Semiparametric Regression Based on Multiple Sources
|
|
- Richard Robbins
- 5 years ago
- Views:
Transcription
1 Semiparametric Regression Based on Multiple Sources Benjamin Kedem Department of Mathematics & Inst. for Systems Research University of Maryland, College Park Give me a place to stand and rest my lever on, and I can move the Earth, (Archimedes, B.C.) STAT Seminar, November 1, 2012
2 1. Cheng, K.F., and Chu, C.K. (2004), Semiparametric density estimation under a two-sample density ratio model, Bernoulli, 10(4), Fokianos, K. (2004). Merging information for semiparametric density estimation. Journal of the Royal Statistical Society, Series B, 66, Fokianos, K., Kedem, B., Qin, J., Short, D.A. (2001). A semiparametric approach to the one-way layout. Technometrics, 43, McGlynn K.A., Sakoda1 L.C., Rubertone M.V., Sesterhenn I.A., Lyu C., Graubard B.I., Erickson R.L. (2007). Body size, dairy consumption, puberty, and risk of testicular germ cell tumors. American Journal of Epidemiology, 165, Qin, J., Zhang, B. (1997). A goodness of fit test for logistic regression models based on case-control data. Biometrika, 84, Qin, J., and Zhang, B. (2005). Density Estimation Under A Two-Sample Semiparametric Model. Nonparametric Statistics, 1, Kedem, B., Kim, E-Y, Voulgaraki, A., and Graubard, B. (2009). Two-Dimensional Semiparametric Density Ratio Modeling of Testicular Germ Cell Data. Statistics in Medicine, 2009, 28, Voulgaraki, A., Kedem, B., and Graubard, B.(2012). "Semiparametric Regression in Testicular Germ Cell Data", Annals of Applied Statistics, 6, Supplement: DOI: /12-AOAS552SUPP
3 Abstract 1. Information is combined from multiple multivariate sources. 2. Density ratio model: Multivariate distributions are regressed" on a reference distribution. 3. Random cocariates. 4. Kernel density estimator from many data sources: more efficient than the traditional single-sample kernel density estimator. 5. Each multivariate distribution and the corresponding conditional expectation (regression) of interest are estimated from the combined data using all sources. 6. Graphical and quantitative diagnostic tools are suggested to assess model validity. 7. The method is applied in quantifying the effect of height and age on weight of germ cell testicular cancer patients. 8. Comparison with multiple regression, GAM, and nonparametric kernel regression.
4 Main Points Density Ratio Examples Semiparametric Statistical Formulation Combined semiparametric density estimators Some Simulation Results Application to Testicular Germ Cell Cancer Summary
5 Main Points Density Ratio Examples Semiparametric Statistical Formulation Combined semiparametric density estimators Some Simulation Results Application to Testicular Germ Cell Cancer Summary
6 Main Points Density Ratio Examples Semiparametric Statistical Formulation Combined semiparametric density estimators Some Simulation Results Application to Testicular Germ Cell Cancer Summary
7 Main Points Density Ratio Examples Semiparametric Statistical Formulation Combined semiparametric density estimators Some Simulation Results Application to Testicular Germ Cell Cancer Summary
8 Main Points Density Ratio Examples Semiparametric Statistical Formulation Combined semiparametric density estimators Some Simulation Results Application to Testicular Germ Cell Cancer Summary
9 Main Points Density Ratio Examples Semiparametric Statistical Formulation Combined semiparametric density estimators Some Simulation Results Application to Testicular Germ Cell Cancer Summary
10 Outline
11 TRMM sensors: Likely distortions of ground truth
12 Reference: Ground truth
13 Multiple filtering of a signal f 1 (ω) = H 1 (ω) 2 f (ω).. (1). f q (ω) = H q (ω) 2 f (ω) That is, q distortions or multiple tilting of the same reference spectral density f.
14 Linear system with Gaussian noise X t = αx t 1 + ɛ t, ɛ = (ɛ 1t,..., ɛ qt, ɛ mt ) ɛ jt g j N(0, σ 2 j ), j = 1,..., q, m. Choose g m g. We get many distortions of the same reference g: g 1 (x) = e α 1+β 1 x 2 g(x)... g q (x) = e αq+βqx2 g(x)
15 Analysis of variance Consider the classical one-way ANOVA with m = q + 1 independent normal random samples: x 11,..., x 1n1 g 1 (x)... x q1,..., x qnq g q (x) x m1,..., x mnm g m (x) g j (x) N( µ j, σ 2 ), j = 1,..., m.
16 Then, holding g m (x) g(x) as a reference: g 1 (x) = exp(α 1 + β 1 x)g(x)... g q (x) = exp(α q + β q x)g(x) α j = µ2 m µ 2 j 2 σ 2, β j = µ j µ m σ 2, j = 1,..., q
17 k-parameter exponential families { k } g(x, θ) = d(θ)s(x) exp c i (θ)t i (x) i=1 Let g j (x) g(x, θ j ), g(x) g(x, θ m ). Again, q distortions of a reference g(x): g 1 (x) = exp{α 1 + β 1 h(x)}g(x)... g q (x) = exp{α q + β qh(x)}g(x)
18 Case-control: Multinomial logistic regression (Prentice and Pyke 1979). RV y s.t. P(y = j) = π j, m j=1 π j = 1. Assume: For j = 1,..., m, and any h(x), P(y = j x) = exp( α j + β jh(x)) 1+ q k=1 exp( α k + β kh(x)) Define: f (x y = j) = g j (x), j = 1,..., m Then with α j = α j + log[ π m /π j ], j = 1,..., q, and g m g, g j (x) = exp( α j + β j h(x))g(x), j = 1,..., q.
19 Weighted Distributions Rao(1965) unified the concept of weighted distributions of the form: W (x; α)p(x; θ) p w (x; θ, α) = E[W (X; α)] where W (x; α) is known. This is an example of tilting of a reference distribution. By changing the weight W (x; α) we obtain different distortions of the same reference p(x; θ).
20 Biased sampling Length-biased Sampling: Vardi (1982) introduced y F(y) = 1/µ xdg(x), y 0, 0 where µ = 0 xdg(x) <. This is a tilt model. Biased Sampling/Selection Bias: Vardi (1985), and Gill, Vardi, Wellner (1988) considered the more general biased sampling model F(y) = W (G) 1 y w(x)dg(x), where w(x) is known and W (G) = w(x)dg(x). This is a tilt model. F, G are obtained by NPMLE.
21 Biased sampling Length-biased Sampling: Vardi (1982) introduced y F(y) = 1/µ xdg(x), y 0, 0 where µ = 0 xdg(x) <. This is a tilt model. Biased Sampling/Selection Bias: Vardi (1985), and Gill, Vardi, Wellner (1988) considered the more general biased sampling model F(y) = W (G) 1 y w(x)dg(x), where w(x) is known and W (G) = w(x)dg(x). This is a tilt model. F, G are obtained by NPMLE.
22 Comparison Distributions (Parzen 1977,...,2009) Sequence of cdf s: {F 1,..., F q } G, with cont. densities f 1,..., f q, g. Comparison Distributions are defined as: D j (u; G, F j ) = F j (G 1 (u)), 0 < u < 1, j = 1,..., q Then by differentiation, with x = G 1 (u): f 1 (x) = d(g(x); G, F 1 )g(x) f q (x) = d(g(x); G, F q )g(x)
23 A general structure emerges of a reference behavior (distribution) and its many distortions: g 1 = w 1 g g q = w q g How can we take advantage of this? Can the distorted information be of use? Can the distorted and reference information be fused as to improve the reference information? In other words, can the bad improve the good?
24 A general structure emerges of a reference behavior (distribution) and its many distortions: g 1 = w 1 g g q = w q g How can we take advantage of this? Can the distorted information be of use? Can the distorted and reference information be fused as to improve the reference information? In other words, can the bad improve the good?
25 A general structure emerges of a reference behavior (distribution) and its many distortions: g 1 = w 1 g g q = w q g How can we take advantage of this? Can the distorted information be of use? Can the distorted and reference information be fused as to improve the reference information? In other words, can the bad improve the good?
26 A general structure emerges of a reference behavior (distribution) and its many distortions: g 1 = w 1 g g q = w q g How can we take advantage of this? Can the distorted information be of use? Can the distorted and reference information be fused as to improve the reference information? In other words, can the bad improve the good?
27 A general structure emerges of a reference behavior (distribution) and its many distortions: g 1 = w 1 g g q = w q g How can we take advantage of this? Can the distorted information be of use? Can the distorted and reference information be fused as to improve the reference information? In other words, can the bad improve the good?
28 The relationship between a reference and its distortions or tilts opens the door to a useful general statistical approach based on fused or combined information from many sources. A case in point is the estimation of the conditional expectation from fused multivariate samples.
29 Outline
30 m = q + 1 independent p-dim multivariate samples: x ij = (x ij1, x ij2,..., x ijp ) g i (x 1,..., x p ), i = 1,..., q, m, j = 1,..., n i and x i1, x i2,..., x ini iid gi Reference or baseline probability density function (pdf): g g m (x) g m (x 1,..., x p )
31 Density ratio model: or equivalently g i (x) g(x) = w(x, θ i) (2) g i (x) = w(x, θ i )g(x) (3) 1. g i (x), g(x) are unknown. 2. w is a known positive and continuous function. 3. θ i are unknown d-dimensional parameters. 4. pdf s may be continuous or discrete. 5. Normal assumption is not made.
32 Problem: Use the combined data from all the samples x 11,..., x 1n1,...x m1,..., x mnm of size n = n n m to estimate all the parameters θ i, i = 1,..., q, and all the densities g 1,..., g q and g m = g.
33 G(x) G m (x), p ij = dg(x ij ) = dg m (x ij ). The empirical log-likelihood is given by: l = log L = m n i q n i log(p ij ) + log(w(x ij, θ i )) (4) i=1 j=1 i=1 j=1 subject to the constraints: p ij 0, m n i p ij = 1, i=1 j=1 m n i p ij w(x ij, θ k ) = 1 for k = 1,..., q. i=1 j=1 (5)
34 ˆp ij = 1 n 1 + q k=1 ˆµ k 1 [ w(x ij, ˆθ k ) 1 ], (6) Ĝ(x) = Ĝm(x) = m n i ˆp ij I(x ij x) (7) i=1 j=1 More generally, for l = 1,..., m and w(x ij, ˆθ m ) 1, Ĝ l (x) = m n i ˆp ij w(x ij, ˆθ l )I(x ij x) (8) i=1 j=1
35 Outline
36 Traditional single sample kernel density estimator: ˆf (x) = 1 nh p n n ( ) x xi K i=1 h n (9) h n is a sequence of bandwidths such that: 1. h n 0, nh p n as n. 2. K (x) is defined for p-dimensional x. 3 K (x) 0, symmetric around R p K (x)dx = 1. Under certain conditions, ˆf (x) is a consistent estimator of f (x) (Parzen 1962, Shao 2003).
37 Fokianos (2004), Cheng and Chu (2004), Qin and Zhang (2005), Voulgaraki, Kedem, Graubard (VKG) (2012), use the the probabilities ˆp ij instead of 1/n: ĝ l (x) = 1 h p n m n i ( ) x xij ˆp ij ŵ l (x ij )K i=1 j=1 h n (10)
38 VKG 2012: Assume that K ( ) is a nonnegative bounded symmetric function with K (x)dx = 1, x xk (x)dx = k 2 > 0. Assume that g l is continuous at x and twice differentiable in a neighborhood of x. If, as n, h n = O(n 1 4+p ), then ( nhn p ĝ l (x) g l (x) 1 2 h2 n u 2 g l (x ) ) D x x uk (u)du N(0, σ 2 (x)) where σ 2 (x) = w l(x)g l (x) m k=1 ζ kw k (x) K 2 (u)du for any fixed x.
39 The mean integrated square error (MISE) is defined as: ( ĝl MISE(ĝ l (x)) = E (x) g l (x) ) 2 dx (11) Minimizing the MISE with respect to h n, the optimal bandwidth is: hn (p/n) w l (x)g l (x)/ [ m k=1 = ζ kw k (x) ] dx K 2 (u)du ( ) 2 u ( 2 g l (x)/ x x )uk (u)du dx 1 4+p (12)
40 For sufficient large n, an alternative way to find h n is by minimizing h p n n n i=1 i =1 ˆp(t i )ŵ l (t i )ˆp(t i )ŵ l (t i ) K (z)k ( z + t i t i h n 2 ( xli x lj n l (n l 1)hn p K h n i j ) dz ). (13)
41 VKG 2012: If ˆf is the classical single-sample multivariate kernel density estimator of g l, then As n, h n 0 and nh p n, (a) AMISE(ĝ l ) AMISE(ˆf ) (b) Using optimal bandwidths, the proposed semiparametric density estimator ĝ l (x) is more efficient than ˆf (x), i.e for every l eff (ˆf, ĝ l ) AMISE (ĝ l ) AMISE (ˆf ) 1 where AMISE is the optimal AMISE.
42 Diagnostic Plots and Goodness of Fit Define ( ) } k Rα,k { 2 = 1 exp xα n i x α (14) n i : Sample size of ith sample. x α: The number of times the estimated semiparametric cdf falls in the estimated 1 α confidence interval obtained from the corresponding empirical cdf, both evaluated at the sample points. k > 0 abd α chosen by the user. R 2 α,k takes values between 0 and 1. Computing R 2 α,k is both simple and fast.
43 Qin and Zhang (1997): R3 2 = exp( n max G i Ĝi ) (15) Clearly, R3 2 takes values between 0 and 1.
44 Simulation Results: Ĝi vs. G i, i = 1, 2 Simulation ( 1: g ) 1 N((0, 0), Σ)) (case), g 2 N((0, 0), Σ)) (control), 4 2 Σ =, n = 40, n 2 = 30. Simulation 1: G1 Simulation 1: G2 Joint Sem. CDF case Joint Sem. ref. CDF control Joint Empcdf case Joint Empcdf control
45 Simulation ( 2: g ) 1 N((0, 0), Σ)) (case), g 2 N((1, 1), Σ)) (control) with 3 1 Σ =, n = 200, n 2 = 200. Simulation 2: G1 Simulation 2: G2 Joint Sem. CDF case Joint Sem. ref. CDF control Joint Empcdf case Joint Empcdf control
46 Simulation 3: g 1 from standard two dimensional Multivariate Cauchy (case) ( and g from two dimensional Multivariate Cauchy (control) with µ = (1, 1), V = 5 10 n 1 = 200, n 2 = 200. ), Simulation 3: G1 Simulation 3: G2 Joint Sem. CDF case Joint Sem. ref. CDF control Joint Empcdf case Joint Empcdf control
47 Simulation 4: g 1 from standard two dimensional Multivariate Cauchy (case) and g 2 from uniform distribution on the triangle (0, 0), (6, 0), ( 3, 4) (control), and n 1 = 200, n 2 = 200. Simulation 4: G1 Simulation 4: G2 Joint Sem. CDF case Joint Sem. ref. CDF control Joint Empcdf case Joint Empcdf control
48 Table: Comparison of R3 2 and R2.05,2 for 100 repetitions of case and control. Simulation Group R3 2 R.05,2 2 (1) Case Control (2) Case Control (3) Case Control (4) Case Control
49 Bandwidth (BW) selection using cross validation. Case Control Same BW Diff. BW s Same BW Diff. BW s h h 1 h 2 h h 1 h 2 Leave One Out Simulation Simulation Alternative (12) Simulation Simulation
50 Outline
51 Suppose we have m multivariate samples of p-dim vectors: (Random Covariates, Response) = (x 1,..., x (p 1), y) We wish to estimate m conditional expectations: E(y x 1,..., x (p 1) )
52 Data: i = 1,..., q, m, j = 1,..., n i, (x ij1, x ij2,..., x ij(p 1), y ij ) g i (x 1,..., x (p 1), y). Reference: Distortions: g g m (x 1,..., x (p 1), y) g i (x 1,..., x (p 1), y), i = 1,..., q
53 Model: As in ratio of pdf s from N(µ 1, Σ), N(µ 2, Σ) assume g i (x) g(x) = w(x, α i, β i ) exp(α i + β ix), i = 1,..., q (16) where x = (x 1,..., x (p 1), y) and β i = (β i1,..., β ip ).
54 Let t denote the vector of combined data of length n = n 1 + n n m. Following the method of constrained empirical likelihood we obtain score equations for ˆα j and ˆβ j : l α j = l β j = n i=1 n i=1 ρ j w j (t i ) 1 + ρ 1 w 1 (t i ) + + ρ q w q (t i ) + n j = 0 (17) ρ j w j (t i )t i 1 + ρ 1 w 1 (t i ) + + ρ q w q (t i ) + n j (x ji1,..., y ji ) (18) = 0 i=1 j = 1,..., q and ρ j = n j /n m w j (t i ) = exp(α j + β j t i)
55 ˆp i = 1 n m Ĝ(t) = 1 n m ρ 1 ŵ 1 (t i ) + + ρ q ŵ q (t i ) n I(t i t) 1 + ρ 1 ŵ 1 (t i ) + + ρ q ŵ q (t i ) i=1 (19) (20) where ŵ j (t i ) = exp(ˆα j + ˆβ jt i ) As n, the estimators ˆθ = (ˆα 1,, ˆα q, ˆβ 1,, ˆβ q ) are asymptotically normal (Qin and Zhang 1997, Lu 2007).
56 Under the p-dimensional density ratio model we can predict the response y given the covariate information x 1, x 2,..., x (p 1) for any of the m data sets as follows: n j ĝ j (x 1,..., x (p 1), y i ) Ê j (y x 1,..., x (p 1) ) = y i, j = 1,..., q, m. y i ĝ j (x 1,..., x (p 1), y i ) i (21) The ĝ j in (??) are the semiparametric kernel density estimates.
57 Assume that the data are bounded. Then: (a) As n, h 0 and nh p, Ê(y x) E(y x) g(x)dx 0 in the mean square sense. (b) If, in addition 0 < A < g(x), then Ê(y x) E(y x) dx 0 in the mean square sense.
58 Outline
59 Testicular germ cell tumor (TGCT) is a common cancer among U.S. men, mainly in the age group of years (McGlynn et al 2003). Increased risk is significantly related to height, whereas body mass index was not found significant (McGlynn et al 2007). Using a two dimensional semiparametric model, it was shown that jointly height and weight are significant risk factors (Kedem et al 2009). We use TGCT data with 3 variables: age, height (cm), weight (kg). Case: n 1 = 763. Control: n 2 = 928. n = 1691 individuals. We consider two cases: 2D with variables height and weight 3D with variables height, weight and age. In both cases the control distribution is the reference distribution.
60 2D Problem: E(Weight Height) Diagnostic plots of Ĝi versus G i, i = 1, 2 evaluated at (height,weight) pairs. Case Group Control Group Joint Sem. CDF Joint Sem. CDF Joint Empirical CDF Joint Empirical CDF
61 2D Case Regression 2D TGCT: E[Weight/Height] for G1 Residual plot for 2D TGCT Case weight Actual points Ref. Regression Semiparametric GAM NW hat(e) height Sem. E[weight/height]
62 2D Control Regression 2D TGCT: E[Weight/Height] for G2 Residual plot for 2D TGCT Control weight Actual points Ref. Regression Semiparametric GAM NW hat(e) height Sem. E[weight/height]
63 3D Problem: E(Weight Height, Age) Diagnostic plots of Ĝi versus G i, i = 1, 2 evaluated at (age,height,weight) triplets. Case Group Control Group Joint Sem. CDF Joint Sem. CDF Joint Empirical CDF Joint Empirical CDF
64 3D Residuals Residual plots for the semiparametric model in the 3D TGCT data set. Residual plot for 3D TGCT Case Residual plot for 3D TGCT Control hat(e) hat(e) Sem. E[weight/height, age] Sem. E[weight/height, age]
65 Some joint probabilities of age, height and weight in the case and control groups. Probability Case Control Pr(A 45, H , W ) Pr(A 26, H , W ) Pr(A 29, H , W ) Pr(A 33, H , W ) Pr(A 34, H , W ) Pr(A 37, H , W ) Pr(A 40, H , W ) Pr(A 43, H , W ) Pr(A 45, H , W )
66 Case-control weight and Ê[weight height, age]. Empty entries in the table correspond to subjects with the same height and age, but possibly different weights. Control Case Age Height Weight Ê[W H, A] Weight Ê[W H, A]
67 Case-control weight and Ê[weight height, age]. Empty entries in the table correspond to subjects with the same height and age, but possibly different weights. Control Case Age Height Weight Ê[W H, A] Weight Ê[W H, A]
68 Case-control weight and Ê[weight height, age]. Empty entries in the table correspond to subjects with the same height and age, but possibly different weights. Control Case Age Height Weight Ê[W H, A] Weight Ê[W H, A]
Semiparametric Regression Based on Multiple Sources
Semiparametric Regression Based on Multiple Sources Benjamin Kedem Department of Mathematics & Inst. for Systems Research University of Maryland, College Park Give me a place to stand and rest my lever
More informationABSTRACT. Semiparametric Regression and Mortality Rate Prediction. Anastasia Voulgaraki, Doctor of Philosophy, 2011
ABSTRACT Title of dissertation: Semiparametric Regression and Mortality Rate Prediction Anastasia Voulgaraki, Doctor of Philosophy, 2011 Dissertation directed by: Professor Benjamin Kedem Department of
More informationProblem Selected Scores
Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationMinimum Hellinger Distance Estimation in a. Semiparametric Mixture Model
Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationEmpirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design
1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary
More informationPrimal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing
Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal
More informationHistogram Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction
Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction Tine Buch-Kromann Construction X 1,..., X n iid r.v. with (unknown) density, f. Aim: Estimate the density
More informationLecture 3. Truncation, length-bias and prevalence sampling
Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More information41903: Introduction to Nonparametrics
41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationA note on profile likelihood for exponential tilt mixture models
Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential
More information1: PROBABILITY REVIEW
1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following
More information36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1
36. Multisample U-statistics jointly distributed U-statistics Lehmann 6.1 In this topic, we generalize the idea of U-statistics in two different directions. First, we consider single U-statistics for situations
More informationGeneralized logit models for nominal multinomial responses. Local odds ratios
Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationContinuous Random Variables
1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationECE 4400:693 - Information Theory
ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationDistribution-Free Distribution Regression
Distribution-Free Distribution Regression Barnabás Póczos, Alessandro Rinaldo, Aarti Singh and Larry Wasserman AISTATS 2013 Presented by Esther Salazar Duke University February 28, 2014 E. Salazar (Reading
More informationWeek 9 The Central Limit Theorem and Estimation Concepts
Week 9 and Estimation Concepts Week 9 and Estimation Concepts Week 9 Objectives 1 The Law of Large Numbers and the concept of consistency of averages are introduced. The condition of existence of the population
More informationAdvanced Statistics II: Non Parametric Tests
Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011 Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More information1.1 Basis of Statistical Decision Theory
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of
More informationMIT Spring 2015
MIT 18.443 Dr. Kempthorne Spring 2015 MIT 18.443 1 Outline 1 MIT 18.443 2 Batches of data: single or multiple x 1, x 2,..., x n y 1, y 2,..., y m w 1, w 2,..., w l etc. Graphical displays Summary statistics:
More informationTest Problems for Probability Theory ,
1 Test Problems for Probability Theory 01-06-16, 010-1-14 1. Write down the following probability density functions and compute their moment generating functions. (a) Binomial distribution with mean 30
More informationNuisance parameter elimination for proportional likelihood ratio models with nonignorable missingness and random truncation
Biometrika Advance Access published October 24, 202 Biometrika (202), pp. 8 C 202 Biometrika rust Printed in Great Britain doi: 0.093/biomet/ass056 Nuisance parameter elimination for proportional likelihood
More informationGauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA
JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter
More informationCalibration Estimation for Semiparametric Copula Models under Missing Data
Calibration Estimation for Semiparametric Copula Models under Missing Data Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Economics and Economic Growth Centre
More informationDensity estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas
0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity
More informationBickel Rosenblatt test
University of Latvia 28.05.2011. A classical Let X 1,..., X n be i.i.d. random variables with a continuous probability density function f. Consider a simple hypothesis H 0 : f = f 0 with a significance
More informationChapter 4: Asymptotic Properties of the MLE (Part 2)
Chapter 4: Asymptotic Properties of the MLE (Part 2) Daniel O. Scharfstein 09/24/13 1 / 1 Example Let {(R i, X i ) : i = 1,..., n} be an i.i.d. sample of n random vectors (R, X ). Here R is a response
More informationMathematical Preliminaries
Mathematical Preliminaries Economics 3307 - Intermediate Macroeconomics Aaron Hedlund Baylor University Fall 2013 Econ 3307 (Baylor University) Mathematical Preliminaries Fall 2013 1 / 25 Outline I: Sequences
More informationMaximum Smoothed Likelihood for Multivariate Nonparametric Mixtures
Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures David Hunter Pennsylvania State University, USA Joint work with: Tom Hettmansperger, Hoben Thomas, Didier Chauveau, Pierre Vandekerkhove,
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationIntroduction An approximated EM algorithm Simulation studies Discussion
1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationRecent Advances in the analysis of missing data with non-ignorable missingness
Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationSpring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =
Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically
More informationSmooth simultaneous confidence bands for cumulative distribution functions
Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationTesting Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata
Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function
More informationSpatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood
Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October
More informationBoundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta
Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta November 29, 2007 Outline Overview of Kernel Density
More informationModelling Non-linear and Non-stationary Time Series
Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September
More informationQuantitative Economics for the Evaluation of the European Policy. Dipartimento di Economia e Management
Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti 1 Davide Fiaschi 2 Angela Parenti 3 9 ottobre 2015 1 ireneb@ec.unipi.it. 2 davide.fiaschi@unipi.it.
More informationUNIVERSITY OF CALIFORNIA, SAN DIEGO
UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department
More informationA Conditional Approach to Modeling Multivariate Extremes
A Approach to ing Multivariate Extremes By Heffernan & Tawn Department of Statistics Purdue University s April 30, 2014 Outline s s Multivariate Extremes s A central aim of multivariate extremes is trying
More informationQuantile Regression for Residual Life and Empirical Likelihood
Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu
More informationMachine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier
Machine Learning 10-701/15 701/15-781, 781, Spring 2008 Theory of Classification and Nonparametric Classifier Eric Xing Lecture 2, January 16, 2006 Reading: Chap. 2,5 CB and handouts Outline What is theoretically
More informationMathematics Ph.D. Qualifying Examination Stat Probability, January 2018
Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from
More informationA Sampling of IMPACT Research:
A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More information1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).
Estimation February 3, 206 Debdeep Pati General problem Model: {P θ : θ Θ}. Observe X P θ, θ Θ unknown. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Examples: θ = (µ,
More informationON SOME TWO-STEP DENSITY ESTIMATION METHOD
UNIVESITATIS IAGELLONICAE ACTA MATHEMATICA, FASCICULUS XLIII 2005 ON SOME TWO-STEP DENSITY ESTIMATION METHOD by Jolanta Jarnicka Abstract. We introduce a new two-step kernel density estimation method,
More informationRandom fraction of a biased sample
Random fraction of a biased sample old models and a new one Statistics Seminar Salatiga Geurt Jongbloed TU Delft & EURANDOM work in progress with Kimberly McGarrity and Jilt Sietsma September 2, 212 1
More informationA Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints
Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note
More informationExercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More information1 Empirical Likelihood
Empirical Likelihood February 3, 2016 Debdeep Pati 1 Empirical Likelihood Empirical likelihood a nonparametric method without having to assume the form of the underlying distribution. It retains some of
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationFinal Examination Statistics 200C. T. Ferguson June 11, 2009
Final Examination Statistics 00C T. Ferguson June, 009. (a) Define: X n converges in probability to X. (b) Define: X m converges in quadratic mean to X. (c) Show that if X n converges in quadratic mean
More informationMathematics Qualifying Examination January 2015 STAT Mathematical Statistics
Mathematics Qualifying Examination January 2015 STAT 52800 - Mathematical Statistics NOTE: Answer all questions completely and justify your derivations and steps. A calculator and statistical tables (normal,
More informationEEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as
L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace
More informationWEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION
WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION Michael Amiguet 1, Alfio Marazzi 1, Victor Yohai 2 1 - University of Lausanne, Institute for Social and Preventive Medicine, Lausanne, Switzerland 2 - University
More informationChapter 10. U-statistics Statistical Functionals and V-Statistics
Chapter 10 U-statistics When one is willing to assume the existence of a simple random sample X 1,..., X n, U- statistics generalize common notions of unbiased estimation such as the sample mean and the
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject
More informationOPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan
OPTIMISATION CHALLENGES IN MODERN STATISTICS Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan How do optimisation problems arise in Statistics? Let X 1,...,X n be independent and identically distributed
More informationIntroduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β
Introduction - Introduction -2 Introduction Linear Regression E(Y X) = X β +...+X d β d = X β Example: Wage equation Y = log wages, X = schooling (measured in years), labor market experience (measured
More informationChapter 2 Inference on Mean Residual Life-Overview
Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update 3. Juni 2013) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend
More informationEstimation for nonparametric mixture models
Estimation for nonparametric mixture models David Hunter Penn State University Research supported by NSF Grant SES 0518772 Joint work with Didier Chauveau (University of Orléans, France), Tatiana Benaglia
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationEmpirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications
Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Fumiya Akashi Research Associate Department of Applied Mathematics Waseda University
More informationChapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70
Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:
More informationPackage Rsurrogate. October 20, 2016
Type Package Package Rsurrogate October 20, 2016 Title Robust Estimation of the Proportion of Treatment Effect Explained by Surrogate Marker Information Version 2.0 Date 2016-10-19 Author Layla Parast
More informationParameters Estimation for a Linear Exponential Distribution Based on Grouped Data
International Mathematical Forum, 3, 2008, no. 33, 1643-1654 Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data A. Al-khedhairi Department of Statistics and O.R. Faculty
More informationChapter 5 continued. Chapter 5 sections
Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationMann-Whitney Test with Adjustments to Pre-treatment Variables for Missing Values and Observational Study
MPRA Munich Personal RePEc Archive Mann-Whitney Test with Adjustments to Pre-treatment Variables for Missing Values and Observational Study Songxi Chen and Jing Qin and Chengyong Tang January 013 Online
More informationMultivariate Random Variable
Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate
More informationF & B Approaches to a simple model
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University JSM, 2015 E. Christou, M. G. Akritas (PSU) SIQR JSM, 2015
More informationTests of independence for censored bivariate failure time data
Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ
More informationSTAT Chapter 5 Continuous Distributions
STAT 270 - Chapter 5 Continuous Distributions June 27, 2012 Shirin Golchi () STAT270 June 27, 2012 1 / 59 Continuous rv s Definition: X is a continuous rv if it takes values in an interval, i.e., range
More informationModeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses
Outline Marginal model Examples of marginal model GEE1 Augmented GEE GEE1.5 GEE2 Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association
More informationModification and Improvement of Empirical Likelihood for Missing Response Problem
UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu
More informationEfficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models
NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia
More informationSparse Nonparametric Density Estimation in High Dimensions Using the Rodeo
Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University
More information