Assessing small sample bias in coordinate based meta-analyses for fmri

Size: px
Start display at page:

Download "Assessing small sample bias in coordinate based meta-analyses for fmri"

Transcription

1 Assessing small sample bias in coordinate based meta-analyses for fmri F Acar, R Seurinck & B Moerkerke IBS Channel Network Conference, Hasselt 25 April 2017

2 What is fmri?

3 fmri Meta-analysis Small sample bias Discussion What is fmri

4 fmri Meta-analysis Small sample bias Discussion What is fmri Brain y x

5 fmri Meta-analysis Small sample bias Discussion What is fmri Brain Divided into > voxels y y x x

6 fmri Meta-analysis Small sample bias Discussion What is fmri Brain Divided into > voxels BOLD-response is measured y y y x x x

7 fmri Meta-analysis Small sample bias Discussion Publication bias Illustrated result Thresholding methods Thresholding: In which voxels is activation larger than can be expected by chance? Correct for multiple testing problem Different methods Uncorrected threshold False Discovery Rate Random-Field Theory (FWE)

8 fmri Meta-analysis Small sample bias Publication bias Discussion Illustrated result Thresholding methods Thresholding: In which voxels is activation larger than can be expected by chance? Correct for multiple testing problem Different methods Uncorrected threshold False Discovery Rate Random-Field Theory (FWE) Results

9 fmri Meta-analysis Small sample bias Discussion A problem of power and reproducibility Small sample sizes: Median of n = 15 (Carp, 2012) Causes more false positives (FPs) and false negatives (FNs) (Button et al, 2013) Experiments are very expensive Statistical test in over 100,000 voxels: Multiple testing problem Explosion of FPs Multiple comparisons corrections (FDR, FWER, RFT, ) Increase in specificity, dramatic decrease in sensitivity Studies with small sample sizes tend to employ more lenient thresholds => small sample bias

10 fmri Meta-analysis Small sample bias Discussion A problem of power and reproducibility Small sample sizes: Median of n = 15 (Carp, 2012) Causes more false positives (FPs) and false negatives (FNs) (Button et al, 2013) Experiments are very expensive Statistical test in over 100,000 voxels: Multiple testing problem Explosion of FPs Multiple comparisons corrections (FDR, FWER, RFT, ) Increase in specificity, dramatic decrease in sensitivity Studies with small sample sizes tend to employ more lenient thresholds => small sample bias

11 fmri Meta-analysis Small sample bias Discussion A problem of power and reproducibility Small sample sizes: Median of n = 15 (Carp, 2012) Causes more false positives (FPs) and false negatives (FNs) (Button et al, 2013) Experiments are very expensive Statistical test in over 100,000 voxels: Multiple testing problem Explosion of FPs Multiple comparisons corrections (FDR, FWER, RFT, ) Increase in specificity, dramatic decrease in sensitivity Studies with small sample sizes tend to employ more lenient thresholds => small sample bias

12 fmri Meta-analysis Small sample bias Discussion Why Publications with keyword 'fmri' in Web of Science 5000 Low power? Increase N! Would it not be awesome if we could re-use existing research? Yearly > 5000 publications using fmri Yes! Meta-analysis Statistical tool Combine results of multiple studies Aim: derive pooled estimates to approach truth in population Count Year

13 fmri Meta-analysis Small sample bias Discussion Meta-analysis Classic meta-analysis: Original: univariate approach Focus on effect sizes Weighted average

14 fmri Meta-analysis Small sample bias Discussion Meta-analysis Classic meta-analysis: Meta-analysis of fmri studies Original: univariate approach Focus on effect sizes Weighted average

15 fmri Meta-analysis Small sample bias Discussion Coordinate-based meta-analysis (ALE)

16 fmri Meta-analysis Small sample bias Discussion Coordinate-based meta-analysis (ALE) * ALE = 1 '(1 MA * ),-

17 fmri Meta-analysis Small sample bias Discussion Coordinate-based meta-analysis y x

18 fmri Meta-analysis Small sample bias Discussion Coordinate-based meta-analysis y x

19 fmri Meta-analysis Small sample bias Discussion Coordinate-based meta-analysis n (participants per study) # contributing peaks y x

20 fmri Meta-analysis Small sample bias Discussion Coordinate-based meta-analysis n (participants per study) # contributing peaks n (participants per study) # contributing peaks y 40 n (participants per study) # contributing peaks n (participants per study) # contributing peaks x

21 fmri Meta-analysis Small sample bias Discussion Coordinate-based meta-analysis n (participants per study) # contributing peaks n (participants per study) # contributing peaks y 40 n (participants per study) # contributing peaks n (participants per study) # contributing peaks x

22 fmri Meta-analysis Small sample bias Discussion Coordinate-based meta-analysis Cluster 2, slope= 002, p=04484 y Individual study sample size No Yes Cluster contribution? x

23 fmri Meta-analysis Small sample bias Discussion Coordinate-based meta-analysis n (participants per study) # contributing peaks n (participants per study) # contributing peaks y 40 n (participants per study) # contributing peaks n (participants per study) # contributing peaks x

24 fmri Meta-analysis Small sample bias Discussion Coordinate-based meta-analysis Cluster 3, slope=001, p<0001 y Individual study sample size No Yes Cluster contribution? x

25 fmri Meta-analysis Small sample bias Discussion Regression We plot individual study sample size in function of cluster contribution Possible scenario's Individual study sample size No publication bias No Yes Individual study sample size No No effect Yes Individual study sample size No Publication bias Yes Cluster contribution? Cluster contribution? Cluster contribution?

26 fmri Meta-analysis Small sample bias Discussion Regression We plot individual study sample size in function of cluster contribution Possible scenario's Individual study sample size No publication bias No Yes Individual study sample size No No effect Yes Individual study sample size No Publication bias Yes Cluster contribution? Cluster contribution? Cluster contribution?

27 fmri Meta-analysis Small sample bias Discussion Regression We plot individual study sample size in function of cluster contribution Possible scenario's Individual study sample size No publication bias No Yes Individual study sample size No No effect Yes Individual study sample size No Publication bias Yes Cluster contribution? Cluster contribution? Cluster contribution?

28 fmri Meta-analysis Small sample bias Discussion Effect of lenient thresholding What is the effect of small sample bias on the regression?

29 fmri Meta-analysis Small sample bias Discussion Effect of lenient thresholding Simulated 500 meta-analyses with 43 studies Approx 39 small studies (n < 31), 4 large studies 1 target area with activation Lenient (p < 0001, uncorrected) or non-lenient (FDR whole brain corrected, q < 005) threshold 2 scenario s no lenient thresholding lenient threshold for the small studies in which no activation was found with the FDR threshold ALE meta-analysis with statistically significant peaks y x

30 fmri Meta-analysis Small sample bias Discussion Effect of lenient thresholding Simulated 500 meta-analyses with 43 studies Approx 39 small studies (n < 31), 4 large studies 1 target area with activation Lenient (p < 0001, uncorrected) or non-lenient (FDR whole brain corrected, q < 005) threshold 2 scenario s no lenient thresholding lenient threshold for the small studies in which no activation was found with the FDR threshold ALE meta-analysis with statistically significant peaks

31 fmri Meta-analysis Small sample bias Discussion Effect of lenient thresholding Simulated 500 meta-analyses with 43 studies Approx 39 small studies (n < 31), 4 large studies 1 target area with activation Lenient (p < 0001, uncorrected) or non-lenient (FDR whole brain corrected, q < 005) threshold 2 scenario s no lenient thresholding lenient threshold for the small studies in which no activation was found with the FDR threshold ALE meta-analysis with statistically significant peaks

32 fmri Meta-analysis Small sample bias Discussion Effect of lenient thresholding Select 1 t-map from a meta-analysis (neurovault) Compute average effect size in region of interest Compute power in region of interest for different sample sizes With standard thresholding (FDR, q < 001) With lenient thresholding n > 30: FDR, q < 001 n 30: uncorrected, p < 005 Simulate cluster contribution based on power (x 100) Depends on sample size, effect size and thresholding method Plot cluster contribution with and without lenient thresholding y x

33 fmri Meta-analysis Small sample bias Discussion Effect of lenient thresholding Select 1 t-map from a meta-analysis (neurovault) Compute average effect size in region of interest Compute power in region of interest for different sample sizes With standard thresholding (FDR, q < 001) With lenient thresholding n > 30: FDR, q < 001 n 30: uncorrected, p < 005 Simulate cluster contribution based on power (x 100) Depends on sample size, effect size and thresholding method Plot cluster contribution with and without lenient thresholding y x

34 fmri Meta-analysis Small sample bias Discussion Effect of lenient thresholding Select 1 t-map from a meta-analysis (neurovault) Compute average effect size in region of interest Compute power in region of interest for different sample sizes With standard thresholding (FDR, q < 001) With lenient thresholding n > 30: FDR, q < 001 n 30: uncorrected, p < 005 Simulate cluster contribution based on power (x 100) Depends on sample size, effect size and thresholding method Plot cluster contribution with and without lenient thresholding y x

35 fmri Meta-analysis Small sample bias Discussion Effect of lenient thresholding Select 1 t-map from a meta-analysis (neurovault) Compute average effect size in region of interest Compute power in region of interest for different sample sizes With standard thresholding (FDR, q < 001) With lenient thresholding n > 30: FDR, q < 001 n 30: uncorrected, p < 005 Simulate cluster contribution based on power (x 100) Depends on sample size, effect size and thresholding method Plot cluster contribution with and without lenient thresholding y x

36 fmri Meta-analysis Small sample bias Discussion Effect of lenient thresholding Standard thresholding, slope=638 Lenient thresholding, slope=495 Individual study sample size Individual study sample size No Yes No Yes Cluster contribution? Cluster contribution?

37 fmri Meta-analysis Small sample bias Discussion Effect of lenient thresholding Goal Assessment of small sample bias with as little information as possible Remarks Little attention for validity activated clusters (raise awareness) Amount of contributing peaks Sample sizes contributing studies Robustness of activated clusters

38 fmri Meta-analysis Small sample bias Discussion Effect of lenient thresholding Goal Assessment of small sample bias with as little information as possible Remarks Little attention for validity activated clusters (raise awareness) Amount of contributing peaks Sample sizes contributing studies Robustness of activated clusters

39 fmri Meta-analysis Small sample bias Discussion Discussion: FSN

40 fmri Meta-analysis Small sample bias Discussion Discussion: FSN Amount of null studies that can be added depends on Thresholding method Sample size Number of peaks Developed a tool to generate null studies based on parameters of the meta-analysis

41 fmri Meta-analysis Small sample bias Discussion Discussion: FSN Amount of null studies that can be added depends on Thresholding method Sample size Number of peaks Developed a tool to generate null studies based on parameters of the meta-analysis

42 Thank you!

43 Small Sample Inference for the Probabilistic Index Model: A Flexible Class of Rank Tests Gustavo Guimarães de Castro Amorim 1 joint work with Olivier Thas 1,2, Karel Vermeulen 1, Stijn Vansteelandt 3 and Jan De Neve 4 1 Department of Mathematical Modelling, Statistics and Bioinformatics, Ghent University, Belgium 2 National Institute for Applied Statistics Research Australia (NIASRA), University of Wollongong, Wollongong, Australia 3 Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Belgium 4 Department of Data Analysis, Ghent University, Belgium gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 1 / 21

44 Probabilistic Index Model (PIM) Models the conditional probability that the outcome of one randomly chosen subject exceeds the outcome of another independently chosen subject, given their respective covariates [Thas et al, 2012, JRSS-B] Small Sample Inference for the Probabilistic Index Model 2 / 21

45 Probabilistic Index Model (PIM) Models the conditional probability that the outcome of one randomly chosen subject exceeds the outcome of another independently chosen subject, given their respective covariates [Thas et al, 2012, JRSS-B] P ( Y < Y X, X ) = g 1 (Z T β) where β is p-dimensional parameter of interest gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 2 / 21

46 Probabilistic Index Model (PIM) Models the conditional probability that the outcome of one randomly chosen subject exceeds the outcome of another independently chosen subject, given their respective covariates [Thas et al, 2012, JRSS-B] P ( Y < Y X, X ) = g 1 (Z T β) Probabilistic index where β is p-dimensional parameter of interest Z : function of (X, X ) g( ): link function (eg, logit or probit) gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 2 / 21

47 Probabilistic Index Model PIMs: Semiparametric regression model Available on CRAN > library(pim) pim(formula, data, link = c("logit", "probit", "identity"), model = c("difference", "marginal", "regular", "customized"), ) Can be used to generate classical (and new) rank tests Supplementing them with interpretable effect sizes gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 3 / 21

48 Relation to rank tests Example: Two-sample problem Let Y (k), with Y (k) F k, be the outcome in the kth group (k = 1, 2) and define a dummy variable X (k) = k 1 Take g 1 (Zβ) = (X (2) X (1) )β so that ( ) P Y (1) < Y (2) X (1) = 0, X (2) = 1 = β Identity link function gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 4 / 21

49 Relation to rank tests Rejecting H 0 : β = 0 in favor of H 1 : β 0 H 1 : β 1 2 ( ) H 1 : P Y (1) < Y (2) X (1) = 0, X (2) = The WMW test procedure is embedded in PIM modeling (just like t-test procedure is embedded in linear regression in a two-sample design) gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 5 / 21

50 Relation to rank tests Similarly, PIM generates: Kruskal - Wallis rank test Friedman rank test Mack - Skillings test and many others PIM generates in addition more flexible rank tests which allow for covariate adjustment (more in De Neve and Thas [2015, JASA]) gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 6 / 21

51 However Small Sample Inference for the Probabilistic Index Model 7 / 21

52 However We rely on asymptotic normality: ˆβ is asymptotically normal (see Vermeulen et al [2017, Submitted]) *[Little, 2006, The American Statistician] Small Sample Inference for the Probabilistic Index Model 7 / 21

53 Small sample inference Example: We generate data from the normal linear model fitted a correctly specified probit-pim β Y = X α + ɛ, with X N(0, 1) Distribution of ˆβ for n = 15 gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 8 / 21

54 Small sample inference Example: We generate data from the normal linear model Y = X α + ɛ, with X N(0, 1) fitted a correctly specified probit-pim β standard Wald confidence intervals Interessed in the empirical coverage (after 1,000 Monte Carlo simulations) Distribution of ˆβ for n = 15 gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 8 / 21

55 Small sample inference Standard Wald interval based on ˆβ ( ) gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 9 / 21

56 Small sample inference Again, we generate data from the normal linear model Y = X T α + ɛ, with X N(0, I) with α T = (0,, 0), (of length p = 2,, 5) fitted a correctly specified probit-pim β standard Wald confidence region Interessed in the empirical coverage (after 1,000 Monte Carlo simulations) gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 10 / 21

57 Small sample inference Standard Wald interval based on ˆβ ( ) gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 11 / 21

58 Small sample inference We need better methods for small sample inference Small Sample Inference for the Probabilistic Index Model 12 / 21

59 Small sample inference We need better methods for small sample inference Alternatives: Bootstrapping [Jiang and Kalbfleisch, 2012, Sankhya B] β is estimated by solving a set of estimating equations PIM models all pairwise comparisons inflated number of estimating functions computationally intensive gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 12 / 21

60 Small sample inference We need better methods for small sample inference Alternatives: Bootstrapping [Jiang and Kalbfleisch, 2012, Sankhya B] β is estimated by solving a set of estimating equations PIM models all pairwise comparisons inflated number of estimating functions computationally intensive Empirical likelihood [Owen, 1990, Annals of Statistics] gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 12 / 21

61 Empirical likelihood For the PIM, { n n L(β) = max w i : w i > 0, i = 1,, n; w i = 1, w i=1 i=1 } n w i w j U(X i, X j ; β) = 0 i,j=1 where U( ) is the PIM estimating function 1 Non-linear in the weights w i 2 Restriction might not have a solution 3 Computationally intensive [Chen et al, 2008, JCGS] gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 13 / 21

62 Building on empirical likelihood methods, we proposed the Small Sample Inference for the Probabilistic Index Model 14 / 21

63 Building on empirical likelihood methods, we proposed the Bias-reduced adjusted jackknife empirical likelihood method Small Sample Inference for the Probabilistic Index Model 14 / 21

64 Building on empirical likelihood methods, we proposed the Bias-reduced adjusted jackknife empirical 1 Applies the jackknife 0 Minimizes the second-order bias of β gustavoguimaraesdecastroamorim@ugentbe method to U(X, X ; β) 2 Jackknife pseudo-values 3 Apply the usual empirical likelihood to them likelihood method Jackknife Empirical Likelihood (deals with #1) (Non-linear in the weights wi ) [Jing et al, 2009, JASA] Small Sample Inference for the Probabilistic Index Model 14 / 21

65 Building on empirical likelihood methods, we proposed the Bias-reduced adjusted jackknife empirical likelihood method Adjusted Empirical Likelihood (deals with #2) (Restriction might not have a solution) [Chen et al, 2008, JCGS] Jackknife Empirical Likelihood (deals with #1) (Non-linear in the weights w i ) [Jing et al, 2009, JASA] gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 14 / 21

66 Building on empirical likelihood methods, we proposed the Bias-reduced adjusted jackknife empirical likelihood method Minimizes the second-order bias of ˆβ Adjusted Empirical Likelihood (deals with #2) (Restriction might not have a solution) [Chen et al, 2008, JCGS] Jackknife Empirical Likelihood (deals with #1) (Non-linear in the weights w i ) [Jing et al, 2009, JASA] gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 14 / 21

67 Theorem Let β0 be the true value of β Under regularity conditions, L(β0 ) χ2p 2 log {R(β0 )} = 2 log (n + 1) (n+1) in distribution as n gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 15 / 21

68 Theorem Let β0 be the true value of β Under regularity conditions, L(β0 ) χ2p 2 log {R(β0 )} = 2 log (n + 1) (n+1) in distribution as n If p > 1, replace χ2p by Fnp, where Fnp (n 1)p F (p, n p) n p [Owen, 1990, Annals of Statistics] gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 15 / 21

69 Are there any improvements? *Credits to xkcdcom Small Sample Inference for the Probabilistic Index Model 16 / 21

70 Results Standard Wald interval based on ˆβ ( ) gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 17 / 21

71 Results Standard Wald interval based on ˆβ ( likelihood ( ) ) and bias-reduced adjusted jackknife empirical gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 17 / 21

72 Results Standard Wald interval based on ˆβ ( ), gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 18 / 21

73 Results Standard Wald interval based on ˆβ ( ( ) ), bias-reduced adjusted jackknife empirical likelihood gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 18 / 21

74 Results Standard Wald interval based on ˆβ ( ), bias-reduced adjusted jackknife empirical likelihood ( ) and bias-reduced adjusted jackknife empirical likelihood with the F-approximation ( ) gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 18 / 21

75 Conclusion Probabilistic index models: Flexible semiparametric regression model Applicable to discrete, continuous or ordinal data Gives consistent and asymptotically normal estimates Small Sample Inference for the Probabilistic Index Model 19 / 21

76 Conclusion Probabilistic index models: Flexible semiparametric regression model Applicable to discrete, continuous or ordinal data Gives consistent and asymptotically normal estimates Strongly related to rank tests: PIM can also be used to generate rank tests for complex designs with the additional advantage of supplementing them with an effect size Small Sample Inference for the Probabilistic Index Model 19 / 21

77 Conclusion Probabilistic index models: Flexible semiparametric regression model Applicable to discrete, continuous or ordinal data Gives consistent and asymptotically normal estimates Strongly related to rank tests: PIM can also be used to generate rank tests for complex designs with the additional advantage of supplementing them with an effect size Currently available on CRAN Small Sample Inference for the Probabilistic Index Model 19 / 21

78 Conclusion Inference in small samples: Bias-reduced adjusted jackknife empirical likelihood gives coverage close to the nominal value for sample sizes as small as 20 In the process of being added into the PIM R package PIMs can therefore be used for small sample inference gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 20 / 21

79 Bibliography J Chen, A M Variyath, and B Abraham Adjusted empirical likelihood and its properties 17(2): , 2008, JCGS J De Neve and O Thas A regression framework for rank tests based on the probabilistic index model 110(511): , 2015, JASA W Jiang and J D Kalbfleisch Bootstrapping u-statistics: applications in least squares and robust regression 74(1):56 76, 2012, Sankhya B B-Y Jing, J Yuan, and W Zhou Jackknife empirical likelihood 104(487): , 2009, JASA R J Little Calibrated bayes: a bayes/frequentist roadmap 60(3): , 2006, The American Statistician A B Owen Empirical likelihood for confidence regions 75:90 120, Annals of Statistics, 1990, Annals of Statistics O Thas, J D Neve, L Clement, and J-P Ottoy Probabilistic index models 74(4): , 2012, JRSS-B K Vermeulen, G Amorim, J De Neve, O Thas, and S Vansteelandt Semiparametric estimation of probabilistic index models: efficiency and bias (-):00 00, 2017, Submitted gustavoguimaraesdecastroamorim@ugentbe Small Sample Inference for the Probabilistic Index Model 21 / 21

80 IBS Channel Network Conference 2017 Positive-definite multivariate spectral estimation: a geometric wavelet approach Joris Chau (Boursier FRIA), Rainer von Sachs April 25, 2017 Institute of Statistics, Biostatistics and Actuarial Sciences Université catholique de Louvain, Belgium

81 Background and motivation In estimating the covariance matrix of a (non-degenerate) complex random vector, the target is Hermitian, positive definite (HPD), ie Σ = Σ, and x Σ x > 0 for each x C d Our interest is in statistical problems where the target f (ω) is a curve of HPD matrices, ie for ω [0, 1) f (ω) = f (ω), and x f (ω) x > 0 for each x C d We focus in particular on nonparametric spectral estimation of a multivariate stationary time series, where the underlying spectral matrix is a curve of HPD matrices across frequency Classical approach: 1 Compute periodogram matrix, a noisy (asympotically unbiased, but inconsistent) HPSD estimator of the spectral matrix 2 Smooth periodogram to get a consistent HPD estimate of the spectral matrix (by eg kernel regression, projection estimators, multitaper estimators 1, etc) 1 Thomson, DJ (1982) Spectrum estimation and harmonic analysis Proc IEEE 70,

82 Background and motivation Typically, to guarantee positive-definiteness of the estimator, one applies equivalent smoothing parameters to each matrix-component (eg same bandwidth parameter for kernel regression, or same number of tapers for multitaper estimators) Link to Shiny-app: 3

83 Introduction Geometry of the space of HPD matrices P d d A (d d)-spectral matrix f (ω) at frequency ω is HPD, ie f (ω) P d d (P d d, +, S) is not a vector space, eg possibly p 1 p 2 / P d d Also, P d d endowed with the Euclidean distance is an incomplete metric space However, P d d is a well-studied Riemannian manifold, ie P d d locally looks like R d2 and can be equipped with a Riemannian metric, (a smooth family of inner products for each p P d d ) [Rahman et al (2005)] 2 develop wavelet transforms for M-valued data with tractable Exp-/Log-maps (local bijective maps between M and T p(m)), and the notion of a midpoint between p 1, p 2 M Based on these ideas, we can also construct wavelet transforms acting only on the Riemmanian manifold P d d 2 Rahman, I U, Drori, I, Stodden, V C, Donoho, D L, and Schröder, P (2005) Multiscale representations for manifold-valued data Multiscale Modeling & Simulation, 4(4),

84 Preliminaries and tools 1 Distance function: a specific choice of (invariant) Riemannian metric induces the following manifold-distance function 3 : δ(p 1, p 2 ) = Log(p 1/2 1 p 2 ) F, for p 1, p 2 P d d ( d ) 1/2 = log(λ i (p 1 1 p 2)) 2 i=1 with the notation y x := y xy for matrix congruence transformation In particular, δ(p 1, p 2 ) < for each p 1, p 2 P d d and singular matrices are pushed to the boundary of the metric space 2 Geodesics: the metric space (P d d, δ) is a complete metric space, and unique geodesics joining any two points p 1, p 2 P d d are given by, γ(p 1, p 2, t) = p 1/2 1 (p 1/2 1 p 2 ) t, 0 t 1 The midpoint Mid(p 1, p 2 ) := γ(p 1, p 2, 1/2) is defined as the halfway point along the (unique) geodesic connecting p 1 and p 2 3 Pennec, X, Fillard, P, Ayache, R (2006), A Riemannian framework for tensor computing, International Journal of Computer Vision, 66(1),

85 Preliminaries and tools 3 Exp-/Log-maps: the Exp-maps are global diffeomorphisms Exp p : T p(p d d ) P d d from the tangent space (attached at a point p) to the manifold via: ( ) Exp p (h) = p 1/2 Exp p 1/2 h where Exp( ) denotes the ordinary matrix exponential Similarly, the Log-maps are defined as the (unique) inverse exponential maps, with Log p : P d d T p(p d d ) Figure: Illustration of geodesics on the manifold and Log-/Exp-maps relating P d d to T p0 (P d d ) 6

86 Forward wavelet transform M 1,0 M 1,1 M 2,0 M 2,1 M 2,2 M 2,3 M J 1,0 M J 1,1 M J 1, n 2 2 M J 1, n 2 1 M J,0 M J,1 := := M J,2 M J,3 := := MJ,n 4 MJ,n 3 := := M J,n 2 M J,n 1 := := P n (ω 0) P n (ω 1) P n (ω 2) P n (ω 3) P n(ω n 4) P n(ω n 3) P n(ω n 2) P n(ω n 1) Figure: Midpoint-pyramid of the curve (P n(ω k )) k such that M j 1,k := Mid(M j,2k, M j,2k+1 ) 7

87 Forward wavelet transform M j 2,0 M j 1,1 M j 1,0 M j 1,1 M j 1,2 M j 1,3 M j,2 M j,3 Figure: Compute pyramid of imputed midpoints ( M j,k ) j,k Compute imputed midpoints M j,2k, M j,2k+1 Step (1) Collect 2D + 1 closest neighbors (M j,k+l ) D l D of M j,k with D 0 Step (2) Transform each neighbor to T Mj,k (P d d ) = H d d by the Log-map, and decompose in terms of ONB of H d d, (a real vector space) Step (3) Impute finer-scale real-valued coefficients by (ordinary) polynomial interpolation, and jump back to P d d using the Exp-map 8

88 Forward wavelet transform M j 2,0 M j 1,1 M j 1,0 M j 1,1 M j 1,2 M j 1,3 M j,2 M j,3 M j,2 Mj,3 Figure: Compute pyramid of imputed midpoints ( M j,k ) j,k Compute imputed midpoints M j,2k, M j,2k+1 Step (1) Collect 2D + 1 closest neighbors (M j,k+l ) D l D of M j,k with D 0 Step (2) Transform each neighbor to T Mj,k (P d d ) = H d d by the Log-map, and decompose in terms of ONB of H d d, (a real vector space) Step (3) Impute finer-scale real-valued coefficients by (ordinary) polynomial interpolation, and jump back to P d d using the Exp-map 8

89 Forward wavelet transform Figure: Compute pyramid of imputed midpoints ( M j,k ) j,k Compute imputed midpoints M j,2k, M j,2k+1 Step (1) Collect 2D + 1 closest neighbors (M j,k+l ) D l D of M j,k with D 0 Step (2) Transform each neighbor to T Mj,k (P d d ) = H d d by the Log-map, and decompose in terms of ONB of H d d, (a real vector space) Step (3) Impute finer-scale real-valued coefficients by (ordinary) polynomial interpolation, and jump back to P d d using the Exp-map 8

90 Forward wavelet transform M j 2,0 M j 1,1 M j 1,0 M j 1,1 M j 1,2 M j 1,3 D j,2 D j,3 Figure: Compute pyramid of wavelet coefficients (D j,k ) j,k Compute wavelet coefficient D j,k Given true and imputed midpoints M j,k, M j,k, the wavelet coefficients are defined as a difference in the tangent space, ( ) D j,k := Log M 1/2 M j,k j,k T Id (P d d ) Note that D j,k Id = D j,k F = δ( M j,k, M j,k ) by definition of the distance function 9

91 Forward wavelet transform Figure: Compute pyramid of wavelet coefficients (D j,k ) j,k Compute wavelet coefficient D j,k Given true and imputed midpoints M j,k, M j,k, the wavelet coefficients are defined as a difference in the tangent space, ( ) D j,k := Log M 1/2 M j,k j,k T Id (P d d ) Note that D j,k Id = D j,k F = δ( M j,k, M j,k ) by definition of the distance function 9

92 Forward wavelet transform M 1,0, D 1,0 M 1,1, D 1,1 D 2,0 D 2,1 D 2,2 D 2,3 D J 2,0 D J 2,1 D J 2, n 4 2 D J 2, n 4 1 D J 1,0 D J 1,1 D J 1,2 D J 1,3 D J 1, n 2 4 DJ 1, n 2 3 D J 1, n 2 2 DJ 1, n 2 1 Figure: Equivalent representation of (P n(ω k )) k in terms of wavelet coefficients 10

93 Periodogram denoising Permutation-invariance of the wavelet spectral estimator Other nonparametric multivariate spectral estimation methods 45 rely on smoothing the Cholesky decomposition of the (pre-smoothed) periodogram The Cholesky-smoothed spectral estimator is not necessarily equivariant under a permutation of the ordering of the time series To be precise: if π(1,, d) is a permutation of the ordering of the time series, then a natural requirement is that: ˆf π(ω) = U πˆf (ω)u π with U π the permutation matrix corresponding to π(1,, d) The wavelet thresholded spectral estimator satisfies the above permutation-invariance for any permutation π(1,, d), whereas the Cholesky-based spectral estimators generally do not 4 Dai, M, and Guo, W (2004) Multivariate spectral analysis using Cholesky decomposition Biometrika 91(3), Rosen, O, and Stoffer, D S (2007) Automatic estimation of multivariate spectra via smoothing splines Biometrika 94(2),

94 Concluding remarks Link to Shiny-app: This talk is based on the paper: Chau, J, and von Sachs, R (2017) Positive-definite multivariate spectral estimation: a geometric wavelet approach, (currently under review) An R-package pdspecest containing the tools to perform wavelet-based multivariate spectral estimation and clustering is available on CRAN Install the latest developmental version via devtools::install_github("jorischau/pdspecest") Other applications exploiting the geometric properties of the space of Hermitian positive-definite matrices as a Riemannian manifold include: extending the notion of a data depth for observations in the space of Hermitian PD matrices (available soon in the R-package pdspecest) But also, time-varying multivariate spectral estimation, and multivariate spectral estimation of replicated time series 12

95 Thank you! IBS Channel Network Conference 2017 Positive-definite multivariate spectral estimation: a geometric wavelet approach Joris Chau (Boursier FRIA), Rainer von Sachs April 25, 2017 Institute of Statistics, Biostatistics and Actuarial Sciences Université catholique de Louvain, Belgium 13

Probabilistic Index Models

Probabilistic Index Models Probabilistic Index Models Jan De Neve Department of Data Analysis Ghent University M3 Storrs, Conneticut, USA May 23, 2017 Jan.DeNeve@UGent.be 1 / 37 Introduction 2 / 37 Introduction to Probabilistic

More information

A Regression Framework for Rank Tests Based on the Probabilistic Index Model

A Regression Framework for Rank Tests Based on the Probabilistic Index Model A Regression Framework for Rank Tests Based on the Probabilistic Index Model Jan De Neve and Olivier Thas We demonstrate how many classical rank tests, such as the Wilcoxon Mann Whitney, Kruskal Wallis

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

The General Linear Model. Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London

The General Linear Model. Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Lausanne, April 2012 Image time-series Spatial filter Design matrix Statistical Parametric

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Local Polynomial Modelling and Its Applications

Local Polynomial Modelling and Its Applications Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,

More information

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Statistical inference for MEG

Statistical inference for MEG Statistical inference for MEG Vladimir Litvak Wellcome Trust Centre for Neuroimaging University College London, UK MEG-UK 2014 educational day Talk aims Show main ideas of common methods Explain some of

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

Adjusted Empirical Likelihood for Long-memory Time Series Models

Adjusted Empirical Likelihood for Long-memory Time Series Models Adjusted Empirical Likelihood for Long-memory Time Series Models arxiv:1604.06170v1 [stat.me] 21 Apr 2016 Ramadha D. Piyadi Gamage, Wei Ning and Arjun K. Gupta Department of Mathematics and Statistics

More information

OPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan

OPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan OPTIMISATION CHALLENGES IN MODERN STATISTICS Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan How do optimisation problems arise in Statistics? Let X 1,...,X n be independent and identically distributed

More information

pim: An R package for fitting probabilistic index models

pim: An R package for fitting probabilistic index models pim: An R package for fitting probabilistic index models Jan De Neve and Joris Meys April 29, 2017 Contents 1 Introduction 1 2 Standard PIM 2 3 More complicated examples 5 3.1 Customised formulas.............................

More information

Local Polynomial Wavelet Regression with Missing at Random

Local Polynomial Wavelet Regression with Missing at Random Applied Mathematical Sciences, Vol. 6, 2012, no. 57, 2805-2819 Local Polynomial Wavelet Regression with Missing at Random Alsaidi M. Altaher School of Mathematical Sciences Universiti Sains Malaysia 11800

More information

A Riemannian Framework for Denoising Diffusion Tensor Images

A Riemannian Framework for Denoising Diffusion Tensor Images A Riemannian Framework for Denoising Diffusion Tensor Images Manasi Datar No Institute Given Abstract. Diffusion Tensor Imaging (DTI) is a relatively new imaging modality that has been extensively used

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Semi-Nonparametric Inferences for Massive Data

Semi-Nonparametric Inferences for Massive Data Semi-Nonparametric Inferences for Massive Data Guang Cheng 1 Department of Statistics Purdue University Statistics Seminar at NCSU October, 2015 1 Acknowledge NSF, Simons Foundation and ONR. A Joint Work

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Differential Geometry and Lie Groups with Applications to Medical Imaging, Computer Vision and Geometric Modeling CIS610, Spring 2008

Differential Geometry and Lie Groups with Applications to Medical Imaging, Computer Vision and Geometric Modeling CIS610, Spring 2008 Differential Geometry and Lie Groups with Applications to Medical Imaging, Computer Vision and Geometric Modeling CIS610, Spring 2008 Jean Gallier Department of Computer and Information Science University

More information

Hakone Seminar Recent Developments in Statistics

Hakone Seminar Recent Developments in Statistics Hakone Seminar Recent Developments in Statistics November 12-14, 2015 Hotel Green Plaza Hakone: http://www.hgp.co.jp/language/english/sp/ Organizer: Masanobu TANIGUCHI (Research Institute for Science &

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

More information

Some methods for handling missing values in outcome variables. Roderick J. Little

Some methods for handling missing values in outcome variables. Roderick J. Little Some methods for handling missing values in outcome variables Roderick J. Little Missing data principles Likelihood methods Outline ML, Bayes, Multiple Imputation (MI) Robust MAR methods Predictive mean

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and

More information

TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION

TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION Gábor J. Székely Bowling Green State University Maria L. Rizzo Ohio University October 30, 2004 Abstract We propose a new nonparametric test for equality

More information

sparse and low-rank tensor recovery Cubic-Sketching

sparse and low-rank tensor recovery Cubic-Sketching Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru

More information

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design 1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

Photos placed in horizontal position with even amount of white space between photos and header

Photos placed in horizontal position with even amount of white space between photos and header Photos placed in horizontal position with even amount of white space between photos and header XPCA: Copula-based Decompositions for Ordinal Data Clifford Anderson-Bergman, Kina Kincher-Winoto and Tamara

More information

Testing for group differences in brain functional connectivity

Testing for group differences in brain functional connectivity Testing for group differences in brain functional connectivity Junghi Kim, Wei Pan, for ADNI Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Banff Feb

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Simultaneous Confidence Bands for the Coefficient Function in Functional Regression

Simultaneous Confidence Bands for the Coefficient Function in Functional Regression University of Haifa From the SelectedWorks of Philip T. Reiss August 7, 2008 Simultaneous Confidence Bands for the Coefficient Function in Functional Regression Philip T. Reiss, New York University Available

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments,

More information

Riemannian Metric Learning for Symmetric Positive Definite Matrices

Riemannian Metric Learning for Symmetric Positive Definite Matrices CMSC 88J: Linear Subspaces and Manifolds for Computer Vision and Machine Learning Riemannian Metric Learning for Symmetric Positive Definite Matrices Raviteja Vemulapalli Guide: Professor David W. Jacobs

More information

Bumpbars: Inference for region detection. Yuval Benjamini, Hebrew University

Bumpbars: Inference for region detection. Yuval Benjamini, Hebrew University Bumpbars: Inference for region detection Yuval Benjamini, Hebrew University yuvalbenj@gmail.com WHOA-PSI-2017 Collaborators Jonathan Taylor Stanford Rafael Irizarry Dana Farber, Harvard Amit Meir U of

More information

Observed Brain Dynamics

Observed Brain Dynamics Observed Brain Dynamics Partha P. Mitra Hemant Bokil OXTORD UNIVERSITY PRESS 2008 \ PART I Conceptual Background 1 1 Why Study Brain Dynamics? 3 1.1 Why Dynamics? An Active Perspective 3 Vi Qimnü^iQ^Dv.aamics'v

More information

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures David Hunter Pennsylvania State University, USA Joint work with: Tom Hettmansperger, Hoben Thomas, Didier Chauveau, Pierre Vandekerkhove,

More information

The Block Criterion for Multiscale Inference About a Density, With Applications to Other Multiscale Problems

The Block Criterion for Multiscale Inference About a Density, With Applications to Other Multiscale Problems Please see the online version of this article for supplementary materials. The Block Criterion for Multiscale Inference About a Density, With Applications to Other Multiscale Problems Kaspar RUFIBACH and

More information

Lecture 3: Statistical Decision Theory (Part II)

Lecture 3: Statistical Decision Theory (Part II) Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline

More information

Algorithm-Independent Learning Issues

Algorithm-Independent Learning Issues Algorithm-Independent Learning Issues Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007, Selim Aksoy Introduction We have seen many learning

More information

Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"

Kneib, Fahrmeir: Supplement to Structured additive regression for categorical space-time data: A mixed model approach Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach" Sonderforschungsbereich 386, Paper 43 (25) Online unter: http://epub.ub.uni-muenchen.de/

More information

Statistical inference on Lévy processes

Statistical inference on Lévy processes Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

A general mixed model approach for spatio-temporal regression data

A general mixed model approach for spatio-temporal regression data A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression

More information

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Inference. Undercoverage of Wavelet-Based Resampling Confidence Intervals

Inference. Undercoverage of Wavelet-Based Resampling Confidence Intervals Communications in Statistics Simulation and Computation, 37: 1307 1315, 2008 Copyright Taylor & Francis Group, LLC ISSN: 0361-0918 print/1532-4141 online DOI: 10.1080/03610910802050902 Inference Undercoverage

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Functional Latent Feature Models. With Single-Index Interaction

Functional Latent Feature Models. With Single-Index Interaction Generalized With Single-Index Interaction Department of Statistics Center for Statistical Bioinformatics Institute for Applied Mathematics and Computational Science Texas A&M University Naisyin Wang and

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:

More information

SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING

SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING Submitted to the Annals of Applied Statistics SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING By Philip T. Reiss, Lan Huo, Yihong Zhao, Clare

More information

A Practitioner s Guide to Cluster-Robust Inference

A Practitioner s Guide to Cluster-Robust Inference A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

The Nonparametric Bootstrap

The Nonparametric Bootstrap The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use

More information

Statistícal Methods for Spatial Data Analysis

Statistícal Methods for Spatial Data Analysis Texts in Statistícal Science Statistícal Methods for Spatial Data Analysis V- Oliver Schabenberger Carol A. Gotway PCT CHAPMAN & K Contents Preface xv 1 Introduction 1 1.1 The Need for Spatial Analysis

More information

Applications of Information Geometry to Hypothesis Testing and Signal Detection

Applications of Information Geometry to Hypothesis Testing and Signal Detection CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry

More information

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Probability Sampling Procedures Collection of Data Measures

More information

Signal Processing for Functional Brain Imaging: General Linear Model (2)

Signal Processing for Functional Brain Imaging: General Linear Model (2) Signal Processing for Functional Brain Imaging: General Linear Model (2) Maria Giulia Preti, Dimitri Van De Ville Medical Image Processing Lab, EPFL/UniGE http://miplab.epfl.ch/teaching/micro-513/ March

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Linear Model Under General Variance

Linear Model Under General Variance Linear Model Under General Variance We have a sample of T random variables y 1, y 2,, y T, satisfying the linear model Y = X β + e, where Y = (y 1,, y T )' is a (T 1) vector of random variables, X = (T

More information

Manifold Monte Carlo Methods

Manifold Monte Carlo Methods Manifold Monte Carlo Methods Mark Girolami Department of Statistical Science University College London Joint work with Ben Calderhead Research Section Ordinary Meeting The Royal Statistical Society October

More information

Introduction An approximated EM algorithm Simulation studies Discussion

Introduction An approximated EM algorithm Simulation studies Discussion 1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse

More information

Information geometry for bivariate distribution control

Information geometry for bivariate distribution control Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

A better way to bootstrap pairs

A better way to bootstrap pairs A better way to bootstrap pairs Emmanuel Flachaire GREQAM - Université de la Méditerranée CORE - Université Catholique de Louvain April 999 Abstract In this paper we are interested in heteroskedastic regression

More information

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,

More information

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap University of Zurich Department of Economics Working Paper Series ISSN 1664-7041 (print) ISSN 1664-705X (online) Working Paper No. 254 Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and

More information

Ultra High Dimensional Variable Selection with Endogenous Variables

Ultra High Dimensional Variable Selection with Endogenous Variables 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High

More information

Analysis of longitudinal neuroimaging data with OLS & Sandwich Estimator of variance

Analysis of longitudinal neuroimaging data with OLS & Sandwich Estimator of variance Analysis of longitudinal neuroimaging data with OLS & Sandwich Estimator of variance Bryan Guillaume Reading workshop lifespan neurobiology 27 June 2014 Supervisors: Thomas Nichols (Warwick University)

More information

Extending a two-variable mean to a multi-variable mean

Extending a two-variable mean to a multi-variable mean Extending a two-variable mean to a multi-variable mean Estelle M. Massart, Julien M. Hendrickx, P.-A. Absil Universit e catholique de Louvain - ICTEAM Institute B-348 Louvain-la-Neuve - Belgium Abstract.

More information

BMI/STAT 768: Lecture 04 Correlations in Metric Spaces

BMI/STAT 768: Lecture 04 Correlations in Metric Spaces BMI/STAT 768: Lecture 04 Correlations in Metric Spaces Moo K. Chung mkchung@wisc.edu February 1, 2018 The elementary statistical treatment on correlations can be found in [4]: http://www.stat.wisc.edu/

More information

Rank Regression with Normal Residuals using the Gibbs Sampler

Rank Regression with Normal Residuals using the Gibbs Sampler Rank Regression with Normal Residuals using the Gibbs Sampler Stephen P Smith email: hucklebird@aol.com, 2018 Abstract Yu (2000) described the use of the Gibbs sampler to estimate regression parameters

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Measurement error as missing data: the case of epidemiologic assays. Roderick J. Little

Measurement error as missing data: the case of epidemiologic assays. Roderick J. Little Measurement error as missing data: the case of epidemiologic assays Roderick J. Little Outline Discuss two related calibration topics where classical methods are deficient (A) Limit of quantification methods

More information

arxiv: v3 [stat.me] 23 Jul 2016

arxiv: v3 [stat.me] 23 Jul 2016 Functional mixed effects wavelet estimation for spectra of replicated time series arxiv:1604.01191v3 [stat.me] 23 Jul 2016 Joris Chau and Rainer von achs Institut de statistique, biostatistique et sciences

More information

Bootstrapping the Grainger Causality Test With Integrated Data

Bootstrapping the Grainger Causality Test With Integrated Data Bootstrapping the Grainger Causality Test With Integrated Data Richard Ti n University of Reading July 26, 2006 Abstract A Monte-carlo experiment is conducted to investigate the small sample performance

More information

STATISTICS-STAT (STAT)

STATISTICS-STAT (STAT) Statistics-STAT (STAT) 1 STATISTICS-STAT (STAT) Courses STAT 158 Introduction to R Programming Credit: 1 (1-0-0) Programming using the R Project for the Statistical Computing. Data objects, for loops,

More information

Adaptive Piecewise Polynomial Estimation via Trend Filtering

Adaptive Piecewise Polynomial Estimation via Trend Filtering Adaptive Piecewise Polynomial Estimation via Trend Filtering Liubo Li, ShanShan Tu The Ohio State University li.2201@osu.edu, tu.162@osu.edu October 1, 2015 Liubo Li, ShanShan Tu (OSU) Trend Filtering

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University WHOA-PSI Workshop, St Louis, 2017 Quotes from Day 1 and Day 2 Good model or pure model? Occam s razor We really

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

Calibration Estimation for Semiparametric Copula Models under Missing Data

Calibration Estimation for Semiparametric Copula Models under Missing Data Calibration Estimation for Semiparametric Copula Models under Missing Data Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Economics and Economic Growth Centre

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Modelling temporal structure (in noise and signal)

Modelling temporal structure (in noise and signal) Modelling temporal structure (in noise and signal) Mark Woolrich, Christian Beckmann*, Salima Makni & Steve Smith FMRIB, Oxford *Imperial/FMRIB temporal noise: modelling temporal autocorrelation temporal

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation

More information

Lecture 14: Variable Selection - Beyond LASSO

Lecture 14: Variable Selection - Beyond LASSO Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)

More information

arxiv: v1 [math.st] 1 Dec 2014

arxiv: v1 [math.st] 1 Dec 2014 HOW TO MONITOR AND MITIGATE STAIR-CASING IN L TREND FILTERING Cristian R. Rojas and Bo Wahlberg Department of Automatic Control and ACCESS Linnaeus Centre School of Electrical Engineering, KTH Royal Institute

More information