Statstical models, selection effects and sampling bias
|
|
- Beryl Henry
- 5 years ago
- Views:
Transcription
1 Statstical models, selection effects and sampling bias Department of Statistics University of Chicago Statslab Cambridge Nov
2 Outline 1 Gaussian models Binary regression model Attenuation of treatment effect Problems with conventional models 2 Sampling bias Volunteer samples Joint distributions of Y given x Randomization 3 Ewes and lambs Notation Estimating functions Interference 4
3 Conventional regression model Gaussian models Binary regression model Attenuation of treatment effect Problems with conventional models Fixed index set U (always infinite): u 1, u 2,... subjects, plots... Covariate x(u 1 ), x(u 2 ),... (non-random, vector-valued) Response Y (u 1 ), Y (u 2 ),... (random, real-valued) Regression model: Sample = finite ordered subset u 1,..., u n (distinct in U) For each sample with configuration x = (x(u 1 ),..., x(u n )) Response distribution p x (y) on R n depends on x Kolmogorov consistency condition on distributions p x ( ) x is not a random variable p x ( ) is not the conditional distribution given x
4 Gaussian regression model Gaussian models Binary regression model Attenuation of treatment effect Problems with conventional models Covariate x(u) = (x 1 (u), x 2 (u)) partitioned into two components x 1 (u) = (variety(u), treatment(u),...) affecting the mean x 2 (u) = (block(u), coordinates(u)) affecting covariances Example: response distribution for a fixed sample of n plots p x (y A; θ) = N n (X 1 β, σ 2 0 I n + σ 2 1 K [x 2])(A) A R n, K [x] = {K (x i, x j )} block-factor models: K (i, j) = 1 if block(i) = block(j) spatial/temporal models: K (i, j) = exp( x 2 (i) x 2 (j) /τ) Generalized random field: K (i, j) = ave x i, x j log x x Equivalent to Y (u) = µ(u) + ɛ(u) + η(u) with ɛ η,...
5 Binary regression model (GLMM) Gaussian models Binary regression model Attenuation of treatment effect Problems with conventional models Units: u 1, u 2,... subjects, patients, plots (labelled) Covariate x(u 1 ), x(u 2 ),... (non-random, X -valued) Latent process η on X (Gaussian, for example) Responses Y (u 1 ),... conditionally independent given η logit pr(y (u) = 1 η) = α + βx(u) + η(x(u)) Joint distribution for sample having configuration x p x (y) = E η n i=1 e (α+βx i +η(x i ))y i 1 + e α+βx i +η(x i ) parameters α, β, K, K (x, x ) = cov(η(x), η(x )).
6 Attenuation of treatment effect Gaussian models Binary regression model Attenuation of treatment effect Problems with conventional models η is a random effect (associated with clinical centre) logit pr(y i = 1 η, Trt = 0) = α + η(x i ) logit pr(y i = 1 η, Trt = 1) = α + η(x i ) + τ Unconditionally pr(y i = 1 Trt = t) = logit pr(y i = 1 Trt = t) = α + τ t e α+η+τ t 1 + e α+η+τ t φ(η; σ 2 ) dη Attenuation: τ τ/ σ 2 Treatment effect = odds(success w/treat)/odds(success without) Treatment effect: τ or τ?
7 Gaussian models Binary regression model Attenuation of treatment effect Problems with conventional models Binary regression model: computation GLMM computational problem: Options: p x (y) = n R n i=1 e (α+βx i +η(x i ))y i 1 + e α+βx φ(η; K ) dη i +η(x i ) Taylor approx: Laird and Ware; Schall; Breslow and Clayton, McC and Nelder, Drum and McC,... Laplace approximation: Wolfinger 1993; Shun and McC 1994 Numerical approximation: Egret E.M. algorithm: McCulloch 1994 for probit models Monte Carlo: Z&L,...
8 But,..., wait a minute... Gaussian models Binary regression model Attenuation of treatment effect Problems with conventional models p x (y) is the distribution for each fixed sample of n units. But... the sample might not be predetermined volunteer samples in clinical trials; pre-screening of patients to increase compliance; behavioural studies in ecology; marketing studies, with purchase events as units; public policy: crime type with crime events as units Ergo, x is also random, so we need a joint distribution. Q1: For a bivariate process, what does p x (y) represent? Q2: Is it necessarily the case that p x (y) = p(y x)?... even if sample size is random?
9 Gaussian models Binary regression model Attenuation of treatment effect Problems with conventional models
10 Gaussian models Binary regression model Attenuation of treatment effect Problems with conventional models Problems in the application of conventional models Clinical trials / market research / traffic studies / crime... (i) Operational interpretation of a sample as a fixed subset or as a random subset independent of the process (ii) Sample units generated by a random process sequential recruitment, purchase events, traffic studies... (iii) Population also generated by a random process in time animal populations, purchase events, crime events,... (iv) Samples: random, sequential, quota,... (v) Conditional distribution given observed random x versus stratum distribution for fixed x
11 y = 2 Sampling bias Volunteer samples Joint distributions of Y given x Randomization y = 1 y = 0 X A point process on C X for C = {0, 1, 2}, and the superposition process on X. Intensity λ r (x) for class r: r = 0, 1, 2. x-values auto-generated by the superposition process with intensity λ.(x). To each auto-generated unit there corresponds an x-value and a y-value.
12 Binary point process model Sampling bias Volunteer samples Joint distributions of Y given x Randomization Intensity process λ 0 (x) for class 0, λ 1 (x) for class 1 Log ratio: η(x) = log λ 1 (x) log λ 0 (x) Events form a PP with intensity λ on {0, 1} X. Conventional GLMM calculation (Bayesian and frequentist): pr(y = 1 x, λ) = λ 1(x) λ. (x) = eη(x) 1 + e η(x) ( ) ( ) λ1 (x) e η(x) pr(y = 1 x) = E = E λ. (x) 1 + e η(x) GLMM calculation is correct in a sense, but irrelevant there might not be an event at x!
13 Sampling bias Volunteer samples Joint distributions of Y given x Randomization Correct calculation for auto-generated units pr(event of type r in dx λ) = λ r (x) dx + o(dx) pr(event of type r in dx) = E(λ r (x)) dx + o(dx) pr(event in SPP in dx λ) = λ. (x) dx + o(dx) pr(event in SPP in dx) = E(λ. (x)) dx + o(dx) pr(y (x) = r SPP event at x) = Eλ r (x) Eλ. (x) ( λr (x) ) = E x SPP = Eλ ( ) r (x) λ. (x) Eλ. (x) E λr (x) λ. (x) Sampling bias: Distn for fixed x versus distn for autogenerated x.
14 Two ways of thinking Sampling bias Volunteer samples Joint distributions of Y given x Randomization First way: waiting for Godot! Fix x X and wait for an event to occur at x pr(y = 1 λ, x) = λ 1(x) ( λ.(x) ) pr(y = 1; x) = E λ1 (x) = E(Y i i : X i = x) λ.(x) Conventional, mathematically correct, but seldom relevant Second way: come what may! SPP event occurs at x, a random point in X joint density at (y, x) proportional to E(λ y (x)) = m y (x) x has marginal density proportional to E(λ. (x)) = m. (x) pr(y = 1 x SPP) = Eλ ( ) 1(x) Eλ. (x) E λ1 (x) = E(Y i i : X i = x) λ. (x)
15 Ewes and lambs Notation Estimating functions Interference Log Gaussian illustration of sampling bias η 0 (x) GP(0, K ), λ 0 (x) = exp(η 0 (x)) η 1 (x) GP(α + βx, K ), λ 1 (x) = exp(η 1 (x)) η(x) = η 1 (x) η 0 (x) GP(α + βx, 2K ), K (x, x) = σ 2 One-dimensional sampling distributions: ( ) e η(x) ρ(x) = p x (Y = 1) = E 1 + e η(x) logit(ρ(x)) α + β x ( β < β ) e α+βx+σ2/2 π(x) = pr(y = 1 x SPP) = Eλ 1(x) Eλ. (x) = e σ2 /2 + e α+βx+σ2 /2 logit pr(y = 1 x SPP) = α + βx No approximation; no attenuation
16 Ewes and lambs Notation Estimating functions Interference Alternative formulation involving selection Fix the population U Associate with each u a random response intensity λ (u) y,t ( t = 0 t = 1 y = 0 λ (u) 00 λ (u) ) 01 y = 1 λ (u) 10 λ (u) 11 implying a volunteer intensity λ (u).. (random and small) pr(u Sample) = E(λ (u).. ) pr(u S & t u = t) = E(λ (u).t ) = m (u).t pr(y u = 1 u S & t u = t) = m (u) 1t /m (u).t E(λ 1t /λ.t )
17 Ewes and lambs Notation Estimating functions Interference Conditional joint distributions of Y given x Sampling with fixed x quota ( ) λyi (x i ) p x (y) = E λ. (x i ) ( ) e y i η(x i ) = E 1 + e η(x i ) coincides with standard GLMM model Sequential sampling fixed time: #x random p(y x) = E λ yi (x i ) e R λ.(x)ν(dx) E λ. (x i ) e R λ.(x)ν(dx)
18 Ewes and lambs Notation Estimating functions Interference Randomization: Simple treatment effect Event x > pair (t(x), y(x)), treatment status, outcome H 0 H a Expected intensity (t, y) λ y (x)γ t λ y (x)γ t,y m y (x)γ t,y (0, 0) λ 0 (x)γ 0 λ 0 (x)γ 00 m 0 (x)γ 00 (0, 1) λ 1 (x)γ 0 λ 1 (x)γ 01 m 1 (x)γ 01 (1, 0) λ 0 (x)γ 1 λ 0 (x)γ 10 m 0 (x)γ 10 (1, 1) λ 1 (x)γ 1 λ 1 (x)γ 11 m 1 (x)γ 11 Conditional on λ: odds(y (x) = 1 t, λ) = λ 1 (x)γ t1 /(λ 0 (x)γ t0 ) odds ratio = τ = γ 11 γ 00 /(γ 01 γ 10 ) Treatment effect constant in x and independent of λ
19 GLMM analysis for fixed x Ewes and lambs Notation Estimating functions Interference Sample (X, Y, t) 1, (X, Y, t) 2,... observed sequentially GLMM analysis for a single event at x λ 1 (x)γ t1 pr(y (x) = 1 t, λ) = λ 1 (x)γ t1 + λ 0 (x)γ ( t0 ) λ 1 (x)γ t1 pr(y (x) = 1 t) = E λ 1 (x)γ t1 + λ 0 (x)γ t0 odds(y (x) = 1 t = 1) odds ratio = odds(y (x) = 1 t = 0) = τ Conclusion: treatment effect is attenuated log τ < log τ
20 Ewes and lambs Notation Estimating functions Interference Point process analysis for autogenerated units Sample (X, Y, t) 1, (X, Y, t) 2,... observed sequentially Correct analysis for a single event at x λ 1 (x)γ t1 pr(y (x) = 1 t, λ) = λ 1 (x)γ t1 + λ 0 (x)γ t0 E(λ 1 (x))γ t1 pr(y (x) = 1 t, x SPP) = E(λ 1 (x))γ t1 + E(λ 0 (x))γ t0 odds(y (x) = 1 t, x SPP) = m 1 (x)γ t1 /(m 0 (x)γ t0 ) odds ratio = γ 11 γ 00 /(γ 01 γ 10 ) = τ Conclusion: No attenuation of treatment effect.
21 ... Problem: assess the effectiveness of treatment for scour in lambs Randomized trial: flock of 100 ewes due to produce lambs in Feb-March 10% zero; 40% single; 50% twins 140 lambs Randomize assignment of treatment (t = 0 or t = 1) to lambs randomization protocol... Response Y = 1 if lamb suffers from 6 weeks Lambs are the experimental units, but... Ewe = litter = cluster: (use index i for ewe, l for lamb) Responses Y il, Y il in same cluster are correlated
22 Correlation in clusters Standard GLMM model for correlation in clusters Introduce an iid cluster-specific random effect {η i }. eη i +α+βt il pr(y il = 1 η, t) = 1 + e η i +α+βt il Assume Y -values conditionally independent given η. Hence, the model distributions are e η i +α+βt il pr(y il = 1; t) = E η 1 + e η eα +β til i +α+βt il 1 + e α +β t il p t (y) = ( e η i +α+βt il ) E η 1 + e η i +α+βt il i l Use GLMM software to estimate β or β... or Bayesian software to obtain posterior distribution Does this make sense in the context?
23 Simple model: correct analysis Response distn w/o treatment for ewe i given λ w.p. λ (0) y w.p. λ (1) y (y {0, 1}) (y 1, y 2 ) w.p. λ (2) y 1,y 2 (y 1, y 2 {0, 1}) Distribution of #lambs: iid E(λ (0), λ (1)., λ (2).. ) Joint response distn incl treatment: (y, t) w.p. λ (1) y γ ty (y, t {0, 1}) (y 1, y 2 ) w.p. λ (2) y 1,y 2 γ t1 y 1 γ t2 y 2 such that γ.y = 1. (γ 11 γ 00 /(γ 10 γ 01 ) = e β ) constant
24 Simple model: correct analysis contd. pr(m i = 1, Y i = y, t i = t λ) = λ (1) y γ ty pr(m i = 1, Y i = y, t i = t) = E(λ (1) y )γ ty y, t {0, 1} odds(y i = 1 t i = 1) odds(y i = 1 t i = 0) = E(λ(1) 1 )E(λ(1) 0 ) γ 11γ 00 E(λ (1) 1 )E(λ(1) 0 ) γ = e β 10γ 01 (t i = 1 implies m i = 1): Compare with incorrect analysis pr(y il = 1 λ, t il = t) = λ (1) 1 γ t1 λ (1) 1 γ t1 + λ (1) 0 γ t0 log odds(y il = 1 t il = 1) log odds(y il = 1 t il = 0) β with β < β, giving the illusion of attenuation. = eη i +βt 1 + e η i +βt (GL
25 Attenuation: π(t) = eα+βt e η+α+βt 1 + e α+βt ρ(t) = E η 1 + e η+α+βt eα +β t 1 + e α +β t Estimating functions (sum over lambs) S(t, y) = lambs Y l π(t l ) S (t, y) = lambs Y l ρ(t l ) (GLMM, PA) Mean and variance of S and S E(S(t, y)) = 0; E(S (t, y)) 0 var(s(y, y)) =?
26 Notation: meaning of E(Y i X i = x) Exchangeable sequence (Y 1, X 1 ), (Y 2, X 2 ),... with binary Y implies conditionally iid given λ Stratum x: U x = {i : X i = x} an infinite random subsequence Stratum average: ave{y i : i U x } = λ 1 (x)/λ. (x) Stratum mean = expected value of stratum average: ( ) λ1 (x) ρ(x) = E eα +β x λ. (x) 1 + e α +β x is declared target in much biostatistical work (PA) Correct calculation for a random stratum in SPP: ( λ1 (x) ) π(x) = E x SPP = E(λ 1(x)) λ. (x) E(λ. (x)) = eα+βx 1 + e α+βx Conditional mean π(x) = E(Y i X i = x) versus stratum mean ρ(x) = E(Y i i : X i = x)
27 Consequences of ambiguous notation Sample (Y 1, X 1 ), (Y 2, X 2 ),... observed sequentially ( λ1 (x) ) π(x) = E x SPP = eα+βx λ. (x) 1 + e α+βx = E(Y i X i = x) ( ) λ1 (x) ρ(x) = E eα +β x λ. (x) 1 + e α +β x = E(Y i i : X i = x) (ρ(x) computed by logistic-normal integral) Conventional PA estimating function Y i ρ(x i ) is such that and same for Y 2,.... E(Y 1 ρ(x) X 1 = x) = π(x) ρ(x) 0
28 Estimating functions done correctly Mean intensity for class r: m r (x) = E(λ r (x)) π r (x) = m r (x)/m. (x); ρ r (x) = E(λ r (x)/λ. (x)) E(Y i ) = ρ(x) for each i in U x = {u : X u = x} For autogenerated x, E(Y x SPP) = π(x) ρ(x) T (x, y) = x SPP h(x)(y (x) π(x)) has zero mean for auto-generated configurations x. Note: E(T x) 0; average is also over configurations
29 Explanation of unbiasedness z = {(x 1, y 1 ),...} configuration generated in (0, t). x = {x 1,..., } marginal configuration (SPP) z is a random measure with mean t m r (x) ν(dx) at (r, x) x is marginal random measure with mean t m. (x) ν(dx) at x π r (x) = m r (x)/m. (x) Hence E(z(r, dx)) = π r (x)e(x(dx)) for all (r, x) implies T (x, y) = = x SPP X h(x)(y (x) π(x)) h(x) has zero expectation. But E(T x) 0. ( ) z(r, dx) π r (x)x(dx)
30 variance calculation: binary case (y, x) generated by point process; T (x, y) = h(x)(y (x) π(x)) x SPP E(T (x, y)) = 0; E(T x) 0 var(t ) = h 2 (x)π(x)(1 π(x)) m. (x) dx X + h(x)h(x ) V (x, x ) m.. (x, x ) dx dx X 2 + h(x)h(x ) 2 (x, x )m.. (x, x ) dx dx X 2 V : spatial or within-cluster correlation; : interference
31 What is interference? Physical/biological interference: distribution of Y (u) depends on x(u ) Sampling interference for autogenerated units m r (x) = E(λ r (x)); m rs (x, x ) = E(λ r (x)λ s (x )) Univariate distributions: π r (x) = m r (x)/m. (x) = pr(y (x) = r x SPP) Bivariate distributions: π rs (x, x ) = m rs (x, x )/m.. (x, x ) π rs (x, x ) = pr(y (x) = r, Y (x ) = s x, x SPP) Hence π r. (x, x ) = pr(y (x) = r x, x SPP) r (x, x ) = π r. (x, x ) π r (x) No second-order sampling interference if r (x, x ) = 0
32 Inference: Conventional Gaussian model Model p x (A) = N n (Xβ, Σ x = σ0 2 I n + K [x])(a) Rationalization Y (i) = x i i + η(x(i)) Stratum average: Ȳ (U x) = ave{y i i : x i = x} = x β + η(x) Conditional distribution of Y u for u U x given observation y, X Y u data N(x β + k Σ 1 x (y Xβ), Σ uu k Σ 1 x k) Ȳ (U x ) data N(x β + k Σ 1 x (y Xβ), K (x, x) k Σ 1 x k) k i = K (x, x i ), (such as e x x i or x x i 3 ) Stratum average is a random variable, not a parameter Estimate is a distribution (not a function of sufficient statistic) Likewise for GLMMs
33 Inference and predicton for the PP model For a sequential sample Observation (x, y) (x (0),..., x (k 1) ) Product density m r (x (r) ) = E( x x (r) λ r (x)) for class r Conditional distribution as a random labelled partition of x: p(y x) m 0 (x (0) ) m k 1 (x (k 1) ) For a subsequent autogenerated event p(y (x ) = r data, x SPP) m r (x (r), x )/m r (x (r) ) Use likelihood or estimating function to estimate parameters Use conditional distribution for inference/prediction
34 Brief summary of conclusions (i) Reasonable case for fixed-population model in certain areas laboratory work; field trials; veterinary trials;... (ii) Good case for autogenerated units in other areas clinical trials; marketing; crime; animal behaviour (iii) The choice matters in random-effects models (iv) π(x) = E(Y i X i = x) versus ρ(x) = E(Y i i : X i = x) conditional distribution versus stratum distribution attenuation or non-attenuation (v) What is modelled and estimated by PA? claims to estimate ρ(x) but actually estimates π(x) (vi) What does the GLMM likelihood estimate? Difficult to say; probably neither
Regression models, selection effects and sampling bias
Regression models, selection effects and sampling bias Peter McCullagh Department of Statistics University of Chicago Hotelling Lecture I UNC Chapel Hill Dec 1 2008 www.stat.uchicago.edu/~pmcc/reports/bias.pdf
More informationSampling bias in logistic models
Sampling bias in logistic models Department of Statistics University of Chicago University of Wisconsin Oct 24, 2007 www.stat.uchicago.edu/~pmcc/reports/bias.pdf Outline Conventional regression models
More informationGeneralized linear models III Log-linear and related models
Generalized linear models III Log-linear and related models Peter McCullagh Department of Statistics University of Chicago Polokwane, South Africa November 2013 Outline Log-linear models Binomial models
More informationSampling bias and logistic models
J. R. Statist. Soc. B (2008) 70, Part 4, pp. 643 677 Sampling bias and logistic models Peter McCullagh University of Chicago, USA [Read before The Royal Statistical Society at a meeting organized by the
More informationSampling bias and logistic models
Proofs subject to correction. Not to be reproduced without permission. Contributions to the discussion must not exceed 0 words. Contributions longer than 0 words will be cut by the editor. 1 1 1 1 1 1
More informationPartition models and cluster processes
and cluster processes and cluster processes With applications to classification Jie Yang Department of Statistics University of Chicago ICM, Madrid, August 26 and cluster processes utline 1 and cluster
More informationPQL Estimation Biases in Generalized Linear Mixed Models
PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized
More informationNon-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models
Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models
More informationBayesian Multivariate Logistic Regression
Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationComputer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression
Group Prof. Daniel Cremers 4. Gaussian Processes - Regression Definition (Rep.) Definition: A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution.
More informationComputer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression
Group Prof. Daniel Cremers 9. Gaussian Processes - Regression Repetition: Regularized Regression Before, we solved for w using the pseudoinverse. But: we can kernelize this problem as well! First step:
More informationGeneralized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationImproper mixtures and Bayes s theorem
and Bayes s theorem and Han Han Department of Statistics University of Chicago DASF-III conference Toronto, March 2010 Outline Bayes s theorem 1 Bayes s theorem 2 Bayes s theorem Non-Bayesian model: Domain
More informationGibbs Sampling in Endogenous Variables Models
Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take
More informationLecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu
Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes
More informationLinear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.
Linear Mixed Models One-way layout Y = Xβ + Zb + ɛ where X and Z are specified design matrices, β is a vector of fixed effect coefficients, b and ɛ are random, mean zero, Gaussian if needed. Usually think
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationGoals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 2. Recap: MNL. Recap: MNL
Goals PSCI6000 Maximum Likelihood Estimation Multiple Response Model 2 Tetsuya Matsubayashi University of North Texas November 9, 2010 Learn multiple responses models that do not require the assumption
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationGauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA
JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More informationEstimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004
Estimation in Generalized Linear Models with Heterogeneous Random Effects Woncheol Jang Johan Lim May 19, 2004 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure
More informationLecture 1: Bayesian Framework Basics
Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of
More informationThe consequences of misspecifying the random effects distribution when fitting generalized linear mixed models
The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San
More informationHigh-Throughput Sequencing Course
High-Throughput Sequencing Course DESeq Model for RNA-Seq Biostatistics and Bioinformatics Summer 2017 Outline Review: Standard linear regression model (e.g., to model gene expression as function of an
More informationMultivariate Survival Analysis
Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in
More informationShu Yang and Jae Kwang Kim. Harvard University and Iowa State University
Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet
Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 15-7th March 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Mixture and composition of kernels. Hybrid algorithms. Examples Overview
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationTime Series and Dynamic Models
Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The
More informationMachine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart
Machine Learning Bayesian Regression & Classification learning as inference, Bayesian Kernel Ridge regression & Gaussian Processes, Bayesian Kernel Logistic Regression & GP classification, Bayesian Neural
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationσ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =
Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,
More informationA Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks
A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks Y. Xu, D. Scharfstein, P. Mueller, M. Daniels Johns Hopkins, Johns Hopkins, UT-Austin, UF JSM 2018, Vancouver 1 What are semi-competing
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationA New Generalized Gumbel Copula for Multivariate Distributions
A New Generalized Gumbel Copula for Multivariate Distributions Chandra R. Bhat* The University of Texas at Austin Department of Civil, Architectural & Environmental Engineering University Station, C76,
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationCharles E. McCulloch Biometrics Unit and Statistics Center Cornell University
A SURVEY OF VARIANCE COMPONENTS ESTIMATION FROM BINARY DATA by Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University BU-1211-M May 1993 ABSTRACT The basic problem of variance components
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationReview. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis
Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,
More informationEco517 Fall 2014 C. Sims FINAL EXAM
Eco517 Fall 2014 C. Sims FINAL EXAM This is a three hour exam. You may refer to books, notes, or computer equipment during the exam. You may not communicate, either electronically or in any other way,
More informationStandard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j
Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )
More informationLog Gaussian Cox Processes. Chi Group Meeting February 23, 2016
Log Gaussian Cox Processes Chi Group Meeting February 23, 2016 Outline Typical motivating application Introduction to LGCP model Brief overview of inference Applications in my work just getting started
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation
More informationI. Multinomial Logit Suppose we only have individual specific covariates. Then we can model the response probability as
Econ 513, USC, Fall 2005 Lecture 15 Discrete Response Models: Multinomial, Conditional and Nested Logit Models Here we focus again on models for discrete choice with more than two outcomes We assume that
More informationCOS513 LECTURE 8 STATISTICAL CONCEPTS
COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions
More informationLECTURE 11: EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS
LECTURE : EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS HANI GOODARZI AND SINA JAFARPOUR. EXPONENTIAL FAMILY. Exponential family comprises a set of flexible distribution ranging both continuous and
More informationCovariance modelling for longitudinal randomised controlled trials
Covariance modelling for longitudinal randomised controlled trials G. MacKenzie 1,2 1 Centre of Biostatistics, University of Limerick, Ireland. www.staff.ul.ie/mackenzieg 2 CREST, ENSAI, Rennes, France.
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationCTDL-Positive Stable Frailty Model
CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationA note on multiple imputation for general purpose estimation
A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume
More informationSTAT 526 Advanced Statistical Methodology
STAT 526 Advanced Statistical Methodology Fall 2017 Lecture Note 10 Analyzing Clustered/Repeated Categorical Data 0-0 Outline Clustered/Repeated Categorical Data Generalized Linear Mixed Models Generalized
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationEffective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression
Communications of the Korean Statistical Society 2009, Vol. 16, No. 4, 713 722 Effective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression Young-Ju Kim 1,a a Department of Information
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationReconstruction of individual patient data for meta analysis via Bayesian approach
Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi
More informationCOMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017
COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS
More informationSparse Linear Models (10/7/13)
STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine
More informationUQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables
UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables To be provided to students with STAT2201 or CIVIL-2530 (Probability and Statistics) Exam Main exam date: Tuesday, 20 June 1
More informationImproving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates
Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationNonresponse weighting adjustment using estimated response probability
Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationRecitation 2: Probability
Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationFrailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.
Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationFigure 36: Respiratory infection versus time for the first 49 children.
y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1
More informationHybrid Dirichlet processes for functional data
Hybrid Dirichlet processes for functional data Sonia Petrone Università Bocconi, Milano Joint work with Michele Guindani - U.T. MD Anderson Cancer Center, Houston and Alan Gelfand - Duke University, USA
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationDiscussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs
Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationChapter 16. Structured Probabilistic Models for Deep Learning
Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationLecture 5: GPs and Streaming regression
Lecture 5: GPs and Streaming regression Gaussian Processes Information gain Confidence intervals COMP-652 and ECSE-608, Lecture 5 - September 19, 2017 1 Recall: Non-parametric regression Input space X
More informationRandom Effects Models for Network Data
Random Effects Models for Network Data Peter D. Hoff 1 Working Paper no. 28 Center for Statistics and the Social Sciences University of Washington Seattle, WA 98195-4320 January 14, 2003 1 Department of
More informationLecture 25: Review. Statistics 104. April 23, Colin Rundel
Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April
More informationUsing the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes 1 JunXuJ.ScottLong Indiana University 2005-02-03 1 General Formula The delta method is a general
More informationLecture 6: Bayesian Inference in SDE Models
Lecture 6: Bayesian Inference in SDE Models Bayesian Filtering and Smoothing Point of View Simo Särkkä Aalto University Simo Särkkä (Aalto) Lecture 6: Bayesian Inference in SDEs 1 / 45 Contents 1 SDEs
More informationHierarchical Modelling for Univariate Spatial Data
Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationGroup Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology
Group Sequential Tests for Delayed Responses Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Lisa Hampson Department of Mathematics and Statistics,
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation
More informationMissing Covariate Data in Matched Case-Control Studies
Missing Covariate Data in Matched Case-Control Studies Department of Statistics North Carolina State University Paul Rathouz Dept. of Health Studies U. of Chicago prathouz@health.bsd.uchicago.edu with
More information,..., θ(2),..., θ(n)
Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 305 Part VII
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationMixed models in R using the lme4 package Part 7: Generalized linear mixed models
Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team University of
More informationLinear Models for Classification
Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationGeneralized Linear Models
Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More information