Solar Spectral Analyses with Uncertainties in Atomic Physical Models Xixi Yu Imperial College London xixi.yu16@imperial.ac.uk 3 April 2018 Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 1 / 35
Acknowledgments Joint work with the International Space Science Institute (ISSI) team Improving the Analysis of Solar and Stellar Observations Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 2 / 35
The Solar Corona The solar corona is a complex and dynamic system Measuring physical properties in any solar region is important for understanding the processes that lead to these events Figure: The photospheric magnetic field measured with HMI, million degree emission observed with the AIA Fe IX 171,A channel, and high temperature loops observed with XRT Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 3 / 35
Aim We want to infer physical quantities of the solar atmosphere (density, temperature, path length, etc.), but we only observe intensity Inferences rely on models for the underlying atomic physics How to address uncertainty in the atomic physics models? Figure: Hinode spacecraft. Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 4 / 35
Physical Parameters n k 1 : number of free electrons per unit volume in plasma T k : electron temperature d k : path length through the solar atmosphere θ k = (log n k, log d k ) m: index of the emissivity curve Expected intensity of line with wavelength λ: ɛ (m) λ ɛ (m) λ (n k, T k )n 2 k d k (n k, T k ) is the plasma emissivity for the line with wavelength λ in pixel k 1 Subscript k is the pixel index Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 5 / 35
Data: Observed Intensity Data from the Extreme-Ultraviolet Imaging Spectrometer (EIS) on Hinode spacecraft. Figure: Example EIS spectrum of seven Fe XIII lines Spectral lines with wavelengths Λ = {λ 1,..., λ J } Observed intensities for K pixels and J wavelengths: ˆD = {D k = (I kλ1,..., I kλj ), k = 1,..., K} Standard deviation σ kλj are also known Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 6 / 35
Uncertainty: Emissivity Emissivity: how strongly energy is radiated at a given wavelength Simulated from a model accounting for uncertainty in the atomic data Suppose a collection of M emissivity curves are known M = {ɛ (m) λ (n k, T k ), λ Λ, m = 1,..., M} Figure: A simplified level diagram for the transitions relevant to the 7 lines considered here. Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 7 / 35
Preceding Work Fully Bayesian Model I: Use Bayesian Methods to incorporate information in the data for narrowing the uncertainty in the atomic physics calculation There are only M = 1000 equally likely emissivity curves as a priori Solution: Obtain a sample of M that accounts for the uncertainty in the atomic data, m m is treated as an unknown parameter Obtain sample from p(m, θ D) via two-step Monte Carlo (MC) samplers or Hamiltonian Monte Carlo (HMC) Conclusions: We are able to incorporate uncertainties in atomic physics calculations into analyses of solar spectroscopic data Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 8 / 35
Preceding Work: Fully Bayesian Model I Independent prior distributions Likeihood L(m, θ k D k ) p(m, θ k ) = p(m) p(log n k ) p(log d k ) (1) m DiscreteUniform({1,..., M}) (2) log 10 n k Uniform(min = 7, max = 12) (3) log 10 d k Cauchy(center = 9, scale = 5) (4) ( indep I kλ m, n k, d k Normal Joint posterior distribution ɛ (m) λ ) (n k, T k )n 2 k d k, σkλ 2, for λ Λ (5) p(m, θ k D k ) L(m, θ k D k ) p(m, θ k ), (6) Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 9 / 35
Preceding Work: Pragmatic VS Fully Bayesian Models Pragmatic Bayesian posterior distribution p(m, θ k D k ) = p(θ k D k, m) p(m). (7) Fully Bayesian posterior distribution p(m, θ k D k ) = p(θ k D k, m) p(m D k ). (8) Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 10 / 35
Preceding Work: Multimodal Posterior Distributions Bimodal posterior distributions occur Two modes correspond to two emissivity curves Inaccurate relative size of two modes in HMC Reason: Not enough emissivity curves Challenge: Sparse selection of emissivity curves Density 0 10 20 30 40 H Algorithm Density 0 5 10 15 20 H Algorithm 9.20 9.25 9.30 log ne 9.30 9.40 9.50 9.60 log ds Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 11 / 35
Preceding Work: Multimodal Posterior Distributions A computational issue: Inaccurate relative size of two modes A computational solution: Adding a few synthetic replicate emissivity curves with augmented set M aug M: M aug /M = {w 1 Emis 471 + w 2 Emis 368 }, where (w 1, w 2 ) = (0.75, 0.25), (0.50, 0.50), &(0.25, 0.75) Density 0 10 20 30 40 H Algorithm Density 0 5 10 15 20 H Algorithm 9.20 9.25 9.30 log ne 9.30 9.40 9.50 9.60 log ds Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 12 / 35
Aim Come up with a way to efficiently represent the high dimensional joint distribution of the uncertainty of the emissivity curves. Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 13 / 35
Comparison of Current and Preceding Models Model I (Done) Joint posterior distribution: p(m, θ k D k ) L(m, θ k D k ) p(m) p(θ k ) p(m) = 1 M A computational trick: adding a few synthetic replicate emissivity curves Model II (On going...) Joint posterior distribution: p(c(r k ), θ k D k ) L(C(r k ), θ k D k ) p(r k, θ k ) r k is the PCA transformation of emissivity curve, C p(r) is a high dimensional distribution An algorithm: summarizing the distribution with multivariate standard Normal distribution via principal component analysis (PCA) Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 14 / 35
Principal Component Analysis In log space J = 16 PCs capture 99% of total variation PCA generated emissivity curve replicate based on the first J PCs C rep = C + J r j β j v j, j=i where C: average of all 1000 emissivity curves r j : random variate generated from the standard Normal distribution β 2 j, v j: eigenvalue and eigenvector of component j in the PCA representation Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 15 / 35
Plot of Original Emissivity Curves C 61 59 57 55 C C_ave 1.0 0.0 0.5 1.0 0 50 100 150 var index 0 50 100 150 var index Top panel: Each part corresponds one of the seven lines Light gray area: all 1000 emissivities Dark gray area: middle 68% of emissivities Solid black curve: C Bottom panel: Same as above, but using C C Colored dashed curves: six randomly selected curves Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 16 / 35
Plot of PCA Generated Emissivity Curves C C_ave 1.0 0.0 0.5 1.0 C C_ave 1.0 0.0 0.5 1.0 0 50 100 150 var index 0 50 100 150 var index Light & dark gray areas are same as above Top panel: Dashed lines: all 1500 PCA generated emissivities Dotted lines: the middle 68% of PCA emissivities Bottom panel: Colored dashed curves: selection of PCA curves Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 17 / 35
Plot of Eigenvectors PC1 eigenvector 0.15 0.10 0.05 0.00 0.05 0.10 7 8 9 10 11 12 PC2 eigenvector 0.15 0.10 0.05 0.00 0.05 7 8 9 10 11 12 PC3 eigenvector 0.10 0.05 0.00 0.05 0.10 7 8 9 10 11 12 PC4 eigenvector 0.10 0.05 0.00 0.05 0.10 0.15 7 8 9 10 11 12 logn logn logn logn PC5 eigenvector 0.10 0.00 0.05 0.10 0.15 PC6 eigenvector 0.15 0.05 0.05 0.10 0.15 PC7 eigenvector 0.20 0.15 0.10 0.05 0.00 PC8 eigenvector 0.10 0.00 0.05 0.10 0.15 196.52 200.02 201.12 202.04 203.16 203.83 209.92 7 8 9 10 11 12 7 8 9 10 11 12 7 8 9 10 11 12 7 8 9 10 11 12 logn logn logn logn Figure: Plot of eigenvectors of 7 lines along log n grid for the first 16 PCs. Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 18 / 35
Principal Component Analysis PC9 eigenvector 0.1 0.0 0.1 0.2 PC10 eigenvector 0.2 0.1 0.0 0.1 PC11 eigenvector 0.2 0.1 0.0 0.1 PC12 eigenvector 0.15 0.05 0.05 0.15 7 8 9 10 11 12 7 8 9 10 11 12 7 8 9 10 11 12 7 8 9 10 11 12 logn logn logn logn PC13 eigenvector 0.2 0.1 0.0 0.1 0.2 0.3 0.4 PC14 eigenvector 0.2 0.1 0.0 0.1 PC15 eigenvector 0.2 0.1 0.0 0.1 0.2 PC16 eigenvector 0.2 0.0 0.1 0.2 0.3 0.4 196.52 200.02 201.12 202.04 203.16 203.83 209.92 7 8 9 10 11 12 7 8 9 10 11 12 7 8 9 10 11 12 7 8 9 10 11 12 logn logn logn logn Figure: Plot of eigenvectors of 7 lines along log n grid for the first 16 PCs. Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 19 / 35
Alg I: Hamiltonian Monte Carlo via Stan Use Stan (mc-stan.org) to sample (r (l), θ (l) ) p(r, θ D), for l = 1,..., L. where p(r, θ D) p(d r, θ) p(r) p(θ) p(r) Normal(0, I ) log 10 n Uniform(min = 7, max = 12) log 10 d Cauchy(center = 9, scale = 5) Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 20 / 35
Alg I: Compare Stan Results From Model I & Model II Density 0 10 20 30 40 Density 0 5 10 15 9.10 9.20 9.30 9.40 logn 9.2 9.4 9.6 9.8 logds The results from Model II (white hist) can mitigate the multimodal from Model I (gray hist) Problem: Stan is time-consuming Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 21 / 35
Alg I: Scatterplot matrices from Stan Results Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 22 / 35
Alg I: Scatterplot matrices from Stan Results Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 23 / 35
Alg I: Scatterplot matrices from Stan Results Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 24 / 35
Alg II: Independence Sampler (IS) Metropolis Hastings (MH) within Fully Bayesian Gibbs Sampler Step 1: For j = 1,..., J, sample r [prop] j N (µ = 0, sd = 1). Step 2: Set r [prop] = (r [prop] 1,..., r [prop] J ) and C [prop] = C + J j=1 r [t+1] j β j v j. Step 3: Sample θ [prop] Q MVN (θ r [prop] ). Step 4: Compute ρ = p(c [prop], θ [prop] D) Q MVN (θ (t), r (t) ) p(c (t), θ (t) D) Q MVN (θ [prop], r [prop] ) (9) Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 25 / 35
Alg II: Independence Sampler Use a multivariate normal distribution Q MVN (θ, r) to fit a sample obtained from the pragmatic Bayesian posterior. Suppose we have the pragmatic Bayesian samples, {(r (1) pb, θ(1) (T ) pb ),..., (r pb, θ(t ) pb )} Set the multivariate normal distribution Q MVN (θ, r) to be, ( ) θ r r Normal(µ r = 0, Σ 22 = I ) (( ˆµθ Normal µ r ) ( )) ˆΣ11 ˆΣ12, ˆΣ 21 Σ 22 ˆµ θ, ˆΣ 11, ˆΣ 12 : sample mean and sample variance of θ obtained from the pragmatic Bayesian sample, sample covariance between θ and r The conditional distribution of θ given r is, Q MVN (θ r) Normal(ˆµ θ + ˆΣ 12 r, ˆΣ 11 ˆΣ 12 ˆΣ 21 ) achieved via multivariate normal linear regression. Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 26 / 35
Alg II: Several Possible Proposals of r Normal(0, I ) Normal(r (471), I ) or Normal(r (368), I ) Mixture distribution: p Normal(r (471), σ 2 I ) + (1 p) Normal(r (368), σ 2 I ) where p and σ are tuning parameters Normal(µ Stan, Σ Stan ) or Normal(µ Stan, Σ Stan 2) Student t 4 Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 27 / 35
Alg II: Proposals of r r Normal(0, I ) log n_e 9.22 9.24 9.26 9.28 9.30 9.32 ACF 0.0 0.2 0.4 0.6 0.8 1.0 0 2000 4000 6000 8000 10000 log ds 9.30 9.35 9.40 9.45 9.50 e1 0.5 0.0 0.5 1.0 1.5 2.0 0 2000 4000 6000 8000 10000 0 10 20 30 40 0 2000 4000 6000 8000 10000 Lag Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 28 / 35
Alg II: Proposals of r r Normal(r (471), I ) or Normal(r (368), I ) log n_e 9.20 9.25 9.30 0 1000 2000 3000 4000 5000 log ds 9.30 9.40 9.50 9.60 0 1000 2000 3000 4000 5000 ACF 0.0 0.2 0.4 0.6 0.8 1.0 e1 0.5 1.0 1.5 2.0 2.5 3.0 0 5 10 15 20 25 30 35 0 1000 2000 3000 4000 5000 Lag Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 29 / 35
Alg II: Proposals of r r p Normal(r (471), σ 2 I ) + (1 p) Normal(r (368), σ 2 I ) where p = 0.5 and σ = 2 are tuning parameters log n_e 9.0 9.1 9.2 9.3 9.4 0 1000 2000 3000 4000 5000 log ds 9.2 9.4 9.6 9.8 0 1000 2000 3000 4000 5000 ACF 0.0 0.2 0.4 0.6 0.8 1.0 e1 2 1 0 1 2 3 4 5 0 5 10 15 20 25 30 35 0 1000 2000 3000 4000 5000 Lag Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 30 / 35
Alg II: Proposals of r r Normal(µ Stan, Σ Stan ) or Normal(µ Stan, Σ Stan 2) log n_e 9.10 9.15 9.20 9.25 9.30 9.35 0 10000 20000 30000 40000 log ds 9.2 9.3 9.4 9.5 9.6 9.7 9.8 0 10000 20000 30000 40000 ACF 0.0 0.2 0.4 0.6 0.8 1.0 e1 1 0 1 2 3 0 10 20 30 40 0 10000 20000 30000 40000 Lag Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 31 / 35
Alg II: Proposals of r r Student t 4 θ r multivariate t linear regression with df = 4 log n_e 9.15 9.20 9.25 9.30 0 10000 20000 30000 40000 log ds 9.30 9.40 9.50 9.60 0 10000 20000 30000 40000 ACF 0.0 0.2 0.4 0.6 0.8 1.0 e1 2 1 0 1 2 3 0 10 20 30 40 0 10000 20000 30000 40000 Lag Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 32 / 35
Alg III: Random Walk Sampler (TBD) Use Random Walk Step 1: For j = 1,..., J, sample r [prop] j Step 2: Set r [prop] = (r [prop] 1,..., r [prop] C [prop] = C + J j=1 r [t+1] j β j v j. J N (µ = r [t] j, sd = σ r ). ) and Step 3: Sample θ [prop] Q MVN (θ r [prop] ). Step 4: Compute ρ = p(c [prop], θ [prop] D) Q MVN (θ (t) r (t) ) p = (C (t), θ (t) D) Q MVN (θ [prop] r [prop] ) (10) Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 33 / 35
Alg III Tuning parameter σ = 0.6 r is initialized from Normal(r (471), I ) log n_e 9.0 9.1 9.2 9.3 9.4 log ds 9.2 9.4 9.6 9.8 10.0 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 ACF 0.0 0.2 0.4 0.6 0.8 1.0 e1 4 2 0 2 4 0 5 10 15 20 25 30 35 0 1000 2000 3000 4000 5000 Lag Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 34 / 35
Alg IV: Mixture of Independence Sampler and Random Walk Sampler (TBD) Step 1: Sample u 1 Uniform(0, 1) Step 2: If u 1 < p m, go to Alg III, Random Walk Sampler else, go to Alg II, Independence Sampler Two tuning parameters p m and σ r Xixi Yu (Imperial College London) Statistical methods in Astrophysics 3 April 2018 35 / 35