Quantifying conformation fluctuation induced uncertainty in bio-molecular systems

Similar documents
arxiv: v3 [math.na] 7 Dec 2018

Polynomial chaos expansions for sensitivity analysis

Uncertainty quantification for sparse solutions of random PDEs

Interpolation via weighted l 1 minimization

Strengthened Sobolev inequalities for a random subspace of functions

Sparse polynomial chaos expansions in engineering applications

UC Irvine UC Irvine Previously Published Works

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery

A regularized least-squares method for sparse low-rank approximation of multivariate functions

ACTIVE SUBSPACES for dimension reduction in parameter studies

Solving the Stochastic Steady-State Diffusion Problem Using Multigrid

Stochastic Spectral Approaches to Bayesian Inference

AN INTRODUCTION TO COMPRESSIVE SENSING

A Non-Intrusive Polynomial Chaos Method For Uncertainty Propagation in CFD Simulations

arxiv: v1 [math.na] 14 Sep 2017

Algorithms for Uncertainty Quantification

Minimal Element Interpolation in Functions of High-Dimension

Stochastic Collocation Methods for Polynomial Chaos: Analysis and Applications

Estimating functional uncertainty using polynomial chaos and adjoint equations

Solving the steady state diffusion equation with uncertainty Final Presentation

Fast Numerical Methods for Stochastic Computations

Uncertainty analysis of large-scale systems using domain decomposition

Dinesh Kumar, Mehrdad Raisee and Chris Lacor

sparse and low-rank tensor recovery Cubic-Sketching

Polynomial chaos expansions for structural reliability analysis

A new method on deterministic construction of the measurement matrix in compressed sensing

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond

Sparse recovery for spherical harmonic expansions

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

Intro BCS/Low Rank Model Inference/Comparison Summary References. UQTk. A Flexible Python/C++ Toolkit for Uncertainty Quantification

Uncertainty Quantification and hypocoercivity based sensitivity analysis for multiscale kinetic equations with random inputs.

Design of Projection Matrix for Compressive Sensing by Nonsmooth Optimization

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise

Uniqueness Conditions for A Class of l 0 -Minimization Problems

Uncertainty Quantification for multiscale kinetic equations with random inputs. Shi Jin. University of Wisconsin-Madison, USA

Generalized Orthogonal Matching Pursuit- A Review and Some

Stochastic Dimension Reduction

The Sparsest Solution of Underdetermined Linear System by l q minimization for 0 < q 1

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Proper Generalized Decomposition for Linear and Non-Linear Stochastic Models

Uncertainty Quantification via Sparse Polynomial Chaos Expansion

Original Research. Sensitivity Analysis and Variance Reduction in a Stochastic NDT Problem

Interpolation-Based Trust-Region Methods for DFO

SENSITIVITY ANALYSIS IN NUMERICAL SIMULATION OF MULTIPHASE FLOW FOR CO 2 STORAGE IN SALINE AQUIFERS USING THE PROBABILISTIC COLLOCATION APPROACH

Introduction to Uncertainty Quantification in Computational Science Handout #3

The Sparsity Gap. Joel A. Tropp. Computing & Mathematical Sciences California Institute of Technology

A Stochastic Collocation based. for Data Assimilation

Utilizing Adjoint-Based Techniques to Improve the Accuracy and Reliability in Uncertainty Quantification

Uncertainty Quantification for multiscale kinetic equations with high dimensional random inputs with sparse grids

Uncertainty Quantification in MEMS

An Introduction to Sparse Approximation

Noisy Signal Recovery via Iterative Reweighted L1-Minimization

Uncertainty Quantification in Computational Models

A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation

Benjamin L. Pence 1, Hosam K. Fathy 2, and Jeffrey L. Stein 3

Application and validation of polynomial chaos methods to quantify uncertainties in simulating the Gulf of Mexico circulation using HYCOM.

A Polynomial Chaos Approach to Robust Multiobjective Optimization

Compressed Sensing and Linear Codes over Real Numbers

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

Compressed Sensing and Related Learning Problems

Multipath Matching Pursuit

arxiv: v1 [physics.comp-ph] 21 Jun 2016

Sparse Approximation of PDEs based on Compressed Sensing

Collocation based high dimensional model representation for stochastic partial differential equations

A Vector-Space Approach for Stochastic Finite Element Analysis

Greedy Signal Recovery and Uniform Uncertainty Principles

Uncertainty Quantification and Validation Using RAVEN. A. Alfonsi, C. Rabiti. Risk-Informed Safety Margin Characterization.

arxiv: v2 [physics.comp-ph] 15 Sep 2015

Stochastic Solvers for the Euler Equations

Sparse Legendre expansions via l 1 minimization

Enhanced Compressive Sensing and More

Reconstruction from Anisotropic Random Measurements

Fast Hard Thresholding with Nesterov s Gradient Method

Adaptive L p (0 <p<1) Regularization: Oracle Property and Applications

Thresholds for the Recovery of Sparse Solutions via L1 Minimization

Accepted Manuscript. SAMBA: Sparse approximation of moment-based arbitrary polynomial chaos. R. Ahlfeld, B. Belkouchi, F.

Multiplicative and Additive Perturbation Effects on the Recovery of Sparse Signals on the Sphere using Compressed Sensing

Parameterized PDEs Compressing sensing Sampling complexity lower-rip CS for PDEs Nonconvex regularizations Concluding remarks. Clayton G.

Sampling and low-rank tensor approximation of the response surface

of Orthogonal Matching Pursuit

Sensing systems limited by constraints: physical size, time, cost, energy

Stochastic structural dynamic analysis with random damping parameters

Quadrature for Uncertainty Analysis Stochastic Collocation. What does quadrature have to do with uncertainty?

An Empirical Chaos Expansion Method for Uncertainty Quantification

A Power Efficient Sensing/Communication Scheme: Joint Source-Channel-Network Coding by Using Compressive Sensing

Addressing high dimensionality in reliability analysis using low-rank tensor approximations

Uncertainty Propagation and Global Sensitivity Analysis in Hybrid Simulation using Polynomial Chaos Expansion

Dynamic response of structures with uncertain properties

Color Scheme. swright/pcmi/ M. Figueiredo and S. Wright () Inference and Optimization PCMI, July / 14

Sparse Solutions of an Undetermined Linear System

Xu Guanlei Dalian Navy Academy

High-dimensional covariance estimation based on Gaussian graphical models

Interpolation via weighted l 1 -minimization

Research Article A Pseudospectral Approach for Kirchhoff Plate Bending Problems with Uncertainties

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

Recent Developments in Compressed Sensing

Keywords: Sonic boom analysis, Atmospheric uncertainties, Uncertainty quantification, Monte Carlo method, Polynomial chaos method.

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem

LIMITATION OF LEARNING RANKINGS FROM PARTIAL INFORMATION. By Srikanth Jagabathula Devavrat Shah

Transcription:

Quantifying conformation fluctuation induced uncertainty in bio-molecular systems Guang Lin, Dept. of Mathematics & School of Mechanical Engineering, Purdue University Collaborative work with Huan Lei, Xiu Yang, Bin Zhang, Nathan Baker, PNNL 2015 IMA Hot Topics Workshop on Uncertainty Quantification in Materials Modeling, Purdue University, West Lafayette, IN, Aug. 31, 2015. arxiv:1408.5629 This work is supported by DOE grant for the Collaboratory on Mathematics for Mesoscopic Modeling of Materials (CM4)

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Background & Motivation A biomolecule in equilibrium: static or dynamic? Target properties: deterministic or stochastic? Figure: Tube diagram of the molecule Trypsin inhibitor (PDB code: 5pti) under equilibrium. Quantifying the influence of conformational uncertainty in biomolecular solvation

Outline Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues 1 Construct stochastic model of conformation fluctuation 2 Numerical methods to construct the surrogated model 3 Numerical Example: SASA of individual/total residues Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Construct the stochastic model Approximate the potential energy of molecule fluctuation by V (R, R) = γ (r ij r ij ) 2 h(r c r ij ) 2 i<j R T = [ r T 1 r T 2 r T N ] - equilibrium position; R T = [ r T 1 rt 2 rt N ] - instantaneous position; γ - elastic coefficient; r c - cut-off distance. Define the Hessian matrix H 11 H 12 H 1N H 21 H 22 H 2N H =.,H ij = H N1 H N2 H NN 2 V X i X j 2 V Y i X j 2 V Z i X j 2 V X i Y j 2 V Y i Y j 2 V Z i Y j 2 V X i Z j 2 V Y i Z j 2 V Z i Z j Fluctuation correlation between the residues i and j Ri R T k B T [ j = H 1 ] γ ij Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Full stochastic model Eigenvalue decomposition H = WΛW T, Λ = diag(λ 1,, λ 3N 6 ) λ i - the i-th nonzero eigenvalues of H, w i - i-th eigenvector of H. Correlation matrix C can be determined by C ij R i R T j C = k BT γ WΛ 1 W = UU T, Stochastic conformation states are generated by R(ξ) = R + R(ξ) R(ξ) = Uξ ξ - 3N 6 dimensional standard normal random vector. Target property X (ξ) := X (R(ξ)). Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Reduced stochastic model For local property X {p} on residue p, R X {p} could be sparse, i.e., X {p} R j = 0 if X {p} on residue p is independent of R j. Correlation matrix C can be reduced to C C ij = C ij h(r p c r ip )h(r p c r jp ), r pi = r p r i, r pj = r p r j, r p c - cut-off distance for X {p}. Residue label C C {p} Residue label Figure: Sketch of a typical reduced correlation matrix. C {p} = U {p} U {p}t, R {p} (ξ {p} ) = R {p} + U {p} ξ {p}, N d = 3 h(rc p r ip ), i X {p} (ξ {p} ) := X {p} (R(ξ {p} )) ξ {p} : d-dimensional normal random vector Quantifying the influence of conformational uncertainty in biomolecular solvation

Outline Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues 1 Construct stochastic model of conformation fluctuation 2 Numerical methods to construct the surrogated model 3 Numerical Example: SASA of individual/total residues Quantifying the influence of conformational uncertainty in biomolecular solvation

Generalized polynomial chaos (gpc) expansion in uncertainty quantification (UQ) (Ghanem and Spanos 1991; Xiu and Karniadakis, 2002) Quantity of interest, e.g., force, velocity, etc. truncation error i.i.d. random variables For input samples 1

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Generalized Polynomial Chaos Generalized Polynomial Chaos (gpc) helps to represent uncertainty. In practice, we truncate the gpc expansion up to polynomial order P X (ξ) X (ξ) = P α =0 c α ψ α (ξ) Construct gpc expansion: probabilistic collocation methods (e.g, tensor product, sparse grid method, etc.) Q X (ξ)ψ α (ξ)dp(ξ) X i ψ α (ξ i )w i, i=1 where X i - collocation point, w i - weight. Major challenge: For high dimensional system: large number of collocation points (e.g. d = 27, P = 2, Q = 7.6 10 12 tensor product points) sensitive to numerical error accompanied with X Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Brief introduction to compressive sensing Consider a linear system Ψ M N c N 1 = u M 1. Ψ c = u When M < N, the system is underdetermined, it is possible to obtain c if c is a sparse vector. We may obtain it by solving the following optimization problem: Ψc + e = u, where e 2 ɛ, we modify (P h,0 ) as (P h,ɛ ) : min c c h subject to Ψc u 2 ɛ. 1. E.J. Candès, J. Romberg, T. Tao, IEEE Trans. Inform. Theory, 2006. 2. D.L. Donoho, M. Elad, V.N. Temlyakov, IEEE Trans. Inform. Theory, 2006. Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Application to generalized polynomial chaos Consider a gpc expansion X (ξ) = α c αψ α (ξ). By sampling ξ, we obtain: N X (ξ 1 ) c α ψ α (ξ 1 ), α=0 N X (ξ 2 ) c α ψ α (ξ 2 ), α=0 which can be cast into the linear system: i.e., ψ 0 (ξ 1 ) ψ 1 (ξ 1 ) ψ N (ξ 1 ) c 0 X (ξ 1 ) ψ 0 (ξ 2 ) ψ 1 (ξ 2 ) ψ N (ξ 2 ) c 1........ X (ξ 2 )., ψ 0 (ξ M ) ψ 1 (ξ M ) ψ N (ξ M ) X (ξ M ) Ψc + e = X, where e is related to the truncation error. c N Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Critical Problem The sparsity of the gpc expansion is unknown a priori. Quantifying the influence of conformational uncertainty in biomolecular solvation

Sparsity exact sparse nearly sparse 1

Sparsity exact sparse norm : norm : Example: is called -sparse if nearly sparse is sparse if Example: best - sparse approximation of 1

Compressive Sensing in UQ Classical works: Donoho, Candes, Tao, Romberg, Boyd 2004-2009. Orthonormal polynomial systems: Rauhut and Ward 2012. Application in UQ: Doostan and Owhadi 2010. Bayesian model uncertainty method: Karagiannis, Lin 2014. Mixed Shrinkage Prior procedure: Karagiannis, Konomi, Lin 2014. Sampling strategy: Rauhut and Ward 2012; Yan, Guo and Xiu 2012; Xu and Zhou 2014; Hampton and Doostan 2015. Enhancing sparsity: Candes, Wakin and Boyd 2008; Yang and Karniadakis 2013; Peng, Hampton and Doostan 2014. Adaptive basis selection: Jakeman, Eldred and Sargsyan 2015. Incorporating gradient information: Jakeman, Eldred and Sargsyan 2015; Lei, Yang, Zheng, Lin and Baker 2014; Peng, Hampton and Doostan 2015. 3

Increase the sparsity in the optimization Reweighted minimization (Candes, Wakin and Boyd 2008, Yang and Karniadakis 2013): It can be achieved iteratively: 4

Increase the sparsity intrinsically 5

Increase the sparsity intrinsically 6

Difficulties How to obtain? Understanding of the physical model How to compute the PDF of? may not be independent which may be an issue when generating new set of orthonormal polynomials. Does the matrix still have good properties? 7

A Special Case We consider a special case : are i.i.d. Gaussian, i.e., and the mapping is a rotation: are still i.i.d. Gaussian. are Hermite polynomials. are the value of Hermite polynomials at another set of input samples generated in the same manner (e.g., randn in MATLAB). 8

Example 9

Example 10

Rotation Matrix Active subspace method by Constantine, Dow and Wang (2014). Define the gradient matrix (outer product of gradient): where is symmetric and 11

Rotation Matrix Active subspace method by Constantine, Dow and Wang (2014). Define the gradient matrix (outer product of gradient): where is symmetric and is unknown! 12

Rotation matrix 13

Iteratively Construct Rotation Matrix Given the input sample and the output samples, the rotation matrix can be obtained iteratively to (possibly) improve its performance: In other words, where is the number of iterations. 14

Iteratively Construct Rotation Matrix Given the input sample and the output samples, the rotation matrix can be obtained iteratively to (possibly) improve its performance: In other words, where is the number of iterations. 15

Summary of the Algorithm 1.. 2. Set iteration counter and set. 3. Construct measurement matrix as, and compute the gpc coefficients with compressive sensing method. 16

Summary of the Algorithm 1.. 2. Set iteration counter and set. 3. Construct measurement matrix as, and compute the gpc coefficients with compressive sensing method. 4. Compute rotation matrix based on. 5. If is close to identity matrix or permutation matrix, stop. Otherwise, set, i.e., and go to step 3. 17

Outline Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues 1 Construct stochastic model of conformation fluctuation 2 Numerical methods to construct the surrogated model 3 Numerical Example: SASA of individual/total residues Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Solvent Accessible Surface Area of individual residues C C {p} 0.1 Figure: Molecule Trypsin inhibitor (PDB code: 5pti) under equilibrium. Probability density distribution 0.075 0.05 0.025 0 full correlation matrix reduced correlation matrix zero off diagonal element 60 90 120 SASA Figure: Probability density function of the Solvent accessible surface area (SASA) of the 14th residue. Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues gpc coefficient 100 10 0 gpc coefficient 80 60 40 20 ξ χ Normalized eigenvalue 10 1 10 2 10 3 10 4 G C 0 10 0 10 1 10 2 gpc basis index Figure: gpc coefficients c α with respect to ξ and χ. 10 5 5 10 15 20 25 Index Figure: Eigenvalues of the gradient matrix G and the correlation matrix C. Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Relative L 2 error We compute the relative L 2 error by ε Ns N s X (ξ i ) X (ξ i ) 2 / X (ξ i ) 2 i where N s - the number of sampling data (N s = 10 6.) i Relative L 2 error 10 1 CS (ξ) p = 2 p = 3 CS (χ) p = 2 p = 3 Sp level 1 (55 samples) Sp level 2 (1513 samples) Relative L 2 error 10 1 CS (ξ) p = 2 p = 3 CS (χ) p = 2 p = 3 Sp level 1 (55 samples) Sp level 2 (1513 samples) 10 2 10 2 200 300 400 500 600 number of sample 200 300 400 500 600 number of sample Figure: Symbols - gpc expansions X (χ) and X (ξ) by compressive sensing method. Dash lines - Sparse grid points on level 1 and 2. Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Probability density function probability density distribution 0.1 0.075 0.05 0.025 MC (10 6 samples) SP level 1 (55 samples) SP level 2 (1513 samples) CS (300 samples) MC (300 samples) 0 60 70 80 90 100 110 120 SASA (a) K L divergence 10 2 10 3 10 4 SP level 1 (55 samples) MC (300 samples) MC (1200 samples) CS (χ) 10 5 200 300 400 500 number of sample 600 (b) Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Total SASA 10 0 40 Relative L 2 error 10 1 10 2 CS (ξ) CS (χ) SP level 1 (337 samples) < SASA N SASA gpc > 30 20 10 CS (800 samples) CS (1600 samples) 10 3 800 1200 1600 number of sample 2000 Figure: Relative L 2 error of the total SASA by gpc expansion. 0 3250 3300 3350 3400 3450 SASA Figure: Mean error of the surrogated model for the total SASA in different regimes. Quantifying the influence of conformational uncertainty in biomolecular solvation

Summary Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues We proposed a framework based on gpc expansion to quantify the conformation fluctuation induced uncertainty in bio-molecular systems. We proposed a method to elevate the sparsity of the gpc expansion, yielding more accurate surrogated model. This method is well-suited for UQ study in bio-molecular system of high dimensionality, where sample points are often accompanied with numerical error. Quantifying the influence of conformational uncertainty in biomolecular solvation

Construct stochastic model of conformation fluctuation Numerical methods to construct the surrogated model Numerical Example: SASA of individual/total residues Acknowledgement We acknowledge helpful discussion from: T. Goddard, G. Karniadakis, W. Pan, G. Schenter, X. Wan, Z. Zhang, W. Zhou. We acknowledge financial support from DOE Grant for the new Collaboratory on Mathematics for Mesoscopic Modeling of Materials. Quantifying the influence of conformational uncertainty in biomolecular solvation