Tokamak profile database construction incorporating Gaussian process regression

Size: px

Start display at page:

Download "Tokamak profile database construction incorporating Gaussian process regression"

Pierce Simon
6 years ago
Views:

1 Tokamak profile database construction incorporating Gaussian process regression A. Ho 1, J. Citrin 1, C. Bourdelle 2, Y. Camenen 3, F. Felici 4, M. Maslov 5, K.L. van de Plassche 1,4, H. Weisen 6 and JET Contributors 1 DIFFER - Dutch Institute for Fundamental Energy Research, Eindhoven, The Netherlands 2 CEA, IRFM, Saint Paul Lez Durance, France 3 IIFS/PIIM, CNRS - Université de Provence, Marseille, France 4 Eindhoven University of Technology, Eindhoven, The Netherlands 5 EURATOM-CCFE Fusion Association, Culham Science Centre, Abingdon, UK 6 Swiss Plasma Centre, EPFL, Lausanne, Switzerland See the author list of Overview of the JET results in support to ITER by X. Litaudon et al. to be published in Nuclear Fusion Special issue: overview and summary reports from the 26th Fusion Energy Conference (Kyoto, Japan, October 2016) May 31, / 13

2 Using NNs to accelerate of turbulent transport coefficient estimation Goal: Emulate linear gyrokinetic (GK) turbulence simulations using high-dimensional (>20D) neural networks (NN) J. Citrin, S. Breton, F. Felici, et al., Real-time capable first principle based modelling of tokamak turbulent transport, Nuclear Fusion, vol. 55, no. 9, p , 2015 Improves tractability of calculations Off-line discharge optimization Controller design and / or real-time control Lots of data needed to train NN, limited by cost of GK simulations Discharge data repositories = simulation inputs in relevant subspace 2 / 13

3 Workflow of data extraction from large repositories Discharge selection Time window selection Profile fitting Sanity and consistency checks per discharge 15% loss 50% loss? Profile database, training set sampling Quality profile subset for validation Parameters: T e,i, T e,i, n e,i,imp, n e,i,imp, q, ŝ, ω tor, S Q,n, etc... Important to obtain reliable error estimates on derivative quantities! 3 / 13

4 Brief overview of Gaussian process regression A regression technique based on Bayesian statistical principles Data pair(s): (x, y) Basis function(s): Φ(x) Weight(s): w Output error / noise: ε y = Φ(x) T w + ε p(y x, w) p(w) p(w x, y) = p(y x) ( ) w N (0, Σ w), ε N 0, σn 2 4 / 13

5 Brief overview of Gaussian process regression A regression technique based on Bayesian statistical principles Data pair(s): (x, y) Basis function(s): Φ(x) Weight(s): w Output error / noise: ε y = Φ(x) T w + ε p(y x, w) p(w) p(w x, y) = p(y x) ( ) w N (0, Σ w), ε N 0, σn 2 Gaussians allow analytical solutions and use of kernel trick: E [ y(x) y ( x )] = Φ(x) T Σ wφ ( x ) + σ 2 n δ xx k( x, x ) + σ 2 n δ xx Free parameters, Θ, defined by choice of kernel, ie. k(x, x ) k(x, x, Θ) 4 / 13

6 Brief overview of Gaussian process regression A regression technique based on Bayesian statistical principles Data pair(s): (x, y) Basis function(s): Φ(x) Weight(s): w Output error / noise: ε y = Φ(x) T w + ε p(y x, w) p(w) p(w x, y) = p(y x) ( ) w N (0, Σ w), ε N 0, σn 2 Gaussians allow analytical solutions and use of kernel trick: E [ y(x) y ( x )] = Φ(x) T Σ wφ ( x ) + σ 2 n δ xx k( x, x ) + σ 2 n δ xx Free parameters, Θ, defined by choice of kernel, ie. k(x, x ) k(x, x, Θ) Making predictions of y given x : K(x, x) = k ( x = x, x = x ) ȳ = K(x, x) [ K(x, x) y + σ 2 n I ] V[y ] = K(x, x ) K(x, x) [ K(x, x) + σ 2 ni ] 1 K(x, x ) 4 / 13

7 Example: Squared-exponential (SE) kernel Φ(x) = {φ µ(x)} ( ( ) ) ( ) k x, x = πlσ w exp (x x 2 (x µ)2 2 ( ) 2 2l φ µ(x) = exp 2l 2 { Θ = σ a = πlσ w, σ l = } 2l Sample y drawn from model with random values for w 5 / 13

properties For more info on statistical theory and constructing kernels, see: C. Bishop, Pattern Recognition and Machine Learning.

8 Example: Squared-exponential (SE) kernel Φ(x) = {φ µ(x)} ( ( ) ) ( ) k x, x = πlσ w exp (x x 2 (x µ)2 2 ( ) 2 2l φ µ(x) = exp 2l 2 { Θ = σ a = πlσ w, σ l = } 2l Sample y drawn from model with random values for w More kernels are available, each with additional properties For more info on statistical theory and constructing kernels, see: C. Bishop, Pattern Recognition and Machine Learning. New York, NY: Springer, 2006 C. Rasmussen and C. Williams, Gaussian Processes for Machine Learning. Cambridge, MA: MIT Press, / 13

9 Optimizing the kernel hyperparameters using gradient-ascent Denominator of Bayes Theorem is called the marginal likelihood: p(y x) = p(y x, w) p(w) dw Combined with Gaussian assumptions gives log-marginal-likelihood: log p(y x) = ( 1 2 yt K + σni ) 2 1 y 1 log 2 K + σ 2 n I 1 nx log 2π 2 Goodness of fit Kernel complexity Size of data set Cost function = maximized w.r.t the hyperparameters, Θ: One of many ways to determine Θ Gives Θ with maximum probability to match input data Gradient ascent search algorithm implemented with Monte Carlo kernel restarts to ensure non-local maxima 6 / 13

10 Choice of kernels used to fit experimental profiles Rational Quadratic: k ( x, x ) = σ 2 ( ) 1 + (x x ) 2 α 2αl 2, Θ = {σ, α, l} Imposes required smoothness for sufficiently large l Higher degree of flexibility compared to Squared-Exponential Fails to provide good fits through pedestal when present Gibbs: ( ( ) k x, x ) = σ 2 (x x ) 2 exp 2l 2 ( ) (x µ) 2 l = l 0 l 1 exp 2σl 2, Θ = {σ, l 0, l 1, σ l } Imposes general smoothness through sufficiently large l 0 Allows rough pedestal fitting by reduction of l near LCFS Not perfectly reliable, but provides a decent estimate 7 / 13

11 Decision chart of the extraction workflow Extract data from DDAs Calculate coordinate systems Jump at ρ > 0.8? Y N Gibbs Kernel Regular Kernel EFIT data exists? N Reject Y n e,t e data exists? N Y Perform GP fits T i exists? n imp exists? Constr. q exists? Y N Y N Y N Perform GP T i = T e Perform GP Use Z eff,est Use constrain Apply bias Save Apply sanity checks Calculate GK parameters 8 / 13

12 Sample workflow output (JET #70274) Experimental feature: Pedestal fitting using Gibbs kernel with Gaussian warping Allows for pedestal location estimation 9 / 13

Estimation of systematic coordinate mapping errors Compared standard EFIT to EFIT constrained with Faraday rotation / MSE measurements Figure :

13 Estimation of systematic coordinate mapping errors Compared standard EFIT to EFIT constrained with Faraday rotation / MSE measurements Figure : JET #89091 Figure : Normalized poloidal flux differences with fitted 3rd-order polynomials for outer radii (red) and inner radii (blue) 10 / 13

14 Impact of estimated bias on profiles (JET #91366) General result: δρ 0.05 shift towards edge For purposes of turbulence + transport integrated modelling, the primary concern is the shift of the boundary condition (pedestal shoulder location) Effect of this shift on such modelling has not yet been investigated 11 / 13

15 Implemented checks to identify high quality data for validation 1. Integrated heat deposition profiles vs. injected heating power Validity check of heat deposition profiles 2. Heat deposition profiles vs. accumulated equipartition flux (T i T e ) Validity check of fitted T profiles from GP Equipartition flux does not yet account for isotope of main ion (easily implementable) For windows without CX, can be used to estimate how large T i T e can be 3. Integrated pressure ( 3 2nT ) profiles vs. total plasma energy Validity check of fitted n, T profiles from GP Open to additional suggestions for checks, provided they are computationally quick to perform 12 / 13

16 Summary and next steps Summary: Gaussian process regression workflow for mass automated profile fitting (JET data) for extracting parameter subspace for GK runs Estimated bias between unconstrained and constrained EFIT (using Faraday rotation and/or MSE) for boundary condition modification in integrated modelling Basic sanity checks applied to identify high quality data set, used to test performance of future NN within integrated modelling suites Next Steps: Improve checks to specify validation set for integrated modelling Extend workflow to include data from other machines via adapters Detailed analysis of JET parameter space for problem characterization / dimensionality reduction Perform gyrokinetic simulations (QuaLiKiz, linear GENE) to build the training set for NN 13 / 13

17 Using statistics on bias to estimate stochastic errors In theory, x-errors can be applied by enhancing y-errors proportionally to gradient of model fit without x-errors K(x, x) + σ 2 ni = K(x, x) + σ 2 ni + y Σ x T y A. McHutchon and C. Rasmussen, Gaussian process training with input noise, Advances in Neural Information Processing Systems, pp , 2011 However, error bar size yields nonsensical fits inappropriate error estimation 14 / 13

Gaussian Process Regression

Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process