Control-oriented model learning with a recurrent neural network

Size: px

Start display at page:

Download "Control-oriented model learning with a recurrent neural network"

Brandon Stanley
5 years ago
Views:

1 Control-oriented model learning with a recurrent neural network M. A. Bucci O. Semeraro A. Allauzen L. Cordier G. Wisniewski L. Mathelin 20 November 2018, APS Atlanta

Kuramoto-Sivashinsky (KS) u t = 4 u 2 u 1 2 u 2 it

or chaotic solution the length of the domain is the

spatial discretization: 64 points implicit time

2 Kuramoto-Sivashinsky (KS) u t = 4 u 2 u 1 2 u 2 it models diffusive instabilities in flame front steady or chaotic solution the length of the domain is the critical parameter Numerically solved L = 22 Fourier spatial discretization: 64 points implicit time marching scheme: dt = π L time L L L M. A. Bucci!2

3 Model vs Model-Free Model based-control Training a model for the dynamics allows Model Predictive Control approaches and/or Opposite Control approach. Model free control Deep Reinforcement Learning algorithms (DQN, DDQN, DDPG, ) solve the Bellman equation to maximize the objective function. The solution of the Bellman equation is a necessary and sufficient condition for the optimality of the control policy if and only if the whole phase state is known. A model valid in the whole phase state can be used to explore effectively the phase state. Example of Kuramoto-Sivashinky control with DDPG from equilibrium solution of KS system E3 to E2 for L = 22. M. A. Bucci!3

4 Model vs Model-Free Model based-control Training a model for the dynamics allows Model Predictive Control approaches and/or Opposite Control approach. Model free control Deep Reinforcement Learning algorithms (DQN, DDQN, DDPG, ) solve the Bellman equation to maximize the objective function. The solution of the Bellman equation is a necessary and sufficient condition for the optimality of the control policy if and only if the whole phase state is known. A model valid in the whole phase state can be used to explore effectively the phase state. Example of Kuramoto-Sivashinky control with DDPG from equilibrium solution of KS system E3 to E2 for L = 22. M. A. Bucci!3

Long time horizon prediction Neural Networks models are extremely powerful to forecast chaotic dynamics. [1] Pathak, Jaideep, et al. (2018): [2] Vlachas, Pantelis R., et al. (2018). [3] Pathak, Jaideep, et al.

5 Long time horizon prediction Neural Networks models are extremely powerful to forecast chaotic dynamics. [1] Pathak, Jaideep, et al. (2018): [2] Vlachas, Pantelis R., et al. (2018). [3] Pathak, Jaideep, et al. (2017) Figure: prediction of Kuramoto-Sivashinsky chaotic dynamics. Picture from [1]. c t 1 Recurrent Neural Network (LSTM) + F t Ct I t O t σ σ tanh σ tanh h t h t 1 x t c t c t h t x t Layer Pointwise operation Copy memory state input data M. A. Bucci!4

6 Long time horizon prediction Neural Networks models are extremely powerful to forecast chaotic dynamics. [1] Pathak, Jaideep, et al. (2018): [2] Vlachas, Pantelis R., et al. (2018). [3] Pathak, Jaideep, et al. (2017) Figure: prediction of Kuramoto-Sivashinsky chaotic dynamics. Picture from [1]. Recurrent Neural Network (LSTM) c t 1 Forget gate + F t Ct I t O t σ σ tanh σ tanh h t h t 1 x t c t c t h t x t Layer Pointwise operation Copy memory state input data F t = σ(x t U f + h t 1 W f ) M. A. Bucci!4

7 Long time horizon prediction Neural Networks models are extremely powerful to forecast chaotic dynamics. [1] Pathak, Jaideep, et al. (2018): [2] Vlachas, Pantelis R., et al. (2018). [3] Pathak, Jaideep, et al. (2017) Figure: prediction of Kuramoto-Sivashinsky chaotic dynamics. Picture from [1]. Recurrent Neural Network (LSTM) c t 1 + F t Ct I t O t σ Update gate σ tanh σ tanh h t h t 1 x t c t c t h t x t Layer Pointwise operation Copy memory state input data F t = σ(x t U f + h t 1 W f ) C t = tanh(x t U c + h t 1 W c ) I t = σ(x t U i + h t 1 W i ) c t = F t c t 1 + I t C t M. A. Bucci!4

8 Long time horizon prediction Neural Networks models are extremely powerful to forecast chaotic dynamics. [1] Pathak, Jaideep, et al. (2018): [2] Vlachas, Pantelis R., et al. (2018). [3] Pathak, Jaideep, et al. (2017) Figure: prediction of Kuramoto-Sivashinsky chaotic dynamics. Picture from [1]. Recurrent Neural Network (LSTM) c t 1 + F t Ct I t O t σ σ tanh Output gate σ tanh h t h t 1 x t c t c t h t x t Layer Pointwise operation Copy memory state input data F t = σ(x t U f + h t 1 W f ) C t = tanh(x t U c + h t 1 W c ) I t = σ(x t U i + h t 1 W i ) O t = σ(x t U o + h t 1 W o ) c t = F t c t 1 + I t C t h t = O t tanh(c t ) M. A. Bucci!4

9 KS predictability Input: Output: u n, c n u n+1, c n+1 2 Layer of LSTM with 256 Neurons 1 Linear layer 64 neurons M. A. Bucci!5

10 KS predictability Input: Output: u n, c n u n+1, c n+1 2 Layer of LSTM with 256 Neurons 1 Linear layer 64 neurons M. A. Bucci!5

11 KS predictability Input: Output: u n, c n u n+1, c n+1 2 Layer of LSTM with 256 Neurons 1 Linear layer 64 neurons M. A. Bucci!5

12 KS predictability Input: Output: u n, c n u n+1, c n+1 2 Layer of LSTM with 256 Neurons 1 Linear layer 64 neurons M. A. Bucci!5

13 KS predictability Input: Output: u n, c n u n+1, c n+1 2 Layer of LSTM with 256 Neurons 1 Linear layer 64 neurons with few data NN can predict the solution for a long period Training Prediction M. A. Bucci!5

14 KS predictability Input: Output: u n, c n u n+1, c n+1 2 Layer of LSTM with 256 Neurons 1 Linear layer 64 neurons with few data NN can predict the solution for a long period prediction from an unseen initial condition fails Training Prediction initialize memory prediction M. A. Bucci!5

15 KS predictability Input: Output: u n, c n u n+1, c n+1 2 Layer of LSTM with 256 Neurons 1 Linear layer 64 neurons with few data NN can predict the solution for a long period prediction from an unseen initial condition fails artificial stable solutions might arise Training initialize memory Prediction prediction M. A. Bucci!5

KS predictability Input: un, cn Output: un+1, cn+1 2 Layer of LSTM with 256 Neurons 1 Linear layer 64 neurons with few data NN can predict the solution for a long period prediction from an unseen

16 KS predictability Input: un, cn Output: un+1, cn+1 2 Layer of LSTM with 256 Neurons 1 Linear layer 64 neurons with few data NN can predict the solution for a long period prediction from an unseen initial condition fails Training Prediction artificial stable solutions might arise Spurious correlation can be obtained even in presence of large dataset if taken along poorly chosen trajectories M. A. Bucci initialize memory!5 prediction

17 Robust learning: open questions f propagator of K-S system approximated by LSTM Neural Network architecture x n+1 = f(x n ) x n+1 = LSTM(x n ) M. A. Bucci!6

18 Robust learning: open questions f propagator of K-S system approximated by LSTM Neural Network architecture NN training: L = min x n+1 x n+1 x n+1 = f(x n ) x n+1 = LSTM(x n ) The neural network training minimizes the distance between the true chaotic trajectory and the predicted one M. A. Bucci!6

19 Robust learning: open questions f propagator of K-S system approximated by LSTM Neural Network architecture NN training: L = min x n+1 x n+1 x n+1 = f(x n ) x n+1 = LSTM(x n ) verify: f LSTM? The neural network training minimizes the distance between the true chaotic trajectory and the predicted one Is this procedure enough to recover a model that is statistically representative of the KS system? M. A. Bucci!6

20 Robust learning: open questions f propagator of K-S system approximated by LSTM Neural Network architecture NN training: L = min x n+1 x n+1 x n+1 = f(x n ) x n+1 = LSTM(x n ) verify: f LSTM? The neural network training minimizes the distance between the true chaotic trajectory and the predicted one Is this procedure enough to recover a model that is statistically representative of the KS system? What is the discriminant information that NN learns during the training? M. A. Bucci!6

21 Robust learning: open questions f propagator of K-S system approximated by LSTM Neural Network architecture NN training: L = min x n+1 x n+1 x n+1 = f(x n ) x n+1 = LSTM(x n ) verify: f LSTM? The neural network training minimizes the distance between the true chaotic trajectory and the predicted one Is this procedure enough to recover a model that is statistically representative of the KS system? What is the discriminant information that NN learns during the training? Can we introduce deterministic information in the data to achieve a statistically correct model? M. A. Bucci!6

22 Theoretical amount of data and ergodic measurement A chaotic system is well characterized by the correlation dimension: C(m, ε) = 2 (N m)(n m 1) N i=m N j=i+1 Φ(ε X i X j ) Grassberger-Procaccia (1987): C(m, ε) = ε D 2 Minimum data to converge D2: N > (D/ε) D 2 /2 Eckmann & Ruelle (1991) N > 2(D 2 + 1) D 2 Essex (1991) N > R(2 Q) 2(1 Q) 2D 2 +1 Baker & Gollub (1996) M. A. Bucci!7

Theoretical amount of data and ergodic measurement A chaotic system is well characterized by the correlation dimension: C(m, ε) = 2 (N m)(n m 1) N i=m N j=i+1 Φ(ε X i X j ) Grassberger-Procaccia

23 Theoretical amount of data and ergodic measurement A chaotic system is well characterized by the correlation dimension: C(m, ε) = 2 (N m)(n m 1) N i=m N j=i+1 Φ(ε X i X j ) Grassberger-Procaccia (1987): C(m, ε) = ε D 2 Minimum data to converge D2: N > (D/ε) D 2 /2 Eckmann & Ruelle (1991) X = σ(y X) Y = XZ + ρx Y Z = XY βz D 2 = 2.06 N > 2(D 2 + 1) D 2 Essex (1991) R(2 Q) N > 2(1 Q) 2D 2 +1 Baker & Gollub (1996) dt = 0.01 m = 5 τ = 29 N = σ = 10 β = 8/3 ρ = 28 M. A. Bucci!7

24 Choice of the data 3 datasets to train LSTM neural network model: M. A. Bucci!8

25 Choice of the data 3 datasets to train LSTM neural network model: 1 trajectory on the chaotic attractor N = M. A. Bucci!8

26 Choice of the data 3 datasets to train LSTM neural network model: 1 trajectory on the chaotic attractor N = M. A. Bucci 9 trajectories randomly initialized on the chaotic attractor N = 3000!8

27 Choice of the data 3 datasets to train LSTM neural network model: 1 trajectory on the chaotic attractor N = M. A. Bucci 9 trajectories randomly initialized on the chaotic attractor N = 3000!8 Kawahara G., Uhlmann M., & Van Veen, L. (2012). The significance of simple invariant solutions in turbulent flows. Annual Review of Fluid Mechanics, 44,

28 Choice of the data 3 datasets to train LSTM neural network model: 1 trajectory on the chaotic attractor N = M. A. Bucci E0 = (0,0,0) 9 trajectories randomly initialized on the chaotic attractor N = 3000!8 9 trajectories from fixed points N = 3000 Kawahara G., Uhlmann M., & Van Veen, L. (2012). The significance of simple invariant solutions in turbulent flows. Annual Review of Fluid Mechanics, 44,

29 Choice of the data 3 datasets to train LSTM neural network model: 1 trajectory on the chaotic attractor N = M. A. Bucci E1 = ( β(ρ 1), 9 trajectories randomly initialized on the chaotic attractor N = 3000!8 β(ρ 1), (ρ 1)) 9 trajectories from fixed points N = 3000 Kawahara G., Uhlmann M., & Van Veen, L. (2012). The significance of simple invariant solutions in turbulent flows. Annual Review of Fluid Mechanics, 44,

30 Choice of the data 3 datasets to train LSTM neural network model: 1 trajectory on the chaotic attractor N = M. A. Bucci E2 = ( β(ρ 1), 9 trajectories randomly initialized on the chaotic attractor N = 3000!8 β(ρ 1), (ρ 1)) 9 trajectories from fixed points N = 3000 Kawahara G., Uhlmann M., & Van Veen, L. (2012). The significance of simple invariant solutions in turbulent flows. Annual Review of Fluid Mechanics, 44,

31 Learning with different strategies computational time 80% cheaper then the standard strategy D 2 = 2.06 ± 0.1 D 2 = 0.13 ± 0.81 D 2 = 2.17 ± 0.09 C(ε) C(ε) C(ε) ε ε ε Z n+1 Z n+1 Z n+1 Z n Z n Z n M. A. Bucci!9

(2010) KS LSTM Error CNN LSTM CNN x n x n+1 The instant solution is embedded by CNN to take into account the periodicity of the

32 KS results Training dataset composed by 64 trajectories leaves from each invariant solution in KS with L = 22: 4 equilibrium states plus 2 traveling waves plus symmetries. Cvitanović, P., Davidchack, R. L., & Siminos, E. (2010) KS LSTM Error CNN LSTM CNN x n x n+1 The instant solution is embedded by CNN to take into account the periodicity of the spatial solution. The achieved model can be used starting from any initial conditions without high loss of statistical properties of the dynamics. M. A. Bucci!10

33 Conclusions LSTM architecture is useful to extrapolate dynamics from a chaotic trajectory. The model trained with just one chaotic trajectory is no longer valid to predict dynamics onto an unseen chaotic trajectory. A statistically correct model can be recovered if physical informed dataset is used for the training. The approximation of Neural Networks allows to successfully design control policies for non-linear systems using reinforcement learning. Acknowledgment: ANR/DGA Flowcon project, ANR-17-ASTR-0022 M. A. Bucci!11

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)