Deep Learning for Partial Differential Equations (PDEs)

Similar documents
The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems

IN the field of ElectroMagnetics (EM), Boundary Value

Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Learning Tetris. 1 Tetris. February 3, 2009

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Existence Theory: Green s Functions

From CDF to PDF A Density Estimation Method for High Dimensional Data

Integer weight training by differential evolution algorithms

CSC2541 Lecture 5 Natural Gradient

An Accurate Fourier-Spectral Solver for Variable Coefficient Elliptic Equations

Class Meeting # 1: Introduction to PDEs

Solving Boundary Value Problems (with Gaussians)

Numerical Methods for PDEs


In this chapter we study elliptical PDEs. That is, PDEs of the form. 2 u = lots,

Design Collocation Neural Network to Solve Singular Perturbation Problems for Partial Differential Equations

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

PDEs, part 1: Introduction and elliptic PDEs

Lecture 13: Series Solutions near Singular Points

Partial Differential Equations

Research Article A Novel Differential Evolution Invasive Weed Optimization Algorithm for Solving Nonlinear Equations Systems

Introduction to Biomedical Engineering

Introduction to Neural Networks

A FAST SOLVER FOR ELLIPTIC EQUATIONS WITH HARMONIC COEFFICIENT APPROXIMATIONS

CLASSIFICATION AND PRINCIPLE OF SUPERPOSITION FOR SECOND ORDER LINEAR PDE

Final Exam May 4, 2016

Chapter 2 Boundary and Initial Data

Lecture 6: Introduction to Partial Differential Equations

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.

First order Partial Differential equations

Irregular Solutions of an Ill-Posed Problem

Separation of Variables in Linear PDE: One-Dimensional Problems

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

1 Curvilinear Coordinates

UNSUPERVISED LEARNING

Introduction - Motivation. Many phenomena (physical, chemical, biological, etc.) are model by differential equations. f f(x + h) f(x) (x) = lim

1 What a Neural Network Computes

5.4 - Quadratic Functions

PARTIAL DIFFERENTIAL EQUATIONS. Lecturer: D.M.A. Stuart MT 2007

Introduction to Convolutional Neural Networks (CNNs)

Parallel in Time Algorithms for Multiscale Dynamical Systems using Interpolation and Neural Networks

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

MATH 333: Partial Differential Equations

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017

Stochastic methods for solving partial differential equations in high dimension

Neural-Network Methods for Boundary Value Problems with Irregular Boundaries

Course Description for Real Analysis, Math 156

Stochastic Gradient Descent in Continuous Time

Design Collocation Neural Network to Solve Singular Perturbed Problems with Initial Conditions

NONLOCALITY AND STOCHASTICITY TWO EMERGENT DIRECTIONS FOR APPLIED MATHEMATICS. Max Gunzburger

Data Mining & Machine Learning

Least Mean Squares Regression. Machine Learning Fall 2018

ENGI 9420 Lecture Notes 1 - ODEs Page 1.01

Math 46, Applied Math (Spring 2008): Final

Lecture 4.2 Finite Difference Approximation

Deep Neural Networks and Partial Differential Equations: Approximation Theory and Structural Properties. Philipp Christian Petersen

ECE521 Lectures 9 Fully Connected Neural Networks

Chapter 3 Second Order Linear Equations

Least Mean Squares Regression

11.3 MATLAB for Partial Differential Equations

Data Mining Part 5. Prediction

UNIVERSITY of LIMERICK OLLSCOIL LUIMNIGH

A Novel Technique to Improve the Online Calculation Performance of Nonlinear Problems in DC Power Systems

How to do backpropagation in a brain

Predictive analysis on Multivariate, Time Series datasets using Shapelets

Path Integral Stochastic Optimal Control for Reinforcement Learning

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Hybrid particle swarm algorithm for solving nonlinear constraint. optimization problem [5].

Computational statistics

TMA4120, Matematikk 4K, Fall Date Section Topic HW Textbook problems Suppl. Answers. Sept 12 Aug 31/

MA Chapter 10 practice

Comments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms

AIMS Exercise Set # 1

Robotics Part II: From Learning Model-based Control to Model-free Reinforcement Learning

An Introduction to Parameter Estimation

LECTURE # 0 BASIC NOTATIONS AND CONCEPTS IN THE THEORY OF PARTIAL DIFFERENTIAL EQUATIONS (PDES)

Introduction and some preliminaries

Topic 3: Neural Networks

Course 10. Kernel methods. Classical and deep neural networks.

Differential equations, comprehensive exam topics and sample questions

Beyond finite layer neural network

INTRODUCTION TO PDEs

Linear Models for Regression CS534

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26

Characterisation of the plasma density with two artificial neural network models

Extra Problems and Examples

Latent Variable Models

Machine Learning in Modern Well Testing

Neural networks and optimization

Lab 5: 16 th April Exercises on Neural Networks

Finite Difference Methods for Boundary Value Problems

AM 205: lecture 14. Last time: Boundary value problems Today: Numerical solution of PDEs

Numerical study of time-fractional hyperbolic partial differential equations

Introduction to Boundary Value Problems

PHYS 410/555 Computational Physics Solution of Non Linear Equations (a.k.a. Root Finding) (Reference Numerical Recipes, 9.0, 9.1, 9.

Distributed Estimation, Information Loss and Exponential Families. Qiang Liu Department of Computer Science Dartmouth College

ECE662: Pattern Recognition and Decision Making Processes: HW TWO

4. Multilayer Perceptrons

Support Vector Machines

Transcription:

Deep Learning for Partial Differential Equations (PDEs) Kailai Xu kailaix@stanford.edu Bella Shi bshi@stanford.edu Shuyi Yin syin3@stanford.edu Abstract Partial differential equations (PDEs) have been widely used. However, solving PDEs using traditional methods in high dimensions suffers from the curse of dimensionality. The newly emerging deep learning technqiues are promising in resolving this problem because of its success in many high dimensional problems. In this project we derived and proposed a coupled deep learning neural network for solving the Laplace problem in two and higher dimensions. Numerical results showed that the model is efficient for modest dimension problems. We also showed the current limitation of the algorithm for high dimension problems and proposed an explaination. 1 Introduction & Related Work Solving PDEs (partial differential equations) numerically is the most computation-intensive aspect of engineering and scientific applications. Recently, deep learning emerges as a powerful technique in many applications. The success inspires us to apply such a technique to solving PDEs. One of the main feature is that it can represent complex-shaped functions effectively compared to traditional finite basis function representations, which may require large number of parameters or sophisticated basis. It can also be treated as a black-box approach for solving PDEs, which may serve as a firstto-try method. The biggest challenge is that this topic is rather new and there are few literature (for some examples on this topic, see [2]-[7] that we can refer to. Whether it can work reasonably well remains to be explored. L(θ) = (Lû(x, t θ) f) 2 (1) reads to a very accurate solution (up to 7 digits) to Lu = f (2) where L is some certain differential operator. Since 2017, many authors begin to apply deep learning neural network to solve PDE. Raissi, et al[3] considers, u t = N(t, x, u x, u xx,...), f := u t N(t, x, u x, u xx,...) (3) where u(x, t) is also represented by a neural network. Parameters of the neural network can be learnt by minimizing N ( u(t 2, x i ) u i 2 + f(t i, x i ) 2 ) (4) i=1 where the term u(t 2, x i ) u i 2 tries to fit the data on the boundary and f(t i, x i ) 2 tries to minimize the error on the collocation points. I.E. Lagaris et al has already applied artificial neural network to solve PDEs. Limited by computational resources they only used a single hidden layer and model the solution u(x, t) by a neural network û(x, t(θ)). CS230: Deep Learning, Winter 2018, Stanford University, CA. (LateX template borrowed from NIPS 2017.)

2 Dataset and Features The data set is generated randomly both on the boundaries and in the innerdomain. For every update in Line 3, we randomly generate {(x i, y i )} M i=1, (x i, y i ) Ω and compute g i = g D (x i, y i ). The data are then fed to (2) for computing loss and derivatives. For every update in Line 4, we randomly generate {(x i, y i )} N i=1, (x i, y i ) Ω and compute f i = f(x i, y i ). The data are then fed to (1) for computing loss and derivatives. Figure 1: Data Generation for Training the Neural Network. We sample randomly from the inner domain Ω and its boundary Ω. Here represents the sampling points within the domain and represents the sampling points on the boundary. 3 Approach Our main focus is elliptic PDEs, which is generally presented as: x ( u) p(x, y) x y p(x, y) u y x s ( u) q(x, y) + r(x, y)u = f(x, y), y in Ω u = g D (x, y), on Ω D x q(x, y) u y s + c(x, y)u = g N(x, y), In the case p = q 1, r 0, Ω D = Ω, we obtain the Poisson equation: { u = f, in Ω u = g D on Ω on Ω N (5) (6) In the case f = 0, we obtain Laplace equation. Our algorithm for solving Poisson equation (12) is as follows: we approximate u with u(x, y; w 1, w 2 ) = A(x, y; w 1 ) + B(x, y) N(x, y; w 2 ) where A(x, y; w 1 ) is a neural network that approximates the boundary condition A(x, y; w 1 ) Ω g D (7) and where B(x, y) satisfies B(x, y) Ω 0. Our network looks like this fig. 2. 4 Experiments/Results/Discussion To demonstrate the effectiveness of our method, we applied method of manufactured solution (MMS), i.e. construct some equations with analytic solutions and then compare the numerical solution with the analytic solutions. We consider problems on the unit square Ω = [0, 1] 2, and the reference solution is computed on a uniform 50 50 grid. The test is on well-behaved solutions, solutions with peaks and some other less regular solutions. All of them were manufactured. 2

Figure 2: The structure of our network. We proposed this idea at the inspiration of GAN, where we update the boundary network for several times before we update the PDE network once. Our loss function is made up of two components: boundary loss and interior loss. 4.1 Example 1: Well-behaved Solution Consider the analytical solution then we have and the boundary condition is zero boundary condition. u(x, y) = sin πx sin πy, (x, y) [0, 1] 2 (8) f(x, y) = u(x, y) = 2π 2 sin πx sin πy (9) The hyper-parameters for the network include ε = 10 5, L = 3, n [1] = n [2] = n [3] = 64. In figures below, the red profile shows exact solutions while the green profile shows the neural network approximation. Figure 3 (a) shows the initial profiles of the neural network approximation, while fig. 3 (c) shows that after 300 iterations it is nearly impossible to distinguish the exact solution and the numerical approximation. (a) (b) (c) Figure 3: Evolution of the neural network approximation. At 300 iteration, it is hard to distinguish numerical solution and exact solution by eyes. 3

(a) (b) (c) Figure 4: Loss and error of the neural network. For L b, we have adopted an early termination strategy and therefore the error stays around 10 5 once it reaches the threshold. The PDE error L i and L 2 error approximation keeps decreasing, indicating effectiveness of the algorithm. 4.2 Example 2: Solutions with a Peak Consider the manufactured solution u(x, y) = exp[ 1000(x 0.5) 2 1000(y 0.5) 2 ] + v(x, y) (10) where v(x, y) is a well-behaved solution, such as the one in section 4.1. Figure 5 shows the plot of exp[ 1000(x 0.5) 2 1000(y 0.5) 2 ]. We classify this kind of function as ill-behaved due to its dramatical change at a small area and therefore leads to very large Laplacian. This can be difficult for the neural network approximation to work. The gradient of the optimizer can be very large and a general step size will take the solution to badly behaved ones. As an illustration, let v(x, y) = sin(πx). If we do not do the singularity subtraction and run the neural network program directly, we obtain fig. 5. We see that the numerical solution diverges. (a) (b) (c) Figure 5: Plot of exp[ 1000(x 0.5) 2 1000(y 0.5) 2 ], which has a peak at (0.5, 0.5), and the evolution across iterations. This extreme behavior proposes difficulty for neural network approximation, since the Laplacian around (0.5, 0.5) changes dramatically and can be very large. Solution converges on the boundary after several iterations but because of the peak, it diverges within the domain. 4.3 Example 3: Less Regular Solutions Consider the following manufactured solution u(x, y) = y 0.6 (11) the derivative of the solution goes to infinity as y, which makes the loss function hard to optimize. In the meanwhile, we notice that for any interval [δ, 1], δ > 0, the derivatives of u(x, y) on [0, 1] [δ, 1] is bounded, that is to say, only the derivatives near y = 0 bring trouble to the optimization. Figure fig. 6 shows the initial profile of the neural network approximation. See Section 4.1 for detailed description of the surface and notation. The convergence is less satisfactory than section 4.1. Figure 8 shows the loss and error plot with respect to training iterations. The boundary loss did not reach the 4

(a) (b) (c) Figure 6: Numerical solution at iteration 1000. We can see that near y = 0, there is data mismatch between numerical solution and exact solution and network approximation. At y 0, the derivative. We can see the distortion between approximation solution and exact solution near y = 0. u y threshold ε within 1000 iterations. Both the loss and the error show oscillatory behavior across the iterations, although they possess decreasing trends. This shows the difficulty when we deal with less regular solutions. Even at 1000 iteration, we can still observe obvious mismatch between the exact (a) (b) (c) Figure 7: Loss and error of the neural network. The boundary loss did not reach the threshold ε within 1000 iterations. Both the loss and the error show oscillatory behavior across the iterations, although they possess decreasing trends. solution and the numerical solution (fig. 6). 4.4 Higher dimensions After our trial on 2D problem, we further tried 5D, 7D, 8D and 10D, varying the number of layers for each case. From the plots presented, we observe that, when dimension is small, increasing the number of layers does not help with the ultimate performance of our approximation. The convergence speed, however, increased when there are more layers in the network. As we see, this holds true for (a) and (b), where dimensions are 2 and 5. Our approximation for high dimensions is constrained by computation power, surprisingly. In 7D approximation, all settings show promises, as the L 2 loss is on the decline. But we do not see the convergence because the training iterations are not enough and obviously, to converge, we need a lot more steps. For 10D domain approximation, the case may be either the approximation does not help at all, or the converge takes so long that we do not have enough iterations to observe it. This result is reasonable, since for high dimensional problem, the curve itself is complex and difficult to approximate, and our network may simply do not have enough parameters to finish this task. We here do not have the space to plot for 8D and 10D approximation, the process runs so long even on AWS that it almost looks like the loss is not being reduced at all. 5

(a) (b) (c) Figure 8: Performance of approximation for domains in different dimensions. In low dimensional problems, we see the convergence fast, and the more layers we have in the neural network, the faster convergence we observe; the number of layers do not affect the ultimate performance, measured in L 2 loss. The higher the dimensions are, the more difficult for us to see convergence, or in other words, the more iterations the approximation requires to achieve convergence. Notice in (c), for 7D domain approximation, we cannot see the convergence in 30K iterations, but in general the loss is being reduced. 5 Conclusion/Future Work In this project, we proposed novel deep learning approaches to solve the Lapalce equation, { u = f, in Ω (12) u = g D on Ω We also showed results for different kinds of solutions, and discussed the high dimension cases as well. In the future, we will generalize results to other types of PDEs, and also investigate algorithms for ill-behaved solutions, such as peaks, exploding gradients, oscillations, etc. 6 Contributions Kailai worked on mathematical formulation of the methods, while Bella and Shuyi worked on the model tuning and realization. The source code can be seen at https://github.com/kailaix/nnpde. References [1] Wang, Zixuan, et al. Sparse grid discontinuous Galerkin methods for high-dimensional elliptic equations. Journal of Computational Physics 314 (2016): 244-263. [2] I.E. Lagaris, A. Likas and D.I. Fotiadis. Artificial Neural Networks for Solving Ordinary and Partial Differential Equations, 1997. [3] Maziar Raissi. Deep Hidden Physics Models, Deep Learning of Nonlinear Partial Differential Equations, 2018. [4] Justin Sirignano and Konstantinos Spiliopoulos. DGM: A deep learning algorithm for solving partial differential equations, 2007. [5] Weinan E, Jiequn Han, and Arnulf Jentzen. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations, 2017. [6] Zichao Long, Yiping Lu, Xianzhong Ma, and Bin Dong. PDE-Net, Learning PDEs from data, 2018. [7] Modjtaba Baymani, Asghar Kerayechian, Sohrab Effati. Artificial Neural Networks Approach for Solving Stokes Problem, Applied Mathematics 2010. [8] M.M. Chiaramonte and M. Kiener. Solving differential equations using neural networks. [9] Peng, Shige, and Falei Wang. BSDE, path-dependent PDE and nonlinear Feynman-Kac formula. Science China Mathematics 59.1 (2016): 19-36. 6