Neural Network Tutorial & Application in Nuclear Physics. Weiguang Jiang ( 蒋炜光 ) UTK / ORNL

Similar documents
Introduction to Neural Networks

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016

Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Tips for Deep Learning

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.

Artificial Neuron (Perceptron)

Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 3: Introduction to Deep Learning (continued)

Tips for Deep Learning

Deep Feedforward Networks

Based on the original slides of Hung-yi Lee

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Artificial Neural Networks. Edward Gatt

Jakub Hajic Artificial Intelligence Seminar I

CSE 190 Fall 2015 Midterm DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO START!!!!

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Introduction to Neural Networks

Neural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

Deep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation

Artificial Intelligence

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

CSC321 Lecture 5: Multilayer Perceptrons

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

epochs epochs

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Deep Feedforward Networks

Deep Learning & Artificial Intelligence WS 2018/2019

Machine Learning for Physicists Lecture 1

Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 16

Deep Learning II: Momentum & Adaptive Step Size

Lecture 5: Logistic Regression. Neural Networks

Logistic Regression. COMP 527 Danushka Bollegala

Deep Feedforward Networks. Lecture slides for Chapter 6 of Deep Learning Ian Goodfellow Last updated

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions

Introduction to (Convolutional) Neural Networks

Comments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms

CSCI567 Machine Learning (Fall 2018)

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

Understanding Neural Networks : Part I

Nonlinear Classification

CSC242: Intro to AI. Lecture 21

Neural Networks: Optimization & Regularization

Deep Feedforward Networks. Han Shao, Hou Pong Chan, and Hongyi Zhang

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4

Reading Group on Deep Learning Session 1

Machine Learning. Boris

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

Computational statistics

Lecture 2: Logistic Regression and Neural Networks

Course 395: Machine Learning - Lectures

Neural Networks: Backpropagation

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

CS260: Machine Learning Algorithms

Deep Feedforward Networks. Seung-Hoon Na Chonbuk National University

EVERYTHING YOU NEED TO KNOW TO BUILD YOUR FIRST CONVOLUTIONAL NEURAL NETWORK (CNN)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

More on Neural Networks

Day 3 Lecture 3. Optimizing deep networks

Artificial Neural Networks

Deep Learning Lab Course 2017 (Deep Learning Practical)

Machine Learning

Artificial Neural Networks. MGS Lecture 2

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Unit III. A Survey of Neural Network Model

4. Multilayer Perceptrons

Multilayer Perceptrons and Backpropagation

Intro to Neural Networks and Deep Learning

CS60010: Deep Learning

Advanced Machine Learning

ECE521 Lecture 7/8. Logistic Regression

ECE521 Lectures 9 Fully Connected Neural Networks

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Machine Learning Basics III

Machine Learning (CSE 446): Neural Networks

Neural Network Training

Neural Networks Teaser

An overview of deep learning methods for genomics

Foundations of Deep Learning: SGD, Overparametrization, and Generalization

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Identifying QCD transition using Deep Learning

Neural Networks Lecture 4: Radial Bases Function Networks

10. Artificial Neural Networks

Artifical Neural Networks

Learning Neural Networks

Deep Learning & Neural Networks Lecture 4

Lecture 17: Neural Networks and Deep Learning

Nonlinear Models. Numerical Methods for Deep Learning. Lars Ruthotto. Departments of Mathematics and Computer Science, Emory University.

Advanced computational methods X Selected Topics: SGD

Neural Networks DWML, /25

NN V: The generalized delta learning rule

Transcription:

Neural Network Tutorial & Application in Nuclear Physics Weiguang Jiang ( 蒋炜光 ) UTK / ORNL

Machine Learning Logistic Regression Gaussian Processes Neural Network Support vector machine Random Forest Genetic Algorithm

Machine Learning Establish a function Image Reconition f ( ) = Panda Playing Go f ( ) = 3-4 (next move) Extrapolation f ( ) = g.s. : -27.63 MeV

Machine Learning Framework Image Recognition f ( ) = Panda f Model : Complex function with lots of parameters (black box) f 1 ( ) = Panda f 1 ( ) = Dog f 2 ( ) = Panda f 2 ( ) = cat

Machine Learning Supervised Learning A set of function f 1 ( ) = Panda f 1 ( ) = Dog f 2 ( ) = Panda f 2 ( ) = cat Goodness of the function f Training Data Supervised Learning Function input: Function output: bird Timon dog

Machine Learning Supervised Learning A set of function f 1, f 2, f 3 Training Testing dog Goodness of the function f Find the best function f Use f Training Data bird Timon dog

Three Steps for Machine Learning Step 1: Define a set of function Step 2: Evaluate the function Step 3: Choose the best function

Three Steps for Machine Learning Step 1: Neural Network Step 2: Evaluate the function Step 3: Choose the best function

Neural Network & Machine Learning neurons input layer weights hidden layer(s) 128 X 128 X 3 color = 49152 neurons 0-255 chroma output layer 0 1 0 0 0 panda A sample of feed forward neural network (NN)

Feedforward Neural Network x 1 x 2 x 3 x 4 x 5 x 6 z 1 = x 1 w 1 + +x n w n + bias z 1 x w 1 z σ(z) x Activation function y 1 Feedforward: The value for every neuron only depend on the previous layer.

Activation function give non-linearity to the NN Sigmoid z σ(z) α Activation function ReLU

Neural Network Function input layer (X): one hidden layer: output layer (Y): i neurons j neurons k neurons Z j = X i W ij + b j i X j = σ (Z j ) Y k = X j W jk + b k j

Tensor operation Training data: input layer (X): one hidden layer: output layer (Y): p sample i neurons j neurons k neurons Z (p,j) = X (p,i) W (i,j) + b (p,j) z 11 z 1j = z p1 z pj x 11 x 1i x p1 x pi w 11 w 1j + w i1 w ij b 11 b 1j b p1 b pj Y (p,k) = X (p,j) W (j,k) + b (p,j) y 11 y 1j = y p1 y pj σ(z) 11 σ(z) 1j σ(z) p1 σ(z) pj w 11 w 1j + w i1 w ij b 11 b 1j b p1 b pj

Three Steps for Machine Learning Step 1: Neural Network Step 2: Evaluate the function Step 3: Choose the best function

Evaluate a network Image Recognition Introduce a loss function to describe the performance of the network (mse, cross entropy) x= y = f(x) NN Loss: p L = l p r=1 Smaller the better 0 1 1 0 = 0.2 0.8 Supervised : = 0 1 mse: l p = σ k y k y 2 k /k

Three Steps for Machine Learning Step 1: Neural Network Step 2: Evaluate the function Step 3: Choose the best function

Learning : find the best function Ultimate goal: Find the network parameters set that minimize the total loss L Gradient Descent (even for AlphaGo) Pick an initial value for network parameters w Compute L/ w with training data Update the parameters w w η L/ w Repeat until L/ w is small enough This procedure is so call the machine learning.

Backpropagation An efficient way to compute L/ w Z j = X i W ij + b j i X j = σ (Z j ) Y k = X j W jk + b k j mse: l p = σ k y k y 2 k /k Function signals Error signals Backpropagation (BP)

Backpropagation Z j = X i W ij + b j i l w = l z x l b = l z X j = σ (Z j ) Sigmoid: x = l = l 1 Z x 1+e z (1 1 1+e z) Y k = X j W jk + b k j l w = l y x l b = l y mse: l p = σ k y k y k 2 /k l y = 2/k * (y y)

Optimizer Gradient Descent : walking in the desert, blindfold Cannot see the whole picture SGD Momentum AdaGrad Adam

Deep learning Deep just means more hidden layers Deep is better But not too deep Deep VS Wide

Deep learning Gradient vanishing/exploding problem n hidden layers w q is the weights for q th hidden layer l x = l y w l = l w q y w n w n 1 w n 2 w q+1 x

Overfitting Training data and testing data can be different Solution: Get more training data Create more training data Dropout L1/L2 regularization Early Stopping

Friendly tool: Keras Python $ apt-get install python3-pip $ pip3 install keras $ pip3 install tensorflow

Neural network simulation & extrapolation NN application in nuclear physics 4 He ground-state energy NCSM NCSM ħω ħω * Negoita G A, Luecke G R, Vary J P, et al. Deep Learning: A Tool for Computational Nuclear Physics[J]. arxiv preprint arxiv:1803.03215, 2018.

Neural network simulation & extrapolation 4 He radius NCSM NCSM ħω ħω

Neural network simulation & extrapolation The minimum g.s energy of each Nmax drops exponentially Nmax Nmax * Forssén, C., Carlsson, B. D., Johansson, H. T., Sääf, D., Bansal, A., Hagen, G., & Papenbrock, T. (2018). Large-scale exact diagonalizations reveal low-momentum scales of nuclei. Physical Review C, 97(3), 034328.

Neural network for Coupled-cluster CCSDT equations: We want to truncate the 3p3h configurations Introduce NN to select the more important configurations

Neural network for Coupled-cluster The input layer include all the quantum numbers of the 3p3h amplitudes ( n l j j tot t z ). 8 He 16 O NN NN 3p3h amplitudes 3p3h amplitudes

Neural network in nuclear matter Train with different combination of cd,ce. density

Neural network Uncertainty analysis Even with the same input data and network structure, the NN will give different results in mutual independence trainings. Sources of uncertainty 1. random initialization of neural network parameters 2. different divisions between training data and validation data 3. data shuffle (limit batch size) Solution 1.??? 2. k-fold cross validation 3. increase batch size

Ensembles of Neural Networks 4 He radius 1.45 The distribution of NN predict radius is Gaussian. Nmax 4-12 Define the full width at half maximum value as the uncertainty for certain NN structure. The NN uncertainty reduce with more training data. Nmax 4-16 1.44 The almost identical value of NN prediction radius indicates that the NN has a good generalization performance. Nmax 4-20 1.44

More complex case: 4 He g.s. energy, with two peak Nmax 4-10 Nmax 4-12 Nmax 4-14

4 He g.s. energy, with two peak Nmax 4-16 Nmax 4-18 Nmax 4-20

We can separate the peaks with features (loss) provided by NN.