A Video from Google DeepMind.

Size: px

Start display at page:

Download "A Video from Google DeepMind."

Mercy West
5 years ago
Views:

1 A Video from Google DeepMind

2 Can machine learning teach us cluster updates? Lei Wang Institute of Physics, CAS Li Huang and LW, LW,

3 Local vs Cluster algorithms

4 Local vs Cluster algorithms is slower than

5 Local vs Cluster algorithms Algorithmic innovation outperforms Moore s law!

6 Recommender Systems Learn preferences Recommendations

7 Recommender Systems Learn preferences Recommendations

8 Restricted Boltzmann Machine Smolensky 1986 Hinton and Sejnowski 1986 NX M X N X MX E (x, h) = a i x i b j h j x i W ij h j i=1 j=1 i=1 j=1 Energy based model h 2 {0, 1} M h 1 h 2 h 3... hm x 1 x 2... x N x 2 {0, 1} N

9 Restricted Boltzmann Machine Smolensky 1986 Hinton and Sejnowski 1986 NX M X N X MX E (x, h) = a i x i b j h j x i W ij h j i=1 j=1 i=1 j=1 Energy based model h 2 {0, 1} M h 1 h 2 h 3... hm x 1 x 2... x N x 2 {0, 1} N

10 Restricted Boltzmann Machine Smolensky 1986 Hinton and Sejnowski 1986 NX M X N X MX E (x, h) = a i x i b j h j x i W ij h j i=1 j=1 i=1 j=1 Energy based model h 2 {0, 1} M Probabilities p(x, h) e E(x,h) h 1 h 2 h 3... hm p(x) = X h p(x, h) x 1 x 2... x N x 2 {0, 1} N p(h x) =p(x, h)/p(x)

11 Restricted Boltzmann Machine Smolensky 1986 Hinton and Sejnowski 1986 NX M X N X MX E (x, h) = a i x i b j h j x i W ij h j i=1 j=1 i=1 j=1 Energy based model h 2 {0, 1} M Probabilities p(x, h) e E(x,h) h 1 h 2 h 3... hm p(x) = X h p(x, h) x 1 x 2... x N x 2 {0, 1} N p(h x) =p(x, h)/p(x) Universal approximator of probability distributions. Freund and Haussler, Le Roux and Bengio, 2008

12 Generative Learning Learning: Fit to the target distribution p(x) (x) target probability Learn the model from data Hinton 2002 Generating: Blocked Gibbs sampling h p(h x) x p(x 0 h) x 0 Generate more data from the learned model

13 MNIST database of handwritten digits learned weight W ij Theano deep learning tutorial

14 MNIST database of handwritten digits learned weight W ij h 1 h 2 h 3... hm x 1 x 2... x N Theano deep learning tutorial

15 MNIST database of handwritten digits learned weight W ij h 1 h 2 h 3... hm x 1 x 2... x N Theano deep learning tutorial

16 Challenge Which one is written by human? Salakhutdinov, 2015 Deep learning summer school

17 Idea Model the QMC data with an RBM, then sample!om the RBM Li Huang and LW, cf. Liu, Qi, Meng, Fu, Liu, Shen, Qi, Meng, Fu, Xu, Qi, Liu, Fu, Meng,

18 Why is it useful? When the fitting is perfect, we can completely bypass the quantum part of the QMC cf. Torlai, Melko, Even with an imperfect fitting, the RBM can still guide the QMC sampling RBM is a recommender system for the QMC simulations

Why is it useful? When the fitting is perfect, we can completely bypass the quantum part of the QMC cf. Torlai, Melko,1606.

19 Why is it useful? When the fitting is perfect, we can completely bypass the quantum part of the QMC cf. Torlai, Melko, Even with an imperfect fitting, the RBM can still guide the QMC sampling RBM is a recommender system for the QMC simulations Bonus: it is also fun to see what it discovers for the physical model

20 Learned Weight of a 8*8 Falicov-Kimball model

21 Accept or not? A(x! x 0 )=min apple 1, T (x 0! x) T (x! x 0 ) (x 0 ) (x) The art of Monte Carlo methods Sample the RBM, and propose the move to QMC p(h x) x h h 0 min 1, p(h0 ) p(h) p(x 0 h 0 ) x 0

22 Accept or not? A(x! x 0 )=min apple 1, T (x 0! x) T (x! x 0 ) (x 0 ) (x) Detailed balance condition for the RBM T (x 0! x) T (x! x 0 ) = p(x) p(x 0 ) Sample the RBM, and propose the move to QMC p(h x) x h h 0 min 1, p(h0 ) p(h) p(x 0 h 0 ) x 0

23 Accept or not? Acceptance rate of recommended update Li Huang and LW, A(x! x 0 )=min apple 1, p(x) p(x 0 ) (x0 ) (x) RBM Physical model surrogate function force bias R. M. Neal, Bayesian learning for neural networks, J. S. Liu, Monte Carlo strategies in scientific computing, 2008 S. Zhang, Auxiliary-field QMC for correlated electron systems, 2013

24 Question Can Boltzmann Machines Discover Cluster Updates? LW,

25 Cluster Update in a Nutshell Swendsen and Wang,1987

26 Cluster Update in a Nutshell Sample auxiliary bond variables Swendsen and Wang,1987

27 Cluster Update in a Nutshell Build clusters Swendsen and Wang,1987

28 Cluster Update in a Nutshell Flip clusters randomly Swendsen and Wang,1987

29 Cluster Update in a Nutshell Rejection free cluster update! Swendsen and Wang,1987

30 visible s i = ±1 p(h s) hidden h` =0, 1 flip clusters Swendsen-Wang Boltzmann Machine Ising spins Visible units Auxiliary bond variables Build clusters Flip clusters Hidden units Sample h given s Sample s given h

31 Boltzmann Machine for cluster update E(s, h) = X` (Ws i`s j` + b) h` s i` s j` s 2 { 1, +1} Vertex h` =1 h 2 {0, 1} Edge

32 Boltzmann Machine for cluster update E(s, h) = X` (Ws i`s j` + b) h` s i` s j` s 2 { 1, +1} Vertex h` =0 h 2 {0, 1} Edge

33 Boltzmann Machine for cluster update E(s, h) = X` (Ws i`s j` + b) h` s i` s j` s 2 { 1, +1} Vertex h` =0 h 2 {0, 1} Edge

34 Cluster Update in the BM language

35 Cluster Update in the BM language Sample hidden variables given visible units

36 Cluster Update in the BM language Inactive hidden units break the links

37 Cluster Update in the BM language Randomly flip clusters of visible units

38 Cluster Update in the BM language Voila!

39 Boltzmann Machine for cluster update E(s, h) = X` (Ws i`s j` + b) h` s i` s j` s 2 { 1, +1} Vertex h` =0 h 2 {0, 1} Edge Rejection free Monte Carlo updates when 1+eb+W 1+e b W = e2 J

40 Boltzmann Machine for cluster update E(s, h) = X` (Ws i`s j` + b) h` s i` s j` s 2 { 1, +1} Vertex h` =0 h 2 {0, 1} Edge Rejection free Monte Carlo updates when 1+eb+W 1+e b W = e2 J Encompass general cluster algorithm frameworks Niedermeyer,1988 Kandel and Domany,1991 Kawashima and Gubernatis,1995

41 Boltzmann Machine for cluster update E(s, h) = X` (Ws i`s j` + b) h` s i` s j` s 2 { 1, +1} Vertex h` =0 h 2 {0, 1} Edge Rejection free Monte Carlo updates when 1+eb+W 1+e b W = e2 J Encompass general cluster algorithm frameworks In general Niedermeyer,1988 E(s, h) =E(s) Kandel and Domany,1991 Kawashima and Gubernatis,1995 X [W F (s)+b ] h feature

policies which can be optimized for efficiency The hidden units learn

42 Learn Generate Boltzmann Machines are learnable so they can adapt to various physical problems Boltzmann Machines parametrize Monte Carlo policies which can be optimized for efficiency The hidden units learn to play smart roles: Fortuin-Kasteleyn and Hubbard-Stratonovich transformations

43 Plaquette Ising model K However, for K 0, NO simple and efficient global update method is known

44 Plaquette Ising model K e Ks 1s 2 s 3 s 4! X h e W (s 1s 2 +s 3 s 4 )h

45 Plaquette Ising model K e Ks 1s 2 s 3 s 4! X h e W (s 1s 2 +s 3 s 4 )h Given s, sample h on each plaquette independently Given h, the Boltzmann Machine is an ordinary Ising model with modulated interactions, which can be sampled efficiently Rejection free cluster update!

46 Plaquette Ising model K e Ks 1s 2 s 3 s 4! X h e W (s 1s 2 +s 3 s 4 )h LW,

47 Machine learning for many-body physics New tools for (quantum) many-body problems Moreover, it offers a new way of thinking Can we make new scientific discovery with it? Can one design better algorithms with it? LW, Li Huang and LW, Li Huang, Yi-feng Yang and LW, LW,

Machine learning for many-body physics New tools for (quantum) many-body problems Moreover, it offers a new way of thinking Can we make new scientific discovery with

48 Machine learning for many-body physics New tools for (quantum) many-body problems Moreover, it offers a new way of thinking Can we make new scientific discovery with it? Can one design better algorithms with it? The fun just starts! LW, Li Huang and LW, Li Huang, Yi-feng Yang and LW, LW,

49 Quantum many-body physics for ML Quantum entanglement perspective on deep learning Jing Chen, Song Cheng, Haidong Xie, LW, and Tao Xiang, Dong-Ling Deng, Xiaopeng Li and S. Das Sarma, Xun Gao, L.-M. Duan, Y. Huang and J. E. Moore, MNIST database random images from the Deep Learning book by Goodfellow, Bengio, Courville

50 A(x! x 0 )=min apple 1, p(x) p(x 0 ) (x0 ) (x) h 2 {0, 1} M x 2 {0, 1} N h 1 h 2 h 3... hm x 1 x 2... x N RBM Physical model e.g. Falikov-Kimball (classical field + fermions) model ln[ (x)] = U 2 NX x i + ln det h 1+e H(x) i while for the RBM i=1 H ij = K ij + ij U (x i 1/2) ln[p(x)] = NX a i x i + MX ln 1+e b j+ P N i=1 x iw ij i=1 j=1

51 Train the RBM p(x) (x) Supervised learning of the RBM NX i=1 = U 2 a i x i + NX i=1 MX j=1 ln 1+e b j+ P N i=1 x iw ij x i + ln det 1+e H x 1 x 2 x N......

Accelerated Monte Carlo simulations with restricted Boltzmann machines

Accelerated Monte Carlo simulations with restricted Boltzmann machines Accelerated simulations with restricted Boltzmann machines Li Huang 1 Lei Wang 2 1 China Academy of Engineering Physics 2 Institute of Physics, Chinese Academy of Sciences July 6, 2017 Machine Learning