Applications of the tensor RG method to machine learning

Size: px
Start display at page:

Download "Applications of the tensor RG method to machine learning"

Transcription

1 Applications of the tensor RG method to machine learning Yannick Meurice The University of Iowa With Sam Foreman, Joel Giedt, and Judah Unmuth-Yockey Happy birthday Thom! TGASEPC, August 19, 2017

2 Content of the talk 1 The Tensor renormalization group (TRG) method in a nutshell 2 What is Machine Learning? 3 RG ideas applied to digit recognition using the MNIST data The MNIST data The multiple layer perceptron Principal Components Analysis (PCA): relevant features Coarse graining procedures 4 The 2D Ising model near T c (Sam Foreman) 5 Work in progress 6 Conclusions

3 Tensor Renormalization Group (TRG) Blocking in configuration space is hard! (2D Ising example) Tensor RG: blocking is simple and exact! TRG: first implementation of Wilson program for lattice models without incontrollable approximations In this talk, we focus on the 2D Ising model example Not discussed here: O(N) and principal chiral models and gauge models (Ising, U(1) and SU(2)) and their quantum simulators For details see: PRB , PRD 88, and PRD , PRA , PRD , PRA , (PRD in press)

4 The Renormalization Group (RG) for lattice models Main ingredients: 1 Replace microscopic degrees of freedom by coarser ones (block spins, decimation, coarse graining...). 2 Rescale the new variables in such a way that the new interactions (encoded in an Hamiltonian or tensor...) can be compared to the original ones. 3 Find the fixed points of the RG transformation defined by 1. and 2. 4 Linearize the RG transformation at the fixed point; separate relevant (expanding) and irrelevant (contracting) directions. The relevant directions characterize the universal large distance behavior.

5 The 2D Ising model System of spins on a square lattice: σ x,y = ±1 Hamiltonian: H = (x,y) σ x,y(σ x+1,y + σ x,y+1 ) partition function: Z = {σ} e βh Second order phase transition (order-disorder) at self dual β c : tanh(β c ) = exp( 2β c ) which implies β c = ln(1 + 2)/2 = Transfer matrix exactly diagonalizable at finite volume (Kaufman) Critical exponents are known exactly First model to try when developing a new renormalization group (RG) method

6

7

8

9 Block Spining in Configuration Space: Step 1 Square blocks in A B checkerboard; B blocks are fixed backgrounds. We can block spin in the A blocks. Example: Pr(φ A = 4 φ background i ) = exp(β(4 + 8 i=1 φbackground i )) B B A B B

10 Block Spining in Configuration Space: Step 2 etc. The next step is to try to block spin in the B blocks. We can use step 1 and block spin in a given B block but there are 20 background spins. Finding the effective energy function is nontrivial. B'' B' A B' B'' A B A B'' B' A B' B''

11 TRG blocking: it s simple and exact! For each link: exp(βσ 1 σ 2 ) = cosh(β)(1 + tanh(β)σ 1 tanh(β)σ2 ) = cosh(β) ( tanh(β)σ 1 tanh(β)σ2 ) n 12. n 12 =0,1 Regroup the four terms involving a given spin σ i and sum over its two values ±1. The results can be expressed in terms of a tensor: T (i) xx yy which can be visualized as a cross attached to the site i with the four legs covering half of the four links attached to i. The horizontal indices x, x and vertical indices y, y take the values 0 and 1 as the index n 12. T (i) xx yy = f x f x f y f y δ ( mod[x + x + y + y, 2] ), where f 0 = 1 and f 1 = tanh(β). The delta symbol is 1 if x + x + y + y is zero modulo 2 and zero otherwise.

12 TRG blocking (graphically) Exact form of the partition function: Z = (cosh(β)) 2V Tr i T (i) xx yy. Tr mean contractions (sums over 0 and 1) over the links. Reproduces the closed paths of the HT expansion. TRG blocking separates the degrees of freedom inside the block which are integrated over, from those kept to communicate with the neighboring blocks. Graphically : Y y 1 y 2 X x 1 y L x U y R x 1 ' X' x 2 x D x 2 ' y 1 ' y 2 ' Y'

13 TRG Blocking (formulas) Blocking defines a new rank-4 tensor T XX YY where each index now takes four values. T X(x 1,x 2 )X (x 1,x 2 )Y (y 1,y 2 )Y (y 1,y 2 ) = x U,x D,x R,x L T x1 x U y 1 y L T xu x 1 y 2y R T xd x 2 y Ry 2 T x 2 x D y L y 1, where X(x 2, x 2 ) is a notation for the product states e. g., X(0, 0) = 1, X(1, 1) = 2, X(1, 0) = 3, X(0, 1) = 4. The partition function can be written exactly as Z = (cosh(β)) 2V Tr 2i T (2i) XX YY, where 2i denotes the sites of the coarser lattice with twice the lattice spacing of the original lattice.

14 Accurate exponents from approximate tensor renormalizations (YM, Phys. Rev. B 87, ) For the Ising model on square and cubic lattices, truncation method (HOSVD) sharply singles out a surprisingly small subspace of dimension two. In the two states limit, the transformation can be handled analytically yielding a value for the critical exponent ν much closer to the exact value 1 than obtained in Migdal-Kadanoff approximations. Alternative blocking procedures that preserve the isotropy can improve the accuracy to ν = and respectively. Applications to other classical lattice models are possible, including models with fermions. TRG could become a competitor for the Monte Carlo method suitable to calculate accurately critical exponents, take continuum limits and study near-conformal systems in large volumes.

15 More than two states When a few more states are added, the quality of the approximation does not immediately improve. One first observes oscillations, false bifurcations, approximate degeneracies. Same observations were made by Leo Kadanoff ( ) and collaborators. Can a systematic understanding of this phenomenon be reached using the exact solution at finite volume (B. Kaufman, 1949) or approximate functional evolution (Curtright and Zachos, 2011)? 1.05 x T χ Figure: Thermal x-value 1/ν versus the truncation size χ in various calculations summarized in by Efi Efrati, Zhe Wang, Amy Kolan, Leo P. Kadanoff, Rev. Mod. Phys. 86, 647 May 2014

16 RG aspects of Machine Learning 1 What is Machine Learning? 2 RG ideas applied to digit recognition using the MNIST data The MNIST data The multiple layer perceptron Principal Components Analysis (PCA): relevant features Coarse graining procedures 3 The 2D Ising model near T c (Sam Foreman)

17 Rosenblatt perceptron (1958) and its digital version Using outputs functions of the form y l = σ( j W ljv j ) you can recognize 91 percent of the digits of the MNIST data (v j : pixels, W lj tunable, σ(x) the sigmoid function).

18 Machine Learning Course (Andrew Ng, Stanford) About this course: Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome. Topics include: (i) Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks). (ii) Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning). (iii) Best practices in machine learning (bias/variance theory; innovation process in machine learning and AI). [...]

19 MNIST data 60,000 handwritten digits; the correct" answer is known. 0 1-th. MNIST digit 0 2-th. MNIST digit 0 3-th. MNIST digit th. MNIST digit 0 5-th. MNIST digit 0 6-th. MNIST digit th. MNIST digit 0 8-th. MNIST digit 0 9-th. MNIST digit FIG. 1. First 9 MNIST data.

20 MNIST data: grayscale pixels There is a UV cutoff (pixels are uniform), an IR cutoff close to the overall size and a typical size (width of lines).

21 A perceptron for the MNIST data (one hidden layer) We consider a perceptron model where the visible variables v i are the = 784 pixels of the MNIST digits (in grayscale values between 0 and 1). We decided to take 196=784/4 hidden variables h k. Later (see hierarchical approximations below), we will attach" them rigidly to 2 2 blocks of pixels. The hidden variables are: 784 h k = σ( W (1) kj v j ), k = 1, j=1 We choose the sigmoid function σ(x) = 1/(1 + exp( x)). The output variables are y l = σ( k W (2) lk h k)). with l = 0, 1,... 9 which we want to associate with the MNIST characters by having y l 1 for l =digit while y l 0 for the 9 others.

22 The sigmoid function The sigmoid function σ(x) = 1/(1 + exp( x)) is a popular choice of activation function (0 at large negative input, 1 at large positive input). We have σ(x) = σ(x) σ(x) 2 which allows simple algebraic manipulations for the gradients. 1.0 sigmoid function

23 Gradient search Given a learning set {v (n) i } with n = 1, 2,... N 60, 000 with the corresponding target vectors {t (n) l }, we minimize the loss function: with E(W (1), W (2) ) = (1/2) y (n) l σ( k N W (2) lk σ( j 9 (y (n) l n=1 l=0 t (n) l ) 2 W (1) kj v (n) j )). The weights matrices W (1) and W (2) are initialized with random numbers following a normal distribution. They are then optimized using a gradient method with gradients: G (2) lk E W (2) lk G (1) kj E W (1) = l = (y l t l )(y l y 2 l )h k (y l t l )(y l yl 2 )W (2) lk (h k hk 2 )v j kj

24 Performance After one learning cycle" (going through the entire MNIST training data), we get a performance of about 0.95 (number of correct identifications/number of attempts), after 10 learning cycles, the performance saturates near 0.98 (with 196 hidden variables and learning parameters 0.1 and 0.5). It is straightforward to introduce more hidden layers y (n) l σ(w (m)...σ(w (2) (1) lk σ(w kj v (n) j )). With two hidden layers, the performance improves (but only very slightly). On the other hand, if we remove the hidden layer (as in Rosenblatt experiment), the performance goes down to about 0.91.

25 Failures (2 percent) It is instructive to look at the outputs for the 2 percent of cases where the algorithm fails to identify the correct digits. Here, the outputs suggest 1, 7 and 9 with a slightly larger value for 1 than for 9 which is the correct answer according to the MNIST data. In the following, I will focus more on getting comparable performance with less learning parameters rather than reducing the number of failures testing sample 320 (a nine) y for testing sample

26 Recent attempts to use RG ideas It has been suggested by Methap and Schwab ( ) that the hidden variables can be related to the RG block variables" and that an hierarchical organization inspired by physics modeling could drastically reduce the number of learning parameters. This may be called cheap" learning (Lin and Tegmark, ). The notion of criticality or fixed point has not been identified precisely on the ML side. It is not clear that the technical meaning of relevant", as used in the RG context to describe unstable directions of the RG transformation linearized near a nontrivial fixed point, can be used generically in ML context. We are exploring this question for the MNIST data (which is not critical") and the more tunable Ising model.

27 Principal Component Analysis (PCA, 19th cent. tech.) Identifying relevant directions" may allow a drastic reduction of the information necessary to calculate observables. Relevant directions: directions with largest variance v (n) i : greyscale value of the i-th pixel in the n-th sample v i : average greyscale value of the i-th pixel covariance matrix: C ij = (1/N) N (n) n=1 (v i v i )(v (n) j v j ) We can project the original data in a small dimensional subspace corresponding to the largest eigenvalues of C ij Can we keep the crucial information in a small subspace? For the MNIST data, the 784 pixels digits are recognizable using projections in dimension subspace

28 MNIST averages Figure: Empirical probability for 60,000 digits (includes all digits, left) and 5,923 zeros (right).

29 PCA eigenvectors ( eigenfaces")

30 PCA projections of a MNIST digit 10-dim. projection of MNIST data 20-dim. projection of MNIST data dim. projection of MNIST data dim. projection of MNIST data 0 50-dim. projection of MNIST data 0 60-dim. projection of MNIST data dim. projection of MNIST data 0 80-dim. projection of MNIST data 0 Original MNIST data Figure 1: PCA projections in subspaces of dimensions 10, 20, compared to the original 784-dimensional data. PCA projections in subspaces of dimensions 10, 20, compared to the original (lower right) 784-dimensional MNIST digit.

31 Success rate of (drastic) PCA Projections PCA Success rate PCAP Images PCAP Vectors PCA dimension Figure: Success rate for PCA projections of the images (blue, using the PCA eigenvectors=images) or using the PCA coordinates only (red, no images) as a function of the dimension of the subspace. The green line represents the asymptotic value (98 percent) for the original version (dimension 784).

32 Blocking We have used the one layer perceptron with blocked images. First we replaced squares of 4 four pixels by a single pixel carrying the average value of the four blocked pixels. Using the blocked pictures with 49 hidden variables, the success rate goes down slightly (97 percent). Repeating once, we obtain 7 7 images. With 25 hidden variables, we get a success rate of 92 percent. 0 blocked once 0 blocked twice

33 Ising projection We can replace the grayscale pixels by black and white pixels. This barely affects the performance (97.6 percent) but diminishes the configuration space from to and allows a Restricted Boltzmann Machine treatment (not discussed here). 0 Original 0 Isinglike (graycut=0.5)

34 Cheap learning (hierarchical approximations) We have considered the hierarchical approximation where each hidden variable is only connected to a single 2 2 block of visible variables (pixels): W (1) lj v j W (1) l,α v l,α with α = 1, 2, 3, 4 are the position in the 2 2 block and l = 1,...,196 the labeling of the blocks. Even though the number of parameters that we need to determine with the gradient method is significantly smaller (by a factor 196), the performance remains A generalization with 4x4 blocks leads to a 0.90 performance with 1/4 as many weights. This simplified version can be used as a starting point for a full gradient search (pretraining), but the hierarchical structure (sparcity of W ij ) is robust and remains visible during the training. This pretraining breaks the huge permutation symmetry of the hidden variables.

35 Transition to the 2D Ising model The MNIST data has a typical size built-in and UV details can be erased until the relevant information contained at that size is reached. One can think that the various digits are phases", but there is nothing like a critical point. We now consider worm configurations for the 2D Ising model at different β = 1/T, some close to the critical value. The graphs for the Ising model calculations have been made by Sam Foreman.

36 Worm Algorithm 1 Randomly select starting point on lattice, set head = tail = j. Choose nearest neighbor site at random for possible update, j. 2 If random() R: assign head to new site, j, and update the bond number, i.e. the number of currently active bonds n b = n b ± 1, where R is the acceptance ratio, defined by R = { βj n b +1, when n b = n b + 1 n b βj, when n b = n b 1 (1) 3 While head tail, repeat 2. If head = tail, update partition function Z = Z + 1, and go to 1. 4 Note that when this update is accepted, the global number of active bonds N b is updated by one, i.e. N b = N b For the Ising model, one can use resummed versions with links occupied 0 or 1 time with a weight th(β).

37 Main points legal worms (high temperature graphs) =combinations of tensors Specific heat =C v = A(β) + B(β) N b + C(β)(N b N b ) 2 with N b the number of links in a worm (N b N b ) 2 (2/π) ln( T T c ) Conjecture: λ max PCA (3/2V )(N b N b ) 2 Blocking results: similar shapes as unblocked. Quantitative features and RG interpretations are in progress (not discussed here).

38 Sampling of Ising HT contributions ( worms")

39 Blocked worms (8x8)

40 Worms (16x16)

41 Blocked worms (16x16)

42 Blocking We implement a coarse-graining renormalization approach, where the lattice is divided into blocks of 2 2 squares. Each 2 2 square is then blocked into a single site, where the new external bonds in a given direction are determined by the number of active bonds exiting a given square. If a given block has one external bond in a given direction, the blocked site retains this bond in the blocked configuration, otherwise it is ignored. Results for the leading PCA eigenvalue and specific heat for blocked configurations are qualitatively similar to the unblocked results. These results and their RG interpretation are still preliminary and will not be presented here. The blocking procedure is approximate. Better approximations can be constructed with the TRG method (YM arxiv: , Phys. Rev. B 87, ). Worm implementation of the critical theory are in progress.

43 Specific Heat (Sam Foreman) The worm algorithm allows statistically exact calculations of the specific heat. In order to observe finite scaling effects, we performed this analysis for lattice sizes L = 4, 8, 16, 32.

44 Specific Heat (Sam Foreman)

45 Leading singularity fit

46 Comparison with exact results

47 PCA leading eigenvalue (Sam Foreman)

48 PCA leading eigenvalue (Sam Foreman)

49 PCA leading eigenvalue and specific heat

50 Conclusions ML has a clear RG flavor: how to extract relevant features from noisy pictures (configurations). RG ideas allow to reduce the complexity of the perceptron algorithms for the MNIST data. Direct use of Tensor RG methods are being used for the ML treatment of worm configurations of the 2D Ising model near criticality. Quantitative tests in progress. Restricted Boltzmann Machines=Ising models ML Lattice Field Theory: compress configurations, find features or updates (Lei Wang ) Private funding for Lattice Field Theory?

51 Thanks for listening and waiting (the image is cooling!) = 0.1

52 Thanks for listening and waiting (the image is cooling!) = 0.2

53 Thanks for listening and waiting (the image is cooling!) = 0.3

54 Thanks for listening and waiting (the image is cooling!) = 0.4

55 Thanks for listening and waiting (the image is cooling!) = 0.5

56 Thanks for listening and waiting (the image is cooling!) = 0.6

57 Happy birthday t Hom!

58 Acknowledgements: This research was supported in part by the Dept. of Energy under Award Numbers DOE grants DE-SC , DE-SC , and DE-SC

59 Truncation procedure M <ij> X(x 1,x 2 )X (x 1 x 2 )yy = y T (i) x 1,x 1,y,y T (j) x 2 x 2 y y, which can be put in a canonical form by using a Higher Order Singular Value Decomposition defined by a unitary transformation on each of the four indices (see Tao Xiang et al. PRB 86). The unitary transformation for each index is the one that diagonalizes the symmetric tensor obtained by summing over all the other indices in the following way: G XX = X yy M XX yy M X X yy.

60 Eigenvalues of G t The two new states have the form 0 = cos φ 00 + sin φ 11 and 1 = ( )/ 2. The angle φ is obtained by diagonalizing the even-even block. The symmetric form of 1 is a consequence of G 33 = G 44, itself due to reflection symmetry.

61 A simple 3 parameter map With the two states projection, we obtain a new rank 4 tensor with indices taking again two values and the same parity selection rule. We divide the new tensor by a constant in such a way that, dropping all the primes herafter, T 0000 = 1. In the isotropic case, the reflection symmetry imposes that T 1010 = T 0110 = T 1001 = T 0101 t 1, T 1100 = T 0011 t 2, T 1111 t 3. For the initial tensor, we have what we call later the Ising condition": t 1 = t 2 and t 3 = t 2 1, but this property is not preserved by the blocking procedure which can be expressed as a mapping of the three dimensional parameter space (t 1, t 2, t 3 ) into itself. This can be expressed as rational expressions involving the values of the tensor and trigonometric functions of φ. They are analytical expressions and derivatives can be taken to calculate the linearized map.

62 Numerical Results A fixed point is found for β = not far from the exact value The fixed point is approximately t1 = , t 2 = , t 3 = and shows significant departure from the Ising condition. The eigenvalues of the linearized RG transformation are λ 1 =2.0177, λ 2 = and λ 3 = The scaling factor b is 2, and this implies the values for the critical exponents ν = log b/log λ and ω = log λ 2 /log b 2.31 which can be compared with the exact values 1 and 2 respectively. For comparison, the also isotropic Migdal recursion for the D = 2 Ising model and b = 2 can be written as t = 2t 2 /(1 + t 4 ) The fixed point is at t = and corresponds to β c = and ν =

63 Restricted Boltzmann Machines (say for MNIST data) v (visible) and h (hidden) vectors of binary variables (Ising spins). v are the 784 black and white pixels of a MNIST picture. p(v, h) = (1/Z ) exp(h k W kj v j + b k h k + a j v j ) p(v) = {h k =0,1} (1/Z ) exp(h kw kj v j + b k h k + a j v j ) Z = {h k =0,1,v j =0,1} exp(h kw kj v j + b k h k + a j v j ) Given an empirical image v, you can maximize the model probability (determine the learning parameters W kj etc.) by using gradients: ln p(v)/ W kj = (1/Z (v)) Z (v)/ W kj (1/Z ) Z / W kj Z (v) = {h k =0,1} exp(h kw kj v j + b k h k + a j v j ) For Z (v), v correspond to the black and white pixels of a given image, the summation over the binary hidden variables factorizes and can performed analytically: (1 + exp(b k + W kj v j )). On the other hand, Z requires the sum over the visible binary variables (v j = (1 + σ j )/2 are essentially Ising spins).

64 Finding the phase from worm pictures (Sam Foreman) The one layer perceptron with fully connected W can be used to try to determine the phase from a worm picture. The target values are determined using the known infinite volume transition. Obviously, this information becomes more accurate as the size increases.

65 Output functions for fully connected W (Sam Foreman) Figure: Softmax outputs vs. temperature. These results can be improved using ConvNet algorithms which have a clear RG flavor (Sam Foreman, work in progress).

Approaching conformality in lattice models

Approaching conformality in lattice models Approaching conformality in lattice models Yannick Meurice The University of Iowa yannick-meurice@uiowa.edu Work done in part with Alexei Bazavov, Zech Gelzer, Shan-Wen Tsai, Judah Unmuth-Yockey, Li-Ping

More information

Machine Learning with Quantum-Inspired Tensor Networks

Machine Learning with Quantum-Inspired Tensor Networks Machine Learning with Quantum-Inspired Tensor Networks E.M. Stoudenmire and David J. Schwab Advances in Neural Information Processing 29 arxiv:1605.05775 RIKEN AICS - Mar 2017 Collaboration with David

More information

Renormalization of Tensor Network States

Renormalization of Tensor Network States Renormalization of Tensor Network States I. Coarse Graining Tensor Renormalization Tao Xiang Institute of Physics Chinese Academy of Sciences txiang@iphy.ac.cn Numerical Renormalization Group brief introduction

More information

Renormalization Group analysis of 2D Ising model

Renormalization Group analysis of 2D Ising model Renormalization Group analysis of D Ising model Amir Bar January 7, 013 1 Introduction In this tutorial we will see explicitly how RG can be used to probe the phase diagram of d > 1 systems, focusing as

More information

Why is Deep Learning so effective?

Why is Deep Learning so effective? Ma191b Winter 2017 Geometry of Neuroscience The unreasonable effectiveness of deep learning This lecture is based entirely on the paper: Reference: Henry W. Lin and Max Tegmark, Why does deep and cheap

More information

Introduction to tensor network state -- concept and algorithm. Z. Y. Xie ( 谢志远 ) ITP, Beijing

Introduction to tensor network state -- concept and algorithm. Z. Y. Xie ( 谢志远 ) ITP, Beijing Introduction to tensor network state -- concept and algorithm Z. Y. Xie ( 谢志远 ) 2018.10.29 ITP, Beijing Outline Illusion of complexity of Hilbert space Matrix product state (MPS) as lowly-entangled state

More information

Fisher s Zeros, Complex RG Flows and Confinement in Lattice Models

Fisher s Zeros, Complex RG Flows and Confinement in Lattice Models Fisher s Zeros, Complex RG Flows and Confinement in Lattice Models Yannick Meurice The University of Iowa yannick-meurice@uiowa.edu Miami 211 December 19, 211 work done in part with Alexei Bazavov, Bernd

More information

Renormalization of Tensor Network States. Partial order and finite temperature phase transition in the Potts model on irregular lattice

Renormalization of Tensor Network States. Partial order and finite temperature phase transition in the Potts model on irregular lattice Renormalization of Tensor Network States Partial order and finite temperature phase transition in the Potts model on irregular lattice Tao Xiang Institute of Physics Chinese Academy of Sciences Background:

More information

Renormalization Group for the Two-Dimensional Ising Model

Renormalization Group for the Two-Dimensional Ising Model Chapter 8 Renormalization Group for the Two-Dimensional Ising Model The two-dimensional (2D) Ising model is arguably the most important in statistical physics. This special status is due to Lars Onsager

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

Machine Learning with Tensor Networks

Machine Learning with Tensor Networks Machine Learning with Tensor Networks E.M. Stoudenmire and David J. Schwab Advances in Neural Information Processing 29 arxiv:1605.05775 Beijing Jun 2017 Machine learning has physics in its DNA # " # #

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA

More information

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:

More information

8.334: Statistical Mechanics II Problem Set # 4 Due: 4/9/14 Transfer Matrices & Position space renormalization

8.334: Statistical Mechanics II Problem Set # 4 Due: 4/9/14 Transfer Matrices & Position space renormalization 8.334: Statistical Mechanics II Problem Set # 4 Due: 4/9/14 Transfer Matrices & Position space renormalization This problem set is partly intended to introduce the transfer matrix method, which is used

More information

Classical Monte Carlo Simulations

Classical Monte Carlo Simulations Classical Monte Carlo Simulations Hyejin Ju April 17, 2012 1 Introduction Why do we need numerics? One of the main goals of condensed matter is to compute expectation values O = 1 Z Tr{O e βĥ} (1) and

More information

Kyle Reing University of Southern California April 18, 2018

Kyle Reing University of Southern California April 18, 2018 Renormalization Group and Information Theory Kyle Reing University of Southern California April 18, 2018 Overview Renormalization Group Overview Information Theoretic Preliminaries Real Space Mutual Information

More information

Notes on Latent Semantic Analysis

Notes on Latent Semantic Analysis Notes on Latent Semantic Analysis Costas Boulis 1 Introduction One of the most fundamental problems of information retrieval (IR) is to find all documents (and nothing but those) that are semantically

More information

Complex Systems Methods 9. Critical Phenomena: The Renormalization Group

Complex Systems Methods 9. Critical Phenomena: The Renormalization Group Complex Systems Methods 9. Critical Phenomena: The Renormalization Group Eckehard Olbrich e.olbrich@gmx.de http://personal-homepages.mis.mpg.de/olbrich/complex systems.html Potsdam WS 2007/08 Olbrich (Leipzig)

More information

The 1+1-dimensional Ising model

The 1+1-dimensional Ising model Chapter 4 The 1+1-dimensional Ising model The 1+1-dimensional Ising model is one of the most important models in statistical mechanics. It is an interacting system, and behaves accordingly. Yet for a variety

More information

Loop optimization for tensor network renormalization

Loop optimization for tensor network renormalization Yukawa Institute for Theoretical Physics, Kyoto University, Japan June, 6 Loop optimization for tensor network renormalization Shuo Yang! Perimeter Institute for Theoretical Physics, Waterloo, Canada Zheng-Cheng

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable

More information

Introduction to the Renormalization Group

Introduction to the Renormalization Group Introduction to the Renormalization Group Gregory Petropoulos University of Colorado Boulder March 4, 2015 1 / 17 Summary Flavor of Statistical Physics Universality / Critical Exponents Ising Model Renormalization

More information

8.334: Statistical Mechanics II Spring 2014 Test 2 Review Problems

8.334: Statistical Mechanics II Spring 2014 Test 2 Review Problems 8.334: Statistical Mechanics II Spring 014 Test Review Problems The test is closed book, but if you wish you may bring a one-sided sheet of formulas. The intent of this sheet is as a reminder of important

More information

Tensor network vs Machine learning. Song Cheng ( 程嵩 ) IOP, CAS

Tensor network vs Machine learning. Song Cheng ( 程嵩 ) IOP, CAS Tensor network vs Machine learning Song Cheng ( 程嵩 ) IOP, CAS physichengsong@iphy.ac.cn Outline Tensor network in a nutshell TN concepts in machine learning TN methods in machine learning Outline Tensor

More information

Lecture 7: Con3nuous Latent Variable Models

Lecture 7: Con3nuous Latent Variable Models CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/

More information

Unsupervised Learning: K- Means & PCA

Unsupervised Learning: K- Means & PCA Unsupervised Learning: K- Means & PCA Unsupervised Learning Supervised learning used labeled data pairs (x, y) to learn a func>on f : X Y But, what if we don t have labels? No labels = unsupervised learning

More information

Non-Fixed Point Renormalization Group Flow

Non-Fixed Point Renormalization Group Flow Non-Fixed Point Renormalization Group Flow Gilberto de la Peña May 9, 2013 Abstract: The renormalization group flow of most systems is characterized by attractive or repelling fixed points. Nevertheless,

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

1. Scaling. First carry out a scale transformation, where we zoom out such that all distances appear smaller by a factor b > 1: Δx Δx = Δx/b (307)

1. Scaling. First carry out a scale transformation, where we zoom out such that all distances appear smaller by a factor b > 1: Δx Δx = Δx/b (307) 45 XVIII. RENORMALIZATION GROUP TECHNIQUE The alternative technique to man field has to be significantly different, and rely on a new idea. This idea is the concept of scaling. Ben Widom, Leo Kadanoff,

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

Phase transition and spontaneous symmetry breaking

Phase transition and spontaneous symmetry breaking Phys60.nb 111 8 Phase transition and spontaneous symmetry breaking 8.1. Questions: Q1: Symmetry: if a the Hamiltonian of a system has certain symmetry, can the system have a lower symmetry? Q: Analyticity:

More information

Manifestly diffeomorphism invariant classical Exact Renormalization Group

Manifestly diffeomorphism invariant classical Exact Renormalization Group Manifestly diffeomorphism invariant classical Exact Renormalization Group Anthony W. H. Preston University of Southampton Supervised by Prof. Tim R. Morris Talk prepared for Asymptotic Safety seminar,

More information

3 Symmetry Protected Topological Phase

3 Symmetry Protected Topological Phase Physics 3b Lecture 16 Caltech, 05/30/18 3 Symmetry Protected Topological Phase 3.1 Breakdown of noninteracting SPT phases with interaction Building on our previous discussion of the Majorana chain and

More information

A Video from Google DeepMind.

A Video from Google DeepMind. A Video from Google DeepMind http://www.nature.com/nature/journal/v518/n7540/fig_tab/nature14236_sv2.html Can machine learning teach us cluster updates? Lei Wang Institute of Physics, CAS https://wangleiphy.github.io

More information

Table of Contents. Multivariate methods. Introduction II. Introduction I

Table of Contents. Multivariate methods. Introduction II. Introduction I Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation

More information

MODEL WITH SPIN; CHARGE AND SPIN EXCITATIONS 57

MODEL WITH SPIN; CHARGE AND SPIN EXCITATIONS 57 56 BOSONIZATION Note that there seems to be some arbitrariness in the above expressions in terms of the bosonic fields since by anticommuting two fermionic fields one can introduce a minus sine and thus

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Additional reading can be found from non-assessed exercises (week 8) in this course unit teaching page. Textbooks: Sect. 6.3 in [1] and Ch. 12 in [2] Outline Introduction

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Physics 127b: Statistical Mechanics. Renormalization Group: 1d Ising Model. Perturbation expansion

Physics 127b: Statistical Mechanics. Renormalization Group: 1d Ising Model. Perturbation expansion Physics 17b: Statistical Mechanics Renormalization Group: 1d Ising Model The ReNormalization Group (RNG) gives an understanding of scaling and universality, and provides various approximation schemes to

More information

arxiv: v1 [stat.ml] 14 Oct 2014

arxiv: v1 [stat.ml] 14 Oct 2014 An exact mapping between the Variational Renormalization Group and Deep Learning Pankaj Mehta Dept. of Physics, Boston University, Boston, MA David J. Schwab Dept. of Physics, Northwestern University,

More information

Phase transitions and critical phenomena

Phase transitions and critical phenomena Phase transitions and critical phenomena Classification of phase transitions. Discontinous (st order) transitions Summary week -5 st derivatives of thermodynamic potentials jump discontinously, e.g. (

More information

Classification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1).

Classification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1). Regression and PCA Classification The goal: map from input X to a label Y. Y has a discrete set of possible values We focused on binary Y (values 0 or 1). But we also discussed larger number of classes

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Opportunities and challenges in quantum-enhanced machine learning in near-term quantum computers

Opportunities and challenges in quantum-enhanced machine learning in near-term quantum computers Opportunities and challenges in quantum-enhanced machine learning in near-term quantum computers Alejandro Perdomo-Ortiz Senior Research Scientist, Quantum AI Lab. at NASA Ames Research Center and at the

More information

Dimensionality Reduction

Dimensionality Reduction Dimensionality Reduction Le Song Machine Learning I CSE 674, Fall 23 Unsupervised learning Learning from raw (unlabeled, unannotated, etc) data, as opposed to supervised data where a classification of

More information

Phase Transitions and Renormalization:

Phase Transitions and Renormalization: Phase Transitions and Renormalization: Using quantum techniques to understand critical phenomena. Sean Pohorence Department of Applied Mathematics and Theoretical Physics University of Cambridge CAPS 2013

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation

More information

Spectral action, scale anomaly. and the Higgs-Dilaton potential

Spectral action, scale anomaly. and the Higgs-Dilaton potential Spectral action, scale anomaly and the Higgs-Dilaton potential Fedele Lizzi Università di Napoli Federico II Work in collaboration with A.A. Andrianov (St. Petersburg) and M.A. Kurkov (Napoli) JHEP 1005:057,2010

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Accelerated Monte Carlo simulations with restricted Boltzmann machines

Accelerated Monte Carlo simulations with restricted Boltzmann machines Accelerated simulations with restricted Boltzmann machines Li Huang 1 Lei Wang 2 1 China Academy of Engineering Physics 2 Institute of Physics, Chinese Academy of Sciences July 6, 2017 Machine Learning

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 8 Continuous Latent Variable

More information

Renormalization of Tensor- Network States Tao Xiang

Renormalization of Tensor- Network States Tao Xiang Renormalization of Tensor- Network States Tao Xiang Institute of Physics/Institute of Theoretical Physics Chinese Academy of Sciences txiang@iphy.ac.cn Physical Background: characteristic energy scales

More information

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection Instructor: Herke van Hoof (herke.vanhoof@cs.mcgill.ca) Based on slides by:, Jackie Chi Kit Cheung Class web page:

More information

Spontaneous Symmetry Breaking

Spontaneous Symmetry Breaking Spontaneous Symmetry Breaking Second order phase transitions are generally associated with spontaneous symmetry breaking associated with an appropriate order parameter. Identifying symmetry of the order

More information

Covariance and Correlation Matrix

Covariance and Correlation Matrix Covariance and Correlation Matrix Given sample {x n } N 1, where x Rd, x n = x 1n x 2n. x dn sample mean x = 1 N N n=1 x n, and entries of sample mean are x i = 1 N N n=1 x in sample covariance matrix

More information

VI.D Self Duality in the Two Dimensional Ising Model

VI.D Self Duality in the Two Dimensional Ising Model VI.D Self Duality in the Two Dimensional Ising Model Kramers and Wannier discovered a hidden symmetry that relates the properties of the Ising model on the square lattice at low and high temperatures.

More information

Deep Learning: Self-Taught Learning and Deep vs. Shallow Architectures. Lecture 04

Deep Learning: Self-Taught Learning and Deep vs. Shallow Architectures. Lecture 04 Deep Learning: Self-Taught Learning and Deep vs. Shallow Architectures Lecture 04 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Self-Taught Learning 1. Learn

More information

A summary of Deep Learning without Poor Local Minima

A summary of Deep Learning without Poor Local Minima A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given

More information

Learning Deep Architectures for AI. Part II - Vijay Chakilam

Learning Deep Architectures for AI. Part II - Vijay Chakilam Learning Deep Architectures for AI - Yoshua Bengio Part II - Vijay Chakilam Limitations of Perceptron x1 W, b 0,1 1,1 y x2 weight plane output =1 output =0 There is no value for W and b such that the model

More information

Potts And XY, Together At Last

Potts And XY, Together At Last Potts And XY, Together At Last Daniel Kolodrubetz Massachusetts Institute of Technology, Center for Theoretical Physics (Dated: May 16, 212) We investigate the behavior of an XY model coupled multiplicatively

More information

Feshbach-Schur RG for the Anderson Model

Feshbach-Schur RG for the Anderson Model Feshbach-Schur RG for the Anderson Model John Z. Imbrie University of Virginia Isaac Newton Institute October 26, 2018 Overview Consider the localization problem for the Anderson model of a quantum particle

More information

Efficient time evolution of one-dimensional quantum systems

Efficient time evolution of one-dimensional quantum systems Efficient time evolution of one-dimensional quantum systems Frank Pollmann Max-Planck-Institut für komplexer Systeme, Dresden, Germany Sep. 5, 2012 Hsinchu Problems we will address... Finding ground states

More information

GRE Subject test preparation Spring 2016 Topic: Abstract Algebra, Linear Algebra, Number Theory.

GRE Subject test preparation Spring 2016 Topic: Abstract Algebra, Linear Algebra, Number Theory. GRE Subject test preparation Spring 2016 Topic: Abstract Algebra, Linear Algebra, Number Theory. Linear Algebra Standard matrix manipulation to compute the kernel, intersection of subspaces, column spaces,

More information

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders

More information

What is Principal Component Analysis?

What is Principal Component Analysis? What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

More information

Towards a manifestly diffeomorphism invariant Exact Renormalization Group

Towards a manifestly diffeomorphism invariant Exact Renormalization Group Towards a manifestly diffeomorphism invariant Exact Renormalization Group Anthony W. H. Preston University of Southampton Supervised by Prof. Tim R. Morris Talk prepared for UK QFT-V, University of Nottingham,

More information

Multilayer Perceptrons (MLPs)

Multilayer Perceptrons (MLPs) CSE 5526: Introduction to Neural Networks Multilayer Perceptrons (MLPs) 1 Motivation Multilayer networks are more powerful than singlelayer nets Example: XOR problem x 2 1 AND x o x 1 x 2 +1-1 o x x 1-1

More information

Newton s Method and Localization

Newton s Method and Localization Newton s Method and Localization Workshop on Analytical Aspects of Mathematical Physics John Imbrie May 30, 2013 Overview Diagonalizing the Hamiltonian is a goal in quantum theory. I would like to discuss

More information

Fisher s zeros as Boundary of RG flows in complex coupling space

Fisher s zeros as Boundary of RG flows in complex coupling space Fisher s zeros as Boundary of RG flows in complex coupling space Yannick Meurice The University of Iowa yannick-meurice@uiowa.edu 5th ERG Conference, Corfu, September 14, 21 work done in part with Alexei

More information

Lecture: Face Recognition

Lecture: Face Recognition Lecture: Face Recognition Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 12-1 What we will learn today Introduction to face recognition The Eigenfaces Algorithm Linear

More information

Scale invariance on the lattice

Scale invariance on the lattice Coogee'16 Sydney Quantum Information Theory Workshop Feb 2 nd - 5 th, 2016 Scale invariance on the lattice Guifre Vidal Coogee'16 Sydney Quantum Information Theory Workshop Feb 2 nd - 5 th, 2016 Scale

More information

Renormalization group maps for Ising models in lattice gas variables

Renormalization group maps for Ising models in lattice gas variables Renormalization group maps for Ising models in lattice gas variables Department of Mathematics, University of Arizona Supported by NSF grant DMS-0758649 http://www.math.arizona.edu/ e tgk RG in lattice

More information

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1 Introduction to Machine Learning Introduction to ML - TAU 2016/7 1 Course Administration Lecturers: Amir Globerson (gamir@post.tau.ac.il) Yishay Mansour (Mansour@tau.ac.il) Teaching Assistance: Regev Schweiger

More information

VI.D Self Duality in the Two Dimensional Ising Model

VI.D Self Duality in the Two Dimensional Ising Model VI.D Self Duality in the Two Dimensional Ising Model Kramers and Wannier discovered a hidden symmetry that relates the properties of the Ising model on the square lattice at low and high temperatures.

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Phase transitions and finite-size scaling

Phase transitions and finite-size scaling Phase transitions and finite-size scaling Critical slowing down and cluster methods. Theory of phase transitions/ RNG Finite-size scaling Detailed treatment: Lectures on Phase Transitions and the Renormalization

More information

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

4 Matrix product states

4 Matrix product states Physics 3b Lecture 5 Caltech, 05//7 4 Matrix product states Matrix product state (MPS) is a highly useful tool in the study of interacting quantum systems in one dimension, both analytically and numerically.

More information

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction Robot Image Credit: Viktoriya Sukhanova 13RF.com Dimensionality Reduction Feature Selection vs. Dimensionality Reduction Feature Selection (last time) Select a subset of features. When classifying novel

More information

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable

More information

Quantum simulation with string-bond states: Joining PEPS and Monte Carlo

Quantum simulation with string-bond states: Joining PEPS and Monte Carlo Quantum simulation with string-bond states: Joining PEPS and Monte Carlo N. Schuch 1, A. Sfondrini 1,2, F. Mezzacapo 1, J. Cerrillo 1,3, M. Wolf 1,4, F. Verstraete 5, I. Cirac 1 1 Max-Planck-Institute

More information

Chiral Haldane-SPT phases of SU(N) quantum spin chains in the adjoint representation

Chiral Haldane-SPT phases of SU(N) quantum spin chains in the adjoint representation Chiral Haldane-SPT phases of SU(N) quantum spin chains in the adjoint representation Thomas Quella University of Cologne Presentation given on 18 Feb 2016 at the Benasque Workshop Entanglement in Strongly

More information

Many-Body Fermion Density Matrix: Operator-Based Truncation Scheme

Many-Body Fermion Density Matrix: Operator-Based Truncation Scheme Many-Body Fermion Density Matrix: Operator-Based Truncation Scheme SIEW-ANN CHEONG and C. L. HENLEY, LASSP, Cornell U March 25, 2004 Support: NSF grants DMR-9981744, DMR-0079992 The Big Picture GOAL Ground

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Real-Space Renormalisation

Real-Space Renormalisation Real-Space Renormalisation LMU München June 23, 2009 Real-Space Renormalisation of the One-Dimensional Ising Model Ernst Ising Real-Space Renormalisation of the One-Dimensional Ising Model Ernst Ising

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Kernel Methods. Barnabás Póczos

Kernel Methods. Barnabás Póczos Kernel Methods Barnabás Póczos Outline Quick Introduction Feature space Perceptron in the feature space Kernels Mercer s theorem Finite domain Arbitrary domain Kernel families Constructing new kernels

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Critical Behavior II: Renormalization Group Theory

Critical Behavior II: Renormalization Group Theory Critical Behavior II: Renormalization Group Theor H. W. Diehl Fachbereich Phsik, Universität Duisburg-Essen, Campus Essen 1 What the Theor should Accomplish Theor should ield & explain: scaling laws #

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the

More information

Main matrix factorizations

Main matrix factorizations Main matrix factorizations A P L U P permutation matrix, L lower triangular, U upper triangular Key use: Solve square linear system Ax b. A Q R Q unitary, R upper triangular Key use: Solve square or overdetrmined

More information

2-Group Global Symmetry

2-Group Global Symmetry 2-Group Global Symmetry Clay Córdova School of Natural Sciences Institute for Advanced Study April 14, 2018 References Based on Exploring 2-Group Global Symmetry in collaboration with Dumitrescu and Intriligator

More information

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014 Principal Component Analysis and Singular Value Decomposition Volker Tresp, Clemens Otte Summer 2014 1 Motivation So far we always argued for a high-dimensional feature space Still, in some cases it makes

More information

Restricted Boltzmann Machines

Restricted Boltzmann Machines Restricted Boltzmann Machines Boltzmann Machine(BM) A Boltzmann machine extends a stochastic Hopfield network to include hidden units. It has binary (0 or 1) visible vector unit x and hidden (latent) vector

More information

Real-Space Renormalization Group (RSRG) Approach to Quantum Spin Lattice Systems

Real-Space Renormalization Group (RSRG) Approach to Quantum Spin Lattice Systems WDS'11 Proceedings of Contributed Papers, Part III, 49 54, 011. ISBN 978-80-7378-186-6 MATFYZPRESS Real-Space Renormalization Group (RSRG) Approach to Quantum Spin Lattice Systems A. S. Serov and G. V.

More information

UNSUPERVISED LEARNING

UNSUPERVISED LEARNING UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training

More information