Multivariate Analysis, TMVA, and Artificial Neural Networks

http://tmva.sourceforge.net/ Multivariate Analysis, TMVA, and Artificial Neural Networks Matt Jachowski jachowski@stanford.edu 1

Multivariate Analysis Techniques dedicated to analysis of data with multiple variables Active field many recently developed techniques rely on computational ability of modern computers 2

Multivariate Analysis and HEP Goal is to classify events as signal or background Single event defined by several variables (energy, transverse momentum, etc.) Use all the variables to classify the event Multivariate analysis! 3

Multivariate Analysis and HEP Rectangular cuts optimization common 4

Multivariate Analysis and HEP Likelihood Estimator analysis also common Use of more complicated methods (Neural Networks, Boosted Decision Trees) not so common (though growing) why? Difficult to implement Physicists are skeptical of new methods 5

Toolkit for Multivariate Analysis (TMVA) ROOT-integrated software package with several MVA techniques Automatic training, testing, and evaluation of MVA methods Guidelines and documentation to describe methods for users this isn t a black box! 6

Toolkit for Multivariate Analysis (TMVA) Easy to configure methods Easy to plug-in HEP data Easy to compare different MVA methods 7

TMVA in Action 8

TMVA and Me TMVA started in October 2005 Still young Very active group of developers My involvement Decorrelation for Cuts Method (mini project) New Artificial Neural Network implementation (main project) 9

Decorrelated Cuts Method Some MVA methods suffer if data has linear correlations i.e. Likelihood Estimator, Cuts Linear correlations can be easily transformed away I implemented this for the Cuts Method 10

Decorrelated Cuts Method Find the square root of the covariance matrix (C=C C ) D T = S CS C'= S DS T Decorrelate the data x'= C'x Apply cuts to decorrelated data 11

Artificial Neural Networks (ANNs) Robust non-linear MVA technique 12

Training an ANN Challenge is training the network Like human brain, network learns from seeing data over and over again Technical details: Ask me if you re really interested 14

MLP MLP (Multi-Layer Perceptron) my ANN implementation for TMVA MLP is TMVA s main ANN MLP serves as base for any future ANN developments in TMVA 15

MLP Information & Statistics Implemented in C++ Object-Oriented 4,000+ lines of code 16 classes 16

Acknowledgements Joerg Stelzer Andreas Hoecker CERN University of Michigan Ford NSF 17

Questions? (I have lots of technical slides in reserve that I would be glad to talk about) 18

Synapses and Neurons v 0 y 0 v = j f ( y,..., y, w n 0,..., w 0 j nj ) w 0j v 1 y 1. w 1j v j y j.. y n w nj y j = ϕ ( v j ) v n 20

Synapses and Neurons v j = f ( y,..., y, w,..., w ) 0 n 0 j nj = n i= 0 w ij y i y j y j = ϕ( v j ) = 1+ 1 e v j v j 21

Universal Approximation Theorem Every continuous function that maps intervals of real numbers to some output interval of real numbers can be approximated arbitrarily closely by a multi-layer perceptron with just one hidden layer (with non-linear activation functions). output f ( x) = w σ ( b + v x) j j j weights between hidden and output layer j non-linear activation function weights between input and hidden layer inputs 22 bias

Training an MLP Training Event: Network: f x, x, x, x ) = ( 0 1 2 3 g x, x, x, x ) = ( 0 1 2 3 d y x 0 x 1 x 2 y x 3 error e = d y 23

Training an MLP Adjust weights to minimize error (or an estimator that is some function of the error) e j ( n) = d ( n) y ( n) j j ε ( n) = 1 2 2 e ( n) j j { output _ neurons} ε avg = 1 ε ( n) N 1 24 N n=

Back-Propagation Algorithm Make correction in direction of steepest descent w ij ( n + 1) = w ( n) + Δw ( n) ij ij Δw ij ( n) = η ε w ij ij ( n) ( n) Corrections made to output layer first, propagated backwards 25