Density Estimation (III)
|
|
- Geraldine Robinson
- 5 years ago
- Views:
Transcription
1 Density Estimation (III) Yesterday Cross-validation Adaptive kernels Variance(bootstrap) Bias(jackknife) Multivariate kernel estimation Today Series estimation Monte Carlo weighting Unfolding Non-parametric regression splots 1 Frank Porter, SLUO Lectures on Statistics, August 2006
2 Some References (I) Richard A. Tapia & James R. Thompson, Nonparametric Density Estimation, Johns Hopkins University Press, Baltimore (1978). David W. Scott, Multivariate Density Estimation, John Wiley & Sons, Inc., New York (1992). Adrian W. Bowman and Adelchi Azzalini, Applied Smoothing Techniques for Data Analysis, Clarendon Press, Oxford (1997). B. W. Silverman, Density Estimation for Statistics and Data Analysis, Monographs on Statistics and Applied Probability, Chapman and Hall (1986); contents.html K. S. Cranmer, Kernel Estimation in High Energy Physics, Comp. Phys. Comm. 136, 198 (2001) [hep-ex/ v1]; cache/hep-ex/pdf/0011/ pdf 2 Frank Porter, SLUO Lectures on Statistics, August 2006
3 Some References (II) M. Pivk & F. R. Le Diberder, splot: a statistical tool to unfold data distributions, Nucl. Instr. Meth. A 555, 356 (2005). R. Cahn, How splots are Best (2005), rev splots best.pdf BaBar Statistics Working Group, Recommendations for Display of Projections in Multi-Dimensional Analyses, Statistics/Documents/MDgraphRec.pdf Additional specific references will noted in the course of the lectures. 3 Frank Porter, SLUO Lectures on Statistics, August 2006
4 Estimation Using Orthogonal Series (I) We may take an alternative approach, and imagine expanding the PDF in a series of orthogonal functions: p(x) = a k ψ k (x), k=0 where a k = ψ k (x)p(x)ρ(x) dx = E [ψ k (x)ρ(x)], and ψ k (x)ψ l (x)ρ(x) dx = δ kl. [ρ(x) is a weight function ] 4 Frank Porter, SLUO Lectures on Statistics, August 2006
5 Estimation Using Orthogonal Series (II) Since the expansion coefficients are expectation values of functions, it is natural to substitute sample averages as estimators for them: â k = 1 n n ψ k (x i )ρ(x i ), i=1 and thus: p(x) = m â k ψ k (x), j=1 where the number of terms m is chosen by some optimization criterion. Note the analogy between choosing m and choosing smoothing parameter w in kernel estimators; and between choosing K and choosing {ψ k }. 5 Frank Porter, SLUO Lectures on Statistics, August 2006
6 Estimation Using Orthogonal Series (III) We are actually rather familiar with estimation using orthogonal series: It is the method of moments! From G. Brandenburg et al., Determination of the K (1800) Spin Parity, SLAC-PUB-1670 (1975). 6 Frank Porter, SLUO Lectures on Statistics, August 2006
7 UsingMonteCarloModels(I) Often build up a data model using Monte Carlo computations of different processes, which are added together to get the complete model. May involve weighting of events, if more integrated luminosity is simulated for some processes than for others. The overall simulated empirical density distribution is then: p(x) = n ρ i δ(x x i ), i=1 where ρ i =1(or n to correspond with an event sample of some integrated luminosity) 7 Frank Porter, SLUO Lectures on Statistics, August 2006
8 Using Monte Carlo Models (II) The weights must be included in computing the sample covariance matrix (x i has components x (k) i,k =1...d): V kl = n i=1 (x (k) ρ i i μ k )(x (l) i μ l ) j ρ, j where μ k = i ρ ix (k) i / j ρ j isthesamplemeanindimensionk. Assuming we have transformed to a diagonal system using this covariance matrix, our product kernel density based on this simluation is then: p 0 (x) = 1 n d 1 j ρ ρ i K x(k) x (k) i. j w k w k i=1 k=1 This may be iterated to obtain an adaptive kernel estimator as discussed earlier. 8 Frank Porter, SLUO Lectures on Statistics, August 2006
9 Unfolding [Another big subject; our treatment will be cursory. Glen Cowan, Statistical Data Analysis, Oxford University Press (1998) devotes a chapter to unfolding.] We may not be satisfied with merely estimating the density from which our sample {x i } was drawn. The interesting physics may be obscured by convolution with uninteresting functions, for example efficiency dependencies or radiative corrections. We assume the convolution function is known; often it is also estimated via auxillary measurements. Because data fluctuates, unfolding usually also necessitates smoothing to control fluctuations. Referred to as Regularization inthis context. 9 Frank Porter, SLUO Lectures on Statistics, August 2006
10 Unfolding: Measurement of R R σ(e + e hadrons) σ (e + e μ + μ [lowest order QED]) R J/ψ ψ(2s) Υ Z 10 2 φ 10 ω ρ ρ s [GeV] R, corrected for initial state radiation (from RPP 2006) 10 Frank Porter, SLUO Lectures on Statistics, August 2006
11 Unfolding Formalism Typical problem: Sample from a distribution with some kernel function R(x, y): o(x) = R(x, y)p(y)dy. We are given a sampling ô, and wish to estimate p. In principle, the solution is easy: p(y) = R 1 (y, x)ô(x)dx, where R 1 (x, y)r(y, x ) dy = δ(x x ). In practice, our observations are discrete, and we need to interpolate/smooth. 11 Frank Porter, SLUO Lectures on Statistics, August 2006
12 Unfolding: Iterative Approach (I) If we don t know how (or are too lazy) to invert R, wemaytryan iterative solution. For example, consider the problem of unfolding radiative corrections. The observed cross section, σ E (s) is related to the interesting cross section σ according to: σ E (s) =σ(s)+δσ(s), where δσ(s) = R(s, s )σ(s ) ds. We form an iterative estimate for σ(s) according to: σ 0 (s) =σ E (s) σ i (s) =σ E (s) R(s, s ) σ i 1 (s ) ds, i =1, 2,... This is just the Neumann series solution to an integral equation! 12 Frank Porter, SLUO Lectures on Statistics, August 2006
13 Unfolding: Iterative Approach (II) Since σ E (s) is measured at discrete s values and with some statistical precision, some smoothing/interpolation is still required. R in region around charm threshold (from SLAC-PUB-4160) 13 Frank Porter, SLUO Lectures on Statistics, August 2006
14 Unfolding: Regularization(I) If we know R 1 we can incorporate the smoothing/interpolation more directly. We could use the techniques already described to form a smoothed estimate ô, and then use the transformation R 1 to obtain the estimator p. For simplicity, consider here the problem of unfolding a histogram. Then we restate the earlier integral formula as: o i = k R ij p j, j=1 where R is a square matrix, assumed invertible. 14 Frank Porter, SLUO Lectures on Statistics, August 2006
15 Unfolding: Regularization(II) A popular procedure is to form a likelihood (or χ 2 ),butaddanextra term, a regulator, to impose smoothing. The modified likelihood is maximized to obtain the estimate for {p j }. ln L ln L =lnl + ws(ô i ). The regulator function S(ô i ) as usual gets its smoothing effect by being somewhat non-local. A popular choice is to add a curvature term to be minimized (hence smoothed): S(ô i )= k 1 j=2 [(ô i+1 ô i ) (ô i ô i 1 )] Frank Porter, SLUO Lectures on Statistics, August 2006
16 Unfolding: Regularization(III) This is implemented, for example, in the RUN package, see V. Blobel: blobel/wwwrunf.html Also, GURU, A. Höcker & V. Kartvelishvili, NIM A 372, 469 (1996). For more, see Glen Cowan s Durham paper at: This paper has a nice demonstration of the importance of smoothing: Note that the transformation R itself corresponds to a sort of smoother, as it acts non-locally. The act of unfolding a smoother can produce large variances. 16 Frank Porter, SLUO Lectures on Statistics, August 2006
17 Non-parametric Regression(I) Regression is the problem of estimating the dependence of some response variableona predictor variable. Given a dataset of predictorresponse pairs {(x i,y i ),i=1,...,n}, we write the relationship as: y i = r(x i )+ɛ i, where the error ɛ i might also depend on x through the parameters of the sampling distribution it represents. We are used to solving this problem with parametric statistics, for example, the dependence of accelerator background on beam current, where we might try a power-law form. We may also bring our non-parametric methods to bear on this problem. 17 Frank Porter, SLUO Lectures on Statistics, August 2006
18 Non-parametric Regression(II) The sampling of the response-predictor pairs may be a fixed design in which the x i values are deliberately selected, or a random design, in which (x i,y i ) is drawn from some joint PDF. We ll work in the context of the random design here, and also will work in two dimensions. The regression function r may be expressed as: yp(x, y) dy r(x) =E [y x] = yp(y x) dy =. p(x, y)dy 18 Frank Porter, SLUO Lectures on Statistics, August 2006
19 Non-parametric Regression: Local Mean Estimator Let us construct an estimator for r by substituting a bivariate product kernel estimator for the unknown PDF p(x, y): 1 n ( ) ( ) x xi y yi p(x, y) = K K. nw x w y w x w y i=1 Assuming a symmetric kernel, after a little algebra we find: n ( ) x xi n ( ) x xi r(x) = y i K / K. i=1 w x This is known as the local mean estimator. Note the absence of dependence on w y, and the linearity in the y i. i=1 w x 19 Frank Porter, SLUO Lectures on Statistics, August 2006
20 Non-parametric Regression: Local Linear Estimator We may achieve better properties by considering local polynomial estimators, corresponding to local polynomial fits to the data. This may be achieved with a least-squares minimization (the local mean is the result for a fit to a zero-order polynomial). Thus, the local linear regression estimate is given by: where r(x) = n i=1 [S 2 (x) S 1 (x)(x i x)] K ((x i x)/w x ) y i S 2 (x)s 0 (x) S 1 (x) 2, S l (x) n ( ) (x i x) l x xi K. i=1 w x 20 Frank Porter, SLUO Lectures on Statistics, August 2006
21 Examples: Local Linear Regression yy xx yr xr45 lowess package ndoc BaBar docs ht Trees year diam 21 Frank Porter, SLUO Lectures on Statistics, August 2006
22 splots The use of the density estimation technique known as splots has gained popularity in BaBar (and perhaps elsewhere?). Multi-variate: uses distribution on a subset of variables to predict distribution in another subset. Based on a (parametric) model in the predictor variables, with different categories (e.g., signal and background ). Provides a means to visualize agreement with model for each category. Provides an easy way to do background subtraction. 22 Frank Porter, SLUO Lectures on Statistics, August 2006
23 splots formalism (I) Assume a total of r + p parameters in the overall fit to the data: The expected number of events, N j,j =1,...,r in each category, Distribution parameters, {θ i,i=1,...,p}. We use a total of N events to estimate these parameters via a maximum likelihood fit to the sample {x}. Wish to find weights w j (x i ), depending only on {x } {x} (and implicitly on the unknown parameters), such that the asymptotic distribution in y/ {x } of the weighted events is the sampling distribution in y, for any chosen category j. Assume that y and {x } are statistically independent within each category. 23 Frank Porter, SLUO Lectures on Statistics, August 2006
24 splots formalism (II) The weights which satisfy our criterion and produce minimum variance summed over the histogram are given by: where w j (e) = rk=1 V jk f k (x e) rk=1 N k f k (x e), w j (e) is the weight for event e in category j V is the covariance matrix from a reduced fit (i.e., excluding y): (V 1) jk N e=1 f j (x e)f k (x e) [ ri=1 Ni f i (x e)] 2, N k is the estimate of the number of events in category k, according to the reduced fit. f j (x e) is the PDF for category j evaluated at x e. 24 Frank Porter, SLUO Lectures on Statistics, August 2006
25 splots formalism (III) Finally, the splot is constructed by adding each event e with y = y e to the y-histogram (or scatter plot, etc, if y is multivariate), with weight w j (e). The resulting histogram is then an estimator for the true distribution in y for category j. 25 Frank Porter, SLUO Lectures on Statistics, August 2006
26 splot Example (I) Example from BaBar B 0 π + π,k + π analysis, showing comparison of splot method (right figure) with a plot enhanced in signal fraction with a cut on likelihood (left figure). Events / 10 MeV GeV ΔE ΔE (GeV) Figure 1. Signal distribution of the ΔE variable. The left figure is obtained applying a cut on the Likelihood ratio to enrich the data sample in signal events (about 60% of signal is kept). The right figure shows the s Plot forsignal(allevents are kept). From: M. Pivk, splot: A Quick Introduction, arxiv:physics/ Frank Porter, SLUO Lectures on Statistics, August 2006
27 splot Example (II) Events / GeV/c K + π π + γ K + π π 0 γ BF 10 6 / 0.02 GeV/c 2 B + K + π π + γ B 0 K + π π 0 γ K S π π + γ B 0 K 0 π π + γ 0 40 K S π π 0 γ B + K 0 π + π 0 γ m ES (GeV/c 2 ) m Kππ (GeV/c 2 ) (From hep-ex/ ) 27 Frank Porter, SLUO Lectures on Statistics, August 2006
28 splot Errors Typically the splot error in a bin is estimated simply according to the sum of the squares of the weights. This sometimes leads to visually misleading impressions, due to fluctuations on small statistics. If the plot is being made for a distribution for which there is a prediction, then that distribution can be used to estimate the expected uncertainties, and these can be plotted. If the plot is being made for a distribution for which there is no prediction, it is more difficult, but a (smoothed) estimate from the empirical distribution may be used to estimate the expected errors. 28 Frank Porter, SLUO Lectures on Statistics, August 2006
29 Summary(I) We have looked at several topics concerning density estimation: EPDF, Histograms, Ideograms Kernel and Series estimators Optimization considerations Error analysis Multivariate issues Unfolding Non-parametric regression splots 29 Frank Porter, SLUO Lectures on Statistics, August 2006
30 Summary(II) We also left many things out, for a few examples: Time series analyses Boundary complications Dimension reduction Next: See Ilya Narsky lectures for Machine Learning andthe Classification/Discrimination problems. Monday, August 28 Friday, September 1; Panofsky Auditorium, 10:00 30 Frank Porter, SLUO Lectures on Statistics, August 2006
Density Estimation (II)
Density Estimation (II) Yesterday Overview & Issues Histogram Kernel estimators Ideogram Today Further development of optimization Estimating variance and bias Adaptive kernels Multivariate kernel estimation
More informationDensity Estimation. We are concerned more here with the non-parametric case (see Roger Barlow s lectures for parametric statistics)
Density Estimation Density Estimation: Deals with the problem of estimating probability density functions (PDFs) based on some data sampled from the PDF. May use assumed forms of the distribution, parameterized
More informationFrank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c /9/9 page 78 le-tex
Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c04 203/9/9 page 78 le-tex 78 4 Resampling Techniques.2 bootstrap interval bounds 0.8 0.6 0.4 0.2 0 0 200 400 600
More informationStatistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests
Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests http://benasque.org/2018tae/cgi-bin/talks/allprint.pl TAE 2018 Benasque, Spain 3-15 Sept 2018 Glen Cowan Physics
More informationMultivariate statistical methods and data mining in particle physics
Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general
More informationAdvanced statistical methods for data analysis Lecture 1
Advanced statistical methods for data analysis Lecture 1 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline
More informationStatistical Methods for Particle Physics (I)
Statistical Methods for Particle Physics (I) https://agenda.infn.it/conferencedisplay.py?confid=14407 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan
More informationSparse Nonparametric Density Estimation in High Dimensions Using the Rodeo
Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University
More informationPhysics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester
Physics 403 Propagation of Uncertainties Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Maximum Likelihood and Minimum Least Squares Uncertainty Intervals
More informationLecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1
Lecture 2 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,
More informationBAYESIAN DECISION THEORY
Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will
More informationDalitz Plot Analyses of B D + π π, B + π + π π + and D + s π+ π π + at BABAR
Proceedings of the DPF-9 Conference, Detroit, MI, July 7-3, 9 SLAC-PUB-98 Dalitz Plot Analyses of B D + π π, B + π + π π + and D + s π+ π π + at BABAR Liaoyuan Dong (On behalf of the BABAR Collaboration
More informationStatistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That
Statistics Lecture 2 August 7, 2000 Frank Porter Caltech The plan for these lectures: The Fundamentals; Point Estimation Maximum Likelihood, Least Squares and All That What is a Confidence Interval? Interval
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationProbability Theory and Simulation Methods
Feb 28th, 2018 Lecture 10: Random variables Countdown to midterm (March 21st): 28 days Week 1 Chapter 1: Axioms of probability Week 2 Chapter 3: Conditional probability and independence Week 4 Chapters
More informationGenerative Learning algorithms
CS9 Lecture notes Andrew Ng Part IV Generative Learning algorithms So far, we ve mainly been talking about learning algorithms that model p(y x; θ), the conditional distribution of y given x. For instance,
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationStatistics for scientists and engineers
Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3
More informationCP violation in B 0 π + π decays in the BABAR experiment. Muriel Pivk, CERN. 22 March 2004, Lausanne
CP violation in B π + π decays in the BABAR experiment, 22 March 24, The BABAR experiment at SLAC 2 Phenomenology 3 The h + h analysis 4 A new tool: s Plots 5 Results 6 Interpretation of the results 7
More informationMultivariate Random Variable
Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate
More informationAdvanced Introduction to Machine Learning
10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationChapter 17: Undirected Graphical Models
Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)
More informationStatistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009
Statistics for Particle Physics Kyle Cranmer New York University 1 Hypothesis Testing 55 Hypothesis testing One of the most common uses of statistics in particle physics is Hypothesis Testing! assume one
More informationNicolas Berger SLAC, Menlo Park, California, U.S.A.
SLAC-PUB-11414 August 25 Frascati Physics Series Vol. VVVVVV (xxxx), pp. - DAΦNE 24: Physics at meson factories Frascati, June. 7-11, 24 Selected Contribution in Plenary Session INCLUSIVE HADRONIC RESULTS
More informationLecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis
Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,
More informationReminders. Thought questions should be submitted on eclass. Please list the section related to the thought question
Linear regression Reminders Thought questions should be submitted on eclass Please list the section related to the thought question If it is a more general, open-ended question not exactly related to a
More informationCMU-Q Lecture 24:
CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationTopics in Probability and Statistics
Topics in Probability and tatistics A Fundamental Construction uppose {, P } is a sample space (with probability P), and suppose X : R is a random variable. The distribution of X is the probability P X
More informationConfidence Limits and Intervals 3: Various other topics. Roger Barlow SLUO Lectures on Statistics August 2006
Confidence Limits and Intervals 3: Various other topics Roger Barlow SLUO Lectures on Statistics August 2006 Contents 1.Likelihood and lnl 2.Multidimensional confidence regions 3.Systematic errors: various
More informationProbabilistic Graphical Models
2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationOn prediction and density estimation Peter McCullagh University of Chicago December 2004
On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating
More informationConsistent Bivariate Distribution
A Characterization of the Normal Conditional Distributions MATSUNO 79 Therefore, the function ( ) = G( : a/(1 b2)) = N(0, a/(1 b2)) is a solu- tion for the integral equation (10). The constant times of
More informationLecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011
Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector
More informationRandom Matrix Eigenvalue Problems in Probabilistic Structural Mechanics
Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics S Adhikari Department of Aerospace Engineering, University of Bristol, Bristol, U.K. URL: http://www.aer.bris.ac.uk/contact/academic/adhikari/home.html
More informationGaussian Processes for Machine Learning
Gaussian Processes for Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics Tübingen, Germany carl@tuebingen.mpg.de Carlos III, Madrid, May 2006 The actual science of
More informationProbability Models for Bayesian Recognition
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIAG / osig Second Semester 06/07 Lesson 9 0 arch 07 Probability odels for Bayesian Recognition Notation... Supervised Learning for Bayesian
More informationMore on Estimation. Maximum Likelihood Estimation.
More on Estimation. In the previous chapter we looked at the properties of estimators and the criteria we could use to choose between types of estimators. Here we examine more closely some very popular
More informationA Calculator for Confidence Intervals
A Calculator for Confidence Intervals Roger Barlow Department of Physics Manchester University England Abstract A calculator program has been written to give confidence intervals on branching ratios for
More informationEstimation of cumulative distribution function with spline functions
INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationRecall the Basics of Hypothesis Testing
Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE
More informationLecture 35: December The fundamental statistical distances
36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose
More informationQuasi-PDFs and Pseudo-PDFs
s and s A.V. Radyushkin Physics Department, Old Dominion University & Theory Center, Jefferson Lab 2017 May 23, 2017 Densities and Experimentally, one works with hadrons Theoretically, we work with quarks
More informationStatistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods
Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods www.pp.rhul.ac.uk/~cowan/stat_aachen.html Graduierten-Kolleg RWTH Aachen 10-14 February 2014 Glen Cowan Physics
More informationISR physics at BABAR
SLAC-PUB-499 ISR physics at BABAR S.Serednyakov, Budker Institute of Nuclear Physics, Novosibirsk, Russia Abstract A method of measuring e+e- annihilation cross sections at low energy s < 5 GeV, using
More informationUnivariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation
Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation PRE 905: Multivariate Analysis Spring 2014 Lecture 4 Today s Class The building blocks: The basics of mathematical
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationNonresponse weighting adjustment using estimated response probability
Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy
More informationStatistics and Data Analysis
Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli Statistics and Data
More informationModelling Non-linear and Non-stationary Time Series
Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 89 Part II
More informationChapter 9. Non-Parametric Density Function Estimation
9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least
More informationIntroduction to Bayesian Statistics
School of Computing & Communication, UTS January, 207 Random variables Pre-university: A number is just a fixed value. When we talk about probabilities: When X is a continuous random variable, it has a
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationPolynomial chaos expansions for structural reliability analysis
DEPARTMENT OF CIVIL, ENVIRONMENTAL AND GEOMATIC ENGINEERING CHAIR OF RISK, SAFETY & UNCERTAINTY QUANTIFICATION Polynomial chaos expansions for structural reliability analysis B. Sudret & S. Marelli Incl.
More informationDynamic System Identification using HDMR-Bayesian Technique
Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in
More informationStatistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart
Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms
More informationStatistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes
Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Lecturer: Drew Bagnell Scribe: Venkatraman Narayanan 1, M. Koval and P. Parashar 1 Applications of Gaussian
More informationJoint Gaussian Graphical Model Review Series I
Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun
More informationData Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006
Astronomical p( y x, I) p( x, I) p ( x y, I) = p( y, I) Data Analysis I Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK 10 lectures, beginning October 2006 4. Monte Carlo Methods
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationp(d θ ) l(θ ) 1.2 x x x
p(d θ ).2 x 0-7 0.8 x 0-7 0.4 x 0-7 l(θ ) -20-40 -60-80 -00 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ θ x FIGURE 3.. The top graph shows several training points in one dimension, known or assumed to
More informationFrank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c /9/9 page 147 le-tex
Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c08 2013/9/9 page 147 le-tex 8.3 Principal Component Analysis (PCA) 147 Figure 8.1 Principal and independent components
More informationNonparameteric Regression:
Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,
More informationStatistics for the LHC Lecture 1: Introduction
Statistics for the LHC Lecture 1: Introduction Academic Training Lectures CERN, 14 17 June, 2010 indico.cern.ch/conferencedisplay.py?confid=77830 Glen Cowan Physics Department Royal Holloway, University
More informationDiscovery searches for light new physics with BaBar
SLAC-PUB-1548 Discovery searches for light new physics with BaBar Neus Lopez-March BABAR Collaboration E-mail: neus.lopezmarch@epfl.ch The BABAR experiment collected large samples of events during the
More informationMachine Learning 2017
Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section
More informationIntroduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones
Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive
More informationLecture 6. Regression
Lecture 6. Regression Prof. Alan Yuille Summer 2014 Outline 1. Introduction to Regression 2. Binary Regression 3. Linear Regression; Polynomial Regression 4. Non-linear Regression; Multilayer Perceptron
More informationCOMP 551 Applied Machine Learning Lecture 20: Gaussian processes
COMP 55 Applied Machine Learning Lecture 2: Gaussian processes Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp55
More informationAlternatives to Basis Expansions. Kernels in Density Estimation. Kernels and Bandwidth. Idea Behind Kernel Methods
Alternatives to Basis Expansions Basis expansions require either choice of a discrete set of basis or choice of smoothing penalty and smoothing parameter Both of which impose prior beliefs on data. Alternatives
More informationStatistical Data Analysis Stat 3: p-values, parameter estimation
Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,
More informationNonparametric Methods Lecture 5
Nonparametric Methods Lecture 5 Jason Corso SUNY at Buffalo 17 Feb. 29 J. Corso (SUNY at Buffalo) Nonparametric Methods Lecture 5 17 Feb. 29 1 / 49 Nonparametric Methods Lecture 5 Overview Previously,
More informationStudies of charmonium production in e + e - annihilation and B decays at BaBar
Studies of charmonium production in e + e - annihilation and B decays at BaBar I. Garzia, INFN Sezione di Ferrara On behalf of the BaBar Collaboration XVI International Conference on Hadron Spectroscopy
More informationLecture 2: Linear regression
Lecture 2: Linear regression Roger Grosse 1 Introduction Let s ump right in and look at our first machine learning algorithm, linear regression. In regression, we are interested in predicting a scalar-valued
More informationCOMS 4721: Machine Learning for Data Science Lecture 1, 1/17/2017
COMS 4721: Machine Learning for Data Science Lecture 1, 1/17/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University OVERVIEW This class will cover model-based
More informationPhysics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10
Physics 509: Error Propagation, and the Meaning of Error Bars Scott Oser Lecture #10 1 What is an error bar? Someone hands you a plot like this. What do the error bars indicate? Answer: you can never be
More informationFrank Porter February 26, 2013
116 Frank Porter February 26, 2013 Chapter 6 Hypothesis Tests Often, we want to address questions such as whether the possible observation of a new effect is really significant, or merely a chance fluctuation.
More informationMultivariate Distribution Models
Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is
More informationRandom Eigenvalue Problems Revisited
Random Eigenvalue Problems Revisited S Adhikari Department of Aerospace Engineering, University of Bristol, Bristol, U.K. Email: S.Adhikari@bristol.ac.uk URL: http://www.aer.bris.ac.uk/contact/academic/adhikari/home.html
More informationWinter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo
Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte
More informationUnfolding Methods in Particle Physics
Unfolding Methods in Particle Physics Volker Blobel University of Hamburg, Hamburg, Germany 1 Inverse problems Abstract Measured distributions in particle physics are distorted by the finite resolution
More informationBaBar s Contributions. Veronique Ziegler Jefferson Lab
BaBar s Contributions Veronique Ziegler Jefferson Lab BaBar last day of running charm light quark heavy quark 2 3 _ Before 2003, 4 cs mesons known 2 S-wave mesons, D s (J P = 0 ) and D s (1 ) 2 P-wave
More informationNew Measurements of ψ(3770) Resonance Parameters & DD-bar Cross Section at BES-II & CLEO-c
New Measurements of ψ(3770) Resonance Parameters & DD-bar Cross Section at BES-II & CLEO-c The review talk is based on the talks given at ICHEP 04 by Anders Ryd, Ian Shipsey, Gang Rong 1 Outline Introduction
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationconditional cdf, conditional pdf, total probability theorem?
6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random
More informationMeasurements of the Proton and Kaon Form Factors via ISR at BABAR
Measurements of the Proton and Kaon Form Factors via ISR at BABAR Fabio Anulli INFN Sezione di Roma on behalf of the BABAR Collaboration HADRON 015 XVI International Conference on Hadron Spectroscopy 13
More informationLecture 3: Central Limit Theorem
Lecture 3: Central Limit Theorem Scribe: Jacy Bird (Division of Engineering and Applied Sciences, Harvard) February 8, 003 The goal of today s lecture is to investigate the asymptotic behavior of P N (
More informationD D Shape. Speaker: Yi FANG for BESIII Collaboration. The 7th International Workshop on Charm Physics May 18-22, 2015 Detroit, Michigan
D D Shape Speaker: Yi FANG for BESIII Collaboration The 7th International Workshop on Charm Physics May 18-22, 2015 Detroit, Michigan Y. Fang (IHEP) D D Shape Charm 2015 1 / 16 Outline 1 Introduction 2
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Principles of Statistical Inference Recap of statistical models Statistical inference (frequentist) Parametric vs. semiparametric
More informationGibbs Sampling in Linear Models #2
Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationSome Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model
Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;
More informationManifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA
Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inria.fr http://perception.inrialpes.fr/
More informationHypothesis testing:power, test statistic CMS:
Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More information