Hypothesis Testing with Kernels
|
|
- Elvin Palmer
- 6 years ago
- Views:
Transcription
1 (Gatsby Unit, UCL) PRNI, Trento June 22, 2016
2 Motivation: detecting differences in AM signals Amplitude modulation: simple technique to transmit voice over radio. in the example: 2 songs. Fragments from song 1 P x, song 2 P y.
3 Motivation: detecting differences in AM signals Amplitude modulation: simple technique to transmit voice over radio. in the example: 2 songs. Fragments from song 1 P x, song 2 P y. Question: P x P y?
4 Motivation: discrete domain - 2-sample testing How do we compare distributions? Given: 2 sets of text fragments (fisheries, agriculture). x 1 : Now disturbing reports out of Newfoundland show that the fragile snow crab industry is in serious decline. First the west coast salmon, the east coast salmon and the cod, and now the snow crabs off Newfoundland. x 2 : To my pleasant surprise he responded that he had personally visited those wharves and that he had already announced money to fix them. What wharves did the minister visit in my riding and how much additional funding is he going to provide for Delaps Cove, Hampton, Port Lorne, y 1 : Honourable senators, I have a question for the Leader of the Government in the Senate with regard to the support funding to farmers that has been announced. Most farmers have not received any money yet. y 2 : On the grain transportation system we have had the Estey report and the Kroeger report. We could go on and on. Recently programs have been announced over and over by the government such as money for the disaster in agriculture on the prairies and across Canada....
5 Motivation: discrete domain - 2-sample testing How do we compare distributions? Given: 2 sets of text fragments (fisheries, agriculture). x 1 : Now disturbing reports out of Newfoundland show that the fragile snow crab industry is in serious decline. First the west coast salmon, the east coast salmon and the cod, and now the snow crabs off Newfoundland. x 2 : To my pleasant surprise he responded that he had personally visited those wharves and that he had already announced money to fix them. What wharves did the minister visit in my riding and how much additional funding is he going to provide for Delaps Cove, Hampton, Port Lorne, y 1 : Honourable senators, I have a question for the Leader of the Government in the Senate with regard to the support funding to farmers that has been announced. Most farmers have not received any money yet. y 2 : On the grain transportation system we have had the Estey report and the Kroeger report. We could go on and on. Recently programs have been announced over and over by the government such as money for the disaster in agriculture on the prairies and across Canada.... Do tx i u and ty j u come from the same distribution, i.e. P x P y?
6 Motivation: discrete domain - independence testing How do we detect dependency? (paired samples) x 1 : Honourable senators, I have a question for the Leader of the Government in the Senate with regard to the support funding to farmers that has been announced. Most farmers have not received any money yet. x 2 : No doubt there is great pressure on provincial and municipal governments in relation to the issue of child care, but the reality is that there have been no cuts to child care funding from the federal government to the provinces. In fact, we have increased federal investments for early childhood development.... y 1 : Honorables sénateurs, ma question s adresse au leader du gouvernement au Sénat et concerne l aide financiére qu on a annoncée pour les agriculteurs. La plupart des agriculteurs n ont encore rien reu de cet argent. y 2 : Il est évident que les ordres de gouvernements provinciaux et municipaux subissent de fortes pressions en ce qui concerne les services de garde, mais le gouvernement n a pas réduit le financement qu il verse aux provinces pour les services de garde. Au contraire, nous avons augmenté le financement fédéral pour le développement des jeunes enfants....
7 Motivation: discrete domain - independence testing How do we detect dependency? (paired samples) x 1 : Honourable senators, I have a question for the Leader of the Government in the Senate with regard to the support funding to farmers that has been announced. Most farmers have not received any money yet. x 2 : No doubt there is great pressure on provincial and municipal governments in relation to the issue of child care, but the reality is that there have been no cuts to child care funding from the federal government to the provinces. In fact, we have increased federal investments for early childhood development.... y 1 : Honorables sénateurs, ma question s adresse au leader du gouvernement au Sénat et concerne l aide financiére qu on a annoncée pour les agriculteurs. La plupart des agriculteurs n ont encore rien reu de cet argent. y 2 : Il est évident que les ordres de gouvernements provinciaux et municipaux subissent de fortes pressions en ce qui concerne les services de garde, mais le gouvernement n a pas réduit le financement qu il verse aux provinces pour les services de garde. Au contraire, nous avons augmenté le financement fédéral pour le développement des jeunes enfants.... Are the French paragraphs translations of the English ones, or have nothing to do with it, i.e. P XY P X P Y?
8 Outline 1 RKHS based metric on probability distributions. 2 2-sample testing: Nonparametric. Distance between distribution representations.
9 Outline 1 RKHS based metric on probability distributions. 2 2-sample testing: Nonparametric. Distance between distribution representations. 3 Independence testing: Dependency detection. Distance between joint (P XY ) and product of marginals (P X P Y ).
10 Kernels
11 Kernels on numerous data types Kernels exist on essentially any data type: images, texts, graphs, time series, dynamical systems,... ñ distribution representation, hypothesis testing: on all these domains.
12 Towards representations of distributions: EX Given: 2 Gaussians with different means. Solution: t-test.
13 Towards representations of distributions: EX 2 Setup: 2 Gaussians; same means, different variances. Idea: look at the 2nd-order features of RVs.
14 Towards representations of distributions: EX 2 Setup: 2 Gaussians; same means, different variances. Idea: look at the 2nd-order features of RVs. ϕ x x 2 ñ difference in EX 2.
15 Towards representations of distributions: further moments Setup: a Gaussian and a Laplacian distribution. Challenge: their means and variances are the same. Idea: look at higher-order features. Let us consider feature representations!
16 Kernel: similarity between features Given: x and x 1 P X objects (images, texts,...).
17 Kernel: similarity between features Given: x and x 1 P X objects (images, texts,...). Question: how similar they are?
18 Kernel: similarity between features Given: x and x 1 P X objects (images, texts,...). Question: how similar they are? Define features of the objects: ϕ x : features of x, ϕ x 1 : features of x 1. Kernel: inner product of these features kpx,x 1 q : ϕ x,ϕ x 1 H.
19 Kernel examples X R d : k p px,yq p x,y `γq p, k G px,yq e γ}x y}2 2, k e px,yq e γ}x y} 1 2, kc px,yq 1 ` γ }x y} 2. 2
20 Kernel examples X R d : k p px,yq p x,y `γq p, k G px,yq e γ}x y}2 2, k e px,yq e γ}x y} 1 2, kc px,yq 1 ` γ }x y} 2. 2 X = texts, strings: bag-of-word kernel, r-spectrum kernel: # of common ď r-substrings.
21 Kernel examples X R d : k p px,yq p x,y `γq p, k G px,yq e γ}x y}2 2, k e px,yq e γ}x y} 1 2, kc px,yq 1 ` γ }x y} 2. 2 X = texts, strings: bag-of-word kernel, r-spectrum kernel: # of common ď r-substrings. X = time-series: dynamic time-warping.
22 Two-sample testing
23 Ingredient: maximum mean discrepancy
24 Ingredient: maximum mean discrepancy
25 Ingredient: maximum mean discrepancy
26 Ingredient: maximum mean discrepancy {MMD 2 pp,qq Ě K P,P ` ĘK Q,Q 2 Ę K P,Q (without diagonals in Ě K P,P, Ę K Q,Q )
27 From kernel trick to mean trick Recall: ϕ x P H: feature of x P X. Kernel: kpx,x 1 q ϕ x,ϕ x 1 H.
28 From kernel trick to mean trick Recall: ϕ x P H: feature of x P X. Kernel: kpx,x 1 q ϕ x,ϕ x 1 H. Mean embedding: Feature of P: µ P : E x P rϕ x s P Hpkq. Inner product: µ P,µ Q H E x P,x1 Qkpx,x 1 q.
29 From kernel trick to mean trick Recall: ϕ x P H: feature of x P X. Kernel: kpx,x 1 q ϕ x,ϕ x 1 H. Mean embedding: Feature of P: µ P : E x P rϕ x s P Hpkq. Inner product: µ P,µ Q H E x P,x1 Qkpx,x 1 q. µ P : well-defined for all distributions (bounded k).
30 Maximum mean discrepancy Squared difference between feature means: MMD 2 pp,qq }µ P µ Q } 2 H µ P µ Q,µ P µ Q H µ P,µ P H ` µ Q,µ Q H 2 µ P,µ Q H E P,P kpx,x 1 q `E Q,Q kpy,y 1 q 2E P,Q kpx,yq.
31 Maximum mean discrepancy Squared difference between feature means: MMD 2 pp,qq }µ P µ Q } 2 H µ P µ Q,µ P µ Q H µ P,µ P H ` µ Q,µ Q H 2 µ P,µ Q H E P,P kpx,x 1 q `E Q,Q kpy,y 1 q 2E P,Q kpx,yq. Unbiased empirical estimate for tx i u n i 1 P, ty ju n j 1 Q: {MMD 2 pp,qq Ě K P,P ` ĘK Q,Q 2 Ę K P,Q.
32 Two-sample test using MMD Two hypotheses: H 0 (null hypothesis): P Q. H 1 (alternative hypothesis): P Q.
33 Two-sample test using MMD Two hypotheses: H 0 (null hypothesis): P Q. H 1 (alternative hypothesis): P Q. Observation: tx i u n i 1 P, ty ju n j 1 Q. Decision: if { MMD 2 pp,qq is far from 0 ñ reject H 0.
34 Two-sample test using MMD Two hypotheses: H 0 (null hypothesis): P Q. H 1 (alternative hypothesis): P Q. Observation: tx i u n i 1 P, ty ju n j 1 Q. Decision: if { MMD 2 pp,qq is far from 0 ñ reject H 0. Threshold =? one answer ÝÝÝÝÝÝÑ asymptotic distribution of { MMD 2.
35 Two-sample test using MMD: H 1 Under H 1 (P Q): asymptotic distribution of { MMD 2 is Gaussian.
36 Two-sample test using MMD: H 0 Under H 0 (P Q): asymptotic distribution is n MMD { 8ÿ 2 pp,pq λ i pzi 2 2q, where z i Np0,2q i.i.d., ż kpx,x 1 qv i pxqdppxq λ i v i px 1 q, kpx,x 1 q ϕ x µ P,ϕ x 1 µ P H. X i 1
37 Two-sample test using MMD: threshold To the decision: given that P Q, we want threshold T such that Ppn { MMD 2 ą T q ď 0.05 : α.
38 Two-sample test using MMD: threshold Task: P n MMD { 2 ą T ď α. Solutions: permutation test: below, kernel eigenspectrum estimate: ˆλ i. moment matching: Gamma approximation.
39 Demo: amplitude modulated signals Question: P x P y?
40 Results: AM signals (120kHz) n 10, 000. Average over 4124 trials. Gaussian noise: added.
41 Independence testing
42 Independence testing Given: 2 kernel-endowed domain: px, kq, py, lq, paired samples: tpx i,y i qu n i 1 P XY. Hypotheses: H 0 : P XY P X P Y, H 1 : P XY P X P Y.
43 Independence testing Given: 2 kernel-endowed domain: px, kq, py, lq, paired samples: tpx i,y i qu n i 1 P XY. Hypotheses: H 0 : P XY P X P Y, H 1 : P XY P X P Y. Statistics: HSIC MMD 2 pp XY,P X P Y q }µ PXY µ PX P Y } 2 Hpǩq, ǩppx,yq, px 1,y 1 qq kpx,x 1 qlpy,y 1 q.
44 HSIC in terms of expectations HSICpP XY,P X P Y q E xy E x 1 y 1rkpx,x1 qlpy,y 1 qs Let us consider an example! `E x E x 1rkpx,x 1 qse y E y 1rlpy,y 1 qs 2E x 1 y 1 Ex kpx,x 1 qe y lpy,y 1 q
45 HSIC: intuition. X: images, Y: descriptions.
46 HSIC intuition: Gram matrices
47 HSIC intuition: Gram matrices Empirical estimate: zhsic pp XY,P X P Y q 1 n 2 phkh HLHq``, H I n n 1 11 T.
48 Independence testing: decision Under H 0 : nhsic z Ñ 8-sum of weighted χ 2... Permutation test: 1 Compute HSIC for tx i,y πpiq u n i 1 with many π-s. 2 Estimate the p1 αq-quantile from the empirical CDF.
49 Demo: translation example 5-line extracts. kernel: bag-of-words, r-spectrum (r 5) sample size: n 10. repetitions: 300. Results: r-spectrum: average Type-II error = 0 (α 0.05), bag-of-words: 0.18.
50 Summary Kernels on images, texts, graphs, time series,... RKHS based metric on probability distributions. Applications: 2-sample testing: MMD. independence testing: HSIC. No density estimation.
51 Contents AM signals. Kernel examples. Universal kernel: definition, examples. MMD: IPM representation. HSIC: Where HS is coming from?
52 AM signals s i : i th song. observation (s ÞÑ y): yptq cospω c tqpasptq `o c q `nptq, where nptq: Gaussian noise. The AM signals were sampled at 120kHz.
53 Kernel examples k G pa,bq e }a b} 2 2 2θ 2, k e pa,bq e }a b} 2 2θ 2, 1 1 k C pa,bq, k t pa,bq 1 ` }a b}2 2 1 ` }a b} θ, θ 2 2 k p pa,bq p a,b `θq p, k r pa,bq 1 }a b}2 2 }a b} 2 2 `θ, 1 k i pa,bq, b}a b} 22 `θ2 k M, 3 pa,bq 2 k M, 5 pa,bq 2? 3 }a b}2 1 ` θ? 5 }a b}2 1 ` θ? 3}a b}2 e θ, ` 5 }a b}2 2 3θ 2? 5}a b}2 e θ.
54 Universal kernel: definition Assume X: compact, metric, k : X ˆX Ñ R kernel is continuous. Then Def-1: k is universal if Hpkq is dense in pcpxq, } } 8 q.
55 Universal kernel: definition Assume Then X: compact, metric, k : X ˆX Ñ R kernel is continuous. Def-1: k is universal if Hpkq is dense in pcpxq, } } 8 q. Def-2: k is characteristic, if µ : M `1 pxq Ñ Hpkq is injective.
56 Universal kernel: definition Assume X: compact, metric, k : X ˆX Ñ R kernel is continuous. Then Def-1: k is universal if Hpkq is dense in pcpxq, } } 8 q. Def-2: k is characteristic, if µ : M `1 pxq Ñ Hpkq is injective. universal, if µ is injective on the finite signed measures of X.
57 Universal kernel: examples On compact subsets of R d (β ą 0): kpa,bq e β}a b}2 2, kpa,bq e β}a b} 1, kpa,bq e β a,b, pβ ą 0q, or more generally 8ÿ kpa,bq f p a,b q, f pxq a n x n p@a n ą 0q. Universal ñ characteristic. n 0
58 MMD: IPM represenation Let F : tf P Hpkq : }f } H ď 1u be the unit ball in H. Then MMDpP,Q;Fq : supre x P f pxq E y Q f pyqs, f PF supr f,µ P H f,µ Q H supr f,µ P µ Q H f PF f PF }µ P µ Q } H.
59 HSIC: Where HS is coming from? Players: px,kq, py,lq, P XY, P X, P Y ; C XY : Hplq Ñ Hpkq. C XY E XY rpϕ x µ PX q b pϕ y µ PY qs, f,c XY g Hpkq E XY rf pxq E X f pxqsrgpyq E Y HSIC }C XY } 2 HS.
Kernel methods. MLSS Arequipa, Peru, Arthur Gretton. Gatsby Unit, CSML, UCL
Kernel methods MLSS Arequipa, Peru, 2016 Arthur Gretton Gatsby Unit, CSML, UCL Some motivating questions... Detecting differences in brain signals The problem: Dolocalfieldpotential(LFP)signalschange when
More informationKernel methods for hypothesis testing and inference
Kernel methods for hypothesis testing and inference MLSS Tübingen, 2015 Arthur Gretton Gatsby Unit, CSML, UCL Some motivating questions... Detecting differences in brain signals The problem: Do local field
More informationKernels. B.Sc. École Polytechnique September 4, 2018
CMAP B.Sc. Day @ École Polytechnique September 4, 2018 Inner product Ñ kernel: similarity between features Extension of kpx, yq x, y ř i x iy i : kpx, yq ϕpxq, ϕpyq H. Inner product Ñ kernel: similarity
More informationKernel-based Dependency Measures & Hypothesis Testing
Kernel-based Dependency Measures and Hypothesis Testing CMAP, École Polytechnique Carnegie Mellon University November 27, 2017 Outline Motivation. Diverse set of domains: kernel. Dependency measures: Kernel
More informationLecture 2: Mappings of Probabilities to RKHS and Applications
Lecture 2: Mappings of Probabilities to RKHS and Applications Lille, 214 Arthur Gretton Gatsby Unit, CSML, UCL Detecting differences in brain signals The problem: Dolocalfieldpotential(LFP)signalschange
More informationHilbert Space Representations of Probability Distributions
Hilbert Space Representations of Probability Distributions Arthur Gretton joint work with Karsten Borgwardt, Kenji Fukumizu, Malte Rasch, Bernhard Schölkopf, Alex Smola, Le Song, Choon Hui Teo Max Planck
More informationAn Adaptive Test of Independence with Analytic Kernel Embeddings
An Adaptive Test of Independence with Analytic Kernel Embeddings Wittawat Jitkrittum Gatsby Unit, University College London wittawat@gatsby.ucl.ac.uk Probabilistic Graphical Model Workshop 2017 Institute
More informationTensor Product Kernels: Characteristic Property, Universality
Tensor Product Kernels: Characteristic Property, Universality Zolta n Szabo CMAP, E cole Polytechnique Joint work with: Bharath K. Sriperumbudur Hangzhou International Conference on Frontiers of Data Science
More informationMath Review Sheet, Fall 2008
1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the
More informationLearning Interpretable Features to Compare Distributions
Learning Interpretable Features to Compare Distributions Arthur Gretton Gatsby Computational Neuroscience Unit, University College London Theory of Big Data, 2017 1/41 Goal of this talk Given: Two collections
More informationReview: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.
1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately
More informationSOLUTIONS Math B4900 Homework 9 4/18/2018
SOLUTIONS Math B4900 Homework 9 4/18/2018 1. Show that if G is a finite group and F is a field, then any simple F G-modules is finitedimensional. [This is not a consequence of Maschke s theorem; it s just
More informationAn Adaptive Test of Independence with Analytic Kernel Embeddings
An Adaptive Test of Independence with Analytic Kernel Embeddings Wittawat Jitkrittum 1 Zoltán Szabó 2 Arthur Gretton 1 1 Gatsby Unit, University College London 2 CMAP, École Polytechnique ICML 2017, Sydney
More informationREAL ANALYSIS II TAKE HOME EXAM. T. Tao s Lecture Notes Set 5
REAL ANALYSIS II TAKE HOME EXAM CİHAN BAHRAN T. Tao s Lecture Notes Set 5 1. Suppose that te 1, e 2, e 3,... u is a countable orthonormal system in a complex Hilbert space H, and c 1, c 2,... is a sequence
More informationForefront of the Two Sample Problem
Forefront of the Two Sample Problem From classical to state-of-the-art methods Yuchi Matsuoka What is the Two Sample Problem? X ",, X % ~ P i. i. d Y ",, Y - ~ Q i. i. d Two Sample Problem P = Q? Example:
More informationCS8803: Statistical Techniques in Robotics Byron Boots. Hilbert Space Embeddings
CS8803: Statistical Techniques in Robotics Byron Boots Hilbert Space Embeddings 1 Motivation CS8803: STR Hilbert Space Embeddings 2 Overview Multinomial Distributions Marginal, Joint, Conditional Sum,
More informationEntropy and Ergodic Theory Notes 22: The Kolmogorov Sinai entropy of a measure-preserving system
Entropy and Ergodic Theory Notes 22: The Kolmogorov Sinai entropy of a measure-preserving system 1 Joinings and channels 1.1 Joinings Definition 1. If px, µ, T q and py, ν, Sq are MPSs, then a joining
More informationEntropy and Ergodic Theory Lecture 7: Rate-distortion theory
Entropy and Ergodic Theory Lecture 7: Rate-distortion theory 1 Coupling source coding to channel coding Let rσ, ps be a memoryless source and let ra, θ, Bs be a DMC. Here are the two coding problems we
More informationMIT Spring 2015
Assessing Goodness Of Fit MIT 8.443 Dr. Kempthorne Spring 205 Outline 2 Poisson Distribution Counts of events that occur at constant rate Counts in disjoint intervals/regions are independent If intervals/regions
More informationEntropy and Ergodic Theory Lecture 27: Sinai s factor theorem
Entropy and Ergodic Theory Lecture 27: Sinai s factor theorem What is special about Bernoulli shifts? Our main result in Lecture 26 is weak containment with retention of entropy. If ra Z, µs and rb Z,
More informationIntroductory Econometrics
Session 4 - Testing hypotheses Roland Sciences Po July 2011 Motivation After estimation, delivering information involves testing hypotheses Did this drug had any effect on the survival rate? Is this drug
More informationProbability review. September 11, Stoch. Systems Analysis Introduction 1
Probability review Alejandro Ribeiro Dept. of Electrical and Systems Engineering University of Pennsylvania aribeiro@seas.upenn.edu http://www.seas.upenn.edu/users/~aribeiro/ September 11, 2015 Stoch.
More informationNonparametric Indepedence Tests: Space Partitioning and Kernel Approaches
Nonparametric Indepedence Tests: Space Partitioning and Kernel Approaches Arthur Gretton 1 and László Györfi 2 1. Gatsby Computational Neuroscience Unit London, UK 2. Budapest University of Technology
More informationCS281A/Stat241A Lecture 17
CS281A/Stat241A Lecture 17 p. 1/4 CS281A/Stat241A Lecture 17 Factor Analysis and State Space Models Peter Bartlett CS281A/Stat241A Lecture 17 p. 2/4 Key ideas of this lecture Factor Analysis. Recall: Gaussian
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections Chapter 3 - continued 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions
More informationLecture 5 September 19
IFT 6269: Probabilistic Graphical Models Fall 2016 Lecture 5 September 19 Lecturer: Simon Lacoste-Julien Scribe: Sébastien Lachapelle Disclaimer: These notes have only been lightly proofread. 5.1 Statistical
More information18.440: Lecture 28 Lectures Review
18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is
More informationSecond-Order Inference for Gaussian Random Curves
Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)
More informationAbrupt change in mean avoiding variance estimation and block bootstrap. Barbora Peštová The Czech Academy of Sciences Institute of Computer Science
Abrupt change in mean avoiding variance estimation and block bootstrap Barbora Peštová The Czech Academy of Sciences Institute of Computer Science Structural changes 2 24 Motivation To know whether a change
More informationMonte Carlo Composition Inversion Acceptance/Rejection Sampling. Direct Simulation. Econ 690. Purdue University
Methods Econ 690 Purdue University Outline 1 Monte Carlo Integration 2 The Method of Composition 3 The Method of Inversion 4 Acceptance/Rejection Sampling Monte Carlo Integration Suppose you wish to calculate
More informationApproximation in the Zygmund Class
Approximation in the Zygmund Class Odí Soler i Gibert Joint work with Artur Nicolau Universitat Autònoma de Barcelona New Developments in Complex Analysis and Function Theory, 02 July 2018 Approximation
More informationA central limit theorem for an omnibus embedding of random dot product graphs
A central limit theorem for an omnibus embedding of random dot product graphs Keith Levin 1 with Avanti Athreya 2, Minh Tang 2, Vince Lyzinski 3 and Carey E. Priebe 2 1 University of Michigan, 2 Johns
More informationChapter 2. Discrete Distributions
Chapter. Discrete Distributions Objectives ˆ Basic Concepts & Epectations ˆ Binomial, Poisson, Geometric, Negative Binomial, and Hypergeometric Distributions ˆ Introduction to the Maimum Likelihood Estimation
More informationLecture 19 - Covariance, Conditioning
THEORY OF PROBABILITY VLADIMIR KOBZAR Review. Lecture 9 - Covariance, Conditioning Proposition. (Ross, 6.3.2) If X,..., X n are independent normal RVs with respective parameters µ i, σi 2 for i,..., n,
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationKernel Bayes Rule: Nonparametric Bayesian inference with kernels
Kernel Bayes Rule: Nonparametric Bayesian inference with kernels Kenji Fukumizu The Institute of Statistical Mathematics NIPS 2012 Workshop Confluence between Kernel Methods and Graphical Models December
More informationLecture The Sample Mean and the Sample Variance Under Assumption of Normality
Math 408 - Mathematical Statistics Lecture 13-14. The Sample Mean and the Sample Variance Under Assumption of Normality February 20, 2013 Konstantin Zuev (USC) Math 408, Lecture 13-14 February 20, 2013
More informationThis paper is not to be removed from the Examination Halls
~~ST104B ZA d0 This paper is not to be removed from the Examination Halls UNIVERSITY OF LONDON ST104B ZB BSc degrees and Diplomas for Graduates in Economics, Management, Finance and the Social Sciences,
More informationNOTES ON SOME EXERCISES OF LECTURE 5, MODULE 2
NOTES ON SOME EXERCISES OF LECTURE 5, MODULE 2 MARCO VITTURI Contents 1. Solution to exercise 5-2 1 2. Solution to exercise 5-3 2 3. Solution to exercise 5-7 4 4. Solution to exercise 5-8 6 5. Solution
More informationLecture 35: December The fundamental statistical distances
36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose
More informationSummary: the confidence interval for the mean (σ 2 known) with gaussian assumption
Summary: the confidence interval for the mean (σ known) with gaussian assumption on X Let X be a Gaussian r.v. with mean µ and variance σ. If X 1, X,..., X n is a random sample drawn from X then the confidence
More information22 : Hilbert Space Embeddings of Distributions
10-708: Probabilistic Graphical Models 10-708, Spring 2014 22 : Hilbert Space Embeddings of Distributions Lecturer: Eric P. Xing Scribes: Sujay Kumar Jauhar and Zhiguang Huo 1 Introduction and Motivation
More informationVariational inequality formulation of chance-constrained games
Variational inequality formulation of chance-constrained games Joint work with Vikas Singh from IIT Delhi Université Paris Sud XI Computational Management Science Conference Bergamo, Italy May, 2017 Outline
More informationPRINCIPLES OF ANALYSIS - LECTURE NOTES
PRINCIPLES OF ANALYSIS - LECTURE NOTES PETER A. PERRY 1. Constructions of Z, Q, R Beginning with the natural numbers N t1, 2, 3,...u we can use set theory to construct, successively, Z, Q, and R. We ll
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationDr. Marques Sophie Algebra 1 Spring Semester 2017 Problem Set 9
Dr. Marques Sophie Algebra Spring Semester 207 Office 59 marques@cims.nyu.edu Problem Set 9 Exercise 0 : Prove that every group of order G 28 must contain a normal subgroup of order 7. (Why should it contain
More informationUnderstanding the relationship between Functional and Structural Connectivity of Brain Networks
Understanding the relationship between Functional and Structural Connectivity of Brain Networks Sashank J. Reddi Machine Learning Department, Carnegie Mellon University SJAKKAMR@CS.CMU.EDU Abstract Background.
More informationTHEORY OF PROBABILITY VLADIMIR KOBZAR
THEORY OF PROBABILITY VLADIMIR KOBZAR Lecture 20 - Conditional Expectation, Inequalities, Laws of Large Numbers, Central Limit Theorem This lecture is based on the materials from the Courant Institute
More informationDistribution Regression with Minimax-Optimal Guarantee
Distribution Regression with Minimax-Optimal Guarantee (Gatsby Unit, UCL) Joint work with Bharath K. Sriperumbudur (Department of Statistics, PSU), Barnabás Póczos (ML Department, CMU), Arthur Gretton
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationDistribution Regression: A Simple Technique with Minimax-optimal Guarantee
Distribution Regression: A Simple Technique with Minimax-optimal Guarantee (CMAP, École Polytechnique) Joint work with Bharath K. Sriperumbudur (Department of Statistics, PSU), Barnabás Póczos (ML Department,
More informationLearning features to compare distributions
Learning features to compare distributions Arthur Gretton Gatsby Computational Neuroscience Unit, University College London NIPS 2016 Workshop on Adversarial Learning, Barcelona Spain 1/28 Goal of this
More informationRegression and Statistical Inference
Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF
More informationA Linear-Time Kernel Goodness-of-Fit Test
A Linear-Time Kernel Goodness-of-Fit Test Wittawat Jitkrittum 1; Wenkai Xu 1 Zoltán Szabó 2 NIPS 2017 Best paper! Kenji Fukumizu 3 Arthur Gretton 1 wittawatj@gmail.com 1 Gatsby Unit, University College
More informationReview of Statistics
Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and
More informationProbabilities & Statistics Revision
Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF
More informationLearning features to compare distributions
Learning features to compare distributions Arthur Gretton Gatsby Computational Neuroscience Unit, University College London NIPS 2016 Workshop on Adversarial Learning, Barcelona Spain 1/28 Goal of this
More informationDS-GA 1002: PREREQUISITES REVIEW SOLUTIONS VLADIMIR KOBZAR
DS-GA 2: PEEQUISIES EVIEW SOLUIONS VLADIMI KOBZA he following is a selection of questions (drawn from Mr. Bernstein s notes) for reviewing the prerequisites for DS-GA 2. Questions from Ch, 8, 9 and 2 of
More informationINDEPENDENCE MEASURES
INDEPENDENCE MEASURES Beatriz Bueno Larraz Máster en Investigación e Innovación en Tecnologías de la Información y las Comunicaciones. Escuela Politécnica Superior. Máster en Matemáticas y Aplicaciones.
More informationMathematische Methoden der Unsicherheitsquantifizierung
Mathematische Methoden der Unsicherheitsquantifizierung Oliver Ernst Professur Numerische Mathematik Sommersemester 2014 Contents 1 Introduction 1.1 What is Uncertainty Quantification? 1.2 A Case Study:
More informationDetection Theory. Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010
Detection Theory Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010 Outline Neyman-Pearson Theorem Detector Performance Irrelevant Data Minimum Probability of Error Bayes Risk Multiple
More informationMinimax-optimal distribution regression
Zoltán Szabó (Gatsby Unit, UCL) Joint work with Bharath K. Sriperumbudur (Department of Statistics, PSU), Barnabás Póczos (ML Department, CMU), Arthur Gretton (Gatsby Unit, UCL) ISNPS, Avignon June 12,
More informationECON 4160, Autumn term Lecture 1
ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions 3.6 Conditional
More informationEmpirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications
Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Fumiya Akashi Research Associate Department of Applied Mathematics Waseda University
More informationKernel Methods. Lecture 4: Maximum Mean Discrepancy Thanks to Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, Jiayuan Huang, Arthur Gretton
Kernel Methods Lecture 4: Maximum Mean Discrepancy Thanks to Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, Jiayuan Huang, Arthur Gretton Alexander J. Smola Statistical Machine Learning Program Canberra,
More informationBig Hypothesis Testing with Kernel Embeddings
Big Hypothesis Testing with Kernel Embeddings Dino Sejdinovic Department of Statistics University of Oxford 9 January 2015 UCL Workshop on the Theory of Big Data D. Sejdinovic (Statistics, Oxford) Big
More informationExtensive Form Abstract Economies and Generalized Perfect Recall
Extensive Form Abstract Economies and Generalized Perfect Recall Nicholas Butler Princeton University July 30, 2015 Nicholas Butler (Princeton) EFAE and Generalized Perfect Recall July 30, 2015 1 / 1 Motivation
More informationExtending Zagier s Theorem on Continued Fractions and Class Numbers
Extending Zagier s Theorem on Continued Fractions and Class Numbers Colin Weir University of Calgary Joint work with R. K. Guy, M. Bauer, M. Wanless West Coast Number Theory December 2012 The Story of
More informationPhysics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester
Physics 403 Propagation of Uncertainties Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Maximum Likelihood and Minimum Least Squares Uncertainty Intervals
More information5 Introduction to the Theory of Order Statistics and Rank Statistics
5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order
More informationADVANCE TOPICS IN ANALYSIS - REAL. 8 September September 2011
ADVANCE TOPICS IN ANALYSIS - REAL NOTES COMPILED BY KATO LA Introductions 8 September 011 15 September 011 Nested Interval Theorem: If A 1 ra 1, b 1 s, A ra, b s,, A n ra n, b n s, and A 1 Ě A Ě Ě A n
More informationComputational Statistics and Optimisation. Joseph Salmon Télécom Paristech, Institut Mines-Télécom
Computational Statistics and Optimisation Joseph Salmon http://josephsalmon.eu Télécom Paristech, Institut Mines-Télécom Plan Duality gap and stopping criterion Back to gradient descent analysis Forward-backward
More informationLecture 13: p-values and union intersection tests
Lecture 13: p-values and union intersection tests p-values After a hypothesis test is done, one method of reporting the result is to report the size α of the test used to reject H 0 or accept H 0. If α
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationThe purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.
Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That
More informationSelf Adaptive Particle Filter
Self Adaptive Particle Filter Alvaro Soto Pontificia Universidad Catolica de Chile Department of Computer Science Vicuna Mackenna 4860 (143), Santiago 22, Chile asoto@ing.puc.cl Abstract The particle filter
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationMTMS Mathematical Statistics
MTMS.01.099 Mathematical Statistics Lecture 12. Hypothesis testing. Power function. Approximation of Normal distribution and application to Binomial distribution Tõnu Kollo Fall 2016 Hypothesis Testing
More informationEstimating the accuracy of a hypothesis Setting. Assume a binary classification setting
Estimating the accuracy of a hypothesis Setting Assume a binary classification setting Assume input/output pairs (x, y) are sampled from an unknown probability distribution D = p(x, y) Train a binary classifier
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationebay/google short course: Problem set 2
18 Jan 013 ebay/google short course: Problem set 1. (the Echange Parado) You are playing the following game against an opponent, with a referee also taking part. The referee has two envelopes (numbered
More informationL E C T U R E 2 1 : M AT R I X T R A N S F O R M AT I O N S. Monday, November 14
L E C T U R E 2 1 : M AT R I X T R A N S F O R M AT I O N S Monday, November 14 In this lecture we want to consider functions between vector spaces. Recall that a function is simply a rule that takes an
More informationEUSIPCO
EUSIPCO 3 569736677 FULLY ISTRIBUTE SIGNAL ETECTION: APPLICATION TO COGNITIVE RAIO Franc Iutzeler Philippe Ciblat Telecom ParisTech, 46 rue Barrault 753 Paris, France email: firstnamelastname@telecom-paristechfr
More information1: PROBABILITY REVIEW
1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following
More informationA Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay
A Statistical Look at Spectral Graph Analysis Deep Mukhopadhyay Department of Statistics, Temple University Office: Speakman 335 deep@temple.edu http://sites.temple.edu/deepstat/ Graph Signal Processing
More informationQuantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing
Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October
More informationBayesian Social Learning with Random Decision Making in Sequential Systems
Bayesian Social Learning with Random Decision Making in Sequential Systems Yunlong Wang supervised by Petar M. Djurić Department of Electrical and Computer Engineering Stony Brook University Stony Brook,
More informationGround States are generically a periodic orbit
Gonzalo Contreras CIMAT Guanajuato, Mexico Ergodic Optimization and related topics USP, Sao Paulo December 11, 2013 Expanding map X compact metric space. T : X Ñ X an expanding map i.e. T P C 0, Dd P Z`,
More informationProblem 1 (20) Log-normal. f(x) Cauchy
ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5
More informationResampling and the Bootstrap
Resampling and the Bootstrap Axel Benner Biostatistics, German Cancer Research Center INF 280, D-69120 Heidelberg benner@dkfz.de Resampling and the Bootstrap 2 Topics Estimation and Statistical Testing
More informationTheorem The simple finite dimensional sl 2 modules Lpdq are indexed by
The Lie algebra sl 2 pcq has basis x, y, and h, with relations rh, xs 2x, rh, ys 2y, and rx, ys h. Theorem The simple finite dimensional sl 2 modules Lpdq are indexed by d P Z ě0 with basis tv`, yv`, y
More informationErdinç Dündar, Celal Çakan
DEMONSTRATIO MATHEMATICA Vol. XLVII No 3 2014 Erdinç Dündar, Celal Çakan ROUGH I-CONVERGENCE Abstract. In this work, using the concept of I-convergence and using the concept of rough convergence, we introduced
More informationNon-specific filtering and control of false positives
Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview
More information10-701/ Recitation : Kernels
10-701/15-781 Recitation : Kernels Manojit Nandi February 27, 2014 Outline Mathematical Theory Banach Space and Hilbert Spaces Kernels Commonly Used Kernels Kernel Theory One Weird Kernel Trick Representer
More informationMaximum Mean Discrepancy
Maximum Mean Discrepancy Thanks to Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, Jiayuan Huang, Arthur Gretton Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia
More information