Will Monroe August 9, with materials by Mehran Sahami and Chris Piech. image: Arito. Parameter learning

Size: px
Start display at page:

Download "Will Monroe August 9, with materials by Mehran Sahami and Chris Piech. image: Arito. Parameter learning"

Transcription

1 Will Monroe August 9, 07 with aterials by Mehran Sahai and Chris Piech iage: Arito Paraeter learning

2 Announceent: Proble Set #6 Goes out tonight. Due the last day of class, Wednesday, August 6 (before class). Soe serious coding! Congressional votng No late days! Heart disease diagnosis

3 Review: Paraeter estaton Soetes we don t know things like the expectaton and variance of a distributon; we have to estate the fro incoplete inforaton. n n = X i S = ( X X X ) i n i= n i= ^ =arg ax LL()

4 Review: Central liit theore Sus and averages of IID rando variables are norally distributed. n σ X = X i N (μ, ) n i= n n Y =n X = X i N (n μ, n σ ) i=

5 Easily-confused principles Constant ultple Su of identcal of a noral norals X N (μ, σ ) CLT X i??? X i N (μ, σ ) (independent & identcal) n n X N (nμ, n σ ) (independent & identcal) n X i N (nμ,n σ ) X i N (nμ,n σ ) (exactly) (approxiately, for large n) i= i=

6 Central liit theore deo

7 Review: Approxiatng a Poisson with a noral X Poi (λ) Y N (λ, λ) (for large λ)

8 Paraeters X Ber(p) =p Poi(λ) =λ Uni(a, b) = [a, b] N(μ, σ²) = [μ, σ²]

9 Maxiu likelihood estaton Choose paraeters that axiize the likelihood (joint probability given paraeters) of the exaple data. ^ =arg ax LL() * x x x x * 0 x 3 4

10 How to: MLE. Copute the likelihood. L()=P ( X,, X ). Take its log. LL()=log L() 3. Maxiize this as a functon of the paraeters. * d LL()=0 d x4 x3 x x x * 0

11 Maxiu likelihood for Bernoulli The axiu likelihood p for Bernoulli rando variables is the saple ean. p^ = X i i=

12 Derivaton: MLE for Bernoulli. Copute the likelihood. = p L()=P( X,, X ) = P( X i ) i= = i= don t forget: IID eans independent! p if X i= ( p) if X i=0 {

13 Derivaton: MLE for Bernoulli. Copute the likelihood. = p L()=P( X,, X ) = P( X i ) i= = i= don t forget: IID eans independent! p if X i= ( p) if X i=0 { Xi X i = p ( p) i=

14 Derivaton: MLE for Bernoulli. Take its log. = p Xi X i L()= ( ) i= Xi X i LL()=log ( ) i= Xi X i = log [ ( ) ] i= = [ X i log +( X i )log( ) ] i=

15 Derivaton: MLE for Bernoulli 3. Maxiize this as a functon of the paraeters. = p LL()= [ X i log +( X i ) log ( ) ] i= ^ p^ =arg ax LL() = X i X i d LL()= d i= = X i ( X i ) i= i= ( = [ ] + ( )( i= + =0 ) )( ) Xi ( ) i= X i = i= X i = = ( ) i= Xi

16 Maxiu likelihood for noral The axiu likelihood μ for noral rando variables is the saple ean, and the axiu likelihood σ² is the uncorrected ean square deviaton. ^ μ= Xi i= ^ ^ σ = ( X i μ) i=

17 Derivaton: MLE for Noral. Take its log =[μ, σ ] L()= i= x μ σ ( e σ π [ ) x μ σ ( LL()= log e σ π i= ) ] x μ = log σ log π σ i= ( )

18 Derivaton: MLE for noral 3. Maxiize this as a functon of the paraeters. =[μ, σ ] X i μ LL()= log σ log π σ i= ^ ^ [ μ^, σ ]==arg ax LL() ( LL()= X i μ σ σ μ i= X i μ = σ i= μ = X i =0 σ i= σ μ= ( ( ) )( ) ( ) X i = X i= )

19 Derivaton: MLE for noral 3. Maxiize this as a functon of the paraeters. =[μ, σ ] X i μ LL()= log σ log π σ i= ^ ^ [ μ^, σ ]==arg ax LL() ( ) X i μ X i μ LL()= σ σ σ σ i= ( X i μ) = σ 3 σ i= [ [ ( )] ] = )( ( X μ) i 3 σ =0 σ i= ) σ = ( X i μ) = ( X i X i= i=

20 Break te!

21 Maxiu likelihood for unifor The axiu likelihood a and b for unifor rando variables are the iniu and axiu of the data. ^ b=ax Xi a^ =in X i i i b a a b

22 Derivaton: MLE for unifor. Copute the likelihood. =[a, b] L()= i= { b a 0 if a X i b otherwise. Take its log. LL()= i= log (b a) if a X i b otherwise { 3. Maxiize this as a functon of the paraeters. ^ =arg ^ [ a^, b]= ax LL()

23 Derivaton: MLE for unifor =[a, b] if a X i b L()= b a i= 0 otherwise ^ =arg ^ [ a^, b]= ax L() { Likelihood: L(a,) L(a, ) a^ =in X i i Likelihood: L(0, b) L(0, b) a b ^ b=ax Xi i

24 Maxiu a posteriori estaton Choose the ost likely paraeters given the exaple data. You ll need a prior probability over the paraeters. ^ =arg ax P ( X,, X n ) =arg ax [ LL()+log P () ]

25 Review: Multnoial rando variable An ultnoial rando variable records the nuber of tes each outcoe occurs, when an experient with ultple outcoes (e.g. die roll) is run ultple tes. X,, X MN (n, p, p,, p ) vector! P ( X =c, X =c,, X =c ) c c c n = p p p c, c,, c ( )

26 Roll all of the dice! A 6-sided die is rolled 7 tes. What is the probability we get: one two 0 threes fours 0 fves 3 sixes? X,, X 6 MN (7,,,,,, ) P ( X =, X =, X 3 =0, X 4 =, X 5 =0, X 6 =3) = =40 6,,0,,0, ( )( ) ( ) ( ) ( ) ( ) ( ) 7 ()

27 Maxiu likelihood with ultnoial A 6-sided die is rolled 7 tes. We get: one two 0 threes fours 0 fves 3 sixes What is the MLE for p₁,, p₆? 3 X,, X 6 MN (7,,, 0,, 0, ) you ll never roll a 3! not in a illion years!

28 Are we doing this backwards? ^ =arg ax P ( X,, X n ) ^ =arg ax P ( X,, X n )

29 Bayes to the rescue ^ =arg ax P ( X,, X n ) P( X,, X n ) P() =arg ax P ( X,, X n ) =arg ax P( X,, X n ) P () =arg ax [ log P( X,, X n )+log P () ]

30 Review: Beta rando variable An beta rando variable odels the probability of a trial s success, given previous trials. The PDF/CDF let you copute probabilites of probabilites! X Beta (a, b) f X ( x)= Cx { a b ( x) 0 if 0< x < otherwise

31 Review: Dirichlet distributon Beta is the distributon ( conjugate prior ) for the p in the Bernoulli and binoial. Dirichlet is the distributon for the p₁, p₂, in the ultnoial. X, X, Dir (a, a, ) fx, X, ( x, x, )= C x a x b if 0<{x, x, }<, x + x + = (0 otherwise)

32 Laplace soothing Also known as add-one soothing: assue you ve seen one iaginary occurrence of each possible outcoe. # ( X =i)+k pi = n+ k # ( X =i)+ pi = n+

33 Maxiu likelihood with ultnoial A 6-sided die is rolled 7 tes. We get: one two 0 threes fours 0 fves 3 sixes What is the MLE for p₁,, p₆? 3 X,, X 6 MN (7,,, 0,, 0, ) you ll never roll a 3! not in a illion years!

34 Laplace with ultnoial A 6-sided die is rolled 7 tes. We get: one two 0 threes fours 0 fves 3 sixes What is the Laplace estate for p₁,, p₆? 3 4 X,, X 6 MN (7,,,,,, ) stll a chance!

35 Paraeter priors X Ber(p) p ~ Beta(a, b) Bin(n, p) p ~ Beta(a, b) MN(p) p ~ Dir(a) Poi(λ) λ ~ Gaa(k, ) Exp(λ) λ ~ Gaa(k, ) N(μ, σ²) μ ~ N(μ, σ ²) σ²~ InvGaa(α, β)

Conditional distributions

Conditional distributions Conditional distributions Will Monroe July 6, 017 with materials by Mehran Sahami and Chris Piech Independence of discrete random variables Two random variables are independent if knowing the value of

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Detection and Estimation Theory

Detection and Estimation Theory ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

Biostatistics Department Technical Report

Biostatistics Department Technical Report Biostatistics Departent Technical Report BST006-00 Estiation of Prevalence by Pool Screening With Equal Sized Pools and a egative Binoial Sapling Model Charles R. Katholi, Ph.D. Eeritus Professor Departent

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

Support Vector Machines MIT Course Notes Cynthia Rudin

Support Vector Machines MIT Course Notes Cynthia Rudin Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single

More information

Some Examples on Gibbs Sampling and Metropolis-Hastings methods

Some Examples on Gibbs Sampling and Metropolis-Hastings methods Soe Exaples o Gibbs Saplig ad Metropolis-Hastigs ethods S420/620 Itroductio to Statistical Theory, Fall 2012 Gibbs Sapler Saple a ultidiesioal probability distributio fro coditioal desities. Suppose d

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution

More information

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA)

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA) Bayesian Learning Chapter 6: Bayesian Learning CS 536: Machine Learning Littan (Wu, TA) [Read Ch. 6, except 6.3] [Suggested exercises: 6.1, 6.2, 6.6] Bayes Theore MAP, ML hypotheses MAP learners Miniu

More information

Probabilistic Machine Learning

Probabilistic Machine Learning Probabilistic Machine Learning by Prof. Seungchul Lee isystes Design Lab http://isystes.unist.ac.kr/ UNIST Table of Contents I.. Probabilistic Linear Regression I... Maxiu Likelihood Solution II... Maxiu-a-Posteriori

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

IN modern society that various systems have become more

IN modern society that various systems have become more Developent of Reliability Function in -Coponent Standby Redundant Syste with Priority Based on Maxiu Entropy Principle Ryosuke Hirata, Ikuo Arizono, Ryosuke Toohiro, Satoshi Oigawa, and Yasuhiko Takeoto

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

Research in Area of Longevity of Sylphon Scraies

Research in Area of Longevity of Sylphon Scraies IOP Conference Series: Earth and Environental Science PAPER OPEN ACCESS Research in Area of Longevity of Sylphon Scraies To cite this article: Natalia Y Golovina and Svetlana Y Krivosheeva 2018 IOP Conf.

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models 2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511

More information

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS N. van Erp and P. van Gelder Structural Hydraulic and Probabilistic Design, TU Delft Delft, The Netherlands Abstract. In probles of odel coparison

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Sampling How Big a Sample?

Sampling How Big a Sample? C. G. G. Aitken, 1 Ph.D. Sapling How Big a Saple? REFERENCE: Aitken CGG. Sapling how big a saple? J Forensic Sci 1999;44(4):750 760. ABSTRACT: It is thought that, in a consignent of discrete units, a certain

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm Acta Polytechnica Hungarica Vol., No., 04 Sybolic Analysis as Universal Tool for Deriving Properties of Non-linear Algoriths Case study of EM Algorith Vladiir Mladenović, Miroslav Lutovac, Dana Porrat

More information

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,

More information

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5,

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, 2015 31 11 Motif Finding Sources for this section: Rouchka, 1997, A Brief Overview of Gibbs Sapling. J. Buhler, M. Topa:

More information

The Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem 36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined

More information

Random Variables and Densities

Random Variables and Densities Rando Variables and Densities Review: Probabilit and Statistics Sa Roweis Rando variables X represents outcoes or states of world. Instantiations of variables usuall in lower case: We will write p() to

More information

Training an RBM: Contrastive Divergence. Sargur N. Srihari

Training an RBM: Contrastive Divergence. Sargur N. Srihari Training an RBM: Contrastive Divergence Sargur N. srihari@cedar.buffalo.edu Topics in Partition Function Definition of Partition Function 1. The log-likelihood gradient 2. Stochastic axiu likelihood and

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

More discrete distributions

More discrete distributions Will Monroe July 14, 217 with materials by Mehran Sahami and Chris Piech More discrete distributions Announcements: Problem Set 3 Posted yesterday on the course website. Due next Wednesday, 7/19, at 12:3pm

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Binomial and Poisson Probability Distributions

Binomial and Poisson Probability Distributions Binoial and Poisson Probability Distributions There are a few discrete robability distributions that cro u any ties in hysics alications, e.g. QM, SM. Here we consider TWO iortant and related cases, the

More information

Bootstrapping Dependent Data

Bootstrapping Dependent Data Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Approximation in Stochastic Scheduling: The Power of LP-Based Priority Policies

Approximation in Stochastic Scheduling: The Power of LP-Based Priority Policies Approxiation in Stochastic Scheduling: The Power of -Based Priority Policies Rolf Möhring, Andreas Schulz, Marc Uetz Setting (A P p stoch, r E( w and (B P p stoch E( w We will assue that the processing

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Lecture 18: Central Limit Theorem. Lisa Yan August 6, 2018

Lecture 18: Central Limit Theorem. Lisa Yan August 6, 2018 Lecture 18: Central Limit Theorem Lisa Yan August 6, 2018 Announcements PS5 due today Pain poll PS6 out today Due next Monday 8/13 (1:30pm) (will not be accepted after Wed 8/15) Programming part: Java,

More information

Principal Components Analysis

Principal Components Analysis Principal Coponents Analysis Cheng Li, Bingyu Wang Noveber 3, 204 What s PCA Principal coponent analysis (PCA) is a statistical procedure that uses an orthogonal transforation to convert a set of observations

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Machine Learning: Fisher s Linear Discriminant. Lecture 05

Machine Learning: Fisher s Linear Discriminant. Lecture 05 Machine Learning: Fisher s Linear Discriinant Lecture 05 Razvan C. Bunescu chool of Electrical Engineering and Coputer cience bunescu@ohio.edu Lecture 05 upervised Learning ask learn an (unkon) function

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

ESE 523 Information Theory

ESE 523 Information Theory ESE 53 Inforation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electrical and Systes Engineering Washington University 11 Urbauer Hall 10E Green Hall 314-935-4173 (Lynda Marha Answers) jao@wustl.edu

More information

Pseudo-marginal Metropolis-Hastings: a simple explanation and (partial) review of theory

Pseudo-marginal Metropolis-Hastings: a simple explanation and (partial) review of theory Pseudo-arginal Metropolis-Hastings: a siple explanation and (partial) review of theory Chris Sherlock Motivation Iagine a stochastic process V which arises fro soe distribution with density p(v θ ). Iagine

More information

3.3 Variational Characterization of Singular Values

3.3 Variational Characterization of Singular Values 3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and

More information

Lecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon

Lecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon Lecture 2: Enseble Methods Isabelle Guyon guyoni@inf.ethz.ch Introduction Book Chapter 7 Weighted Majority Mixture of Experts/Coittee Assue K experts f, f 2, f K (base learners) x f (x) Each expert akes

More information

Nonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy

Nonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy Storage Capacity and Dynaics of Nononotonic Networks Bruno Crespi a and Ignazio Lazzizzera b a. IRST, I-38050 Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I-38050 Povo (Trento) Italy INFN Gruppo

More information

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016/2017 Lessons 9 11 Jan 2017 Outline Artificial Neural networks Notation...2 Convolutional Neural Networks...3

More information

The Random Variable for Probabilities Chris Piech CS109, Stanford University

The Random Variable for Probabilities Chris Piech CS109, Stanford University The Random Variable for Probabilities Chris Piech CS109, Stanford University Assignment Grades 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 Frequency Frequency 10 20 30 40 50 60 70 80

More information

Measures of average are called measures of central tendency and include the mean, median, mode, and midrange.

Measures of average are called measures of central tendency and include the mean, median, mode, and midrange. CHAPTER 3 Data Description Objectives Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance,

More information

SHORT TIME FOURIER TRANSFORM PROBABILITY DISTRIBUTION FOR TIME-FREQUENCY SEGMENTATION

SHORT TIME FOURIER TRANSFORM PROBABILITY DISTRIBUTION FOR TIME-FREQUENCY SEGMENTATION SHORT TIME FOURIER TRANSFORM PROBABILITY DISTRIBUTION FOR TIME-FREQUENCY SEGMENTATION Fabien Millioz, Julien Huillery, Nadine Martin To cite this version: Fabien Millioz, Julien Huillery, Nadine Martin.

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

Testing Properties of Collections of Distributions

Testing Properties of Collections of Distributions Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the

More information

Tracking using CONDENSATION: Conditional Density Propagation

Tracking using CONDENSATION: Conditional Density Propagation Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation

More information

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

A new robust optimization approach for scheduling under uncertainty II. Uncertainty with known probability distribution

A new robust optimization approach for scheduling under uncertainty II. Uncertainty with known probability distribution Coputers and Cheical Engineering 31 (2007) 171 195 A new robust optiization approach for scheduling under uncertainty II. Uncertainty with nown probability distribution Stacy L. Jana, Xiaoxia Lin, Christodoulos

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Bernoulli and Binomial

Bernoulli and Binomial Bernoulli and Binomial Will Monroe July 1, 217 image: Antoine Taveneaux with materials by Mehran Sahami and Chris Piech Announcements: Problem Set 2 Due this Wednesday, 7/12, at 12:3pm (before class).

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

Topic 5a Introduction to Curve Fitting & Linear Regression

Topic 5a Introduction to Curve Fitting & Linear Regression /7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline

More information

What is Probability? (again)

What is Probability? (again) INRODUCTION TO ROBBILITY Basic Concepts and Definitions n experient is any process that generates well-defined outcoes. Experient: Record an age Experient: Toss a die Experient: Record an opinion yes,

More information

Tail Estimation of the Spectral Density under Fixed-Domain Asymptotics

Tail Estimation of the Spectral Density under Fixed-Domain Asymptotics Tail Estiation of the Spectral Density under Fixed-Doain Asyptotics Wei-Ying Wu, Chae Young Li and Yiin Xiao Wei-Ying Wu, Departent of Statistics & Probability Michigan State University, East Lansing,

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

Statistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV

Statistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV CS229 REPORT, DECEMBER 05 1 Statistical clustering and Mineral Spectral Unixing in Aviris Hyperspectral Iage of Cuprite, NV Mario Parente, Argyris Zynis I. INTRODUCTION Hyperspectral Iaging is a technique

More information

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a

More information

Gradient Ascent Chris Piech CS109, Stanford University

Gradient Ascent Chris Piech CS109, Stanford University Gradient Ascent Chris Piech CS109, Stanford University Our Path Deep Learning Linear Regression Naïve Bayes Logistic Regression Parameter Estimation Our Path Deep Learning Linear Regression Naïve Bayes

More information

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data Suppleentary to Learning Discriinative Bayesian Networks fro High-diensional Continuous Neuroiaging Data Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, and Dinggang Shen Proposition. Given a sparse

More information

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning Boosting Mehryar Mohri Courant Institute and Google Research ohri@cis.nyu.edu Weak Learning Definition: concept class C is weakly PAC-learnable if there exists a (weak)

More information

General Properties of Radiation Detectors Supplements

General Properties of Radiation Detectors Supplements Phys. 649: Nuclear Techniques Physics Departent Yarouk University Chapter 4: General Properties of Radiation Detectors Suppleents Dr. Nidal M. Ershaidat Overview Phys. 649: Nuclear Techniques Physics Departent

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE

PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE 1 Nicola Neretti, 1 Nathan Intrator and 1,2 Leon N Cooper 1 Institute for Brain and Neural Systes, Brown University, Providence RI 02912.

More information

Birthday Paradox Calculations and Approximation

Birthday Paradox Calculations and Approximation Birthday Paradox Calculations and Approxiation Joshua E. Hill InfoGard Laboratories -March- v. Birthday Proble In the birthday proble, we have a group of n randoly selected people. If we assue that birthdays

More information

Analyzing Simulation Results

Analyzing Simulation Results Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient

More information

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

A Smoothed Boosting Algorithm Using Probabilistic Output Codes A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu

More information

Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression

Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression Advances in Pure Matheatics, 206, 6, 33-34 Published Online April 206 in SciRes. http://www.scirp.org/journal/ap http://dx.doi.org/0.4236/ap.206.65024 Inference in the Presence of Likelihood Monotonicity

More information

VARIATIONAL ALGORITHMS TO REMOVE STRIPES: A GENERALIZATION OF THE NEGATIVE NORM MODELS.

VARIATIONAL ALGORITHMS TO REMOVE STRIPES: A GENERALIZATION OF THE NEGATIVE NORM MODELS. VARIATIONAL ALGORITHMS TO REMOVE STRIPES: A GENERALIZATION OF THE NEGATIVE NORM MODELS. Jérôe Fehrenbach 1, Pierre Weiss 1 and Corinne Lorenzo 2 1 Institut de Mathéatiques de Toulouse, Toulouse University,

More information

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions Tight Inforation-Theoretic Lower Bounds for Welfare Maxiization in Cobinatorial Auctions Vahab Mirrokni Jan Vondrák Theory Group, Microsoft Dept of Matheatics Research Princeton University Redond, WA 9805

More information

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS Darusz Bskup 1. Introducton The paper presents a nonparaetrc procedure for estaton of an unknown functon f n the regresson odel y = f x + ε = N. (1) (

More information

Optimum Value of Poverty Measure Using Inverse Optimization Programming Problem

Optimum Value of Poverty Measure Using Inverse Optimization Programming Problem International Journal of Conteporary Matheatical Sciences Vol. 14, 2019, no. 1, 31-42 HIKARI Ltd, www.-hikari.co https://doi.org/10.12988/ijcs.2019.914 Optiu Value of Poverty Measure Using Inverse Optiization

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Solutions 1. Introduction to Coding Theory - Spring 2010 Solutions 1. Exercise 1.1. See Examples 1.2 and 1.11 in the course notes.

Solutions 1. Introduction to Coding Theory - Spring 2010 Solutions 1. Exercise 1.1. See Examples 1.2 and 1.11 in the course notes. Solutions 1 Exercise 1.1. See Exaples 1.2 and 1.11 in the course notes. Exercise 1.2. Observe that the Haing distance of two vectors is the iniu nuber of bit flips required to transfor one into the other.

More information

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Statistica Sinica 6 016, 1709-178 doi:http://dx.doi.org/10.5705/ss.0014.0034 AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Nilabja Guha 1, Anindya Roy, Yaakov Malinovsky and Gauri

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION A eshsize boosting algorith in kernel density estiation A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION C.C. Ishiekwene, S.M. Ogbonwan and J.E. Osewenkhae Departent of Matheatics, University

More information

XII.3 The EM (Expectation-Maximization) Algorithm

XII.3 The EM (Expectation-Maximization) Algorithm XII.3 The EM (Expectaton-Maxzaton) Algorth Toshnor Munaata 3/7/06 The EM algorth s a technque to deal wth varous types of ncoplete data or hdden varables. It can be appled to a wde range of learnng probles

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite

More information

Fundamental Tools - Probability Theory IV

Fundamental Tools - Probability Theory IV Fundamental Tools - Probability Theory IV MSc Financial Mathematics The University of Warwick October 1, 2015 MSc Financial Mathematics Fundamental Tools - Probability Theory IV 1 / 14 Model-independent

More information

Simulation of Discrete Event Systems

Simulation of Discrete Event Systems Siulation of Discrete Event Systes Unit 9 Queueing Models Fall Winter 207/208 Prof. Dr.-Ing. Dipl.-Wirt.-Ing. Sven Tackenberg Benedikt Andrew Latos M.Sc.RWTH Chair and Institute of Industrial Engineering

More information

Inferences using type-ii progressively censored data with binomial removals

Inferences using type-ii progressively censored data with binomial removals Arab. J. Math. (215 4:127 139 DOI 1.17/s465-15-127-8 Arabian Journal of Matheatics Ahed A. Solian Ahed H. Abd Ellah Nasser A. Abou-Elheggag Rashad M. El-Sagheer Inferences using type-ii progressively censored

More information

UvA-DARE (Digital Academic Repository) Recursive unsupervised learning of finite mixture models Zivkovic, Z.; van der Heijden, F.

UvA-DARE (Digital Academic Repository) Recursive unsupervised learning of finite mixture models Zivkovic, Z.; van der Heijden, F. UvA-DARE (Digital Acadeic Repository) Recursive unsupervised learning of finite ixture odels Zivkovic, Z.; van der Heijden, F. Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence

More information