Will Monroe August 9, with materials by Mehran Sahami and Chris Piech. image: Arito. Parameter learning
|
|
- Eustace Lynch
- 5 years ago
- Views:
Transcription
1 Will Monroe August 9, 07 with aterials by Mehran Sahai and Chris Piech iage: Arito Paraeter learning
2 Announceent: Proble Set #6 Goes out tonight. Due the last day of class, Wednesday, August 6 (before class). Soe serious coding! Congressional votng No late days! Heart disease diagnosis
3 Review: Paraeter estaton Soetes we don t know things like the expectaton and variance of a distributon; we have to estate the fro incoplete inforaton. n n = X i S = ( X X X ) i n i= n i= ^ =arg ax LL()
4 Review: Central liit theore Sus and averages of IID rando variables are norally distributed. n σ X = X i N (μ, ) n i= n n Y =n X = X i N (n μ, n σ ) i=
5 Easily-confused principles Constant ultple Su of identcal of a noral norals X N (μ, σ ) CLT X i??? X i N (μ, σ ) (independent & identcal) n n X N (nμ, n σ ) (independent & identcal) n X i N (nμ,n σ ) X i N (nμ,n σ ) (exactly) (approxiately, for large n) i= i=
6 Central liit theore deo
7 Review: Approxiatng a Poisson with a noral X Poi (λ) Y N (λ, λ) (for large λ)
8 Paraeters X Ber(p) =p Poi(λ) =λ Uni(a, b) = [a, b] N(μ, σ²) = [μ, σ²]
9 Maxiu likelihood estaton Choose paraeters that axiize the likelihood (joint probability given paraeters) of the exaple data. ^ =arg ax LL() * x x x x * 0 x 3 4
10 How to: MLE. Copute the likelihood. L()=P ( X,, X ). Take its log. LL()=log L() 3. Maxiize this as a functon of the paraeters. * d LL()=0 d x4 x3 x x x * 0
11 Maxiu likelihood for Bernoulli The axiu likelihood p for Bernoulli rando variables is the saple ean. p^ = X i i=
12 Derivaton: MLE for Bernoulli. Copute the likelihood. = p L()=P( X,, X ) = P( X i ) i= = i= don t forget: IID eans independent! p if X i= ( p) if X i=0 {
13 Derivaton: MLE for Bernoulli. Copute the likelihood. = p L()=P( X,, X ) = P( X i ) i= = i= don t forget: IID eans independent! p if X i= ( p) if X i=0 { Xi X i = p ( p) i=
14 Derivaton: MLE for Bernoulli. Take its log. = p Xi X i L()= ( ) i= Xi X i LL()=log ( ) i= Xi X i = log [ ( ) ] i= = [ X i log +( X i )log( ) ] i=
15 Derivaton: MLE for Bernoulli 3. Maxiize this as a functon of the paraeters. = p LL()= [ X i log +( X i ) log ( ) ] i= ^ p^ =arg ax LL() = X i X i d LL()= d i= = X i ( X i ) i= i= ( = [ ] + ( )( i= + =0 ) )( ) Xi ( ) i= X i = i= X i = = ( ) i= Xi
16 Maxiu likelihood for noral The axiu likelihood μ for noral rando variables is the saple ean, and the axiu likelihood σ² is the uncorrected ean square deviaton. ^ μ= Xi i= ^ ^ σ = ( X i μ) i=
17 Derivaton: MLE for Noral. Take its log =[μ, σ ] L()= i= x μ σ ( e σ π [ ) x μ σ ( LL()= log e σ π i= ) ] x μ = log σ log π σ i= ( )
18 Derivaton: MLE for noral 3. Maxiize this as a functon of the paraeters. =[μ, σ ] X i μ LL()= log σ log π σ i= ^ ^ [ μ^, σ ]==arg ax LL() ( LL()= X i μ σ σ μ i= X i μ = σ i= μ = X i =0 σ i= σ μ= ( ( ) )( ) ( ) X i = X i= )
19 Derivaton: MLE for noral 3. Maxiize this as a functon of the paraeters. =[μ, σ ] X i μ LL()= log σ log π σ i= ^ ^ [ μ^, σ ]==arg ax LL() ( ) X i μ X i μ LL()= σ σ σ σ i= ( X i μ) = σ 3 σ i= [ [ ( )] ] = )( ( X μ) i 3 σ =0 σ i= ) σ = ( X i μ) = ( X i X i= i=
20 Break te!
21 Maxiu likelihood for unifor The axiu likelihood a and b for unifor rando variables are the iniu and axiu of the data. ^ b=ax Xi a^ =in X i i i b a a b
22 Derivaton: MLE for unifor. Copute the likelihood. =[a, b] L()= i= { b a 0 if a X i b otherwise. Take its log. LL()= i= log (b a) if a X i b otherwise { 3. Maxiize this as a functon of the paraeters. ^ =arg ^ [ a^, b]= ax LL()
23 Derivaton: MLE for unifor =[a, b] if a X i b L()= b a i= 0 otherwise ^ =arg ^ [ a^, b]= ax L() { Likelihood: L(a,) L(a, ) a^ =in X i i Likelihood: L(0, b) L(0, b) a b ^ b=ax Xi i
24 Maxiu a posteriori estaton Choose the ost likely paraeters given the exaple data. You ll need a prior probability over the paraeters. ^ =arg ax P ( X,, X n ) =arg ax [ LL()+log P () ]
25 Review: Multnoial rando variable An ultnoial rando variable records the nuber of tes each outcoe occurs, when an experient with ultple outcoes (e.g. die roll) is run ultple tes. X,, X MN (n, p, p,, p ) vector! P ( X =c, X =c,, X =c ) c c c n = p p p c, c,, c ( )
26 Roll all of the dice! A 6-sided die is rolled 7 tes. What is the probability we get: one two 0 threes fours 0 fves 3 sixes? X,, X 6 MN (7,,,,,, ) P ( X =, X =, X 3 =0, X 4 =, X 5 =0, X 6 =3) = =40 6,,0,,0, ( )( ) ( ) ( ) ( ) ( ) ( ) 7 ()
27 Maxiu likelihood with ultnoial A 6-sided die is rolled 7 tes. We get: one two 0 threes fours 0 fves 3 sixes What is the MLE for p₁,, p₆? 3 X,, X 6 MN (7,,, 0,, 0, ) you ll never roll a 3! not in a illion years!
28 Are we doing this backwards? ^ =arg ax P ( X,, X n ) ^ =arg ax P ( X,, X n )
29 Bayes to the rescue ^ =arg ax P ( X,, X n ) P( X,, X n ) P() =arg ax P ( X,, X n ) =arg ax P( X,, X n ) P () =arg ax [ log P( X,, X n )+log P () ]
30 Review: Beta rando variable An beta rando variable odels the probability of a trial s success, given previous trials. The PDF/CDF let you copute probabilites of probabilites! X Beta (a, b) f X ( x)= Cx { a b ( x) 0 if 0< x < otherwise
31 Review: Dirichlet distributon Beta is the distributon ( conjugate prior ) for the p in the Bernoulli and binoial. Dirichlet is the distributon for the p₁, p₂, in the ultnoial. X, X, Dir (a, a, ) fx, X, ( x, x, )= C x a x b if 0<{x, x, }<, x + x + = (0 otherwise)
32 Laplace soothing Also known as add-one soothing: assue you ve seen one iaginary occurrence of each possible outcoe. # ( X =i)+k pi = n+ k # ( X =i)+ pi = n+
33 Maxiu likelihood with ultnoial A 6-sided die is rolled 7 tes. We get: one two 0 threes fours 0 fves 3 sixes What is the MLE for p₁,, p₆? 3 X,, X 6 MN (7,,, 0,, 0, ) you ll never roll a 3! not in a illion years!
34 Laplace with ultnoial A 6-sided die is rolled 7 tes. We get: one two 0 threes fours 0 fves 3 sixes What is the Laplace estate for p₁,, p₆? 3 4 X,, X 6 MN (7,,,,,, ) stll a chance!
35 Paraeter priors X Ber(p) p ~ Beta(a, b) Bin(n, p) p ~ Beta(a, b) MN(p) p ~ Dir(a) Poi(λ) λ ~ Gaa(k, ) Exp(λ) λ ~ Gaa(k, ) N(μ, σ²) μ ~ N(μ, σ ²) σ²~ InvGaa(α, β)
Conditional distributions
Conditional distributions Will Monroe July 6, 017 with materials by Mehran Sahami and Chris Piech Independence of discrete random variables Two random variables are independent if knowing the value of
More informationProbability Distributions
Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples
More informationMachine Learning Basics: Estimators, Bias and Variance
Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics
More informationCS Lecture 13. More Maximum Likelihood
CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood
More informationDetection and Estimation Theory
ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer
More informationComputational and Statistical Learning Theory
Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic
More informationA Simple Regression Problem
A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where
More informationUsing EM To Estimate A Probablity Density With A Mixture Of Gaussians
Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points
More informationBiostatistics Department Technical Report
Biostatistics Departent Technical Report BST006-00 Estiation of Prevalence by Pool Screening With Equal Sized Pools and a egative Binoial Sapling Model Charles R. Katholi, Ph.D. Eeritus Professor Departent
More informationEstimating Parameters for a Gaussian pdf
Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3
More informationSupport Vector Machines MIT Course Notes Cynthia Rudin
Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance
More informationBayes Decision Rule and Naïve Bayes Classifier
Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.
More informationLecture October 23. Scribes: Ruixin Qiang and Alana Shine
CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single
More informationSome Examples on Gibbs Sampling and Metropolis-Hastings methods
Soe Exaples o Gibbs Saplig ad Metropolis-Hastigs ethods S420/620 Itroductio to Statistical Theory, Fall 2012 Gibbs Sapler Saple a ultidiesioal probability distributio fro coditioal desities. Suppose d
More informationThis article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and
This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution
More informationBayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA)
Bayesian Learning Chapter 6: Bayesian Learning CS 536: Machine Learning Littan (Wu, TA) [Read Ch. 6, except 6.3] [Suggested exercises: 6.1, 6.2, 6.6] Bayes Theore MAP, ML hypotheses MAP learners Miniu
More informationProbabilistic Machine Learning
Probabilistic Machine Learning by Prof. Seungchul Lee isystes Design Lab http://isystes.unist.ac.kr/ UNIST Table of Contents I.. Probabilistic Linear Regression I... Maxiu Likelihood Solution II... Maxiu-a-Posteriori
More informationFeature Extraction Techniques
Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that
More informationIN modern society that various systems have become more
Developent of Reliability Function in -Coponent Standby Redundant Syste with Priority Based on Maxiu Entropy Principle Ryosuke Hirata, Ikuo Arizono, Ryosuke Toohiro, Satoshi Oigawa, and Yasuhiko Takeoto
More informationIn this chapter, we consider several graph-theoretic and probabilistic models
THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions
More informationResearch in Area of Longevity of Sylphon Scraies
IOP Conference Series: Earth and Environental Science PAPER OPEN ACCESS Research in Area of Longevity of Sylphon Scraies To cite this article: Natalia Y Golovina and Svetlana Y Krivosheeva 2018 IOP Conf.
More informationCombining Classifiers
Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/
More informationCompression and Predictive Distributions for Large Alphabet i.i.d and Markov models
2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511
More informationDERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS
DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS N. van Erp and P. van Gelder Structural Hydraulic and Probabilistic Design, TU Delft Delft, The Netherlands Abstract. In probles of odel coparison
More information1 Proof of learning bounds
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a
More informationSampling How Big a Sample?
C. G. G. Aitken, 1 Ph.D. Sapling How Big a Saple? REFERENCE: Aitken CGG. Sapling how big a saple? J Forensic Sci 1999;44(4):750 760. ABSTRACT: It is thought that, in a consignent of discrete units, a certain
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October
More informationSymbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm
Acta Polytechnica Hungarica Vol., No., 04 Sybolic Analysis as Universal Tool for Deriving Properties of Non-linear Algoriths Case study of EM Algorith Vladiir Mladenović, Miroslav Lutovac, Dana Porrat
More informationTEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES
TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,
More informationSequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5,
Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, 2015 31 11 Motif Finding Sources for this section: Rouchka, 1997, A Brief Overview of Gibbs Sapling. J. Buhler, M. Topa:
More informationThe Weierstrass Approximation Theorem
36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined
More informationRandom Variables and Densities
Rando Variables and Densities Review: Probabilit and Statistics Sa Roweis Rando variables X represents outcoes or states of world. Instantiations of variables usuall in lower case: We will write p() to
More informationTraining an RBM: Contrastive Divergence. Sargur N. Srihari
Training an RBM: Contrastive Divergence Sargur N. srihari@cedar.buffalo.edu Topics in Partition Function Definition of Partition Function 1. The log-likelihood gradient 2. Stochastic axiu likelihood and
More informationSupport Vector Machines. Maximizing the Margin
Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the
More informationMore discrete distributions
Will Monroe July 14, 217 with materials by Mehran Sahami and Chris Piech More discrete distributions Announcements: Problem Set 3 Posted yesterday on the course website. Due next Wednesday, 7/19, at 12:3pm
More informationIntelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes
More informationBinomial and Poisson Probability Distributions
Binoial and Poisson Probability Distributions There are a few discrete robability distributions that cro u any ties in hysics alications, e.g. QM, SM. Here we consider TWO iortant and related cases, the
More informationBootstrapping Dependent Data
Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly
More informationarxiv: v1 [cs.ds] 3 Feb 2014
arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationApproximation in Stochastic Scheduling: The Power of LP-Based Priority Policies
Approxiation in Stochastic Scheduling: The Power of -Based Priority Policies Rolf Möhring, Andreas Schulz, Marc Uetz Setting (A P p stoch, r E( w and (B P p stoch E( w We will assue that the processing
More informatione-companion ONLY AVAILABLE IN ELECTRONIC FORM
OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer
More informationLecture 18: Central Limit Theorem. Lisa Yan August 6, 2018
Lecture 18: Central Limit Theorem Lisa Yan August 6, 2018 Announcements PS5 due today Pain poll PS6 out today Due next Monday 8/13 (1:30pm) (will not be accepted after Wed 8/15) Programming part: Java,
More informationPrincipal Components Analysis
Principal Coponents Analysis Cheng Li, Bingyu Wang Noveber 3, 204 What s PCA Principal coponent analysis (PCA) is a statistical procedure that uses an orthogonal transforation to convert a set of observations
More informationKernel Methods and Support Vector Machines
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic
More informationMachine Learning: Fisher s Linear Discriminant. Lecture 05
Machine Learning: Fisher s Linear Discriinant Lecture 05 Razvan C. Bunescu chool of Electrical Engineering and Coputer cience bunescu@ohio.edu Lecture 05 upervised Learning ask learn an (unkon) function
More informationThis model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.
CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when
More informationESE 523 Information Theory
ESE 53 Inforation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electrical and Systes Engineering Washington University 11 Urbauer Hall 10E Green Hall 314-935-4173 (Lynda Marha Answers) jao@wustl.edu
More informationPseudo-marginal Metropolis-Hastings: a simple explanation and (partial) review of theory
Pseudo-arginal Metropolis-Hastings: a siple explanation and (partial) review of theory Chris Sherlock Motivation Iagine a stochastic process V which arises fro soe distribution with density p(v θ ). Iagine
More information3.3 Variational Characterization of Singular Values
3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and
More informationLecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon
Lecture 2: Enseble Methods Isabelle Guyon guyoni@inf.ethz.ch Introduction Book Chapter 7 Weighted Majority Mixture of Experts/Coittee Assue K experts f, f 2, f K (base learners) x f (x) Each expert akes
More informationNonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy
Storage Capacity and Dynaics of Nononotonic Networks Bruno Crespi a and Ignazio Lazzizzera b a. IRST, I-38050 Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I-38050 Povo (Trento) Italy INFN Gruppo
More informationKeywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution
Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality
More informationPattern Recognition and Machine Learning. Artificial Neural networks
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016/2017 Lessons 9 11 Jan 2017 Outline Artificial Neural networks Notation...2 Convolutional Neural Networks...3
More informationThe Random Variable for Probabilities Chris Piech CS109, Stanford University
The Random Variable for Probabilities Chris Piech CS109, Stanford University Assignment Grades 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 Frequency Frequency 10 20 30 40 50 60 70 80
More informationMeasures of average are called measures of central tendency and include the mean, median, mode, and midrange.
CHAPTER 3 Data Description Objectives Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance,
More informationSHORT TIME FOURIER TRANSFORM PROBABILITY DISTRIBUTION FOR TIME-FREQUENCY SEGMENTATION
SHORT TIME FOURIER TRANSFORM PROBABILITY DISTRIBUTION FOR TIME-FREQUENCY SEGMENTATION Fabien Millioz, Julien Huillery, Nadine Martin To cite this version: Fabien Millioz, Julien Huillery, Nadine Martin.
More informationSupport recovery in compressed sensing: An estimation theoretic approach
Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de
More informationTesting Properties of Collections of Distributions
Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the
More informationTracking using CONDENSATION: Conditional Density Propagation
Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationBlock designs and statistics
Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent
More informationA new robust optimization approach for scheduling under uncertainty II. Uncertainty with known probability distribution
Coputers and Cheical Engineering 31 (2007) 171 195 A new robust optiization approach for scheduling under uncertainty II. Uncertainty with nown probability distribution Stacy L. Jana, Xiaoxia Lin, Christodoulos
More information1 Bounding the Margin
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost
More informationBernoulli and Binomial
Bernoulli and Binomial Will Monroe July 1, 217 image: Antoine Taveneaux with materials by Mehran Sahami and Chris Piech Announcements: Problem Set 2 Due this Wednesday, 7/12, at 12:3pm (before class).
More informationLower Bounds for Quantized Matrix Completion
Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &
More informationTopic 5a Introduction to Curve Fitting & Linear Regression
/7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline
More informationWhat is Probability? (again)
INRODUCTION TO ROBBILITY Basic Concepts and Definitions n experient is any process that generates well-defined outcoes. Experient: Record an age Experient: Toss a die Experient: Record an opinion yes,
More informationTail Estimation of the Spectral Density under Fixed-Domain Asymptotics
Tail Estiation of the Spectral Density under Fixed-Doain Asyptotics Wei-Ying Wu, Chae Young Li and Yiin Xiao Wei-Ying Wu, Departent of Statistics & Probability Michigan State University, East Lansing,
More informationComputational and Statistical Learning Theory
Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher
More informationA Note on the Applied Use of MDL Approximations
A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention
More informationStatistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV
CS229 REPORT, DECEMBER 05 1 Statistical clustering and Mineral Spectral Unixing in Aviris Hyperspectral Iage of Cuprite, NV Mario Parente, Argyris Zynis I. INTRODUCTION Hyperspectral Iaging is a technique
More informationSupport Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab
Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a
More informationGradient Ascent Chris Piech CS109, Stanford University
Gradient Ascent Chris Piech CS109, Stanford University Our Path Deep Learning Linear Regression Naïve Bayes Logistic Regression Parameter Estimation Our Path Deep Learning Linear Regression Naïve Bayes
More informationSupplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data
Suppleentary to Learning Discriinative Bayesian Networks fro High-diensional Continuous Neuroiaging Data Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, and Dinggang Shen Proposition. Given a sparse
More informationFoundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research
Foundations of Machine Learning Boosting Mehryar Mohri Courant Institute and Google Research ohri@cis.nyu.edu Weak Learning Definition: concept class C is weakly PAC-learnable if there exists a (weak)
More informationGeneral Properties of Radiation Detectors Supplements
Phys. 649: Nuclear Techniques Physics Departent Yarouk University Chapter 4: General Properties of Radiation Detectors Suppleents Dr. Nidal M. Ershaidat Overview Phys. 649: Nuclear Techniques Physics Departent
More informationPolygonal Designs: Existence and Construction
Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G
More informationPULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE
PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE 1 Nicola Neretti, 1 Nathan Intrator and 1,2 Leon N Cooper 1 Institute for Brain and Neural Systes, Brown University, Providence RI 02912.
More informationBirthday Paradox Calculations and Approximation
Birthday Paradox Calculations and Approxiation Joshua E. Hill InfoGard Laboratories -March- v. Birthday Proble In the birthday proble, we have a group of n randoly selected people. If we assue that birthdays
More informationAnalyzing Simulation Results
Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient
More informationA Smoothed Boosting Algorithm Using Probabilistic Output Codes
A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu
More informationInference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression
Advances in Pure Matheatics, 206, 6, 33-34 Published Online April 206 in SciRes. http://www.scirp.org/journal/ap http://dx.doi.org/0.4236/ap.206.65024 Inference in the Presence of Likelihood Monotonicity
More informationVARIATIONAL ALGORITHMS TO REMOVE STRIPES: A GENERALIZATION OF THE NEGATIVE NORM MODELS.
VARIATIONAL ALGORITHMS TO REMOVE STRIPES: A GENERALIZATION OF THE NEGATIVE NORM MODELS. Jérôe Fehrenbach 1, Pierre Weiss 1 and Corinne Lorenzo 2 1 Institut de Mathéatiques de Toulouse, Toulouse University,
More informationTight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions
Tight Inforation-Theoretic Lower Bounds for Welfare Maxiization in Cobinatorial Auctions Vahab Mirrokni Jan Vondrák Theory Group, Microsoft Dept of Matheatics Research Princeton University Redond, WA 9805
More informationBAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup
BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS Darusz Bskup 1. Introducton The paper presents a nonparaetrc procedure for estaton of an unknown functon f n the regresson odel y = f x + ε = N. (1) (
More informationOptimum Value of Poverty Measure Using Inverse Optimization Programming Problem
International Journal of Conteporary Matheatical Sciences Vol. 14, 2019, no. 1, 31-42 HIKARI Ltd, www.-hikari.co https://doi.org/10.12988/ijcs.2019.914 Optiu Value of Poverty Measure Using Inverse Optiization
More information13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices
CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay
More informationSolutions 1. Introduction to Coding Theory - Spring 2010 Solutions 1. Exercise 1.1. See Examples 1.2 and 1.11 in the course notes.
Solutions 1 Exercise 1.1. See Exaples 1.2 and 1.11 in the course notes. Exercise 1.2. Observe that the Haing distance of two vectors is the iniu nuber of bit flips required to transfor one into the other.
More informationAN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS
Statistica Sinica 6 016, 1709-178 doi:http://dx.doi.org/10.5705/ss.0014.0034 AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Nilabja Guha 1, Anindya Roy, Yaakov Malinovsky and Gauri
More informationStochastic Subgradient Methods
Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods
More informationE0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis
E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds
More informationModel Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon
Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential
More informationA MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION
A eshsize boosting algorith in kernel density estiation A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION C.C. Ishiekwene, S.M. Ogbonwan and J.E. Osewenkhae Departent of Matheatics, University
More informationXII.3 The EM (Expectation-Maximization) Algorithm
XII.3 The EM (Expectaton-Maxzaton) Algorth Toshnor Munaata 3/7/06 The EM algorth s a technque to deal wth varous types of ncoplete data or hdden varables. It can be appled to a wde range of learnng probles
More informationCS 361: Probability & Statistics
October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite
More informationFundamental Tools - Probability Theory IV
Fundamental Tools - Probability Theory IV MSc Financial Mathematics The University of Warwick October 1, 2015 MSc Financial Mathematics Fundamental Tools - Probability Theory IV 1 / 14 Model-independent
More informationSimulation of Discrete Event Systems
Siulation of Discrete Event Systes Unit 9 Queueing Models Fall Winter 207/208 Prof. Dr.-Ing. Dipl.-Wirt.-Ing. Sven Tackenberg Benedikt Andrew Latos M.Sc.RWTH Chair and Institute of Industrial Engineering
More informationInferences using type-ii progressively censored data with binomial removals
Arab. J. Math. (215 4:127 139 DOI 1.17/s465-15-127-8 Arabian Journal of Matheatics Ahed A. Solian Ahed H. Abd Ellah Nasser A. Abou-Elheggag Rashad M. El-Sagheer Inferences using type-ii progressively censored
More informationUvA-DARE (Digital Academic Repository) Recursive unsupervised learning of finite mixture models Zivkovic, Z.; van der Heijden, F.
UvA-DARE (Digital Acadeic Repository) Recursive unsupervised learning of finite ixture odels Zivkovic, Z.; van der Heijden, F. Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence
More information