CS 3710 Advanced Topics in AI Lecture 17. Density estimation. CS 3710 Probabilistic graphical models. Administration

Similar documents
CS 2750 Machine Learning Lecture 5. Density estimation. Density estimation

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

Machine learning: Density estimation

Generative classification models

Bayesian belief networks

M2S1 - EXERCISES 8: SOLUTIONS

= 2. Statistic - function that doesn't depend on any of the known parameters; examples:

Point Estimation: definition of estimators

STAT 400 Homework 09 Spring 2018 Dalpiaz UIUC Due: Friday, April 6, 2:00 PM

Part I: Background on the Binomial Distribution

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 17

CS 2750 Machine Learning Lecture 8. Linear regression. Supervised learning. a set of n examples

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

STK4011 and STK9011 Autumn 2016

Overview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression

Classification : Logistic regression. Generative classification model.

Chapter 5 Properties of a Random Sample

Dimensionality reduction Feature selection

THE ROYAL STATISTICAL SOCIETY 2010 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULE 2 STATISTICAL INFERENCE

Bayes (Naïve or not) Classifiers: Generative Approach

Point Estimation: definition of estimators

6. Nonparametric techniques

Bayes Interval Estimation for binomial proportion and difference of two binomial proportions with Simulation Study

LINEAR REGRESSION ANALYSIS

X ε ) = 0, or equivalently, lim

Parametric Density Estimation: Bayesian Estimation. Naïve Bayes Classifier

6.867 Machine Learning

Chapter 3 Sampling For Proportions and Percentages

Generalized Linear Regression with Regularization

Parameter Estimation

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Statistical modelling and latent variables (2)

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Unsupervised Learning and Other Neural Networks

Introduction to local (nonparametric) density estimation. methods

Chapter 14 Logistic Regression Models

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULE 2

Useful Statistical Identities, Inequalities and Manipulations

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

Parameter, Statistic and Random Samples

Density estimation III.

Lecture 3 Naïve Bayes, Maximum Entropy and Text Classification COSI 134

Bayesian belief networks

Density estimation II

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Lecture 3. Sampling, sampling distributions, and parameter estimation

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Sampling Theory MODULE X LECTURE - 35 TWO STAGE SAMPLING (SUB SAMPLING)

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

Continuous Distributions

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

Law of Large Numbers

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018

CS 2750 Machine Learning. Lecture 7. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

Naïve Bayes MIT Course Notes Cynthia Rudin

Random Variables. ECE 313 Probability with Engineering Applications Lecture 8 Professor Ravi K. Iyer University of Illinois

ENGI 3423 Simple Linear Regression Page 12-01

f X (x i ;θ) = n ( n logx i = 0 = θml = n/ n logx i 1 θ +1 n n 2 < 0 for all θ (θ +1) n logx i 1 ESTIMATOR: = logx i θ n for all θ θ 2 < 0 2θ 2 x 3

Bayes Decision Theory - II

Chapter 3. Differentiation 3.3 Differentiation Rules

Physical Fluctuomatics 4th Maximum likelihood estimation and EM algorithm

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Linear regression (cont) Logistic regression

Chapter 3. Differentiation 3.2 Differentiation Rules for Polynomials, Exponentials, Products and Quotients

CHAPTER 3 POSTERIOR DISTRIBUTIONS

Econometrics. 3) Statistical properties of the OLS estimator

Supervised learning: Linear regression Logistic regression

arxiv: v1 [math.st] 24 Oct 2016

Dimensionality reduction Feature selection

CHAPTER 4 RADICAL EXPRESSIONS

Construction and Evaluation of Actuarial Models. Rajapaksha Premarathna

Simple Linear Regression

Chapter 4 Multiple Random Variables

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

ESS Line Fitting

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Chapter 13 Student Lecture Notes 13-1

16 Homework lecture 16

Tema 5: Aprendizaje NO Supervisado: CLUSTERING Unsupervised Learning: CLUSTERING. Febrero-Mayo 2005

Some Statistical Inferences on the Records Weibull Distribution Using Shannon Entropy and Renyi Entropy

Lecture Notes Types of economic variables

THE ROYAL STATISTICAL SOCIETY 2009 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULAR FORMAT MODULE 2 STATISTICAL INFERENCE

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Support vector machines

Generative classification models

Statistics MINITAB - Lab 5

Fridayʼs lecture" Problem solutions" Joint densities" 1."E(X) xf (x) dx (x,y) dy X,Y Marginal distributions" The distribution of a ratio" Problems"

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Functions of Random Variables

9.1 Introduction to the probit and logit models

Mathematics HL and Further mathematics HL Formula booklet

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

Bayesian Classifier. v MAP. argmax v j V P(x 1,x 2,...,x n v j )P(v j ) ,..., x. x) argmax. )P(v j

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

Transcription:

CS 37 Avace Topcs AI Lecture 7 esty estmato Mlos Hauskrecht mlos@cs.ptt.eu 539 Seott Square CS 37 robablstc graphcal moels Amstrato Mterm: A take-home exam week ue o Weesay ovember 5 before the class epes o the materal covere so far: Exact fereces Mote-Carlo samplg Varatoal approxmato You wll be evaluate o the correctess a clarty of your aswers Be eat a expla clearly your otatos a solutos CS 37 robablstc graphcal moels

esty estmato ata: {.. } x a vector of attrbute values Attrbutes: moele by raom varables X { X X K X } wth: Cotuous values screte values E.g. bloo pressure wth umercal values or chest pa wth screte values [o-pa ml moerate strog] Uerlyg true probablty strbuto: px CS 37 robablstc graphcal moels ata: esty estmato {.. } x a vector of attrbute values Objectve: try to estmate the uerlyg true probablty strbuto over varables X px usg examples true strbuto samples p X.. } { estmate pˆ X Staar assumptos: Samples are epeet of each other come from the same etcal strbuto fxe px CS 37 robablstc graphcal moels

esty estmato Types of esty estmato: arametrc the strbuto s moele usg a set of parameters Θ p X Θ Example: mea a covaraces of multvarate ormal Estmato: f parameters Θˆ that ft the ata the best o-parametrc The moel of the strbuto utlzes all examples As f all examples were parameters of the strbuto Examples: earest-eghbor Sem-parametrc CS 37 robablstc graphcal moels arametrc esty estmato arametrc esty estmato Basc settgs: A set of raom varables X { X X K X } A moel of the strbuto over varables X wth parameters Θ ata.. } { Objectve: f parameters Θˆ that escrbe p X Θ the best CS 37 robablstc graphcal moels

arameter learg What s the best set of parameters? Maxmum lkelhoo ML estmates maxmze p Θ - represets pror backgrou kowlege Maxmum a posteror probablty MA estmate maxmze p Θ Selects the moe of the posteror p Θ p Θ p Θ p CS 37 robablstc graphcal moels arameter learg Both ML or MA pck oe parameter value Is t always the best soluto? Bayesa approach Remees the lmtato of oe choce Keeps a uses complete posteror strbuto Optmzato s replace wth tegrato p Θ How s t use? Assume we wat: x Coser all parameter settgs a averages the result x x p Example: prect the result of the outcome x x CS 37 robablstc graphcal moels

Beroull strbuto. x Outcomes: wth values or hea or tal ata: a sequece of outcomes x Moel: probablty of a outcome probablty of x x x Beroull strbuto CS 37 robablstc graphcal moels Maxmum lkelhoo ML estmate. Lkelhoo of ata: x Maxmum lkelhoo estmate ML Optmze log-lkelhoo arg max x x l log log x log x log log x - umber of s see - umber of s see x log x CS 37 robablstc graphcal moels

Maxmum lkelhoo ML estmate. Optmze log-lkelhoo l log log Set ervatve to zero Solvg l ML Soluto: ML CS 37 robablstc graphcal moels Maxmum a posteror estmate Maxmum a posteror estmate Selects the moe of the posteror strbuto MA arg max p How to choose the pror probablty? p p va Bayes rule p - s the lkelhoo of ata x x - s the pror probablty o CS 37 robablstc graphcal moels

CS 37 robablstc graphcal moels ror strbuto p Choce of pror: strbuto strbuto fts bomal samplg - cojugate choces MA p MA Soluto: Why? CS 37 robablstc graphcal moels strbuto...3.4.5.6.7.8.9.5.5.5 3 3.5.5 β.5.5 β.5.5 β5

CS 37 robablstc graphcal moels Bayesa approach osteror probablty: robablty of a outcome the ext tral Equvalet to the expecte value of the parameter expectato s take wth regar to the posteror strbuto p x p x x E p p CS 37 robablstc graphcal moels Bayesa learg Expecte value of the parameter E ote:

Expecte value rectve probablty of a outcome x the ext tral x E Substtutg the results for p We get x E Istea of MA a ML choce of the parameter we ca use the expecte value of the parameter ˆ E CS 37 robablstc graphcal moels