Introduction. Modeling Data. Approach. Quality of Fit. Likelihood. Probabilistic Approach

Similar documents
AP Statistics Ch 3 Examining Relationships

Improvements on Waring s Problem

Specification -- Assumptions of the Simple Classical Linear Regression Model (CLRM) 1. Introduction

Estimation of a proportion under a certain two-stage sampling design

Iterative Methods for Searching Optimal Classifier Combination Function

Additional File 1 - Detailed explanation of the expression level CPD

2.3 Least-Square regressions

The multivariate Gaussian probability density function for random vector X (X 1,,X ) T. diagonal term of, denoted

+, where 0 x N - n. k k

Estimation of Finite Population Total under PPS Sampling in Presence of Extra Auxiliary Information

MULTIPLE REGRESSION ANALYSIS For the Case of Two Regressors

Rockefeller College University at Albany

Adaptive Centering with Random Effects in Studies of Time-Varying Treatments. by Stephen W. Raudenbush University of Chicago.

Design of Recursive Digital Filters IIR

Statistical Properties of the OLS Coefficient Estimators. 1. Introduction

Team. Outline. Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference

A REVIEW OF ERROR ANALYSIS

Machine learning: Density estimation

CHAPTER 9 LINEAR MOMENTUM, IMPULSE AND COLLISIONS

Confidence intervals for weighted polynomial calibrations

Simple Linear Regression Analysis

Complete Variance Decomposition Methods. Cédric J. Sallaberry

Pattern Classification

Modal Reconstruction Methods Pros and Cons. Jim Schwiegerling, Ph.D. Department of Ophthalmology and Optical Sciences The University of Arizona

CHAPTER-5 INFORMATION MEASURE OF FUZZY MATRIX AND FUZZY BINARY RELATION

Chapter 6 The Effect of the GPS Systematic Errors on Deformation Parameters

Chapter 11. Supplemental Text Material. The method of steepest ascent can be derived as follows. Suppose that we have fit a firstorder

Mixture of Gaussians Expectation Maximization (EM) Part 2

11.5 MAP Estimator MAP avoids this Computational Problem!

Linear Form of the Radiative Transfer Equation Revisited. Bormin Huang

Multiple Regression Analysis

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

Logistic Regression Maximum Likelihood Estimation

The Fundamental Theorem of Algebra. Objective To use the Fundamental Theorem of Algebra to solve polynomial equations with complex solutions

Confidence intervals for the difference and the ratio of Lognormal means with bounded parameters

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

Supervised Learning. Neural Networks and Back-Propagation Learning. Credit Assignment Problem. Feedforward Network. Adaptive System.

Improvements on Waring s Problem

Web-Mining Agents Probabilistic Information Retrieval

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

Small signal analysis

Harmonic oscillator approximation

Optimal inference of sameness Supporting information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

The Essential Dynamics Algorithm: Essential Results

a new crytoytem baed on the dea of Shmuley and roved t rovably ecure baed on ntractablty of factorng [Mc88] After that n 999 El Bham, Dan Boneh and Om

3 Implementation and validation of analysis methods

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

Why Monte Carlo Integration? Introduction to Monte Carlo Method. Continuous Probability. Continuous Probability

STK3100 and STK4100 Autumn 2018

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Diagnostics in Poisson Regression. Models - Residual Analysis

PhysicsAndMathsTutor.com

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Machine Learning for Signal Processing Linear Gaussian Models

Logistic regression with one predictor. STK4900/ Lecture 7. Program

Review: Fit a line to N data points

7. Algorithms for Massive Data Problems

A total variation approach

Approximate Inference: Mean Field Methods

Lecture 10 Support Vector Machines. Oct

Generative classification models

STK3100 and STK4100 Autumn 2017

The conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above

Pattern Classification (II) 杜俊

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Independent Component Analysis

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Machine Learning for Signal Processing Linear Gaussian Models

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Error Bars in both X and Y

Lecture outline. Optimal Experimental Design: Where to find basic information. Theory of D-optimal design

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Lecture 20: Hypothesis testing

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

A Kernel Particle Filter Algorithm for Joint Tracking and Classification

Measurement Indices of Positional Uncertainty for Plane Line Segments Based on the ε

Discriminative classifier: Logistic Regression. CS534-Machine Learning

A Tutorial on Data Reduction. Linear Discriminant Analysis (LDA) Shireen Elhabian and Aly A. Farag. University of Louisville, CVIP Lab September 2009

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

Batch Reinforcement Learning

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

15-381: Artificial Intelligence. Regression and cross validation

Linear Feature Engineering 11

The Bellman Equation

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals

Bayesian Variable Selection and Computation for Generalized Linear Models with Conjugate Priors

Generalized Linear Methods

Lesson 16: Basic Control Modes

e i is a random error

Engineering Risk Benefit Analysis

Statistics MINITAB - Lab 2

Fuzzy approach to solve multi-objective capacitated transportation problem

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Linear Approximation with Regularization and Moving Least Squares

Joint Source Coding and Higher-Dimension Modulation

( ) [ ( k) ( k) ( x) ( ) ( ) ( ) [ ] ξ [ ] [ ] [ ] ( )( ) i ( ) ( )( ) 2! ( ) = ( ) 3 Interpolation. Polynomial Approximation.

Development of Pedotransfer Functions for Saturated Hydraulic Conductivity

Transcription:

Introducton Modelng Data Gven a et of obervaton, we wh to ft a mathematcal model Model deend on adutable arameter traght lne: m + c n Polnomal: a + a + a + L+ a n Choce of model deend uon roblem Aroach Degn a fgure-of-mert (or ftne) functon Meaure agreement between model and data mall value mean good ft Adut the model arameter to mnme the ftne functon Gve the bet-ft arameter Qualt of Ft If obervaton no, we can never ft eactl We wh to etmate he qualt of the ft of the model he uncertant n the etmate of the bet ft arameter Probabltc Aroach Concentrate on D outut for now Inut obervaton: Model form: { }.. { }.. f ( ; + : Model arameter : Redual noe term Dtrbuton () Lkelhood Probablt of obervaton gven the model arameter O ( f ( ; ) Prob. of all data ( f ( ; ) he lkelhood of the data gven the arameter

Model arameter etmaton We would lke the arameter whch are mot lkel gven the data data) data) data) unknown, but fed Mame Pror PDF of aram Eamle: Lne fttng Ft traght lne to data {, } m + c f ( ; { m, c}) Aume normal error on wth.d. ( ) ( ;, ) ( data m, ( ( m + ) Eamle: Lne fttng ( data m, log data m, ( ( m + ) log ( ( m + ) log data m, ( ( m + ) + cont { m, c} arg mn F( m, ( ( m + ) Lne Fttng { m, c} arg mn F( m, Let df dm df dc, ( ( m + ) ( ( ( m + ) ( m c etc m c ( ( m + ) m m Uncertant he arameter that mame data onl gve u a ngle value o nformaton about uncertant Often we need the full df data) data) data) Hard to comute! Dtrbuton of arameter ( data) Z Z data) a normalaton contant Z d Can be comuted eactl for ome cae Otherwe can be aromated b amlng where necear Z data ) are random amle from

Mean Mean of ), µ ) ˆ ( data) d d Z ˆ data ) Z Z data ) Eamle: Lne Fttng log data m, ( ( m + ) + cont m c If cont K ( ˆ) K( ˆ ) + cont K data) ( : ˆ, ) Etmatng Parameter PDF Error Proagaton ot alwa ea to comute dtrbuton eactl Common aromaton Error roagaton Monte-Carlo etmate Ueful when we have an analtc form for the comutng arameter from the data F({, }) F ({, }) For ntance, for lne fttng m c If vare b Error Proagaton F ({, }) then hu Gauan noe on caue noe on vare b aro. wth varance wth varance F F Monte-Carlo Method Ued when no analtc functon avalable Algorthm: For k..m Add noe to each orgnal meaurement Etmate otmal arameter, k Parameter PDF aromated b PDF of { k } otal varance of F 3

.5.45.4.35.3.5..5..5 3 4 5 Qualt of Ft How well doe the model actuall ft the data? Qualt of Ft How well doe the model actuall ft the data? Probabl not a ver good ft Deend on uncertant on meaurement Probabl not a ver good ft Deend on uncertant on meaurement Qualt of Ft Model form: f ( ; + ( f ( ; ) Aume normal error on wth.d. ( f ( ; ) log + cont If each Ch-quared Dtrbuton drawn from a zero mean, unt normal PDF, then the PDF of M known a χ ( M : ) M ha a mean of and a varance of Ch quare PDF chdf(m,) n MatLab 3 4 5 6 7 8 9 Ch-quared Dtrbuton Ch-quare and the Gamma Functon For large, χ (M:) (M:, ) A Gauan wth mean and varance.45.4.35.3.5..5..5 5 Ch-quared ormal 5 Gamma Functon Incomlete Gamma Functon P( a, ) Γ( a) CDF of Ch-quare: Γ( z) a t t z t t t e dt t e dt M P ( χ < M ) P(, ) χ n! Γ( + n) 4 6 8 4 6 8 4

Qualt of Ft ( f ( ; ) log + cont Margnalaton ) ( M ( f ( ; ) χ wth d.o.f., ) Let If P P M M < If then P P(χ < M ) (CDF of χ at M ).99 then model a M > reaonable ft.9999 the model a oor ft chcdf(m,) n MatLab, ) ( ) ), ) d ( ) ) ) d ( ), ) d Eamle: Lne fttng If we are onl ntereted n the gradent, m m, c data) ( m, c : ( mˆ, cˆ), ) m data) ( m, c : ( mˆ, cˆ), ) dc m data) ( m; mˆ, ) m m General Lnear Model Conder model of form M f ( ) w f ( ) + ( :, ) Lne fttng : M, f (),f () Polnomal degree d fttng : M d +, f () General Lnear Model Data: {, } Contruct M Degn Matr D { D } D f ( ) / Otmal weght w gven b oluton to Dw r r / General Lnear Model Perform VD on D D UWV wˆ VW Covarance of r I U w VW U IUW w VW V r V 5

Model electon uoe we have everal oble model Eg lnear, quadratc, cubc etc How do we elect the bet? P(data aram) alone nuffcent More comle model gve lower redual Akake Informaton Crtera mle method of model electon For each model fnd otmal arameter,, and comute AIC log + d d the number of ndeendent arameter elect model gvng the mallet AIC Baean Model electon uoe M ndee varou oble model We eek the mot robable model to ft to ome data We thu wh to fnd M to mame P( data M ) P( M ) P ( M data) P( data) If all model equall lkel, mame P( data M ) contant Baean Model electon uoe model M ha arameter, P ( data M ) data, M ) M ) d (he `Evdence for model M) Unfortunatel, th often moble to comute analtcall Varou aromaton rooed ee the lterature 6