ST2352. Working backwards with conditional probability. ST2352 Week 8 1

Similar documents
Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Limited Dependent Variables

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Multiple Choice. Choose the one that best completes the statement or answers the question.

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Quantifying Uncertainty

Engineering Risk Benefit Analysis

First Year Examination Department of Statistics, University of Florida

Expected Value and Variance

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

b ), which stands for uniform distribution on the interval a x< b. = 0 elsewhere

EGR 544 Communication Theory

Generative classification models

Basic Statistical Analysis and Yield Calculations

UNIVERSITY OF DUBLIN TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

β0 + β1xi and want to estimate the unknown

1 Convex Optimization

Week 5: Neural Networks

Suites of Tests. DIEHARD TESTS (Marsaglia, 1985) See

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Statistical pattern recognition

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

A be a probability space. A random vector

Composite Hypotheses testing

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Lecture 3: Probability Distributions

Strong Markov property: Same assertion holds for stopping times τ.

Hydrological statistics. Hydrological statistics and extremes

Gaussian process classification: a message-passing viewpoint

Modeling and Simulation NETW 707

Probability and Random Variable Primer

Gaussian Mixture Models

Simulation and Random Number Generation

PHYS 705: Classical Mechanics. Calculus of Variations II

ST2004 Week 8 Probability Distributions

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

Outline for today. Markov chain Monte Carlo. Example: spatial statistics (Christensen and Waagepetersen 2001)

6 Supplementary Materials

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

Goodness of fit and Wilks theorem

Canonical transformations

Artificial Intelligence Bayesian Networks

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Ryan (2009)- regulating a concentrated industry (cement) Firms play Cournot in the stage. Make lumpy investment decisions

Statistics and Probability Theory in Civil, Surveying and Environmental Engineering

PhysicsAndMathsTutor.com

e i is a random error

Introduction to Random Variables

EM and Structure Learning

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Regression with limited dependent variables. Professor Bernard Fingleton

1 Definition of Rademacher Complexity

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

1 Review From Last Time

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

Multi-dimensional Central Limit Theorem

Linear Regression Analysis: Terminology and Notation

Distributions /06. G.Serazzi 05/06 Dimensionamento degli Impianti Informatici distrib - 1

Conjugacy and the Exponential Family

Biostatistics 360 F&t Tests and Intervals in Regression 1

Lecture 3 Stat102, Spring 2007

4.1. Lecture 4: Fitting distributions: goodness of fit. Goodness of fit: the underlying principle

7. Multivariate Probability

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

T E C O L O T E R E S E A R C H, I N C.

Lecture Notes on Linear Regression

What would be a reasonable choice of the quantization step Δ?

PROBABILITY PRIMER. Exercise Solutions

Lecture 10: May 6, 2013

The conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above

Polynomial Regression Models

Maximum Likelihood Estimation

Chapter 4: Regression With One Regressor

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

STAT 3008 Applied Regression Analysis

Data Abstraction Form for population PK, PD publications

SELECTED PROOFS. DeMorgan s formulas: The first one is clear from Venn diagram, or the following truth table:

Stat 543 Exam 2 Spring 2016

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

Probability review. Adopted from notes of Andrew W. Moore and Eric Xing from CMU. Copyright Andrew W. Moore Slide 1

Digital Signal Processing

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Stat 543 Exam 2 Spring 2016

Chapter 11: Simple Linear Regression and Correlation

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

2. Conditional Expectation (9/15/12; cf. Ross)

A random variable is a function which associates a real number to each element of the sample space

Transcription:

ST35 Workng backwards wth condtonal probablty ST35 Week 8

Roll two reg dce. One s 6; Pr(other s 6)? AR smulaton gves Y t = 3. Dst of Y t-? Y t = = Y t- + t ; t ~ N(0,) =? =0.5 ST35 Week 8

Sally Clarke Pr( nnocent SDS deaths) = ( n 8700) SDS Pr(Not nnocent?) ST35 Week 8 3

Sample Survey n 000 Precson 0.03 Approx 95%CI pˆ 0.03 RedC Prop(Satsfed by Kenny) = 0.43 Pr 0.40 p 0.46 0.95?? ˆ Theory p n X X Number ( Yes) X B( n, p) CLT pˆ N p, SE pˆ app p( p) SE pˆ 0.5 ( n 000, p 0.5) n Pr p (0.5) pˆ p (0.5) 0.95 Pr pˆ (0.5) p pˆ (0.5) 0.95 ST35 Week 8 4

Squrrels squrrels. Nut posonous f eat > half Frst eats random prop U(0,) Second eats random prop U(0,) of remander One survves: Pr(Other survves)? Frst eats 0.05 Pr(Second eats >0.5)? Second eats 0.05 Pr(Frst eats > 0.5)? ST35 Week 8 5

Squrrels Z Z P P fatal? fatal? 0.057 0.59 0.057 0.557 FALSE TRUE 0.870 0.6 0.870 0.09 TRUE FALSE 3 0.679 0.467 0.679 0.50 TRUE FALSE 4 0.555 0.66 0.555 0.75 TRUE FALSE 5 0.909 0.985 0.909 0.090 TRUE FALSE 6 0.539 0.893 0.539 0.4 TRUE FALSE Fatal Fatal TRUE FALSE TRUE 0.00 0.53 0.53 FALSE 0.5 0.3 0.48 0.5 0.85.00 Smple: both unform: p= z, p = z P P avg 0.5 0.5 var 0.09 0.05 cov p p -0.04 Margnal cdf Jont pdf va scatterplot Theory E[] 0.50 0.5 Var 0.08 0.05 Cov -0.04 ST35 Week 8 6

Squrrels Cond Dsts Cond cdf of P gven value of P Cond cdf of P gven value of P Both straght lnes (Unform) Why? ST35 Week 8 7

Squrrels Theory Forward smpler than Backward Jont Dst Margnal Dsts Jont pdf va scatterplot Condtonal Dsts ST35 Week 8 8

Squrrels Theory: U(0,) f p 0 p P f p p 0 p p P P p f p, p p 0 p ;0 p p PP ln p f p p dp of course P P p 0 F p p p p 0 of course f p p dp ln p 0 p P F p p p P Jont and Margnal dstrbutons ST35 Week 8 9

Squrrels Theory U(0,) ; VarP E P ln ln 4 0 0 E P p p dp p p p 3 3 9 0 0 E P p ln p dp p ln p p ; Var P 9 6 E PP p p p dp dp 0 p ;0 p p p p dp p 0 Cov P, P 4 ExpVals, Vars, Cov ST35 Week 8 0

Squrrels Theory U(0,) Pr( P ) ; Pr( P ) ln 0.5 Snce necessarly ( P ) ( P ) and vce versa Pr( P, P ) 0 and Pr( P, P ) 0.35 Cond Dst P P f, ln p p PP f p p 0 p p P P p f p p P ln f p, p ln( p ) f p p P PP F p 0 p p P P p ST35 Week 8

Normalsng Constant P easy U 0, P P p easy U 0, p Pr Pr Pr A BPrB A and B B APr A f P P p p f PP all p P, f p f p P P p P P, P P p P p dp P P P p p f p f p f p p f p f p f ST35 Week 8

SIR Samplng - Importance Resamplng Objectve Sample Y from cts pdf f y h( y) Y not smply avalable ; but smple to sample X from sutable f x Sample n values x of X For each compute ( ) r( x ) hx ( ) f x ( ) s r( x ) rx ( ) ( ) w( x ) s 3 Re-sample m values Y from dscrete dst Poss values of Y Probs Pr Y x w( x ) x X X Typcally x,y multvarate. Illustrated by unvarate ST35 Week 8 3

LLN Recall Law Large Numbers Expected Values Sample Mean x d samples of random varable X x E X xf x dx n n X x g x E g X g x f x dx n n X x CLT n, n X N E X Var X n 95% of values of x E X SD X n n 95% of values of g x E g X SD g X n ST35 Week 8 4

Normalsng Const by Ratos f y h y Y Sutable x Sample n values x of X 3 n s avg r( x ) f X ( ) not avalable explctly avalable; smple to sample For each compute ( ) r( x ) hx ( ) ( ) s r( x ) f X x ST35 Week 8 5

Normalsng Const by Ratos Proof h x fx x nr x all y X y h y h ydy f X ydy all y all y fx E r X ( ) X f x dx r x f x dx all y f well defned LLN Propertes of cgce to E r( X ) depend on Var r( X ) Ratos all equal smpler methods avalable ST35 Week 8 6

Normalsng Const: Dscrete Dscrete by SIR (Ilustraton, never needed) Poss Y 0 3 Normalsng Const f(y)h(y) 3 0.5 6.5 f(x) 0.5 0.5 0.5 0.5 Sum Avg ratos 6574 rato 6.574 for VLOOKUP cumwtsx X rato w 0 3 3 0.000 3E-04 0 0 8 0.00 0.00 0 3 4 0.00 0.00 0 4 0.00 0.004 0 5 4 0.00 0.005 0 999 4 0.00 0.998 3 000 0.00 ST35 Week 8 7

Normalsng Const: Contnuous y * Eg f y e y Usng eg x f x e y 0 * Y X rx ( ) 0 Here can show.53 From theory of 3?? y 0 Y e e x e x Quadratc Eg f y e y Theory.53 Ag Rato Sum rato.67 67.3 X f X (x) f Y (x) Rato Wts 0.79 0.757 0.96.7 0.00.63 0.83 0.450.593 0.00 3 0.584 0.558 0.843.5 0.00 4 0.359 0.698 0.937.343 0.00 5 4.944 0.007 0.000 0.00 0.000 000 0.973 0.378 0.63.648 0.00 ST35 Week 8 8

SIR Theory Y a n a n r x Y Dscrete Dst Pr Y x r x ;, n r x Pr r x r x I x wth I x f x a; else x a a x a xa x x Y Y X xa r x I x E r X I X n r x E r X Y x a X r x I x f x dx r x f x dx a X X xa r x f x dx r x f x dx f f x dx x dx f y dy as requred 0 ST35 Week 8 9

SIR - Weghted Bootstrap Squrrels To study dst of P gven P =0.7 Generate many P Resample, preferentally those P for whch P =0.7 s lkely More generally To thnk backwards gven evdence/data Generate many potental predecessors/causes Resample, preferrng those for whch data s lkely ST35 Week 8 0

AR model: Y t gven Y t+ =3 Sample Y t Eg by samplng recursvely Ft dstrbuton pdf (Y t ) For each Y t Compute pdf (Y t+ =3 Y t ) Form ratos Resample, preferrng Y t for whch Y t+ =3 s lkely alpha = Summares Theory 0.9 Mean -0.0 E[Y] 0 Var 5.78 Var[Y] 5.6 But gven that everythng s Gaussan ST35 Week 8

AR model: Y t gven Y t+ =3 pdf Y N t y 0 0, exp y y y y 3 pdf Y Y 3 N 3, exp t t 6 Rato r y exp y y y s 3, s r y pdf N s y and prop const avalable ST35 Week 8

AR model: Y t gven Y t+ =3 Smpler, gven Gaussan Y 0 t BVN, ; y y Y t 0 3 3, Y Y 3 N 3, Y Y N cf Y Y t t y t t t t t y ST35 Week 8 3

SIR and Cond Dsts Recall Propertes of cgce to E r( X ) depend on Var r( X ) Natural for cond dsts ST35 Week 8 4

Workng backwards Data y ( aspects of )process that gave rse to y? Model Seek Procedure y realsaton of rv Y Y can be smulated; data generatng system aspects of system prob dst f Y Y values of parameters ; y ST35 Week 8 5 DGS Z Y pdf f y Y transform Z Y Smulate random from f knowledge, absent data Prefer values for whch lkelhood of these data s hgh ; Pr ; f y Y y L y OR equvalent, f algebra/models smple Y