MFM Practitioner Module: Risk & Asset Allocation. John Dodson. January 28, 2015

Similar documents
MFM Practitioner Module: Risk & Asset Allocation. John Dodson. February 3, 2010

Outline. Motivation Contest Sample. Estimator. Loss. Standard Error. Prior Pseudo-Data. Bayesian Estimator. Estimators. John Dodson.

MFM Practitioner Module: Risk & Asset Allocation. John Dodson. February 18, 2015

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Bayesian Optimization

Gibbs Sampling in Linear Models #2

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 7, 2011

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Part 6: Multivariate Normal and Linear Models

The Expectation-Maximization Algorithm

STAT 830 Bayesian Estimation

Efficient Bayesian Multivariate Surface Regression

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

The linear model is the most fundamental of all serious statistical models encompassing:

Bayesian Linear Regression

MFM Practitioner Module: Risk & Asset Allocation. John Dodson. January 25, 2012

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Attilio Meucci. Issues in Quantitative Portfolio Management: Handling Estimation Risk

MCMC algorithms for fitting Bayesian models

Bayesian Inference. Chapter 9. Linear models and regression

Introduction & Random Variables. John Dodson. September 3, 2008

Bayesian Linear Models

Bayesian Linear Models

Learning Bayesian network : Given structure and completely observed data

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Statistical Data Mining and Machine Learning Hilary Term 2016

Parametric Techniques Lecture 3

Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models

Efficient Bayesian Multivariate Surface Regression

6.867 Machine Learning

Data Analysis and Uncertainty Part 2: Estimation

7. Estimation and hypothesis testing. Objective. Recommended reading

Markov chain Monte Carlo

IEOR E4703: Monte-Carlo Simulation

Classical and Bayesian inference

Covariance function estimation in Gaussian process regression

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Statistical Theory MT 2007 Problems 4: Solution sketches

Conjugate Analysis for the Linear Model

COS513 LECTURE 8 STATISTICAL CONCEPTS

Introduction to Machine Learning

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Non-Parametric Bayes

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

ST 740: Linear Models and Multivariate Normal Inference

Principles of Bayesian Inference

Statistical Theory MT 2006 Problems 4: Solution sketches

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 2: Conjugate models

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is

Review: Statistical Model

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

One-parameter models

Multivariate Distributions

Dynamic Matrix-Variate Graphical Models A Synopsis 1

STAT Advanced Bayesian Inference

Multivariate Normal-Laplace Distribution and Processes

Stat 5101 Lecture Notes

Density Estimation. Seungjin Choi

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

A Derivation of the EM Updates for Finding the Maximum Likelihood Parameter Estimates of the Student s t Distribution

Parametric Techniques

Overview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58

Gentle Introduction to Infinite Gaussian Mixture Modeling

Maximum Likelihood Estimation. only training data is available to design a classifier

Ch. 5 Hypothesis Testing

E. Santovetti lesson 4 Maximum likelihood Interval estimation

10. Linear Models and Maximum Likelihood Estimation

Multivariate Statistical Analysis

A Brief Introduction to R Shiny

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

MFM Practitioner Module: Quantitative Risk Management. John Dodson. September 23, 2015

Lecture 4: Types of errors. Bayesian regression models. Logistic regression

Timevarying VARs. Wouter J. Den Haan London School of Economics. c Wouter J. Den Haan

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Market Risk. MFM Practitioner Module: Quantitiative Risk Management. John Dodson. February 8, Market Risk. John Dodson.

The comparative studies on reliability for Rayleigh models

1 Hypothesis Testing and Model Selection

An Alternative Infinite Mixture Of Gaussian Process Experts

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

CPSC 540: Machine Learning

Variational Message Passing. By John Winn, Christopher M. Bishop Presented by Andy Miller

θ 1 θ 2 θ n y i1 y i2 y in Hierarchical models (chapter 5) Hierarchical model Introduction to hierarchical models - sometimes called multilevel model

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning

Probability and Estimation. Alan Moses

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Biostat 2065 Analysis of Incomplete Data

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018

STA 260: Statistics and Probability II

Bayesian Linear Models

Estimating Correlation Coefficient Between Two Complex Signals Without Phase Observation

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Introduction to Bayesian Methods

Lecture 6. Prior distributions

Index. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables.

Transcription:

MFM Practitioner Module: Risk & Asset Allocation Estimator January 28, 2015

Estimator Estimator

Review: tells us how to reverse the roles in conditional probability. f Y X {x} (y) f Y (y)f X Y {y} (x) tells us how to update the model for Y, combining data with the prior density to yield the posterior density. Estimator Statistics When we interpret Y as (unknown) parameter associated with the characterization of X, and the event X {x} as an observation, provides an important foundation for estimation.

In estimation we do not endow the sample with a characterization; rather, we endow the parameters with a characterization, described by hyper-parameters. This is the prior characterization. We then update the characterization by conditioning on the observed data using, which leads to the posterior characterization from which we can build estimates. Estimator f θ Y (N)(θ) f θ (θ)f Y (N) θ (y (N)) The approach is inherently biased. The prior is ideally based on beliefs about the results before any data have been observed The approach can be appropriate when the statistician is also a subject matter expert (such as you)

In principle, the characterization of the parameter prior can be completely arbitrary. Conjugate But a judicious choice exists, which is to choose a prior from a family that this closed under updates. Such a prior is termed the conjugate prior. With a conjugate prior, updating can be expressed as an algebraic transformation of the hyper-parameters, involving the prior values, the sufficient statistics, and the sample size Improper It can be useful to consider a prior that has no information, for example a prior whose density is uniform over the sample space of the parameters. This is termed an improper prior. Estimator

Important Example There is a conjugate prior for the univariate normal X N (µ, σ 2 ). It has a somewhat complicated form, but it is nonetheless very useful. It is termed the normal inverse Gamma distribution with hyper-parameters µ 0, σ0 2, λ 0, and ν 0 and it is defined by the mixture ) µ σ 2 N (µ 0, σ2 λ 0 ( σ 2 G 1 ν 0 2, ν ) 0 2 σ2 0 The posterior is in the same family with updated parameters based on the sufficient statistics t 1 = i x i and t 2 = i x 2 i λ N = λ 0 + N ν N = ν 0 + N µ N = λ 0µ 0 + t 1 λ N Estimator σ 2 N = ν 0σ 2 0 + λ 0µ 2 0 + t 2 λ N µ 2 N ν N

Important Example The preceding can be generalized to the the multivariate normal, X N (µ, Σ). We call the conjugate prior for the parameters (µ, Σ) the normal inverse Wishart, ( µ Σ N µ 0, 1 ) Σ Σ W 1 (ν 0, ν 0 Σ 0 ) λ 0 with hyper-parameters µ 0 R M, Σ 0 R M M positive definite, λ 0 > 0 and ν 0 > M 1 with M = dim X. The hyper-parameters conditional on a sample x R N M are Estimator λ N = λ 0 + N ν N = ν 0 + N µ N = λ 0µ 0 + x 1 λ N Σ N = ν 0Σ 0 + λ 0 µ 0 µ 0 + x x λ N µ N µ N ν N

Sampling from the Posterior We will see in the next chapter that a point estimate for the market vector parameters is not adequate. We will need to work with a range of possible values for the market vector parameter θ that is consistent with our objective data and our subjective view. Gibbs Sampling Conjugate distributions for the parameters of a characterization may be easy to update, but they can be difficult to sample from. It turns out that if the conditional characterizations of the parameters can be easily sampled, a nested iteration scheme converges to the joint characterization. Estimator where lim i θ(i) θ θ (i) j θ (i 1) θ j 1,, θ (i 1) j 1, θ(i 1) j+1,, Y (N)

Pseudo-data A useful application of the conjugate prior is to imagine that prior itself is a posterior with respect to some imaginary (random) dataset Ỹ (Ñ) = ( X 1,..., XÑ) where each X i are drawn independently from a known distribution representing our beliefs. f θ Y (N)(θ) f Y (N) θ (y)f θ(θ) = f Y (N) θ (y) ( = f Y (N+Ñ) θ f Y (Ñ) θ ( y Ỹ ) im (Ỹ )fθ (θ) ) fθ im (θ) Estimator Application If we want to simulate from the posterior, we simply append the actual sample with a pseudo-sample of variates drawn from the characterization of X, then proceed with non- estimation (e.g. MLE).

Estimator Once we know the posterior distribution f θ Y (N), we still need to provide a result. Some naïve approached include mode and modal dispersion, ˆθ = arg max f θ Y (N) mean and covariance, ˆθ = E ( θ Y (N)) A more sophisticated approach is to find the minimum-measure region for a given level of confidence 1 α. In the univariate setting, this would be (θ 0, θ 1 ) for Estimator arg min θ 1 (θ 0 ) θ 0 θ 0 ( ) and θ 1 (θ 0 ) = Q θ Y (N) F θ Y (N)(θ 0 ) + α. Loss Function The ideal approach is to find a parameter value that minimizes the expected value of some loss function customized to the subsequent application.

The first step towards determining our subjective model for the markets is to codify our experience. Two indirect ways of approaching is to analyze model portfolio allocations constraints on acceptable allocations Estimator Allocation-implied characterization The clearest expression that we can observe of the opinion of a professional asset manager is the allocation he chooses. If we can interpret this allocation as the solution to a satisfaction optimization problem with respect to some measure of performance, we can attempt to invert that solution to determine the implied market characterization.