Geometry of Transformations of Random Variables

Similar documents
Maximum Entropy and Exponential Families

An Integrated Architecture of Adaptive Neural Network Control for Dynamic Systems

23.1 Tuning controllers, in the large view Quoting from Section 16.7:

Advanced Computational Fluid Dynamics AA215A Lecture 4

Where as discussed previously we interpret solutions to this partial differential equation in the weak sense: b

The Hanging Chain. John McCuan. January 19, 2006

Relativistic Dynamics

Hankel Optimal Model Order Reduction 1

A model for measurement of the states in a coupled-dot qubit

z k sin(φ)(x ı + y j + z k)da = R 1 3 cos3 (φ) π 2π dθ = div(z k)dv = E curl(e x ı + e x j + e z k) d S = S

UPPER-TRUNCATED POWER LAW DISTRIBUTIONS

Properties of Quarks

Q2. [40 points] Bishop-Hill Model: Calculation of Taylor Factors for Multiple Slip

Planning with Uncertainty in Position: an Optimal Planner

Quantum Mechanics: Wheeler: Physics 6210

Math 32B Review Sheet

F = F x x + F y. y + F z

Developing Excel Macros for Solving Heat Diffusion Problems

Evaluation of effect of blade internal modes on sensitivity of Advanced LIGO

MAC Calculus II Summer All you need to know on partial fractions and more

Math 220A - Fall 2002 Homework 8 Solutions

Acoustic Waves in a Duct

Discrete Bessel functions and partial difference equations

We will show that: that sends the element in π 1 (P {z 1, z 2, z 3, z 4 }) represented by l j to g j G.

Singular Event Detection

Millennium Relativity Acceleration Composition. The Relativistic Relationship between Acceleration and Uniform Motion

3 Tidal systems modelling: ASMITA model

MOLECULAR ORBITAL THEORY- PART I

A Spatiotemporal Approach to Passive Sound Source Localization

Advances in Radio Science

DIGITAL DISTANCE RELAYING SCHEME FOR PARALLEL TRANSMISSION LINES DURING INTER-CIRCUIT FAULTS

Experiment 03: Work and Energy

Aharonov-Bohm effect. Dan Solomon.

22.54 Neutron Interactions and Applications (Spring 2004) Chapter 6 (2/24/04) Energy Transfer Kernel F(E E')

Analysis of discretization in the direct simulation Monte Carlo

FINITE WORD LENGTH EFFECTS IN DSP

Measuring & Inducing Neural Activity Using Extracellular Fields I: Inverse systems approach

Relative Maxima and Minima sections 4.3

In this case it might be instructive to present all three components of the current density:

Particle-wave symmetry in Quantum Mechanics And Special Relativity Theory

A NETWORK SIMPLEX ALGORITHM FOR THE MINIMUM COST-BENEFIT NETWORK FLOW PROBLEM

Likelihood-confidence intervals for quantiles in Extreme Value Distributions

Normative and descriptive approaches to multiattribute decision making

9 Geophysics and Radio-Astronomy: VLBI VeryLongBaseInterferometry

max min z i i=1 x j k s.t. j=1 x j j:i T j

Einstein s Three Mistakes in Special Relativity Revealed. Copyright Joseph A. Rybczyk

Control Theory association of mathematics and engineering

5.1 Composite Functions

The Lorenz Transform

Chapter 2: Solution of First order ODE

LECTURE NOTES FOR , FALL 2004

Nonreversibility of Multiple Unicast Networks

Berry s phase for coherent states of Landau levels

Panos Kouvelis Olin School of Business Washington University

Most results in this section are stated without proof.

A Differential Equation for Specific Catchment Area

Bottom Shear Stress Formulations to Compute Sediment Fluxes in Accelerated Skewed Waves

Beams on Elastic Foundation

A variant of Coppersmith s Algorithm with Improved Complexity and Efficient Exhaustive Search

General Equilibrium. What happens to cause a reaction to come to equilibrium?

The shape of a hanging chain. a project in the calculus of variations

6.4 Dividing Polynomials: Long Division and Synthetic Division

Integration of the Finite Toda Lattice with Complex-Valued Initial Data

Final Review. A Puzzle... Special Relativity. Direction of the Force. Moving at the Speed of Light

A Queueing Model for Call Blending in Call Centers

Study of EM waves in Periodic Structures (mathematical details)

INTRO VIDEOS. LESSON 9.5: The Doppler Effect

Cavity flow with surface tension past a flat plate

Mathematics II. Tutorial 5 Basic mathematical modelling. Groups: B03 & B08. Ngo Quoc Anh Department of Mathematics National University of Singapore

Chapter 8 Hypothesis Testing

Non-Markovian study of the relativistic magnetic-dipole spontaneous emission process of hydrogen-like atoms

Directional Coupler. 4-port Network

CDS 101/110: Lecture 2.1 System Modeling. Model-Based Analysis of Feedback Systems

A population of 50 flies is expected to double every week, leading to a function of the x

7 Max-Flow Problems. Business Computing and Operations Research 608

Sampler-A. Secondary Mathematics Assessment. Sampler 521-A

Frequency Domain Analysis of Concrete Gravity Dam-Reservoir Systems by Wavenumber Approach

Appendix A Market-Power Model of Business Groups. Robert C. Feenstra Deng-Shing Huang Gary G. Hamilton Revised, November 2001

4.4 Solving Systems of Equations by Matrices

U S A Mathematical Talent Search. PROBLEMS / SOLUTIONS / COMMENTS Round 4 - Year 11 - Academic Year

Computer Engineering 4TL4: Digital Signal Processing (Fall 2003) Solutions to Final Exam

Modeling Probabilistic Measurement Correlations for Problem Determination in Large-Scale Distributed Systems

(a) We desribe physics as a sequence of events labelled by their space time coordinates: x µ = (x 0, x 1, x 2 x 3 ) = (c t, x) (12.

Four-dimensional equation of motion for viscous compressible substance with regard to the acceleration field, pressure field and dissipation field

Differential Equations 8/24/2010

EECS 120 Signals & Systems University of California, Berkeley: Fall 2005 Gastpar November 16, Solutions to Exam 2

Grasp Planning: How to Choose a Suitable Task Wrench Space

arxiv: v2 [cs.dm] 4 May 2018

NEW MEANS OF CYBERNETICS, INFORMATICS, COMPUTER ENGINEERING, AND SYSTEMS ANALYSIS

International Journal of Advanced Engineering Research and Studies E-ISSN

LATTICE BOLTZMANN METHOD FOR MICRO CHANNEL AND MICRO ORIFICE FLOWS TAIHO YEOM. Bachelor of Science in Mechanical Engineering.

1 sin 2 r = 1 n 2 sin 2 i

Chapter 9. The excitation process

SURFACE WAVES OF NON-RAYLEIGH TYPE

Robust Recovery of Signals From a Structured Union of Subspaces

MATHEMATICAL AND NUMERICAL BASIS OF BINARY ALLOY SOLIDIFICATION MODELS WITH SUBSTITUTE THERMAL CAPACITY. PART II

Chapter 2 Lecture 8 Longitudinal stick fixed static stability and control 5 Topics

Review of Force, Stress, and Strain Tensors

Development of a user element in ABAQUS for modelling of cohesive laws in composite structures

Counting Idempotent Relations

Transcription:

Geometry of Transformations of Random Variables Univariate distributions We are interested in the problem of finding the distribution of Y = h(x) when the transformation h is one-to-one so that there is a unique x = h (y) for eah x and y with positive probability or density. In the ase of disrete random variables, the transformation is simple. P (Y = y) = P (h(x) = y) = P ( X = h (y) ) In ontrast, for absolutely ontinuous random variables, the density f Y (y) is in general not equal to f X (h (y)). The reason is that the geometry of the transformation beomes more omplex as the dimension inreases. For disrete distributions, probability is loated at zero-dimensional points, and the transformations do not affet the size of points. For univariate absolutely ontinuous distributions, however, probability is assoiated with the integral of a density over a one-dimensional line segment. Transformations an hange the lengths of intervals, as shown here where an interval of length dx is transformed to smaller interval of length dy. h y+dy y x x+dx Figure : Transformation Y = h(x). The figure shows Y = h(x) over a very small interval so that h appears to be essentially linear. For small dx, the probability in the interval (x, x + dx) is approximately f X (x)dx. The density at y = h(x) will be the limit of the ratio of this probability over the length of the interval between h(x) and h(x + dx) whih is h(x + dx) h(x). (If h (x) < 0, then h(x + dx) < h(x) so the absolute value is needed.) As h is differentiable, the approximation h(x + dx) h(x) + h (x)dx is aurate for very small dx and it follows that the transformed interval has approximate length h (x) dx. The density at y is then f Y (y) = f X(x)dx h (x) dx = f X(h (y)) h (h (y)) after applying x = h (y). Multivariate Distributions We would like to extend this idea to joint densities. If random variables X = (X,..., X n ) have joint density f X, we aim find the joint density f Y of the random variables Y = (Y,..., Y n ) where

we write Y = h(x) to mean Y i = h i (X,..., X n ) for i =,..., n. We will assume that h is a differentiable bijetion whih means that all partial derivatives h i / x j exist and that the vetor equation (y,..., y n ) = h(x,..., x n ) has a unique solution (suh that f X > 0) with (x,..., x n ) = h (y,..., y n ). Bivariate Distributions. We motivate the general answer by examining the bivariate ase (Y, Y 2 ) = h(x, X 2 ). The density at h(x, x 2 ) is the limiting ratio of the probability in a retangle with a orner at (x, x 2 ) with sides of length dx and over the area of the retangle dx. The density at (y, y 2 ) = h(x, x 2 ) will depend on the geometry of the transformation of the orners of this retangle. (x,x 2 + ) (x + dx,x 2 + ) (x,x 2 ) (x + dx,x 2 ) Figure 2: Retangle before transformation. By the partial differentiability of h in eah dimension, the following approximation is true. h (x + dx, x 2 + ) h (x, x 2 ) + h dx + h h 2 (x + dx, x 2 + ) h 2 (x, x 2 ) + h 2 dx + h 2 To simplify the expressions, let y = h (x, x 2 ), y 2 = h 2 (x, x 2 ), a = h dx, b = h, = h 2 dx, and d = h 2. With this notation, the four orners of the retangle are mapped approximately as follows: (x, x 2 ) (y, y 2 ) (x + dx, x 2 ) (y + a, y 2 + b) (x + dx, x 2 + ) (y + a +, y 2 + b + d) (x, x 2 + ) (y +, y 2 + d) These points will not be arranged as a retangle in general, but will be a parallelogram. The parallelogram an be understood geometrially as being formed by the two vetors (a, b) and (, d) 2

extending from (y, y 2 ) to form two adjaent sides with the other sides then being parallel and equal length to these. The proper saling of the density f Y (y) will then depend on the relative area of this parallelogram to the original retangle. The following figure shows a parallelogram where lower left orner orresponds to the point (y, y 2 ) and the two adjaent sides are desribed by the vetors (a, b) and (, d). In this figure, a, b,, d > 0, whih orresponds to all of the partial derivatives h i / x j being positive. In addition, a > and d > b so that ad > b. The following geometri argument relies on these hoies, but the result will be true in general. d b a Figure 3: Parallelogram after transformation. There are two retangles with dashed lines added to the figure. The larger of these retangles has width a and height d and the smaller one has width and height b. In addition, there are two small dotted lines added to the figure whih reate six triangles and two larger polygons. Notie that the six triangles ome in pairs whih are the same size and orientation. Eah pair inludes one shaded triangle within the parallelogram and one outside. When the shading of the triangles are reversed, we get the following figure. The total shaded area is the same and is equal to ad b as it is the differene in the areas of the retangles. Thus, the area of the parallelogram depends only on the lengths and orientations of the vetors (a, b) and (, d). When these vetors are ombined to form a matrix, we see that the area is equal to the absolute value of the determinant of this matrix. ( ) a b ad b = det d In fat, the absolute value of this determinant measures the area of the orresponding parallelogram for any real a, b,, d whih an be shown by working through all ases. If we substitute bak in our original expressions, we see that the area of the parallelogram is ad b = ( h dx ) ( ) h2 3 ( h ) ( ) h2 dx = J dx

d b a Figure 4: Equal area. The area of the parallelogram is equal to the differene in the areas of the retangles. where J = det ( h h h 2 h 2 is alled the Jaobian or Jaobian derivative of the transformation. The ratio of the area of the parallelogram to the area of the original retangle is J and it follows then that the joint density of the random variables Y and Y 2 is f Y (y, y 2 ) = ) J(h (y, y 2 )) f X(h (y, y 2 )). More than two dimensions. It is natural to then ask how this extends to joint distributions of n random variables. The answer is that the density requires a resaling whih is found by alulating the reiproal of the absolute value of the Jaobian derivative for this larger transformation whih is simply a determinant of a larger matrix of partial derivatives. The derivation above found the Jaobian deriavative by omputing y j / x i for eah i, j, but it is also possible to take derivatives of the inverse relationships x j / y j and find the orresponding Jaobian deriavative. The value of this seond derivative is the reiproal of the first. In order to better distinguish these ases, it is useful to introdue a different notation that make the diretion of differentiation lear. We an define the Jaobian derivative as follows. y (y,..., y n ) (x,..., x n ) = det. y x n y n.... y n x n 4

The density of Y = (Y,..., Y n ) an then be omputed by finding one of two Jaobian derivatives. f Y (y,..., y n ) = = (y,...,y n) f X (h (x,..., x n )) (x,...,x n) (x,..., x n ) (y,..., y n ) f X(h (x,..., x n )) If you simply memorize the expression f Y (y,..., y n ) (y,..., y n ) = f X (x,..., x n ) (x,..., x n ) you an rerrange this algebraially to find either Jaobian and then properly use it or its reiproal to find the desired density after the transformation. 5