Error Bars in both X and Y

Similar documents
Review: Fit a line to N data points

Support Vector Machines

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Support Vector Machines

(δr i ) 2. V i. r i 2,

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Section 8.3 Polar Form of Complex Numbers

Mathematical Preparations

MTH 263 Practice Test #1 Spring 1999

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

Orthogonal Functions and Fourier Series. University of Texas at Austin CS384G - Computer Graphics Spring 2010 Don Fussell

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Orthogonal Functions and Fourier Series. University of Texas at Austin CS384G - Computer Graphics Fall 2010 Don Fussell

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

Feb 14: Spatial analysis of data fields

Georgia Tech PHYS 6124 Mathematical Methods of Physics I

Logistic Classifier CISC 5800 Professor Daniel Leeds

Composite Hypotheses testing

Linear Approximation with Regularization and Moving Least Squares

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

7. Products and matrix elements

ˆ (0.10 m) E ( N m /C ) 36 ˆj ( j C m)

SIO 224. m(r) =(ρ(r),k s (r),µ(r))

Programming Project 1: Molecular Geometry and Rotational Constants

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.

Three views of mechanics

PHZ 6607 Lecture Notes

Kernel Methods and SVMs Extension

Which Separator? Spring 1

The Geometry of Logit and Probit

Chapter 12 Equilibrium & Elasticity

1 Matrix representations of canonical matrices

p 1 c 2 + p 2 c 2 + p 3 c p m c 2

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

Please review the following statement: I certify that I have not given unauthorized aid nor have I received aid in the completion of this exam.

w ). Then use the Cauchy-Schwartz inequality ( v w v w ).] = in R 4. Can you find a vector u 4 in R 4 such that the

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Tensor Analysis. For orthogonal curvilinear coordinates, ˆ ˆ (98) Expanding the derivative, we have, ˆ. h q. . h q h q

Chapter 9: Statistical Inference and the Relationship between Two Variables

Lecture 3: Probability Distributions

Chapter 3. r r. Position, Velocity, and Acceleration Revisited

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

For all questions, answer choice E) NOTA" means none of the above answers is correct.

Affine and Riemannian Connections

Important Instructions to the Examiners:

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

Quantum Mechanics I - Session 4

Effects of Ignoring Correlations When Computing Sample Chi-Square. John W. Fowler February 26, 2012

DECOUPLING THEORY HW2

Least Squares Fitting of Data

Differentiating Gaussian Processes

MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS

Least Squares Fitting of Data

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Lecture Notes on Linear Regression

Quantum Mechanics for Scientists and Engineers. David Miller

This model contains two bonds per unit cell (one along the x-direction and the other along y). So we can rewrite the Hamiltonian as:

STAT 3008 Applied Regression Analysis

5.04, Principles of Inorganic Chemistry II MIT Department of Chemistry Lecture 32: Vibrational Spectroscopy and the IR

u i ( u i )a v a = i ( u i )a v i n = x u 1 x u 2

Chapter 11 Angular Momentum

are called the contravariant components of the vector a and the a i are called the covariant components of the vector a.

Solution 1 for USTC class Physics of Quantum Information

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

From Biot-Savart Law to Divergence of B (1)

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

332600_08_1.qxp 4/17/08 11:29 AM Page 481

U-Pb Geochronology Practical: Background

Modeling curves. Graphs: y = ax+b, y = sin(x) Implicit ax + by + c = 0, x 2 +y 2 =r 2 Parametric:

Lecture 6: Introduction to Linear Regression

Cathy Walker March 5, 2010

1 Derivation of Point-to-Plane Minimization

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor

Math 217 Fall 2013 Homework 2 Solutions

Ensemble Methods: Boosting

[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

/ n ) are compared. The logic is: if the two

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Linear Regression Analysis: Terminology and Notation

Lecture 3. Ax x i a i. i i

Lecture 10 Support Vector Machines. Oct

The Schrödinger Equation

Regression Analysis. Regression Analysis

Line Drawing and Clipping Week 1, Lecture 2

Fourier Transform. Additive noise. Fourier Tansform. I = S + N. Noise doesn t depend on signal. We ll consider:

Classification. Representing data: Hypothesis (classifier) Lecture 2, September 14, Reading: Eric CMU,

CS 523: Computer Graphics, Spring Shape Modeling. PCA Applications + SVD. Andrew Nealen, Rutgers, /15/2011 1

Statistical pattern recognition

THE CURRENT BALANCE Physics 258/259

Lecture 12: Classification

Solutions to Selected Exercises

Richard Socher, Henning Peters Elements of Statistical Learning I E[X] = arg min. E[(X b) 2 ]

Probability Theory (revisited)

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

Properties of Least Squares

Transcription:

Error Bars n both X and Y Wrong ways to ft a lne : 1. y(x) a x +b (σ x 0). x(y) c y + d (σ y 0) 3. splt dfference between 1 and. Example: Prmordal He abundance: Extrapolate ft lne to [ O / H ] 0. [ He / H ] Correct method s to mnmse : χ (a, b) ( Y (a X + b) ) σ (Y )+ a σ (X ) Let s see why. [ O / H ]

Vector Space Perspectve data ponts, M parameters. X Model µ(α) defnes a parametersed M-dmensonal surface n the -dmensonal data space. µ ( α) χ (α) squared dstance from the observed data to the model surface. Best-ft model s the one closest to the data. For lnear models (scalng patterns), the model surface s a flat M-dmensonal hyper-plane.

Revew: Vector Spaces Vectors have a drecton and a length. Addton of vectors gves another vector. Scalng a vector stretches ts length. Dot product: a a - b a b a b cosθ θ "angle" between vectors a, b. θ b Length of a vector: a a a (dstance from base to tp) Dstance between vectors: a b

Ortho-normal Bass Vectors Ortho-normal bass vectors e : e e j δ j 1 j 0 j Any vector a s a lnear combnaton of the bass vectors e, wth scale factors a Example: a a e ( a e ) e y 3 x a. e x 4 a e 3 e e 1 1 e y ex 1 3 4 y a. e y 3 x

Data Space s a Vector Space data ponts defne a vector n -dmensonal data space : x {,x,...,x } e 1 + x e +...+ x e bass vectors: e 1 {1,0,...,0} e {0,1,...,0} x x... e {0,0,...,1} e Bass s ortho-normal f: e e j δ j e1 Bass vector e selects data pont x : x e x Data pont x s the projecton of data vector x along the bass vector e.

on-orthogonal Bass Vectors x In the non-orthogonal case, e 1 e cosθ 0 x x Two ways to measure coordnates: Contravarant coordnates (ndex hgh): x project parallel to bass vectors: e θ e 1 x e 1 +x e +...+x e Covarant coordnates (ndex low): x project perpendcular to bass vectors. + x cosθ x x + cosθ x Metrc tensor: Dot product: j g j x j g j e e j x 1 cosθ cosθ 1 x x y x y j e e j x y j g j x y x j y j, j, j j

Metrc for non-orthonormal Bass Vectors x x x e θ g j e e j e 1 e 1 e 1 e cosθ e 1 e cosθ e Metrc s symmetrc: g j g j. Off-dagonal terms vansh f the bass vectors are orthogonal. Dagonal terms defne the lengths of the bass vectors.

Data sets and Functons as Vector Spaces A data set, X, 1,...,, s also an -component vector ( X 1, X,..., X ), one dmenson for each data pont. The data vector s a sngle pont n the -dmensonal data space. A functon, f( t ), s a vector n an nfnte-dmensonal vector space, one dmenson for each value of t. The dot product between functons depends on a weghtng functon w( t ): f, g f (t) g(t) w(t) dt Weghtng functon Each weghtng functon w( t ) gves a dfferent dot product, a dfferent dstance measure, a dfferent vector space. Whch w( t ) to use for data analyss?

χ as (dstance) n functon space The (absolute value) of a functon f( t ) : f f, f f (t) w(t) dt The (dstance) between f( t ) and g( t ) : f g f g, f g ( f (t) g(t) ) w(t) dt A dataset ( X +/- σ ) at t t defnes a specfc weghtng functon: δ(t t w(t) ) σ Wth ths w( t ), the (dstance) from data X( t ) to model µ( t ) s χ : X µ X µ(t ) σ χ. Each dataset defnes ts own weghtng functon.

The Data-Space Metrc: σ s the unt of dstance. χ s (dstance) Defne the data-space dot product wth nverse-varance weghts: w 1 a b a b w σ a b a b σ. a b σ Then, the (dstance) between data x and parametersed model µ(α) s: X χ " $ # X µ (α) σ % ' & X µ(α). µ ( α)

Optmal Scalng n Vector Space otaton Mnmse χ -> pck model closest to the data. Scalng a pattern: µ( α ) α P : X µ (α) αp The pattern P s a vector n data space. The model α P s a lne n data space, multples of P. The best ft s the pont along the lne closest to the data X ˆα X P /σ P /σ X P P P X " µ( ˆα) ˆα P X P % $ 'P X e P # P P & Unt vector along P : e P P P ( ) e P α 1 P α 1 α 0 α α 3 ˆα P

Stretchng the Bass Vectors Usng the vector notaton, ˆα P X P P j j X P j g j P P j g j So the e bass vectors are orthogonal but not unt length, gven the data-space metrc X P σ ( P ) σ g j e e j 1 σ δ j.e. σ s the natural unt of dstance on the th axs of data space! We can stretch axs by factor σ to defne a new set of ortho-normal bass vectors b : b 1 {σ 1,0,...,0} b σ e b b j δ j e 1 {1,0,...,0} e {0,1,...,0}... e {0,0,...,1} b {0,σ,...,0}... b {0,0,...,σ }

Stretch bass vectors to make χ ellpses become crcles Old bass vectors: x x e g j e e j δ j σ x e x χ contours are ellpses Orthogonal, but not normalsed. Stretched bass vectors are orthonormal: e 1 b x /σ x χ contours are crcles b σ e g j b b j δ j b 1 x x, b b x σ b b b 1 /σ 1

Error Bars n both X and Y Wrong ways to ft a lne : 1. y(x) a x +b (σ x 0). x(y) c y + d (σ y 0) 3. splt dfference between 1 and. Example: Prmordal He abundance: [ He / H ] Extrapolate ft lne to [ O / H ] 0. Key concept: X +/- σ X and Y +/- σ Y are ndependent dmensons of the -dmensonal data space. [ O / H ]

Lne Ft wth error bars n both X and Y Data: X ±σ X Y ±σ Y Model: y a x + b Δy Δx For σ X σ Y, where s the pont of closest approach? ot obvous. L y a x + b Δy Y - (a X + b) Δx X - (Y b) / a Horzontal stretch by factor σ Y / σ X makes the probablty cloud round. Also changes the slope: a > a Δ x σ Y Δx a Δy σ X Δ x σ X a tanθ σ Y Closest approach at R Δy cosθ R Δy R σ Y Δy cos θ cos θ + sn θ 1 1+ tan θ σ Y σ Y + a σ X Δy σ Y θ R Δy R Δx θ Δy σ Y + a σ X y a x + b Crcle radus s σ Y σ X

Defnng χ for errors n both X and Y Horzontal stretch makes probablty cloud round. Crcle radus s σ Y σ X. Dstance R at closest approach s : R σ Y Δy σ Y + a σ X Δy R Δx y a x + b Crcle radus s σ Y σ X ote: eed a dfferent stretch for each data pont. Total (dstance) n the - dmensonal data space: χ ε(y ) + ε( X ) σ (Y ) σ ( X ) R σ (Y ) 1 ( Y (a X + b) ) σ (Y )+ a σ (X ) ε(y ) +ε( X ) σ (Y ) ε(y) ε(x ) R