Strong Lens Modeling (I): Principles and Basic Methods

Similar documents
Review: Fit a line to N data points

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

Lecture 10 Support Vector Machines II

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

Errors for Linear Systems

Integrals and Invariants of Euler-Lagrange Equations

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

Linear Approximation with Regularization and Moving Least Squares

Chapter 9: Statistical Inference and the Relationship between Two Variables

SIO 224. m(r) =(ρ(r),k s (r),µ(r))

NUMERICAL DIFFERENTIATION

Lecture 12: Classification

Mathematical Preparations

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

Some basic statistics and curve fitting techniques

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 13

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Error Bars in both X and Y

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

1 Matrix representations of canonical matrices

Chapter 11: Simple Linear Regression and Correlation

Generalized Linear Methods

The Ordinary Least Squares (OLS) Estimator

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Linear Feature Engineering 11

Global Sensitivity. Tuesday 20 th February, 2018

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Lecture Notes on Linear Regression

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Classification as a Regression Problem

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

The Geometry of Logit and Probit

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

CHAPTER 14 GENERAL PERTURBATION THEORY

Lagrangian Field Theory

Topic 5: Non-Linear Regression

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Laboratory 3: Method of Least Squares

More metrics on cartesian products

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

risk and uncertainty assessment

Composite Hypotheses testing

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Feb 14: Spatial analysis of data fields

Modeling of Dynamic Systems

Georgia Tech PHYS 6124 Mathematical Methods of Physics I

VQ widely used in coding speech, image, and video

Lecture 21: Numerical methods for pricing American type derivatives

For all questions, answer choice E) NOTA" means none of the above answers is correct.

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Introduction to Regression

IV. Performance Optimization

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Originated from experimental optimization where measurements are very noisy Approximation can be actually more accurate than

Report on Image warping

Laboratory 1c: Method of Least Squares

Integrals and Invariants of

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

Solutions to exam in SF1811 Optimization, Jan 14, 2015

e i is a random error

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

x i1 =1 for all i (the constant ).

Tensor Analysis. For orthogonal curvilinear coordinates, ˆ ˆ (98) Expanding the derivative, we have, ˆ. h q. . h q h q

Physics 5153 Classical Mechanics. Principle of Virtual Work-1

Uncertainty as the Overlap of Alternate Conditional Distributions

APPENDIX A Some Linear Algebra

U-Pb Geochronology Practical: Background

Lecture 12: Discrete Laplacian

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Kernel Methods and SVMs Extension

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

How its computed. y outcome data λ parameters hyperparameters. where P denotes the Laplace approximation. k i k k. Andrew B Lawson 2013

Mechanics Physics 151

ONE DIMENSIONAL TRIANGULAR FIN EXPERIMENT. Technical Advisor: Dr. D.C. Look, Jr. Version: 11/03/00

Queueing Networks II Network Performance

Non-linear Canonical Correlation Analysis Using a RBF Network

Economics 130. Lecture 4 Simple Linear Regression Continued

Lab session: numerical simulations of sponateous polarization

ρ some λ THE INVERSE POWER METHOD (or INVERSE ITERATION) , for , or (more usually) to

Numerical Heat and Mass Transfer

Solutions Homework 4 March 5, 2018

Nice plotting of proteins II

Notes on Analytical Dynamics

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

10-701/ Machine Learning, Fall 2005 Homework 3

Gaussian Mixture Models

Machine learning: Density estimation

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Appendix B. The Finite Difference Scheme

Transcription:

Strong Lens Modelng (I): Prncples and Basc Methods Chuck Keeton Rutgers, the State Unversty of New Jersey Least-Squares

(I) Prncples and Basc Methods least-squares fttng solvng lens equaton constrants (pont data) parametrc mass models (II) Statstcal Methods Bayesan statstcs Monte Carlo Markov Chans nested samplng (III) Advanced Technques case studes: composte models, astrophyscal prors, substructure extended sources non-parametrc lens models Least-Squares

Strong lens modelng goal: use strong lensng data to learn about... mass model source other parameters (e.g., H 0 ) focus: galaxy-scale lensng pont data (for now) Least-Squares

Smple examples forward problem: fx lens model, solve lens equaton to fnd mage postons (and other data) nverse problem: fx lens data, (re)nterpret lens equaton as constrant equaton solve for model parameters Least-Squares

! #" double lens; conventon: θ 1 > θ 2 > 0 " "!!" β = θ 1 θ2 E θ 1 β = θ 2 θ2 E θ 2 ( β for #2 because mage/source on opposte sdes of lens) ( 1 θ 1 + θ 2 = θe 2 + 1 ) θ E = (θ 1 θ 2 ) 1/2 θ 1 θ 2 Least-Squares

! #" " "!!" Least-Squares double lens; agan θ 1 > θ 2 > 0 β = θ 1 θ E β = θ 2 θ E then θ E = θ 1 + θ 2 2 = θ 2

Model dependence: Ensten radus remark: from the same data we can get dfferent answers dependng on what we assume about the models however... suppose θ 1 = θ 0 + δ and θ 2 = θ 0 δ, and δ s small: ptmass: θ E = (θ 1 θ 2 ) 1/2 θ 0 δ2 2θ 0 + O ( δ 4) : θ E = θ 1 + θ 2 2 = θ 0 result for Ensten radus s not very senstve to choce of model may not be true of other parameters! Least-Squares

lens equaton, now n cartesan angular coordnates [ ] γx u = x θ E ˆx γy cross quad: u = v = 0, wth mages at (±x 1, 0) and (0, ±y 2 ) 0 = (1 γ)x 1 θ E 0 = (1 + γ)y 2 θ E!$#" ' %&!! " #$%& Least-Squares

!$#" ' %&!! " #$%& Least-Squares θ E + γx 1 = x 1 θ E γy 2 = y 2 then [ 1 x1 1 y 2 ] [ θe γ soluton ] = [ x1 y 2 ] θ E = 2x 1y 2 x 1 + y 2 and γ = x 1 y 2 x 1 + y 2

Least-squares fttng usually we cannot solve the constrant equatons exactly more constrants than parameters nose wrong model general goal: mnmze the dfference between the model and data quantfy goodness of ft: dea: fnd best ft (mnmum χ 2 ) χ 2 = (model data) 2 (uncertantes) 2 explore range of allowed models (regon where χ 2 s acceptable) Least-Squares

What s good enough? quantfy degrees of freedom: ν = (# constrants) (# free parameters) f errors are random, have probablty dstrbuton for χ 2 : p(χ 2 ν) = 0.5 0.4 0.3 0.2 0.1 1 2 ν/2 Γ(ν/2) (χ2 ) ν/2 1 e χ 2 /2 Least-Squares

average: peak: 0.5 0.4 0.3 0.2 0.1 χ 2 = ν χ 2 peak = max(ν 2, 0) as a rule of thumb, we expect χ 2 ν for a good ft; but gven statstcal scatter, ths s not a strct condton! Least-Squares

generalze noton of uncertantes... f uncertantes are correlated, ntroduce covarance ( )( ) Cov(x, y) = x x y y = xy x y x y + x y = xy x y for an array of data d = (d 1, d 2, d 3,...), covarance matrx s σ1 2 Cov(d 1, d 2 ) Cov(d 1, d 3 ) Cov(d 2, d 1 ) σ2 2 Cov(d 2, d 3 ) C = Cov(d 3, d 1 ) Cov(d 3, d 2 ) σ3 2.... Least-Squares

5 1 [ C = 4 3 2 1 0 1 2 3 4 5 6 0.775 0.375 0.375 0.340 ] ρ 12 = 0.731 asde: correlaton coeffcent (dmensonless, ρ 1): ρ j = Cov(d, d j ) σ σ j Least-Squares

generalzed goodness of ft χ 2 = (d mod d obs ) t C 1 (d mod d obs ) f data are ndependent then σ1 2 0 C = 0 σ2 2.... and χ 2 reduces to what you expect χ 2 = = d mod 1 d obs 1 d mod 2 d obs 2. (d mod t d obs ) 2 σ 2 1 σ 2 1 0 1 σ 2 2. 0... d mod 1 d obs 1 d mod 2 d obs 2. Least-Squares

Lnear parameters example: x s some ndependent varable (whch we can know); measure d obs and postulate a straght lne 1.2 1.0 0.8 0.6 d mod = mx + b Least-Squares

χ 2 = (mx + b d obs ) 2 σ 2 parabola n both m and b; fnd mnmum by solvng 0 = χ2 m 0 = χ2 b = 2 = 2 x (mx + b d obs ) σ 2 (mx + b d obs ) σ 2 may look complcated, but just a par of lnear equatons [ x 2 ] x [ ] x d obs σ 2 σ m 2 = σ 2 b x σ 2 solve by matrx nverson 1 σ 2 d obs σ 2 Least-Squares

[ x 2 σ 2 x σ 2 1.2 1.0 0.8 0.6 x σ 2 1 σ 2 ] [ m b ] = x d obs σ 2 d obs σ 2 (can generalze to an arbtrary number of lnear parameters) Least-Squares

Non-lnear parameters must explctly search parameter space use establshed algorthms to search for mnmum of a functon n multple dmensons challenges: computatonal effort local mnma long, narrow valleys degeneraces Least-Squares

Downhll smplex method ( amoeba ) http://www.cs.usfca.edu/ brooks/papers/amoeba.pdf also Numercal Recpes Orgnal Smplex Reflecton Expanson Contracton Mult-dmensonal Contracton Least-Squares

parameters suppose we have parameters a and b such that then optmal value of a: d mod = a f(b) χ 2 (a, b) = [af(b) d obs ] 2 0 = χ2 a = 2 f(b)[af(b) d obs ] σ 2 a opt = then σ 2 χ 2 (b) = χ 2 (a opt (b), b) we can stll optmze the lnear parameters analytcally f(b)d obs /σ 2 f(b)2 /σ 2 Least-Squares

lkelhood 1-d Gaussan χ 2 = (x d)2 σ 2 L e χ2 /2 { ±1σ : χ 2 = 1 (68%) ±2σ : χ 2 = 4 (95%) Least-Squares 0.4 0.3 0.2 0.1 0.0 Σ central regon = 68% of the probablty; each tal = 16%

2-d Gaussan f Z 1 x2 y2 = exp 2 2 dx dy 2πσx σy < χ2 2σx 2σy 2 Z Z χ2 2 2 x + y 1 exp dx dy = e r /2 r dr = 2π < χ2 2 0 ( 2 2 68% : χ = 2.3 = 1 e χ /2 95% : χ2 = 6.2 Least-Squares 4 2 0-2 -4-4 -2 0 2 4

Solvng the lens equaton challenges: usually non-lnear often transcendental we may not even know how many solutons there are! mathematcal theorems bound maxmum number of mages... but we need actual number global caustc structure may be nformatve... but dffcult to fnd and analyze soluton: read lens equaton backwards mappng from mage poston x to unque source poston u(x) = x α(x) tle mage plane map each tle back to source plane number of tles that cover source reveals number of mages tles themselves gve estmates of mage postons Least-Squares

Least-Squares

Image plane tlng background Cartesan grd basc coverage polar grd centered on each galaxy resolve key regons adaptve subgrddng near crtcal curves Least-Squares

Quadrlaterals vs. trangles quadrlaterals can be problematc: ok bad trangles are fne: ok ok Least-Squares

trangulaton start wth ponts n a plane connect them wth trangles (Google trangulaton I use http://www.cs.cmu.edu/ quake/trangle.html) Least-Squares

Grddng n gravlens/ Least-Squares

Magnfcaton and tme delay deflecton magnfcaton α(x) = φ(x) = [ φx [ ] 1 1 φxx φ µ = det xy = [ (1 φ φ xy 1 φ xx )(1 φ yy ) φ 2 ] 1 xy yy specal case of crcular symmetry, α(r): (crcular) µ = tme delay [ ] 1 t(x; u) = t 0 2 x u 2 φ(x) φ y [ 1 α(r) ] 1 [ 1 dα ] 1 r dr ] t 0 = 1 + z l c D l D s D ls Least-Squares

pont sources data mage postons fluxes tme delays source parameters poston flux tme scale (extended sources on Thursday) Least-Squares

Poston constrants exact poston χ 2 : χ 2 pos = (x mod mages x obs ) t S 1 (x mod x obs ) astrometrc uncertantes: error ellpse wth axes (σ 1, σ 2 ) and poston angle θ σ (East of North) covarance matrx [ ] [ ] σ 2 S = R 1 0 0 σ2 2 R t sn θσ cos θ R = σ cos θ σ sn θ σ f symmetrc uncertantes: [ ] σ 2 S = 0 0 σ 2 Least-Squares

note: defne source poston assocated wth each observed mage also subtract: δu u obs = x obs α(x obs ) u mod = x mod α(x mod ) = δx [ α(x mod ) α(x obs ) ] µ 1 δx provded that model s decent, such that δx and δu are small then δx µ δu yelds approxmate poston χ 2 : χ 2 pos (u mod u obs ) t µ t S 1 µ (u mod u obs ) Least-Squares

advantages: χ 2 pos (u mod u obs don t need to solve lens equaton ) t µ t S 1 µ (u mod u obs ) u mod s a lnear parameter, so optmze t analytcally concerns: where A = u mod = A 1 b µ t S 1 µ b = µ t S 1 µ u obs approxmaton s vald only when resduals are small... but χ 2 pos yelds a large value (.e., bad ft) n ether case snce we do not solve the lens equaton, we cannot check that the model predcts correct number of mages... only worry about models yeldng too many mages Least-Squares

Flux constrants χ 2 flux = (F obs µ F src ) 2 σ 2 f, f desred, nclude party by lettng F obs optmal source flux can be found analytcally F src = F obs and µ be sgned µ /σ 2 f, µ2 /σ2 f, f desred, straghtforward to swtch to magntudes m mod = m src 2.5 log µ note: photometrc unts are arbtrary absolute fluxes or magntudes, or relatve values Least-Squares

Tme delay constrants predcted tme delay model: t mod τ mod = 1 x mod 2 cosmol: t 0 = 1 + z l c = t 0 τ mod + T 0 u mod 2 φ ( x mod ) D l D s D ls = H 1 0 f(ω M, Ω Λ ; z l, z s ) note: tme zeropont T 0 does not affect dfferental tme delays; but let s make framework general then χ 2 tdel = (t obs t 0 τ mod T 0 ) 2 σ 2 t, Least-Squares

χ 2 tdel = (t obs t 0 τ mod T 0 ) 2 σ 2 t, f we have prors on the cosmologcal parameters (ncludng H 0 ) pror t 0,pror ± σ t0 addtonal term optmal values of t 0 and T 0 : (τ mod ) 2 σt, 2 τ mod σ 2 t, + 1 σ 2 t0 χ 2 t0 = (t 0 t 0,pror ) 2 σ 2 t0 τ mod σ 2 t, 1 σ 2 t, [ t0 T 0 ] = τ mod t obs σ 2 t, t obs σt, 2 + t0,pror σ 2 t0 Least-Squares

Parametrc mass models postulate: mass dstrbuton can be descrbed by a functon wth a modest number of parameters example: Sngular Isothermal Ellpsod (SIE) pros: κ = b 2[(x x 0 ) 2 + (y y 0 ) 2 /q 2 ] 1/2 easy to fnd best ft and assess qualty (+rotaton) buld n astrophyscal knowledge assumptons and prors good enough for many applcatons cons: can only get out what you put n real galaxes may be more complex Least-Squares

Countng # constrants: # parameters: x gal x F t total quad 2 4 2 4 3 17 double 2 2 2 2 1 9 u src F src x gal q gal q env t 0 total 2 1 2 3 2 1 11 Least-Squares

softened power law ellpsod κ = b 2 α 2(s 2 + x 2 + y 2 /q 2 ) 1 α/2 where < 1 steeper than sothermal M(r) r α α = 1 sothermal > 1 shallower than sothermal has many other model classes: pont mass, pseudo-jaffe, de Vaucouleurs, Hernqust, Sersc, NFW, Nuker, exponental dsk,... Least-Squares

models can combne multple components to obtan models that are more complcated but stll parametrc for example: stellar component (e.g., Hernqust) dark matter halo (e.g., NFW) (composte models can be as fancy as you want) Least-Squares

al effects few lens galaxes are solated they have neghbors, and may be embedded n groups or clusters of galaxes envronments can affect the lght bendng by an amount larger than the measurement uncertantes f neghborng galaxes are far from the lens (compared wth Ensten radus), make Taylor seres expanson φ env = φ 0 + a x + κ c 2 r2 + γ 2 r2 cos 2(θ θ γ ) + σ 4 r3 cos(θ θ σ ) + δ 6 r3 cos 3(θ θ δ ) +... structures along the lne of sght can also affect the lght bendng... more complcated Least-Squares

parameter space searchng parameter space may or may not requre a strategc approach... Least-Squares

: hands-on exercses... step 1 pck some mass model, then: plot grd plot crtcal curves and caustcs fnd mages Least-Squares

: step II I generated some mock lenses; now you try to ft them man lens galaxy s a power law ellpsod I may have vared: mass ellptcty/pa power law ndex envronment: shear/pa, or perturber all generated wth z l = 0.3, z s = 2.0, Ω M = 0.27, Ω Λ = 0.73, and some fxed value of H 0 Least-Squares

Sample quads recall: z l = 0.3, z s = 2.0, Ω M = 0.27, Ω Λ = 0.73 what are the model parameters? what s H 0? 2 sampquad1 2 sampquad2 2 sampquad3 1 1 1 0 0 0-1 -1-1 -2-2 -2-2 -1 0 1 2-2 -1 0 1 2-2 -1 0 1 2 2 sampquad4 2 sampquad5 2 sampquad6 1 1 1 0 0 0-1 -1-1 -2-2 -2 Least-Squares -2-1 0 1 2-2 -1 0 1 2-2 -1 0 1 2

Sample doubles recall: z l = 0.3, z s = 2.0, Ω M = 0.27, Ω Λ = 0.73 what are the model parameters? what s H 0? 2 sampdoub1 2 sampdoub2 2 sampdoub3 1 1 1 0 0 0-1 -1-1 -2-2 -2-2 -1 0 1 2-2 -1 0 1 2-2 -1 0 1 2 2 sampdoub4 2 sampdoub5 2 sampdoub6 1 1 1 0 0 0-1 -1-1 -2-2 -2 Least-Squares -2-1 0 1 2-2 -1 0 1 2-2 -1 0 1 2