LSST Pipelines and Data Products Jim Bosch / LSST PST / January 30, 2018 1
Context This is the first of six talks on LSST's data products: 1. Pipelines and Data Products Overview (January 30, Jim Bosch) 2. Photometric Data Products (February 27, Robert Lupton) 3. Astrometric Data Products (March 27,??) 4. Alert Stream (April 24, Eric Bellm) 5. DM's Approach to Blended Sources (May 29,??) 6. Moving Object Pipelines (June 26, Mario Juric) (all dates and speakers are tentative) 2
Nomenclature Source: measurements at a single epoch Object: measurements utilizing multiple (typically all) epochs PVI: Processed Visit Image (aka calexp, Calibrated Exposure) DIA: Difference Imaging Analysis MOPS: Moving Object Processing System AP: Alert Production (aka Prompt, Nightly, Level 1) DRP: Data Release Production (aka Annual, Yearly, Level 2) User Generated: data products created by science users (Level 3) 3
Overview Data Release: all datasets for static sky science and most time domain science Prompt: "pre-release" single-epoch data products + alerts for transient and time domain science that requires fast follow-up 4
Data Release Overview Not shown: MOPS (produces SSObjects from DIASources). 5
Data Release Overview 6
Image Characterization and Calibration 7
PVIs One image for every Visit+Sensor combination, with: Photometric Calibration wavelength dependent Astrometric Calibration / WCS: includes distortions (not a simple FITS WCS) not wavelength-dependent Point Spread Function may include shifts wavelength-dependent Background Model estimated/subtracted using flats appropriate for the spectrum of the sky 8
Sources A table of detections and measurements derived from PVIs. Measurements include: Centroids Aperture and PSF photometry Adaptive-moment shapes Simple morphological star/galaxy separation Sources are probably most valuable for diagnostic purposes; if you're interested in......the static sky or stellar astrometry, use Object instead....variables, use Forced Source instead....transients or solar system objects, use DIASource and/or SSObject instead. 9
Data Release Overview 10
Coaddition and Image Differencing 11
Coadds Direct: combine warps with no special treatment of PSFs Used for deblending and most measurements. Mathematically similar to just taking longer exposures. Minimizes correlated noise. Effective coadd PSF is a weighted average of per-epoch PSFs. Matched: convolve warps to yield the same constant PSF Used for artifact detection and masking. Used as templates for image differencing (may be a different set). Used for measurements that don't include a PSF correction (e.g. aperture fluxes). Considerable correlated noise. Only the PSF-matched coadds used as templates are retained. 12
More Coadds Deep: direct coadds with data from almost all epochs Best-Seeing: direct coadds with data from the best-seeing epochs Short-Period: direct or detection coadds with data from epochs within a particular date range. Not retained. Optimal Coadds: not part of the current baseline. If implemented, would replace both Deep and Best-Seeing. 13
DIA Data Products Difference Images: (science image) - (template) Template is formed by warping, PSF-matching, and DCR-correcting the template to match the properties of the science image. DIASources: detections and measurements on Difference Images Should require minimal deblending (nearly all DIA detections will be point sources, and the static sky will have been removed). We hope to remove false positives primarily with better image processing, but will use machine-learning classifiers if necessary. DIA Data Products are the only ones we guarantee to be high-quality in crowded stellar fields. 14
Data Release Overview 15
Coadd Processing 16
Defining Objects in DRP DIAObjects are defined by associating DIASources. Objects are defined by combining DIAObjects and detections from Coadds. This can help resolve ambiguities in the original DIASource association. The set of DIAObjects is a strict subset of the set of Objects. Sources are not used to define Objects at all. They are associated with Objects after Objects are defined, and that matching will in general be ambiguous. 17
Objects: Coadd Measurements 1. 2. 3. 4. Centroids1,2,3 Adaptive-moment shapes1,2,3,4 Aperture photometry1,4 Kron and Petrosian photometry1,4 Standard colors (algorithm TBD)2,? Each band measured independently. All bands measured consistently together. Measured on Direct Coadds. Measured on PSF-Matched Coadds. 18
Data Release Overview 19
Multi-Epoch Object Characterization 20
Multi-Epoch Object Measurements Moving Point Source Fit PSF to all epochs, with parameters for absolute position, proper motion, parallax, and constant per-band flux. Restricted Bulge+Disk Model Fit a PSF-convolved linear combination of an elliptical exponential profile (Sersic n=1) and de Vaucouleurs (Sersic n=4) profile. Components will have the same ellipticity and center. Fit independently in each band. 21
Forced Photometry We will measure aperture-corrected PSF forced photometry for all Objects1 on both PVIs and Difference Images: Difference Images should be less affected by blending. PVIs should have less noise. We will have to "project" deblending results from the coadd to the measurement image for at least PVI forced photometry (how is TBD). 1. This does not include Solar System Objects, but it does include DIAObjects. 22
Probable Changes We will probably fit a bulge+disk model to all data from all bands simultaneously. This may be in addition to per-band fits, or we might parameterize per-band differences in the joint fit. We will probably tweak the parameterization of the bulge+disk model. We'd like to relax the constraint that both components have the same ellipticity without adding too many degrees of freedom. We will probably run a modern nonparametric shear estimation algorithm like BFD or Metacalibration as well, instead of relying on the bulge+disk fits for shear measurements. 23
Not Improbable Changes If we can get per-object optimal coadds working, we could drop multi-epoch object characterization entirely. Moving Point Source models would be constrained by data from Short Period coadds and per-epoch centroid derivatives measured in Forced Photometry. Bulge+Disk models would be fit to the per-object optimal coadds. 24
Other DRP Data Products Variability characterization (derived from forced photometry) Photometric redshifts (derived from Standard Colors) Survey Geometry / Masks / Completeness information: definition of these data products is currently in progress probably defined via a hierarchical pixelization of the sky (e.g. HEALPix/MOC, HTM) we will ensure that software tools exist for working with whatever format we choose 25
Prompt Processing No Object table or Coadd images. Uses Coadds and reference catalog from last DR as inputs. Simpler pipelines: no iterative refinement or self-calibration. Three pieces: Produce transient alerts within 60s. Refresh Solar System Object (SSObject) catalog daily. Perform forced photometry on new and existing DIAObjects. 26
Overview Data Release 27
Overview Prompt 28
Image Characterization and Calibration Data Release 29
Image Characterization and Calibration Prompt 30
Coaddition and Image Differencing Data Release 31
Coaddition and Image Differencing Prompt 32
Prompt Data Products PVIs Very similar to DR, but with less sophisticated image characterization (e.g. PSF models). Difference Images & DIASources Extremely similar to DR; DR might be able to do a better job masking some kinds of artifacts, and might have less noise in the templates at the start of the survey. 33
Prompt Data Products DIAObjects Prompt: created directly from associations of DIASources. DR: created from associations of DIASources, then potentially refined by comparison with coadd detections. Forced Photometry Prompt: only on Difference Images, using DIAObject positions. DR: on both Difference Images and Calibrated Images, using Object positions. 34
Prompt Data Products Solar System Objects Created incrementally by linking DIASources that were not associated with an existing high-confidence DIAObject. DR version uses only LSST inputs (e.g. no MPC orbits). Alerts A packet of information about every DIASource (including its relationships to DIAObjects and SSObjects). Does not exist at all in DRP. 35
More Information Data Products Definition Document (LSE-163): http://ls.st/dpdd Concise, focuses on data products. Polished, but still uses confusing Level-N nomenclature. Science Pipelines Design Document (LDM-151): http://ls.st/ldm-151 Much more detailed, focuses on algorithms. Rougher and sometimes speculative. Post questions at https://community.lsst.org/c/sci/data Future talks! 36