Nonparametric Estimation of Luminosity Functions

Similar documents
Nonparametric Statistics

arxiv:astro-ph/ v1 15 Feb 2007

Improved Astronomical Inferences via Nonparametric Density Estimation

Introduction to Regression

Introduction to Regression

Introduction to Regression

Introduction to Regression

Addressing the Challenges of Luminosity Function Estimation via Likelihood-Free Inference

Statistics: Learning models from data

Galaxy Morphology Refresher. Driver et al. 2006

group. redshifts. However, the sampling is fairly complete out to about z = 0.2.

Quantitative Economics for the Evaluation of the European Policy. Dipartimento di Economia e Management

Space Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses

12 - Nonparametric Density Estimation

Chapter 9. Non-Parametric Density Function Estimation

Nonparametric Methods

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O

Econ 582 Nonparametric Regression

Censoring and truncation in astronomical surveys. Eric Feigelson (Astronomy & Astrophysics, Penn State)

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Chapter 9. Non-Parametric Density Function Estimation

Histogram Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction

Kernel density estimation with doubly truncated data

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Nonparametric Inference in Cosmology and Astrophysics: Biases and Variants

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Confidence intervals for kernel density estimation

Log-Density Estimation with Application to Approximate Likelihood Inference

Discrete Mathematics and Probability Theory Fall 2015 Lecture 21

Bayesian estimation of the discrepancy with misspecified parametric models

Dr Martin Hendry Dept of Physics and Astronomy University of Glasgow, UK

Motivational Example

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Statistics notes. A clear statistical framework formulates the logic of what we are doing and why. It allows us to make precise statements.

Exploiting Sparse Non-Linear Structure in Astronomical Data

Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta

Nonparametric Inference and the Dark Energy Equation of State

Galaxy number counts in a presence of the graviton background

Nonparametric Density Estimation

Akaike Information Criterion to Select the Parametric Detection Function for Kernel Estimator Using Line Transect Data

Bayes via forward simulation. Approximate Bayesian Computation

Time Series and Forecasting Lecture 4 NonLinear Time Series

Adaptive Nonparametric Density Estimators

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

The flickering luminosity method

Are Bayesian Inferences Weak for Wasserman s Example?

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Galaxy Luminosity Function

Estimation of a Two-component Mixture Model

Curve Fitting Re-visited, Bishop1.2.5

Basic Sampling Methods

Kernel density estimation

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Local Polynomial Modelling and Its Applications

Density Estimation. Seungjin Choi

A Process over all Stationary Covariance Kernels

Censoring and truncation in astronomical surveys

Nonparametric Regression. Badr Missaoui

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures

GAMMA-RAY LUMINOSITY AND PHOTON INDEX EVOLUTION OF FSRQ BLAZARS AND CONTRIBUTION TO THE GAMMA-RAY BACKGROUND

Statistics in Astronomy (II) applications. Shen, Shiyin

Optimal Ridge Detection using Coverage Risk

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

BAYESIAN DECISION THEORY

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION

41903: Introduction to Nonparametrics

DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA

A Large-Sample Approach to Controlling the False Discovery Rate

Approximate Bayesian Computation for Astrostatistics

Data Analysis Methods

LECTURE NOTE #3 PROF. ALAN YUILLE

Density Estimation (II)

ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation. Petra E. Todd

Maximum Likelihood Estimation. only training data is available to design a classifier

Announcements. Proposals graded

A Novel Nonparametric Density Estimator

STA 4273H: Statistical Machine Learning

Additive Isotonic Regression

GALAXY GROUPS IN THE THIRD DATA RELEASE OF THE SLOAN DIGITAL SKY SURVEY

Excerpts from previous presentations. Lauren Nicholson CWRU Departments of Astronomy and Physics

Inferring Galaxy Morphology Through Texture Analysis

Cosmology in the Very Local Universe - Why Flow Models Matter

Lectures in AstroStatistics: Topics in Machine Learning for Astronomers

Mean-Shift Tracker Computer Vision (Kris Kitani) Carnegie Mellon University

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

A tailor made nonparametric density estimate

More on Estimation. Maximum Likelihood Estimation.

Department of Statistics Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA citizenship: U.S.A.

Preface. 1 Nonparametric Density Estimation and Testing. 1.1 Introduction. 1.2 Univariate Density Estimation

Approximate Bayesian Computation for the Stellar Initial Mass Function

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Automatic Local Smoothing for Spectral Density. Abstract. This article uses local polynomial techniques to t Whittle's likelihood for spectral density

Marginal Specifications and a Gaussian Copula Estimation

Introduction to Machine Learning

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

Stat 5101 Lecture Notes

Transcription:

x x Nonparametric Estimation of Luminosity Functions Chad Schafer Department of Statistics, Carnegie Mellon University cschafer@stat.cmu.edu 1

Luminosity Functions The luminosity function gives the number of (quasars, galaxies, galaxy clusters,...) as a function of luminosity (L) or absolute magnitude (M) The 2-d luminosity function allows for dependence on redshift (z) Schechter Form (Schechter, 1976): ( ) ( L φ S (L) dl = n exp L ) α ( L d L L or φ S (M) dm n 10 0.4(α+1)(M M) exp ( 10 ) 0.4(M M) dm L ) 2

Schechter (1976) 3

From Binney and Merrifield (1998), pages 163-164: This formula was initially motivated by a simple model of galaxy formation (Press and Schecter, 1974), but it has proved to have a wider range of application than originally envisaged... With larger, deeper surveys, the limitations of the simple Schechter function start to become apparent. Particular problems on faint end, and with luminosity function evolution with redshift. Blanton, et al. (2003): Model 2-d galaxy luminosity function as Φ(M, z) = n10 0.4(z z 0)P K k=1 Φ k 2πσ 2 M exp [ 1 2 ] (M M k + (z z 0 ) Q) 2 σ 2 M 4

x Quasars Among the furthest observable objects Quasar PKS 2349, Observed by Hubble Telescope in 1996 Image courtesy of NASA 5

Absolute Magnitude 24 26 28 30 0 1 2 3 4 5 Redshift Sample of 15,057 quasars from Sloan Digital Sky Survey. (Richards et al. 2006) 6

Luminosity Functions as Probability Density Functions Integrating a luminosity function gives the count in that bin count with 22 M 21 = 21 22 φ(m) dm Proportional to the probability a randomly chosen such object is such that 22 M 21. Hence, estimating φ(m) analogous to density estimation 7

Density Estimation Basic Problem: Estimate f, where assume that X 1,X 2,...,X n f, i.e., for a b, P(a X i b) = b a f(x) dx Parametric Approach: Assume f belongs to some family of densities (e.g., Gaussian), and estimate the unknown parameters via maximum likelihood Nonparametric Approach: Estimate built on smoothing available data, slower rate of convergence, but less bias The histogram is a simple example of a nonparametric density estimator. 8

xxx Density Estimation Sample of size 303: 7.2 8.3 8.6 8.8 8.9 9.0 9.0 9.0 9.5 9.5 9.6 9.7 9.7 9.8 9.8 9.8 10.0 10.1 10.1 10.2 10.4 10.4 10.4 10.4... Note that x = 20.18, s = 7.93 9

xxx Parametric Estimator density 0.00 0.01 0.02 0.03 0.04 0.05 10 0 10 20 30 40 50 Latitude of Initial Point Assuming Gaussian, estimate µ = 20.18 and σ = 7.93. 10

Parametric Estimator Density 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 10 20 30 40 Latitude of Initial Point Compared with the histogram estimate of the density. 11

Advantages of Parametric Form Relatively simple to fit, via maximum likelihood Functional form Easy calculation of probabilities Smaller errors, on average, provided assumption regarding density is correct 12

xxx Errors in Density Estimators Error at one value x: (f(x) f(x)) 2 Error accumulated over all x: ISE = (f(x) f(x)) 2 dx Mean Integrated Squared Error (MISE): ( ) MISE = E (f(x) f(x)) 2 dx 13

The Bias Variance Tradeoff Can write MISE = b 2 (x)dx + v(x)dx where the bias at x is and the variance at x is b(x) = E( f(x)) f(x) v(x) = V( f(x)) 14

The Bias Variance Tradeoff MSE Bias squared Variance Less Optimal smoothing More 15

Histograms Create bins B 1,B 2,...,B m of width h. Define f n (x) = m j=1 p j h I(x B j). where p j is proportion of observations in B j. Note that f n (x)dx = 1. A histogram is an example of a nonparametric density estimator. Note that it is controlled by the tuning parameter h, the bin width. 16

Histograms Density 0.00 0.02 0.04 0.06 10 20 30 40 Latitude of Initial Point Seems like too many bins, i.e. h is too small. 17

Histograms Density 0.00 0.01 0.02 0.03 0.04 0 10 20 30 40 50 Latitude of Initial Point Seems like too few bins, i.e. h is too large. 18

xxx Histograms 10 20 30 40 Latitude of Initial Point Density 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 Seems better. 19

Histograms Density 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 10 20 30 40 Latitude of Initial Point Clear that normal curve is not a good fit. 20

Histograms For histograms, ( MISE = E ) (f(x) f(x)) 2 dx h2 12 (f (u)) 2 du + 1 nh The value h that minimizes this is h = 1 ( 1/3 6 n 1/3 (f. (u)) du) 2 and then MISE C 2 n 2/3 21

Kernel Density Estimation We can improve histograms by introducing smoothing. f(x) = 1 n ( ) x Xi K. nh h Here, K( ) is the kernel function, itself a smooth density. i=1 22

Kernels A kernel is any smooth function K such that K(x) 0 and K(x)dx = 1, xk(x)dx = 0 and σk 2 x 2 K(x)dx > 0. Some commonly used kernels are the following: the boxcar kernel : K(x) = 1 I( x < 1), 2 the Gaussian kernel : K(x) = 1 e x2 /2, 2π the Epanechnikov kernel : K(x) = 3 4 (1 x2 )I( x < 1) the tricube kernel : K(x) = 70 81 (1 x 3 ) 3 I( x < 1). 23

xxx Kernels 3 0 3 3 0 3 3 0 3 3 0 3 24

xxx Kernel Density Estimation Places a smoothed-out lump of mass of size 1/n over each data point 10 5 0 5 10 The shape of the lumps is controlled by K( ); their width is controlled by h. 25

Kernel Density Estimation Density 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 10 20 30 40 Latitude of Initial Point h chosen too large. 26

xxx Kernel Density Estimation Density 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 10 20 30 40 Latitude of Initial Point h chosen too small. 27

Kernel Density Estimation Density 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 10 20 30 40 Latitude of Initial Point Just right. 28

xxx Kernel Density Estimation MSE 1 4 c2 1h 4 (f (x)) 2 dx + K 2 (x)dx nh Optimal bandwidth is h = ( c2 c 2 1A(f)n ) 1/5 where c 1 = x 2 K(x)dx, c 2 = K(x) 2 dx and A(f) = (f (x)) 2 dx. Then, MSE C 3 n 4/5. 29

The Bias Variance Tradeoff For many smoothers: which is minimized at MISE c 1 h 4 + c 2 nh h = O ( 1 n 1/5 ) Hence, MISE = O whereas, for parametric problems MISE = O ( 1 n 4/5 ( ) 1 n ) 30

xxx Comparisons So, for parametric form, if it s correct, MISE C 1 n For histograms, MISE C 2 n 2/3 For estimators based on smoothing, MISE C 3 n 4/5 31

xxx Comparisons f(x) 0.0 0.1 0.2 0.3 0.4 truth closest normal 4 2 0 2 4 x Truth is standard normal. 32

Comparisons MSE 0e+00 1e 05 2e 05 3e 05 4e 05 0 200 400 600 800 1000 sample size (n) normal kernel Assuming normality has its advantages. 33

Comparisons truth closest normal f(x) 0.0 0.1 0.2 0.3 4 2 0 2 4 Truth is t-distribution with 5 degrees of freedom. x 34

Comparisons normal kernel MSE 5.0e 06 1.5e 05 2.5e 05 0 200 400 600 800 1000 sample size (n) Normal assumption costs, even for moderate sample size. 35

Cross-Validation Objective, data-driven choice of h is possible: ( ) MISE = E (f(x) f(x)) 2 dx ( = E = J(h) + ) f(x) 2 dx f(x) 2 dx ( 2E ) f(x)f(x)dx + f(x) 2 dx Unbiased Estimator of J(h): Ĵ(h) = ( ) 2 2 fn (x) dx n n i=1 f ( i) (X i ) where f ( i) is the density estimator obtained after removing the i th observation. 36

x Local Density Estimation The idea of local modelling (recal local polynomial regression) extends to density estimation. The local likelihood density estimator developed independently by Loader (1996) and Hjort and Jones (1996) allows for fitting of nonparametric models of this type. 37

Local Density Estimation Model assumes that for u near x, log(f(x)) a 0 + a 1 (u x) + + a p (u x) p Thus, ( ) log f(x) = â 0 + â 1 (u x) + + â p (u x) p is the local (near x) estimate of log( f( )) Estimates obtained by maximizing kernel-weigthed likelihood Repeated for a grid of values x x x x 38

Log Density 5 4 3 2 1 3 2 1 0 1 2 3 A few local linear estimates of log( f( )) 39

Log Density 5 4 3 2 1 3 2 1 0 1 2 3 Smoothed toghether to get log( f( )) 40

Back to Luminosity Functions... Simple parametric forms not realistic for large data sets Could build more complex parametric forms, e.g., Blanton, et al. (2003) Φ(M, z) = n10 0.4(z z 0)P K k=1 Φ k 2πσ 2 M exp [ 1 2 ] (M M k + (z z 0 ) Q) 2 σ 2 M Nonparametric approaches need to account for observational limitations, distinguish physical variations from observational variations 41

Truncation Suppose that X 1,X 2,...,X m f, but data are truncated: only get to see those X i such that X i k. Formally, we get to observe X 1,X 2,...,X n f, where f(x) f R k (x) = f(x)dx, if x k 0, if x > k Clearly, must take truncation into account when using the observed data to estimate f. 42

Absolute Magnitude 24 26 28 30 0 1 2 3 4 5 Redshift Sample of 15,057 quasars from Sloan Digital Sky Survey. (Richards et al. 2006) 43

1/V max Estimator (Schmidt, 1968; Eales, 1993) Observe objects of absolute magnitudes M 1,M 2,...,M n Divide absolute magnitude range into bins. Estimate φ over j th bin as bin j φ(m) = i bin j 1 V max (i) where V max (i) is the volume over which an object of magnitude M i can be detected by the survey This is a weighted histogram 44

Nonparametric Maximum Likelihood (NPMLE) Lynden-Bell (1971): Consider the case of one-sided truncation Efron and Petrosian (1999): Two-sided (double) truncation Know that NPMLE will assign all probability to oberved values, but they will not be equal probabilities NPMLE estimators for this case are implemented in R package DTDA. 45

NPMLE Examples Data Value 0.5 1.0 1.5 2.0 2.5 3.0 3.5 1 2 3 4 5 6 7 Observation Number Vertical lines are truncation ranges, dots are data values. 46

NPMLE Examples Data Value 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0.143 0.143 0.143 0.143 0.143 0.143 0.143 1 2 3 4 5 6 7 Observation Number The probability assigned to each by the NPMLE. 47

NPMLE Examples Data Value 0.5 1.0 1.5 2.0 2.5 3.0 3.5 1 2 3 4 5 6 7 Observation Number Differential truncation. 48

NPMLE Examples Data Value 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0.125 0.125 0.125 0.125 0.167 0.167 0.167 1 2 3 4 5 6 7 Observation Number Both of the intervals (, 2) and (2, ) has probability 0.5. 49

NPMLE Examples Data Value 0.5 1.0 1.5 2.0 2.5 3.0 3.5 1 2 3 4 5 6 7 Observation Number An example where it performs poorly. 50

NPMLE Examples Data Value 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0 0 0 0 0.333 0.333 0.333 1 2 3 4 5 6 7 Observation Number xxx 51

NPMLE Examples Data Value 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 1 2 3 4 5 6 7 Observation Number Example from Efron and Petrosian (1999) 52

NPMLE Examples Data Value 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0.137 0.081 0.095 0.091 0.182 0.182 0.232 1 2 3 4 5 6 7 Observation Number 53

Efron and Petrosian (1999) Quasar Example 54

Efron and Petrosian (1999) Quasar Example 55

Efron and Petrosian (1999) Quasar Example The SEF fit is a special exponential family model that assumes that log f(x) = β 0 + β 1 x + β 2 x 2 + β 3 x 3 This model can be fit via maximum likelihood. BUT: Each of these fits assumes independence between redshift and luminosity, i.e., no evolution of the luminosity function with redshift. 56

Absolute Magnitude 24 26 28 30 0 1 2 3 4 5 Redshift Sample of 15,057 quasars from Sloan Digital Sky Survey. (Richards et al. 2006) 57

The Challenge: Model selection Goal: Given the truncated data shown in the scatter plot, estimate the bivariate density (φ(z,m)). Too Strong: Assume independence: φ(z, M) = f(z)g(m), or assume parametric form: φ(z,m,θ) Too Weak: Use fully nonparametric approach 58

xxx A Semiparametric Model Schafer (2007) We utilize a semiparametric model which allows one to place minimal assumptions on the form of the bivariate density and greatly diminish bias due to truncation and boundary effects. Decompose the bivariate density φ(z,m) into log(φ(z,m)) = f(z) + g(m) + h(z,m,θ) where f(z) and g(m) are estimated using nonparametric methods and h(z,m,θ) will take an assumed parametric form; it is intended to model the dependence between the two random variables. We use h(z,m,θ) = θzm, but another choice could be made, e.g. motivated by a physical theory. 59

Fitting the model: The motivating (naive) idea Assume independence and write φ(z, M) = f(z)g(m) where f( ) is the density for redshift and g( ) is the density for absolute magnitude. x Absolute Magnitude 24 25 26 27 28 29 30 A A(4.0,.) x Let A denote the region outside of which of the data are truncated and let A(z, ) denote the cross-section of A at z. Assume, for the moment, that g( ) is known. 0 1 2 3 4 5 Redshift 60

x The available data allow for estimation of f (z) φ(z,m)dm = f(z) A(z, ) A(z, ) g(m)dm for all z, since it is the density of the observable redshift. Assuming for the moment that g( ) were known, it is possible to estimate f( ) using f(z) = f / ( ) (z) g(m)dm. A(z, ) Starting with an initial guess at g( ), we could iterate between assuming g( ) is known, and estimating f( ), and vice versa. But, the variance of this estimator could be huge: Consider the behavior of f(z) for z where A(z, ) g(m)dm is small. 61

Absolute Magnitude 24 26 28 30 0 1 2 3 4 5 Redshift Depicting A(1.5, ) 62

Density 0.00 0.05 0.10 0.15 0.20 0.25 0.30 76% Unobserved 24% Observed 24 26 28 30 Absolute Magnitude 24% of area under assumed g( ) falls in A(1.5, ) 63

Percent Observed 0 20 40 60 80 100 0 1 2 3 4 5 Redshift (z) Percent of area under g( ) in A(z, ), as a function of z. 64

Estimated Redshift Density 0.0 0.1 0.2 0.3 0.4 0.5 0 1 2 3 4 5 Redshift Smooth f ( ) does not lead to smooth f( ) 65

x Local density estimation Instead, we write ( log(f (z)) = log(f(z)) + log A(z, ) ) g(m)dm and treat ( log A(z, ) ) g(m)dm as an offset term in a model for log(f (z)) and consider only forms for f( ) which are controlled by smoothing parameters chosen by the user. The local likelihood density estimator developed independently by Loader (1996) and Hjort and Jones (1996) allows for fitting of nonparametric models of this type; the offset is easily incorporated. 66

x M-estimators This formulation allows the estimation problem to be couched as an M-estimator, i.e. choose parameters of the local models to maximize criterion n ψ(z i,m i ) i=1 Leads to estimates of standard errors, etc. The parametric component h(z,m,θ) is also easily included in the offset. Can use weights allows for varying selection function: n w i ψ(z i,m i ) i=1 67

Bandwidth Selection via Cross-Validation One ideal choice of the smoothing parameters would minimize the integrated mean squared error: [ ) ] 2 E ( φ(z,m) φ(z,m) dz dm. A An unbiased estimator of the integrated mean squared error is φ 2 (z,m)dz dm 2 n φ ( j) (z j,m j ) n A where φ ( j) (z j,m j ) denotes the estimate found when omitting the j th data pair from the analysis efficiently estimated here. j=1 68

Absolute Magnitude 24 26 28 30 0 1 2 3 4 5 Redshift Sample of 15,057 quasars from the Sloan Digital Sky Survey. 69

Absolute Magnitude 24 25 26 27 28 29 30 0 1 2 3 4 5 Redshift Same plot, with dots scaled by value of selection function for that observation. 70

Key aspects of the approach Allows for the removal of the independence assumption, without assuming a parametric form Iterative algorithm not computationally burdensome; must converge to a unique estimate Statistical theory leads to good approximations to standard errors for all estimated quantities Cross-validation estimator of integrated mean squared error is easily calculated, leading to good bandwidth selection Natural incorporation of the selection function 71

LSCV 0.0078260 0.0078250 0.0078240 0.0078230 0.03 0.04 0.05 0.06 0.07 λ z λ M = 0.15 λ M = 0.16 λ M = 0.17 λ M = 0.18 λ M = 0.19 Finding the optimal bandwidths. 72

Absolute Magnitude 24 26 28 30 0 1 2 3 4 5 Redshift Estimate of bivariate density, i.e. 2-D luminosity function. 73

log(ρ(z))(m < 23.075)(Mpc 3 ) 8.0 7.5 7.0 6.5 6.0 5.5 5.0 4.5 0 1 2 3 4 5 Redshift Marginal density for redshift. 74

Φ(M i ) (Mpc 3 mag 1 ) 9 7 5 z = 0.49 z = 1.25 z = 2.01 Φ(M i ) (Mpc 3 mag 1 ) 9 7 5 z = 2.8 z = 3.75 z = 4.75 25 27 29 M i 25 27 29 M i 25 27 29 M i Slices of the bivariate estimate, compared with Richards, et al. (2006). 75

z = 1 z = 2 9 7 5 Φ(M) (Mpc 3 mag 1 ) 9 7 5 Φ(M) (Mpc 3 mag 1 ) z = 3 z = 4 25 27 29 25 27 29 M M Estimates based on simulated data. 76

Nonparametric Bayesian Approach Kelly et al. (2008): Model 2-d luminosity function as mixture of Gaussians. Parameterized form allows for Bayesian inference Natural uncertainty measures for any functions of the luminosity function 77

Figure 3(a) from Kelly, et al. (2008) 78

Conclusion Luminosity functions as probability densities The potential of nonparametric density estimation Moving beyond the independence assumption and binning 79

References Binney and Merrifield (1998). Galactic Astronomy. Princeton University Press. Blanton (2003). ApJ. 592, 819-838. Eales (1993). ApJ. 404. 51-62. Efron and Petrosian (1999). JASA. 94. 824-834. Kelly, et al. (2008). ApJ. 682. 874-895. Lynden-Bell (1971). MNRAS. 155. 95-118. Press and Schechter (1974). ApJ 187, 425-438. Richards et al. (2006). AJ. 131, 2766. Schafer (2007). ApJ 661, 703-713. Schechter (1976). ApJ (203), 297-306. Schmidt (1968). ApJ (151), 393-409 Takeuchi, et al. (2000). ApJS (129), 1-31 80

Acknowledgements Peter Freeman, Chris Genovese, and Larry Wasserman, Department of Statistics, CMU. Sloan Digital Sky Survey Research supported by NSF grants #0434343 and #0240019. For more information... See http://www.stat.cmu.edu/ cschafer/bivtrunc Contact Chad at cschafer@stat.cmu.edu. 81