Minimax Goodness-of-Fit Testing in Ill-Posed Inverse Problems with Partially Unknown Operators

Size: px

Start display at page:

Download "Minimax Goodness-of-Fit Testing in Ill-Posed Inverse Problems with Partially Unknown Operators"

Adele Perry
5 years ago
Views:

1 Minimax Goodness-of-Fit Testing in Ill-Posed Inverse Problems with Partially Unknown Operators Clément Marteau, Institut Camille Jordan, Université Lyon I - Claude Bernard, 43 boulevard du novembre 98, 696 Villeurbanne, France. marteau@math.univ-lyon.fr and Theofanis Sapatinas, epartment of Mathematics and Statistics, University of Cyprus, P.O. Box 0537, CY 678 Nicosia, Cyprus. fanis@ucy.ac.cy May 0, 06 Abstract We consider a Gaussian sequence model that contains ill-posed inverse problems as special cases. We assume that the associated operator is partially unknown in the sense that its singular functions are known and the corresponding singular values are unknown but observed with Gaussian noise. For the considered model, we study the minimax goodness-of-fit testing problem. Working with certain ellipsoids in the space of square-summable sequences of real numbers, with a ball of positive radius removed, we obtain lower and upper bounds for the minimax separation radius in the non-asymptotic framework, i.e., for fixed values of the involved noise levels. Examples of mildly and severely ill-posed inverse problems with ellipsoids of ordinary-smooth and super-smooth sequences are examined in detail and minimax rates of goodness-of-fit testing are obtained for illustrative purposes. AMS 000 subect classifications: 6G05, 6K0 Keywords and phrases: Ellipsoids; compact operators; Gaussian sequence model; Gaussian white noise model; ill-posed inverse problems; minimax goodness-of-fit testing; minimax signal detection; singular value decomposition.

2 Introduction We consider the following Gaussian sequence model GSM), introduced in 9, { Y = b θ + ε ξ, N, X = b + σ η, N,.) where N = {,,...} is the set of natural numbers, b = {b } N > 0 is an unknown sequence, θ = {θ } N l N) is the unknown signal of interest, ξ = {ξ } N and η = {η } N are sequences of independent standard Gaussian random variables and independent of each other), and ε, σ > 0 are known parameters the noise levels). The observations are given by the sequence Y, X) = {Y, X )} N from the GSM.) and their oint law is denoted by P θ,b. Here, l N) denotes the space of square-summable sequence of real numbers, i.e., l N) = {θ R N : θ := N θ < + }. The GSM.) arises in the case of ill-posed inverse problems with noisy operators. Indeed, consider the Gaussian white noise model GWNM) dy ε t) = Aft)dt + ε dw t), t V,.) where A is a linear bounded operator acting on a Hilbert space H with values on another Hilbert space H, f ) H is the unknown response function that one wants to detect or estimate, W ) is a standard Wiener process on V R, and ε > 0 is a known parameter the noise level). For the sake of simplicity, we only consider the case when A is inective meaning that A has a trivial nullspace) and assume that V = 0,, H = L 0, ), U R and H = L U). In most cases of interest, A is a compact operator see, e.g., Chapter of 3). In particular, it admits a singular value decomposition SV) b, ψ, ϕ ) N, in the sense that Aϕ = b ψ, A ψ = b ϕ, N,.3) where A denotes the adoint operator of A, and b ) N and ϕ ) N are, respectively, the eigenvalues and the eigenfunctions of A A. Thus, the first equation in) GSM.) arises where for all N Y = 0 ψ t)dy ε t), θ = 0 ϕ t)ft)dt, ξ = 0 ψ t)dw t), N, and b > 0 since A is inective). In this case, the GWNM.) corresponds to a so-called ill-posed inverse problem since the inversion of A A is not bounded. Possible examples of such decompositions arise with, e.g., convolution or Radon-transform operators, see, e.g., 3. The effect of the ill-posedness of the model is clearly seen in the decay of the singular values b as +. As +, b θ gets weaker and is then more difficult to perform inference on the sequence θ = {θ } N. In order to motivate our investigation, we stress that ill-posed inverse problems have been at the core of several contributions over the last two decades. In the early literature, the associated compact operator A and, hence, its sequence b = {b } N of singular values) was supposed to be fully known. Note that, in this case, the second equation in the GSM.) does not appear.) We refer, e.g., to 6, 8, 7,, 0 in the minimax estimation context. In the goodness-of-fit paradigm, we mention for instance 0, 7 where minimax separation radius/minimax separation rates signal detection or goodness-of-fit testing) were established, for ill-posed inverse problems with various smoothness conditions on the sequence of interest. Specific situations have been also recently investigated in, e.g., 6 tomography) or 4 biophotonic imaging).

3 The case of an unknown compact operator A that is observed with Gaussian noise has also been recently treated in the estimation literature, especially the situation where A is partially unknown, see, e.g., 9, 4,, 9. In these contributions, it is assumed for the corresponding SV.3) that the sequence of singular functions ψ, ϕ) = ψ, ϕ ) N is known, the sequence of singular values b = {b } N is unknown but observed with some Gaussian noise. In other words, the following sequence model is considered X = b + σ η, N, where η = {η } N is a sequence of independent standard Gaussian random variables and independent of the standard Gaussian sequence ξ = {ξ } N ), and σ > 0 is a known parameter the noise level). Therefore, the second equation in the GSM.) is also readily available. To motivate the GSM.), consider the following deconvolution model see also 9 for a complete discussion on this subect) where dy ε t) = g ft) + ε dw t), t 0,,.4) g ft) = 0 gt x)fx)dx, t 0,, is the convolution between g ) and f ), g ) is an unknown -periodic convolution) kernel in L 0, ), f ) is an unknown -periodic signal in L 0, ), dy ε ) is observed, W ) is a standard Wiener process, and ε > 0 is the noise level. Let φ ), N, be the usual real trigonometric basis on V. The model.4) is equivalent to the first equation in the) GSM.) by a proection on the trigonometric basis φ ), N. In the case where the kernel g ) is unknown i.e., the sequence b k ) k N = g, φ k ) k N is unknown), suppose that we can pass the trigonometric basis φ ), N, through the convolution kernel, i.e., to send each φ ), N, as an input function f ) and observe the corresponding dy ε, ), N. In other words, we are able to obtain training data for the estimation of the unknown convolution kernel g ) in this setting. In particular, we obtain exactly the two sequences of observations Y and X, N, in the GSM.). In this case, the corresponding noise levels coincide, i.e., ε = σ. To the best of our knowledge, there is no research work on minimax goodness-of-fit testing in ill-posed inverse problems with partially unknown operators. Our aim is to fill this gap. In particular, considering the GSM.) and working with certain ellipsoids in the space of squared-summable sequences of real numbers, with a ball of positive radius removed, we obtain lower and upper bounds for the minimax separation radius in the non-asymptotic framework, i.e., for fixed values of ε and σ. Examples of mildly and severely ill-posed inverse problems with ellipsoids of ordinary-smooth and super-smooth sequences are examined in detail and minimax rates of goodness-of-fit testing are obtained for illustrative purposes. The paper is organized as follows. Section presents the considered statistical setting, a brief overview of the main results. Section 3 is devoted to the construction of the suggested testing procedure. A general upper bound on the maximal second kind error is then displayed and special benchmark examples are presented for illustrative purposes. The corresponding lower bounds are proposed in Section 4. In Section 5 we discuss some open questions and present a short selection of applied work that motivated our investigation. Finally, all proofs and technical arguments are gathered in Section 6. 3

4 Throughout the paper we set the following notations. For all x, y R, δ x y) = if x = y and δ x y) = 0 if x y. Also, x y := min{x, y} and x y := max{x, y}. Given two sequences c ) N and d ) N of real numbers, c d means that there exists 0 < κ 0 κ < such that κ 0 c /d κ for all N. Let ν be either ε or σ or ε, σ), and let V be either R + := 0, + ) or R + R +. Given two collections c ν ) ν V and d ν ) ν V of positive real numbers, c ν d ν means that there exists 0 < κ 0 < + such that c ν κ 0 d ν for all ν V. Similarly, c ν d ν means that there exists 0 < κ < + such that c ν κ d ν for all ν V. Minimax Goodness-of-Fit Testing. The Statistical Setting Given observations Y, X) = {Y, X )} N from the GSM.), the aim is to compare the underlying unknown) signal θ l N) to a known) benchmark signal θ 0, i.e., to test H 0 : θ = θ 0 versus H : θ θ 0 F,.) for some given θ 0 and a given subspace F. The statistical setting.) is known as goodness-of-fit testing when θ 0 0 or signal detection when θ 0 = 0. Remark. Given observations from the GWNM.), the testing problem.) is related to the test H 0 : f = f 0 versus H : f f 0 F,.) for a given benchmark function f 0 and a given subspace F. In most cases, F contains functions f L 0, ) that admit a Fourier series expansion with Fourier coefficients θ belonging to F see, e.g., 8, Section 3..). In these cases, the tests.) and.) are equivalent. The choice of the set F is important. Indeed, it should be rich enough in order to contain the true θ. At the same time, if it is too rich, it will not be possible to control the performances of a given test due to the complexity of the problem. The common approach for such problems is to impose both a regularity condition which characterizes the smoothness of the underlying signal) and an energy condition which measures the amount of the underlying signal). Concerning the regularity condition, we will work with certain ellipsoids in l N). In particular, we assume that θ E a R), the set E a R) being defined as E a R) = θ l N), a θ R,.3) N where a = a ) N denotes a non-decreasing sequence of positive real numbers with a + as +, and R > 0 is a constant. The set E a R) can be seen as a condition on the decay of θ. The cases where a increases very fast correspond to θ with a small amount of non-zero coefficients. In such a case, the corresponding signal can be considered as being smooth. Without loss of generality, in what follows, we set R =, and write E a instead of E a ). Regarding the energy condition, it will be measured in the l N)-norm. In particular, given r ε,σ > 0 called the radius), which is allowed to depend on the noise levels ε, σ > 0, we will consider alternatives such that θ θ 0 > r ε,σ. Given a smoothness sequence a and a radius r ε,σ > 0, the set F can thus be defined as F := Θ a r ε,σ ) = {ν E a, ν r ε,σ }..4) 4

5 In other words, the set F is an ellipsoid in l N) with a ball of radius r ε,σ > 0 removed. In many cases of interest, the set F provides constraints on the Fourier coefficients of f L 0, ) in the model.) see, e.g., 8, Section 3.). We consider below the hypothesis testing setting.) with θ 0 0 i.e., goodness-of-fit testing). More formally, given observations from the GSM.), for any given θ 0 E a \{0}, we will be dealing with the following goodness-of-fit testing problem H 0 : θ = θ 0 versus H : θ θ 0 Θ a r ε,σ ),.5) where Θ a r ε,σ ) is defined in.4). The sequence a being fixed, the main issue for the problem.5) is then to characterize the values of r ε,σ > 0 for which both hypotheses H 0 called the null hypothesis) and H called the alternative hypothesis) are separable in a sense which will be made precise later on). Remark. We would like to stress at this point that in the standard GSM i.e.,.) with σ = 0), signal detection i.e., θ 0 = 0) and goodness-of-fit testing i.e., θ 0 0) problems are equivalent as soon as the involved operator is inective. Indeed, without loss of generality, we can still replace the observed sequence Y ) N by Ỹ) N := Y b θ 0, ) N. This is no more the case in the GSM.) since the sequence b ) N is unknown. Signal detection and goodnessof-fit problems should therefore be treated in a different manner. In this work, we only address the goodness-of-fit testing problem.5). Remark.3 In the same spirit, we point out that the benchmark function θ 0 is assumed to belong to E a \{0} in the testing problem.5). Such an assumption is not necessary in the classical setting where σ = 0. Given observations from the GSM.), the requirement θ 0 E a \{0} allows a sharp control of the additional uncertainty generated by the noise in the operator. This requirement could be replaced by θ 0 l N)\{0}, leading to the minimax separation rates described in 7, 0 with maxɛ, σ) instead of ɛ. In the following, a non-randomized) test Ψ := ΨY, X) will be defined as a measurable function of the observation Y, X) = Y, X ) N from GSM.) having values in the set {0, }. By convention, H 0 is reected if Ψ = and H 0 is not reected if Ψ = 0. Then, given a test Ψ, we can investigate the first kind type I) error probability defined as α ε,σ Ψ) := P θ0,bψ = ),.6) which measures the probability to reect H 0 when H 0 is true i.e., θ = θ 0 ); it is often constrained as being bounded by a prescribed level α 0,, and the maximal second kind type II) error probability defined as β ε,σ Θ a r ε,σ ), Ψ) := sup P θ,b Ψ = 0),.7) θ: θ θ 0 Θ ar ε,σ) which measures the worst possible probability not to reect H 0 when H 0 is not true i.e., when θ θ 0 Θ a r ε,σ )); one would like to ensure that it is bounded by a prescribed level β 0,. For simplicity in our exposition, we will restrict ourselves to α-level tests, i.e., tests Ψ α satisfying α ε,σ Ψ α ) α, for any fixed value α 0,. Let α, β 0, be given, and let Ψ α be an α-level test. 5

6 efinition. The separation radius of the α-level test Ψ α over the class E a is defined as r ε,σ E a, Ψ α, β) := inf { r ε,σ > 0 : β ε,σ Θ a r ε,σ ), Ψ α ) β }, where the maximal second kind error probability β ε,σ Θ a r ε,σ ), Ψ α ) is defined in.7). In some sense, the separation radius r ε,σ E a, Ψ α, β) corresponds to the smallest possible value of the available signal θ θ 0 for which H 0 and H can be separated by the α-level test Ψ α with maximal second kind error probability, bounded by a prescribed level β 0,. efinition. The minimax separation radius r ε,σ := r ε,σ E a, α, β) > 0 over the class E a is defined as r ε,σ := inf Ψ α: α ε,σ Ψ α)α r ε,σ E a, Ψ α, β)..8) The minimax separation radius r ε,σ corresponds to the smallest radius r ε,σ > 0 such that there exists some α-level test Ψ α for which the maximal second kind error probability β ε,σ Θ a r ε,σ ), Ψ α ) is not greater than β.. Summary of the Results In this section, we present an overview of the main results obtained in the subsequent sections. The main Theorem. presented bellow combines Theorems 3. upper bound) and Theorem 4. lower bound). Moreover, the minimax rates displayed in Tables. and. correspond, respectively, to the minimax rates given in Theorem 3. and Theorem 4. for various smoothness and ill-posedness conditions. We emphasize that the aim of this work is to establish optimal separation conditions for the goodness-of-fit testing problem.5). This task requires, in particular, precise non-asymptotic) controls of the first kind error probability α ε,σ Ψ α ) and the maximal second kind error probability β ε,σ Θ a r ε,σ ), Ψ α ) of a specific test Ψ α that will be made precise in Section 3) by prescribed levels α, β 0,, respectively. Such controls allow us to derive both upper and lower bounds on the minimax separation radius r ε,σ, as summarized in the following theorem. Theorem. Let α, β 0, be fixed, such that α β. Consider the goodness-of-fit testing problem.5). Then, there exist explicit positive constants Cα, β) > 0, Cα,β > 0, c α,β > 0 and σ 0 0, such that, for all 0 < σ σ 0 and for each ε > 0, i) r ε,σ inf Cα, β)ε ln/α) ) σ ln 3/ /σ) a 0, and, for all ε, σ > 0, ii) r ε,σ { C α,β 4 σ } max b M a sup c α,β ε where the bandwidths, M and M depend on both b ) N and σ. a, For the sake of brevity, these quantities are made precise in the subsequent sections. See, respectively, 3.), 4.5), 4.) for the constants Cα, β), Cα, β), cα, β), and 3.3), 4.4) for the terms, M, M. 6

7 Theorem. provides a precise description on the behavior of the minimax separation radius r ɛ,σ in terms of the sequences a ) N and b ) N and of the noise levels ɛ and σ. It is worth pointing out that this control is non-asymptotic. There is indeed a technical constraint on the value of σ 0 < σ σ 0, σ 0 0, ), but we do not assume its convergence towards 0, i.e., we work with fixed values of the noise levels ε and σ. We also note that the restriction α β on the levels, is only necessary for the upper bound item i)): see Propositions 3. and 4.. Then, we apply the above result on specific problems. Namely, we consider various behaviors for both sequences a ) N and b ) N, and discuss the properties of the associated minimax separation radii r ε,σ. Concerning the eigenvalues b ) N of the operator A A, we will alternatively consider situations where b t or b exp t), N, for some t > 0. The first case corresponds to the so-called mildly ill-posed problems while the second one corresponds to severely ill-posed problems. Concerning the ellipsoids E a, i.e., the sequence a ) N, two different kinds of smoothness will be investigated, namely, a s or a exps), N, for some s > 0, the so-called ordinary-smooth and super-smooth cases, respectively. In the above scenarios, we apply Theorem. and describe the associated upper and lower bounds on the minimax separation radius r ε,σ. They are, respectively, displayed in Table. and Table.. Goodness-of-Fit ordinary-smooth super-smooth Testing Problem a s a exp{s} mildly ill-posed ε 4s/s+t+/) σ ln 3/4 /σ) s/t) ε ln/ε)) t+/ σ ln 3/ /σ) b t severely ill-posed ln/ε)) s ln/σ ln / /σ)) s ε s/s+t) σ ln / /σ) s/t) b exp{ t} Table.: Minimax goodness-of-fit testing with unknown singular values: upper bounds on the minimax separation radius r ε,σ for 0 < ε ε 0, ε 0 0,, and 0 < σ σ 0, σ 0 0,, for all t, s > 0. Goodness-of-Fit ordinary smooth super smooth Testing Problem a s a exp{s} mildly ill-posed ε 4s/s+t+/) σ s/t) ε ln ε ) t+/ σ b t severely ill-posed ln/ε)) s ln/σ)) s ε s/s+t) σ s/t) b exp{ t} Table.: Minimax goodness-of-fit testing with unknown singular values: lower bounds on the minimax separation radius r ε,σ for 0 < ε ε 0, ε 0 0,, and 0 < σ σ 0, σ 0 0,, for all t, s > 0. Looking at these tables, both lower and upper bounds coincide in every considered case, up to a logarithmic term that depends on the noise level σ. Hence, Theorem. provides a sharp control on the minimax separation radius r ε,σ in various settings. The interesting property of such minimax separation radii is that they have the same structure whatever the considered 7

8 situation: a maximum between two terms depending, respectively, on the noise levels ɛ and σ. It is also worth pointing out that the first term depending on ɛ corresponds to the minimax separation radius in the case where the operator is known i.e., σ = 0), as displayed in Table.3. Goodness-of-Fit ordinary-smooth super-smooth Testing Problem a s a exp{s} mildly ill-posed ε 4s/s+t+/) ε ln ε ) t+/ b t severely ill-posed ln ε ) s ε s/s+t) b exp{ t} Table.3: Minimax goodness-of-fit testing with known singular values: the separation rates r ε for 0 < ε ε 0, ε 0 0,, for all t, s > 0. The results displayed in Theorem. and Tables.,. can also be understood as follows. Two problems are at hand: detection of the underlying signal with a minimax separation radius that only depends on ɛ) and detection of the frequencies for which the terms b can be replaced by observations X without loss of precision with a minimax separation radius that depends only on σ). The final minimax separation radius is then the maximum of these two terms, i.e., the signal detection hardness is related to the most difficult underlying problem. We stress that such phenomenon has already been discussed in the minimax estimation framework, see, e.g.,, 9. 3 Upper Bound on the Minimax Separation Radius In this section, we first propose an α-level testing procedure. Then, we investigate its maximal second kind error probability and establish a non-asymptotic upper bound on the minimax separation radius which corresponds to item i) of Theorem.). Finally, in Section 3.3, we provide a control of the upper bounds for minimax separation radii for the specific cases displayed in Table.. 3. The Spectral Cut-Off Test For a given θ 0 0, the aim of the goodness-of-fit testing problem.5) is to determine whether or not θ = θ 0. In particular, for any given N, one would like to infer the corresponding value θ from the observation Y, X) = Y, X ) N from GSM.). Typically, for any given N, one may use the naive estimate ˆθ of θ, defined by ˆθ := Y X = b X θ + ε X ξ, N. In order to ensure a good approximation of θ by ˆθ in a sense which will be made precise later on), a precise control of the ratio b /X is required. To this end, we want to avoid coefficients for which X σ, namely for which the observation X is of the order of the corresponding noise level σ, that does not have discriminatory power. Therefore, we will restrict ourselves to coefficients X with indices M, where the bandwidth M is defined by M := inf{ N : X σh }, 3.) where, for all N, κ h = 6 ln α ) + ln ) 0, 3.) α 8

9 for some κ > exp). In some sense, the terms σh can be hence considered as a threshold for the variables X. We stress that unlike classical thresholding procedures, the terms h here depend on the level α and allow a control of the first kind error. Remark 3. The value of κ is, in some sense, related to the value of the first kind error probability of the suggested testing procedure. We will see below that the value κ = 53π +)/6 is convenient to our purpose. We stress that κ is not a regularization parameter: an optimal value of κ only allows to get optimal constants in the final results but will not change the order of the corresponding minimax separation rates. Finding optimal constants is outside the scope of this work. The bandwidth M is a random variable but can be controlled in the sense that M, M with high probability see Lemma 6. for precise computations and Figure 3. for a graphical illustration), where the bandwidths and M are defined by { := inf{ N : b σh 0, }, 3.3) M := inf{ N : b σh, }, and the sequences h 0 = h 0, ) N, h = h, ) N satisfy κ h 0, = 8 ln α κ h, = 6 ln α ) + ln ) 0, 3.4) α ), 3.5) for all N. The sequences h = h ) N, h 0 = h 0, ) N and h = h, ) N in the definition of, M and M allow a uniform control of the standard Gaussian sequence η = η ) N associated with X = X ) N ), for all M see Lemmas 6., 6. and 6.3 in Section 6). We are now in the position to construct a spectral cut-off) testing procedure. According to the methodology proposed earlier in the literature see e.g., 8 or ), our test will be based on an estimation of θ θ 0. For any fixed N, consider the test statistic T,M := ) Y θ,0. 3.6) X Given a prescribed level α 0, for the first kind error probability, the associated spectral cut-off test is then defined as Ψ,M := {T,M >t α, X)}, 3.7) where t α, X) := ε + Cα)ε X X x α/ ) σ ln 3/ /σ) a, 3.8) and Cα) = 3 x α/ + x α/ > 0, x γ := ln/γ) γ 0,. 3.9) In other words, if the estimator T,M of θ θ 0 is greater than the random threshold t α, X), θ and θ 0 are very unlikely to be close to each other, and we will reect H 0. 9

10 b σh 0,M0 σh,m 0 M Figure 3.: An illustration of the spatial positions of the bandwidths and M, defined in 3.3). The decreasing solid curve corresponds to the values of the sequence b = b ) N with respect to the index N, while the oscillating curve demonstrates one realization of the random sequence X = X ) N according to the GSM.). The increasing dashed curve draws the behavior of the sequence σh. For the corresponding random bandwidth M defined in 3.3), Lemma 6. shows that M, M with high probability. Remark 3. Under H 0, Y = b θ,0 + εξ, N, and, hence, T,M = ) b θ,0 + εx ξ. X Therefore, the law of T,M is not available and, thus, its corresponding α)-quantile is not computable in practice, since the sequence b = b ) N is unknown. However, Proposition 3. below ensures that the threshold t α, X) defined in 3.8) provides a computable upper bound on this quantile. First, we focus on the first kind error probability. The following proposition states that the spectral cut-off test Ψ,M defined in 3.7)-3.8), is an α-level test. Proposition 3. Let α 0, be fixed. Consider the goodness-of-fit testing problem.5). Then, setting κ = 53π + )/6, there exists σ 0 0, such that, for all 0 < σ σ 0 and for each ε > 0, the spectral cut-off test Ψ,M, defined in 3.7)-3.8), is an α-level test, i.e., The proof is postponed to Section 6... α ε,σ Ψ,M ) α. 3.0) Remark 3.3 In order to shed light on the term σ 0, we provide bellow a heuristic argument. Note that, under H 0, thanks to a rough) Taylor expansion, T,M εb ξ + σb θ,0 η. Compared to the noise-free case i.e., σ = 0), we have in some sense to deal with the additional term σb θ,0 η. Two scenarios are at hand 0

11 If sup b a C 0, the expected amount of additional signal is σ b θ σ C 0 θ, which is of the order of the classical parametric rate σ. However, since C 0 is unknown, we use a rough standard deviation control on this additional term, which requires a logarithmic term i.e., ln 3/ /σ)) in the right hand side of 3.8). We stress that this logarithmic term can be removed if the knowledge of C 0 is assumed. If sup b a > C 0, we can prove that σb η see Lemma 6.4) is bounded with controlled probability, according to the construction of the bandwidth M given in 3.). In such case, the additional term can be controlled by the bias a. ue to the additional logarithmic term mentioned above, the first kind error probability can be controlled as soon as σ is small enough i.e., 0 < σ σ 0 for some σ 0 0, ). Unsurprisingly, it is impossible to retrieve any kind of information on the observations if the noise level σ is too large. 3. A Non-Asymptotic Upper Bound We now turn our attention to the the maximal second error probability. The following proposition provides, for each noise level ε > 0 and for noise level σ small enough, an upper bound for the separation radius r ε,σ E a, Ψ,M, β) of the spectral cut-off test Ψ,M defined in 3.6)-3.8). Proposition 3. Let α, β 0, be fixed, such that α β. Consider the goodness-of-fit testing problem.5). Let Ψ,M be the spectral cut-off test, defined in 3.7)-3.8). Then, there exists σ 0 0, such that, for all 0 < σ σ 0 and for each ε > 0, rε,σe a, Ψ,M, β) Cα, β)ε x α/ ) σ ln 3/ /σ) a 0, 3.) where Cα, β) = 6Cα) + 3 x β/ ). 3.) The proof is postponed to Section 6... Note that the upper bound on the separation radius r ε,σe a, Ψ,M, β) given in 3.) depends on two antagonistic terms, namely, ε and σ ln 3/ /σ) a 0. Ideally, one would like to make this upper bound as small as possible, i.e., to obtain the weakest possible condition on θ θ 0 such that, for any fixed β 0,, β ε,σ Θ a r ε,σ ), Ψ,M ) β. Therefore, one would like to select := such that := arg min Cα, β)ε x α/ ) σ ln 3/ /σ) a 0, where Cα, β) is defined in 3.). However, this optimal bandwith is not available in practice since the sequence b = b ) N is not assumed to be known. To this end, we use instead the bandwidth := defined as := arg min Cα, β)ε X x α/ ) σ ln 3/ /σ) a, 3.3)

12 The following theorem illustrates the performances of the corresponding spectral cut-off test Ψ,M, defined in 3.7), with :=, defined in 3.3). Theorem 3. Let α, β 0, be fixed, such that α β. Consider the goodness-of-fit testing problem.5). Let Ψ,M be the spectral cut-off test, defined in 3.7) with :=, defined in 3.3). Then, there exists σ 0 0, such that, for all 0 < σ σ 0 and for each ε > 0, and rε,σe a, Ψ,M, β) inf Cα, β)ε α ε,σ Ψ,M) α 3.4) where the constant Cα, β) has been introduced in 3.). The proof of Theorem 3. is postponed to Section x α/ ) σ ln 3/ /σ) a 0, Remark 3.4 According to Theorem 3., given a radius r ε,σ > 0, then rε,σ inf Cα, β)ε and, hence, r ε,σ inf Cα, β)ε β ε,σ Θ a r ε,σ ), Ψ,M) β, x α/ ) σ ln 3/ /σ) a x α/ ) This upper bound corresponds to item i) of Theorem.. 3.5) σ ln 3/ /σ) a ) Remark 3.5 Although we considered the spectral cut-off test 3.7), we stress that several alternative testing strategies could be investigated see, e.g., ). Since our aim is to propose an accurate description of the minimax separation radius/rates associated to the goodness-of-fit problem.), the proposed testing procedure should be considered as a tool that allows to propose an upper bound. Nevertheless, any testing procedure designed for the goodness-offit testing problem.5) depends on the sequence b k ) k N. The uncertainty on the operator, however, requires a specific treatment and cannot be managed by an immediate application of classical procedures see, e.g., 7). 3.3 Upper Bounds: Specific Cases Our aim in this section is to determine an explicit value in terms of the noise levels ε and σ) for the upper bounds on the minimax separation radius r ε,σ obtained in Theorem 3. above. To this end, we will consider well-known specific cases regarding the behavior of both sequences a ) N and b ) N. According to the existing literature, we will essentially deal with mildly and severely ill-posed problems with ellipsoids of ordinary-smooth and super-smooth functions see also Section. for formal definitions). Theorem 3. Consider the goodness-of-fit testing problem.5) when observations are given by.), and the signal of interest has smoothness governed by.3).

13 i) If b t, t > 0, and a s, s > 0, for all N, then, there exist ε 0, σ 0 0, such that, for all 0 < ε ε 0 and 0 < σ σ 0, the minimax separation radius r ε,σ satisfies r ε,σ ε 4s s+t+/ s σ ln 3/4 t /σ) ). ii) If b t, t > 0, and a exp{s}, s > 0, for all N, then, there exist ε 0, σ 0 0, such that, for all 0 < ε ε 0 and 0 < σ σ 0, the minimax separation radius r ε,σ satisfies r ε,σ ε ln /ε) t+ ) σ ln 3 /σ). iii) If b exp{ t}, t > 0, and a s, s > 0, for all N, then, there exist ε 0, σ 0 0, such that, for all 0 < ε ε 0 and 0 < σ σ 0, the minimax separation radius r ε,σ satisfies r ε,σ ln /ε) s ln σ ln / /σ) ) s. iv) If b exp{ t}, t > 0, and a exp{s}, s > 0, for all N, then, there exist ε 0, σ 0 0, such that, for all 0 < ε ε 0 and 0 < σ σ 0, the minimax separation radius r ε,σ satisfies r ε,σ ε s s+t σ ln / /σ) s ) t. The minimax rates displayed above are summarized in Table.. The proof is postponed to Section 6.3. The main task is to compute the asymptotic trade-off between both antagonistic terms ε and σ ln 3/ /σ) a 0 in the upper bounds on the minimax separation radius r ε,σ displayed in 3.6). 4 Lower Bounds on the Minimax Separation Radius We establish a non-asymptotic lower bound on the minimax separation radius which corresponds to item ii) of Theorem.). In order to do this, we consider two special cases of the GSM.), namely the situations where a) ε = 0: the signal is observed without noise but the eigenvalues of the operator at hand are still noisy, and b) σ = 0: the classical model see, e.g., 7 or 0) where the eigenvalues of the operator at hand are known. Both models a) and b) correspond to some extreme situations but provide, in some sense, a benchmark for the problem at hand. We first establish a lower bound for the case a) in Section 4. and recall the lower bound for the case b) that has already been discussed in, e.g.,, 0 or ) in Section 4.. Then, we establish in Section 4.3 that the minimax separation radius associated to goodness-of-fit testing problem.5) is always greater than the maximum of the minimax separation radii associated to the cases a) and b). Finally, in Section 4.4, we provide a control of the lower bounds for minimax separation radii for the specific cases displayed in Table.. 3

14 4. Lower Bounds for a GSM with ε = 0 We consider the GSM.) with b = b and ε = 0, i.e., { Y = b θ, N, X = b + σ η, N. 4.) For a given sequence b = b) N, define Bb) = {ν l N) : C 0 b ν C b, N, 0 < C 0 C < + }. Given observations from the GSM 4.), for any given θ 0 E a \{0} and b Bb), we consider the following goodness-of-fit testing problem where Θ a r σ ) = {µ E a, µ r σ }. H 0 : θ = θ 0 versus H : θ θ 0 Θ a r σ ), b Bb), 4.) as Our aim below is to provide a lower bound on the minimax separation radius r 0,σ, defined r 0,σ := inf r 0,σ E a, Ψ α, β), Ψ α: α 0,σ Ψ α)α where r 0,σ E a, Ψ α, β) is the separation radius of any given α-level test Ψ α, defined as r 0,σ E a, Ψ α, β) := inf { r σ > 0 : β 0,σ,b Θ a r σ ), Bb), Ψ α ) β }, and β 0,σ,b Θr σ ), Bb), Ψ α ) is the associated maximal second kind error probability, defined as β 0,σ,b Θr σ ), Bb), Ψ α ) := sup θ: θ θ 0 Θ ar σ) b Bb) P θ, bψ α = 0). The following proposition states a lower bound for the minimax separation radius r 0,σ of the goodness-of-fit testing problem 4.). Proposition 4. Assume that Y, X) = Y, X ) N are observations from the GSM 4.) and consider the goodness-of-fit testing problem 4.). Let α 0, and β 0, α be given. Then, for every σ > 0, the minimax separation radius r 0,σ is lower bounded by r 0,σ C α,β σ max b M a, 4.3) where { M := sup N : C α,β σb and G C 0, C) } α β) 4.4) with C α,β = ln + α β) ) > 0 and G C 0, C ) = σ π for some constants 0 < C 0 C < +. The proof is postponed to Section C b C 0 b { exp } σ t b ) dt, 4.5) 4

15 Remark 4. Note that G C 0, C) = Φ C ) b ) Φ C 0 ) b ), σ σ where Φ ) is the cumulative distribution function of the standard Gaussian distribution. Hence, G C 0, C) α β) b σk, where K := KC 0, C, α, β) > 0. Then M in 4.4) can be re-expressed as M := sup { N : b σk C α,β }. 4.6) This expression M in 4.6) can be compared to the respective expressions of and M defined in 3.3). In particular, we point-out that there is no logarithmic term involved in M. 4. Lower Bounds for the GSM when σ = 0 We consider the GSM.) with σ = 0, i.e., { Y = b θ + εξ, N, X = b, N. 4.7) Note that, in this case, the above model can be re-expressed as where b = b ) N is a known positive sequence. Y = b θ + εξ, N, 4.8) The following proposition states a lower bound for the minimax separation radius r ε,0, defined in.) with σ = 0, of the following goodness-of-fit testing problem H 0 : θ = θ 0 versus H : θ θ 0 Θ a r ε,0 ), 4.9) where θ 0 E a \{0} and Θ a r ε,0 ) is defined in.4) with σ = 0. Proposition 4. Assume that Y = Y ) N are observations from the GSM 4.8) and consider the goodness-of-fit testing problem 4.9). Let α 0, and β 0, α be given. Then, for every ε > 0, the minimax separation radius r ε,0 is lower bounded by r ε,0 sup c α,β ε a, 4.0) where c α,β = ln + 4 α β) )) /4 > 0. 4.) The proof of the Proposition 4. with detailed arguments are related discussion can be found in e.g.,, 0 and. 5

16 4.3 A Combined Lower Bound The following result provides a lower bound on the minimax separation radius r ε,σ for the goodness-of-fit testing problem.5). This lower bound corresponds to item ii) of Theorem.. Theorem 4. Consider the GSMs.), 4.) and 4.7). enote by r ε,σ, r 0,σ and r ε,0 the corresponding minimax separation radii. Then, for every ε > 0 and σ > 0, In particular, r ε,σ { C α,β 4 σ r ε,σ r 0,σ r ε,0. 4.) } max b M a sup c α,β ε where C α,β is given in 4.5), M is given in 4.4) and c α,β is given in 4.). The proof of Theorem 4. is postponed to Section a, 4.3) Remark 4. At a first sight, the upper and lower bounds respectively displayed in i) and ii) of Theorem. do not exactly match up. However, a closer look at the involved formulas indicates that both quantities contain terms that have similar behaviors. This is, in some sense, confirmed in Section 4.4 below where specific sequences a ) N and b ) N are treated. 4.4 Lower Bounds: Specific Cases Our aim in this section is to determine an explicit value in terms of the noise levels ε and σ) for the lower bounds on the minimax separation radius r ε,σ obtained in Theorem 4. above for the specific sequences a ) N and b ) N considered in Section 3.3. Theorem 4. Consider the goodness-of-fit testing problem.5) when observations are given by.), and the signal of interest has smoothness governed by.3). i) If b t, t > 0, and a s, s > 0, for all N, then, there exist ε 0, σ 0 0, such that, for all 0 < ε ε 0 and 0 < σ σ 0, the minimax separation radius r ε,σ satisfies 4s r ε,σ ε s+t+/ σ s ) t. ii) If b t, t > 0, and a exp{s}, s > 0, for all N, then, there exist ε 0, σ 0 0, such that, for all 0 < ε ε 0 and 0 < σ σ 0, the minimax separation radius r ε,σ satisfies r ε,σ ε ln /ε) t+ ) σ. iii) If b exp{ t}, t > 0, and a s, s > 0, for all N, then, there exist ε 0, σ 0 0, such that, for all 0 < ε ε 0 and 0 < σ σ 0, the minimax separation radius r ε,σ satisfies r ε,σ ln /ε) s ln /σ) s. iv) If b exp{ t}, t > 0, and a exp{s}, s > 0, for all N, then, there exist ε 0, σ 0 0, such that, for all 0 < ε ε 0 and 0 < σ σ 0, the minimax separation radius r ε,σ satisfies r ε,σ ε s s+t σ s t ). The minimax rates displayed above are summarized in Table.. The proof is postponed to Section 6.5. As in the case of the upper bound, the main task is to compute the trade-off between both different antagonistic terms involved in the lower bound on the minimax separation radius r ε,σ displayed in 4.3). 6

17 5 Concluding Remarks 5. Theoretical Outcomes The main conclusion of this work is that goodness-of-fit testing in an inverse problem setting is feasible, even in the specific situation where some uncertainty is observed on the operator at hand in the GSM.). We have established optimal separation conditions for the goodnessof-fit testing problem.5) via a sharp control of the associated minimax separation radius. We stress that several outcomes and open questions are still of interest. We can mention, among others: Adaptivity: As proved in Theorem 3., the test Ψ,M introduced in 3.6)-3.9) with defined in 3.3) is powerful in the sense that its separation radius is equal up to constant) to the minimax one. However, this test strongly depends on the sequence a = a ) N that characterizes the smoothness of the signal of interest. In practice, this sequence is unknown and adaptive procedures are necessary. A classical way to proceed to this adaptation step is to build an α-level test Ψ α, for each candidate in a given set. Then, one define the adaptive Bonferroni type) test Ψ α as Ψ α = max Ψ α,, # where # denotes the cardinality of the set. It can be proved that, provided the set is well-chosen, the test Ψ α is of level α and attains the minimax rate, up to an additional logarithmic term. We refer to, e.g., 8 or 7 for more details on adaptation in the GSM.) with σ = 0). Non-Gaussian models: In the GSM.), the observations are assumed to be Gaussian. However, it may sometimes be of practical or mathematical) interest to consider alternative distributions. The upper bounds obtained earlier are established using a sharp control on the deviation of the sum of independent squared Gaussian random variables, see Lemma 6.4. The main ingredients for the proof of this lemma are refined computations and control of the Laplace transform of such variables see, e.g., 3,, 0). We conecture that such a lemma could be generalized to alternative distributions. On the other hand, the Gaussian structure of the measurements is at the heart of the lower bounds proposed above. Proposing lower bounds with generic conditions on the noise distributions appears to be more challenging. Signal detection: We have already mentioned in Remark. that signal detection is different from goodness-of-fit testing.5) when the GSM.) is at hand. In this work, we were concerned with the case where θ 0 0 goodness-of-fit testing). However, some attention should also be paid in the future to the case where θ 0 = 0 signal detection). In particular, testing methodologies and related minimax separation radii are quite different from those presented above. All these topics require special attention that is beyond the scope of this paper. Nevertheless, they provide an avenue for future research. 5. Applied Outcomes As discussed above, minimax testing in an ill-posed inverse problem context has been at the core of several theoretical investigations over the last decade. In this section, we present a short selection of applied examples that involve goodness-of-fit approaches in an inverse problem setting that motivated our investigation. 7

18 Blind econvolution for Ophthalmoscopy. In 4, the authors considered images I following the model I = O P SF + N, where O denotes the obect of interest, P SF not exactly known in practice) the point spread function which models deformations of the obect of interest) and N represents stochastic noise. In practice, the signal-to-noise ratio is not very high: one has to find solution in order to improve the quality of the data. A possible way is to gather several images pointing on the same obect and to average them. Nevertheless, one has to be sure that a given image can be added to the existing sample. Such a situation can be modeled as follows: suppose one has at his/her disposal N images, say I, I,..., I N, which correspond to observations on the same obect f 0. Writing these observations in the frequency domain I k = b k θ k,0 + ɛξ k, {,,..., N}, k N. Summing-up over {,,..., N}, we obtain the average image, say I, satisfying in the frequency domain I k = b kθ k,0 + ɛξ k with ξ k = N ξ k, k N. N Then, one might want to compare a new observed image, say I, to the average image I, in order to decide whether or not it can be included in the sample. In particular, the idea is to determine if a non-expected movement has occurred during the acquisition of the data. In other words, one wants to determine whether or not θ = θ 0. Provided N is large enough, one can in an idealized framework) assume that the noise ɛξ k has no real influence on the considered task, and can be thus considered to be equal to 0, in which case we recover the GSM.). A similar discussion is provided in 4 in a biophotonic imaging setting. The Instrumental Variable IV) Regression Model. Y, X, W ),,...,n satisfying This setting deals with a sample Y = T gw ) + U, {,,..., n}, 5.) where g is an unknown function of interest, U are independent and identically distributed random variables such that EU /W = 0, and T is an unknown operator that can be estimated using the couples X, W ). Econometricians have been interested in testing specific values for the unknown function g. For instance, 5 considered a parametric model. The generalization of our investigations to the this instrumental regression setting may allow to handle a complete non-parametric framework. ECG ata Analysis. In Bigot 03), shifted observations are at hand. More precisely, one observes N -dimensional signals that can be modeled as follows Y = f. τ ) + ɛξ, {,,..., N}, where the τ denotes a random sample obtained from a random variable τ having density g. Writing the above observations in the frequency domain and summing-up over {,..., N}, one gets k = θ k N Y N e iπkτ + ɛ N ζ k, k N. The law of large number guarantee that N N e iπkτ converges to the k th -Fourier coefficient of g. Provided that g is known, we can then recover θ see, e.g., ). 8

19 Now, assume that these N observed signals correspond to normal ECG measure for a patient in a good health, or in a particular period. Then, imagine that given another hearth cycle measurement, one would like to detect a possible arrhythmic cycle. In this case, we get a new signal Y = f. τ ) + ɛξ, which in the frequency domain is written as Y k = θ k e iπkτ + ɛz k, k N. In such case, one might want to test whether or not θ = θ. Note, in particular, that the operator at hand its eigenvalues) is random. Errors-in-variables model. ensity model with measurement errors have been at the core of several statistical studies in the past decades see, e.g., 3 for an overview). Formally, given a sample of independent and identical distributed random variables Y ),,...,n satisfying Y = X + ɛ =,,..., n, the aim is to produce some inference on the unknown density of the X i, denoted by f, the ɛ i corresponding to some error, with known density η. This appears to be an inverse deconvolution) problem since the Y i are associated to the convolved density f η. In a goodness-of-fit testing task, this model has been discussed in 5 and minimax separation rates in the asymptotic minimax testing framework) have been established in various settings. In the spirit of our contribution, it could be interesting to propose goodness-of-fit testing methods taking into account some possible uncertainty on the density η at hand. The above examples are somehow considered in slightly different settings compared to the GSM.) studied in this paper. However, the related questions might provide an avenue for future research. In any case, determining the separation rates in these or similar settings will require at some step a careful investigation of the GSM.). 6 Appendix: Proofs 6. Useful Lemmas The constant C > 0 and 0 < τ < below will vary from place to place. The following lemma is inspired by Lemma 6. of 9. Lemma 6. Let M,, M be defined as in 3.)-3.5) where α 0, and κ exp) are fixed values. efine the event M = { M < M }. 6.) Then, for any σ 0,, PM c ) α 0 + απ 6κ. 6.) Proof of Lemma 6.. It is easily seen that M PM M ) = P { X > σh } P X M > σh M ), 9 P b M + σ η M > σh M ), P η M > h M h,m ), ) ) 0 = P η M > ln, α

20 where the sequences h ) N and h, ) N are defined in 3.) and 3.5) respectively. Using the bound + e s e x ds x > 0, 6.3) π x π we get x PM M ) α π 0 α ln0/α) 0, 6.4) since ln0/α) > for all α 0,. In the same spirit, M 0 PM < ) = P { X σh } P X σh ), P b σ η σh ), P σ η b σh )), P η h 0, h ). According to the respective definition of h ) N, h 0, ) N see 3.) and 3.4)), and using again inequality 6.3), we obtain PM M ) on noting that N exp { } π h 0, h h 0, h ), π α κ = π /6. Since N the lemma follows, thanks to 6.4) and 6.5). ln κ α ) α κ, = απ 6κ, 6.5) PM c ) PM < ) + PM M ), Lemma 6. Let M be defined as in 3.) and 3.4) where α 0, and κ exp) are fixed values. efine the event M { B = σ η b }. 6.6) Then, for any σ 0, PB c ) α 0 + απ 3κ. 6.7) 0

21 Proof of Lemma 6. Using the definitions of M and M, simple calculations give PB c ) = PB c M) + PB c M c ), M { P σ η > b } + PM c ), M Using 3.5), Lemma 6. and 6.3), we obtain P η > ) h, + PM c ). PB c ) M π 8 ln κ α ) α κ + α 0 + απ 6κ Hence, the lemma holds true. α 0 + απ 3κ. 6.8) Lemma 6.3 Let θ E a be given. Let M be defined as in 3.) and 3.4) where α 0, and κ exp) are fixed values. Then, for any σ 0, and for any N, ) P b θ σ ln 3/ /σ) a α X 5 + α ) π κ + + C exp{ ln +τ /σ)}, for some C > 0 and 0 < τ <. Proof of Lemma 6.3. Using Lemma 6., Lemma 6. and a Taylor expansion as in Lemma 6.6 of 9, we get, for all M, where ζ b X = + σb = σb η + σ ζ η, η 8b on the even B defined in 6.6). Hence ) P b θ σ ln 3/ /σ) a X = P P σb σ η + σ ζ η ) θ σ ln 3/ /σ) a b θ η + σ 4, ζ 4 θ η 4 σ ln 3/ /σ) a.

22 Therefore ) P b θ σ ln 3/ /σ) a X P σ b θ η σ ln 3/ /σ) a B M) +P σ4 ζ 4 θ η 4 σ ln 3/ /σ) a B M) + P B M) c ), := T + T + P B M) c ). 6.9) We concentrate bellow our attention on the term T defined as T := P σ b θ η σ ln 3/ /σ) a B M). We consider the two following possible scenarios: i) a b C 0 as +, for some C 0 > 0, and ii) a b + as +. Consider first scenario i). Then, using again 6.3) { T P σ max b a η ) { P C0σ max M σ ln 3/ /σ) a η ) } σ ln 3/ /σ) B M) ), P η ln 3/4 /σ) C 0 M π C 0 ln 3/4 /σ) exp ln3/ /σ) 8C 0 ) } ) B M) ) C exp{ ln +τ /σ)}. 6.0) for some constants C, τ R +. A similar bound occurs for the term T for this scenario. Consider now the second scenario ii). Then { T P b a σ max η ) } ) a B M) { P σ max η ) } ) 4 b B M) M P σ η ) 4 b M,

23 since the sequence b ) N is non-increasing. Using 6.3), we get T M M π M exp P η ) h,m, h,m ln M α κm exp κm α h,m 8 )) ), α κ. 6.) By similar computations, we get T := P σ4 ζ 4 θ η 4 σ ln 3/ /σ) a P 84 σ 4 { P 8 4 σ 4 max.. η4 b4 M P η 8 h,m 8 M exp π h,m θ η 4 a ), B M), B M), } ) B M), ln κm α Hence, the lemma follows from Lemmas 6., 6. and 6.9)-6.). Lemma 6.4 Let Z = ν + v ω, N, )) α κ. 6.) where ω = ω ) N is a sequence of independent standard Gaussian random variables, and ν ) N, v ) N denote two given real sequences. For all N, define T = Z and Σ = v 4 + v ν. Then, for all x > 0, P T ET ) > Σx + x sup v ) ) P T ET ) < ) Σx exp x) 6.3) exp x). 6.4) Proof of Lemma 6.4 The proof is given in Lemma of 0. 3

24 6. Non-Asymptotic Upper bounds 6.. Proof of Proposition 3. By definition, α ε,σ Ψ,M ) := P θ0,bψ,m = ) = P θ0,bt,m > t α, X)). Conditionally to the sequence X = X ) N, for each M, the random variable X Y θ,0 is Gaussian with mean ν = b /X )θ,0 and standard deviation v = εx. In particular, for all N ) E θ0,bt,m X) := E θ0 Y ),b θ,0 X b = θ,0 + ε X. X X 6.5) For all N, define Σ,M := ε 4 X 4 + ε ) X b θ X,0. Applying Lemma 6.4 with T = T,M, Σ = Σ,M and x = x α/ := ln/α), we get P θ0,b T,M E θ0 T,M X) > Σ,M x α/ + ε x α/ max X ) ) X α. 6.6) Using the inequalities a + b a + b and ab a / + b / for a, b > 0, it is easily seen that ) Σ,M ε X 4 + ε X b θ,0 X ε ) X 4 + b ε max X θ,0 X ε X 4 + ε max X + ) b θ X,0. 6.7) According to 6.5)-6.7), we obtain the following bound P θ0,b T,M > + ) x α/ ) b θ,0 + ε X + Cα)ε X X 4 X α, where the constant Cα) is defined in 3.9). Since EEV W ) = EV ) for any random variables V and W, the previous inequality leads to P θ0,b T,M > + ) ) b x α/ θ,0 + ε X + Cα)ε X 4 α X. Then, by defining A = ) b θ,0 < σ ln 3/ /σ) a X, 4

25 and applying Lemma 6.3, we immediately get α ε,σ Ψ,M ) P θ0,b{t,m > t α, X)} A) + PA c ) α + α 5 + α 6κ 3π + ) + C exp{ ln +τ /σ), = 7α 0 + α ) π κ + ) + C exp{ ln +τ /σ), for some C > 0 and 0 < τ <. In particular, setting ) π κ = 5 +, there exists σ 0 0, such that, for all σ σ 0 and for each ε > 0, This concludes the proof of the proposition. α ε,σ Ψ,M ) α. 6.. Proof of Proposition 3. Let θ, θ 0 E a and θ θ 0 Θ a r ε,σ ). Then P θ,b Ψ,M = 0) = P θ,b {Ψ,M = 0} B M)) + P θ,b {Ψ,M = 0} B M) c ) := T + T. 6.8) Control of T : Using Lemma 6., Lemma 6. and elementary probabilistic arguments, we get T := P θ,b {Ψ,M = 0} B M) c ) PB M) c ) PB c ) + PM c ) α 5 + α ) π κ + β 5 + β ) π κ +, 6.9) since β > α. Control of T : efine t β/, θ, X) to be the β/-quantile of T,M, conditionally on X, i.e., P θ,b T,M t β/, θ, X) X) β. Then, by elementary probabilistic arguments, we get T := P θ,b {Ψ,M = 0} {B M}), = E P θ,b {Ψ,M = 0} X) {B M}, = E P θ,b T,M t α, X) X) {B M}, β E {t α, X) t β/, θ, X)}{B M} + E {t α, X) > t β/, θ, X)}{B M}, β + E {t α, X) > t β/, θ, X)}{B M}, β + P θ,b {t α, X) > t β/, θ, X)} {B M} ). 6.0) 5

Minimax goodness-of-fit testing in ill-posed inverse problems with partially unknown operators

Minimax goodness-of-fit testing in ill-posed inverse problems with partially unknown operators Annales de l Institut Henri Poincaré - Probabilités et Statistiques 7, Vol 53, No 4, 675 78 DOI: 4/6-AIHP768 Association des Publications de l Institut Henri Poincaré, 7 wwwimstatorg/aihp Minimax goodness-of-fit