Inverse problems in statistics

Size: px
Start display at page:

Download "Inverse problems in statistics"

Transcription

1 Inverse problems in statistics Laurent Cavalier (Université Aix-Marseille 1, France) Yale, May p. 1/35

2 Introduction There exist many fields where inverse problems appear Astronomy (Hubble satellite). Econometrics (instrumental variables). Financial mathematics (model calibration). Medical image processing (X-rays). Yale, May p. 2/35

3 Introduction There exist many fields where inverse problems appear Astronomy (Hubble satellite). Econometrics (instrumental variables). Financial mathematics (model calibration). Medical image processing (X-rays). These are problems where we have indirect observations of an object (a function) that we want to reconstruct. Yale, May p. 2/35

4 Inverse problems Let H et G be Hilbert spaces. Let A be a continuous linear operator from H into G. Yale, May p. 3/35

5 Inverse problems Let H et G be Hilbert spaces. Let A be a continuous linear operator from H into G. Given g G find f H such that Af = g. Yale, May p. 3/35

6 Inverse problems Let H et G be Hilbert spaces. Let A be a continuous linear operator from H into G. Given g G find f H such that Af = g. Solving an inverse problem Inversion of the operator A. Yale, May p. 3/35

7 Inverse problems Let H et G be Hilbert spaces. Let A be a continuous linear operator from H into G. Given g G find f H such that Af = g. Solving an inverse problem Inversion of the operator A. If A 1 is not continuous the problem is called ill-posed. Yale, May p. 3/35

8 Inverse problems Let H et G be Hilbert spaces. Let A be a continuous linear operator from H into G. Given g G find f H such that Af = g. Solving an inverse problem Inversion of the operator A. If A 1 is not continuous the problem is called ill-posed. Observe g ε a noisy version of g, then f ε = A 1 g ε could be far from f. Yale, May p. 3/35

9 Inverse problems Let H et G be Hilbert spaces. Let A be a continuous linear operator from H into G. Given g G find f H such that Af = g. Solving an inverse problem Inversion of the operator A. If A 1 is not continuous the problem is called ill-posed. Observe g ε a noisy version of g, then f ε = A 1 g ε could be far from f. Importance of the notion of noise or error. Yale, May p. 3/35

10 Linear inverse problems Let H and G two separable Hilbert spaces. Let A be a known linear bounded operator from the space H to G. Yale, May p. 4/35

11 Linear inverse problems Let H and G two separable Hilbert spaces. Let A be a known linear bounded operator from the space H to G. Let the model : Y = Af + εξ, where Y is the observation, f H unknown, A a continuous linear operator fom H into G, ξ is a white noise, ε corresponds to the noise level. Yale, May p. 4/35

12 Linear inverse problems Let H and G two separable Hilbert spaces. Let A be a known linear bounded operator from the space H to G. Let the model : Y = Af + εξ, where Y is the observation, f H unknown, A a continuous linear operator fom H into G, ξ is a white noise, ε corresponds to the noise level. Reconstruct (estimate) f with the observation Y. Yale, May p. 4/35

13 Linear inverse problems Let H and G two separable Hilbert spaces. Let A be a known linear bounded operator from the space H to G. Let the model : Y = Af + εξ, where Y is the observation, f H unknown, A a continuous linear operator fom H into G, ξ is a white noise, ε corresponds to the noise level. Reconstruct (estimate) f with the observation Y. Projection of a white noise on any orthonormal basis {ψ k } gives a sequence of i.i.d. standard Gaussian random variables. Yale, May p. 4/35

14 Discrete model of inverse problems The standard discrete sample statistical model for linear inverse problems is Y i = Af(X i ) + ξ i, i = 1,...,n, where (X 1,Y 1 ),.., (X n,y n ) are observed (we may assume X i [0, 1]), f is an unknown function in L 2 (0, 1), A is an operator from L 2 (0, 1) into L 2 (0, 1), and ξ i are i.i.d. zero-mean Gaussian random variables of variance σ 2. Yale, May p. 5/35

15 Discrete model of inverse problems The standard discrete sample statistical model for linear inverse problems is Y i = Af(X i ) + ξ i, i = 1,...,n, where (X 1,Y 1 ),.., (X n,y n ) are observed (we may assume X i [0, 1]), f is an unknown function in L 2 (0, 1), A is an operator from L 2 (0, 1) into L 2 (0, 1), and ξ i are i.i.d. zero-mean Gaussian random variables of variance σ 2. Noise level is related to number of observations by ε 1/ n. Yale, May p. 5/35

16 Singular value decomposition A major property of compact operators is that they have a discrete spectrum. Yale, May p. 6/35

17 Singular value decomposition A major property of compact operators is that they have a discrete spectrum. Suppose A A compact operator with a known basis of eigenfunctions in H: A Aϕ k = b 2 k ϕ k. Yale, May p. 6/35

18 Singular value decomposition A major property of compact operators is that they have a discrete spectrum. Suppose A A compact operator with a known basis of eigenfunctions in H: A Aϕ k = b 2 k ϕ k. Singular Value Decomposition (SVD) of A : Yale, May p. 6/35

19 Singular value decomposition A major property of compact operators is that they have a discrete spectrum. Suppose A A compact operator with a known basis of eigenfunctions in H: A Aϕ k = b 2 k ϕ k. Singular Value Decomposition (SVD) of A : Aϕ k = b k ψ k, A ψ k = b k ϕ k, where b k > 0 are the singular values, {ϕ k } o.n.b. on H, {ψ k } o.n.b. on G. Yale, May p. 6/35

20 Singular value decomposition A major property of compact operators is that they have a discrete spectrum. Suppose A A compact operator with a known basis of eigenfunctions in H: A Aϕ k = b 2 k ϕ k. Singular Value Decomposition (SVD) of A : Aϕ k = b k ψ k, A ψ k = b k ϕ k, where b k > 0 are the singular values, {ϕ k } o.n.b. on H, {ψ k } o.n.b. on G. A linear bounded compact operator between two Hilbert spaces may really be seen as an infinite matrix. Yale, May p. 6/35

21 Projection on {ψ k } Projection of Y on {ψ k } : Yale, May p. 7/35

22 Projection on {ψ k } Projection of Y on {ψ k } : Y,ψ k = Af,ψ k + ε ξ,ψ k Yale, May p. 7/35

23 Projection on {ψ k } Projection of Y on {ψ k } : Y,ψ k = f,a ψ k + ε ξ,ψ k Yale, May p. 7/35

24 Projection on {ψ k } Projection of Y on {ψ k } : Y,ψ k = f,a ψ k + ε ξ,ψ k = b k f,ϕ k + ε ξ,ψ k Yale, May p. 7/35

25 Projection on {ψ k } Projection of Y on {ψ k } : Y,ψ k = f,a ψ k + ε ξ,ψ k = b k f,ϕ k + εξ k where {ξ k } standard Gaussian sequence i.i.d., by projection of a white noise ξ on the o.n.b. {ψ k }. Yale, May p. 7/35

26 Sequence space model Equivalent Sequence space model y k = b k θ k + εξ k, k = 1, 2,..., where {θ k } coefficients of f, ξ k N(0, 1) i.i.d., b k 0 singular values. Yale, May p. 8/35

27 Sequence space model Equivalent Sequence space model y k = b k θ k + εξ k, k = 1, 2,..., where {θ k } coefficients of f, ξ k N(0, 1) i.i.d., b k 0 singular values. Estimate θ = {θ k } with the observation Y = {Y k }. Use L 2 risk, it is equivalent to estimate f. Yale, May p. 8/35

28 Sequence space model Equivalent Sequence space model y k = b k θ k + εξ k, k = 1, 2,..., where {θ k } coefficients of f, ξ k N(0, 1) i.i.d., b k 0 singular values. Estimate θ = {θ k } with the observation Y = {Y k }. Use L 2 risk, it is equivalent to estimate f. Remark that b k 0 weaken the signal θ k. Yale, May p. 8/35

29 Sequence space model Equivalent Sequence space model y k = b k θ k + εξ k, k = 1, 2,..., where {θ k } coefficients of f, ξ k N(0, 1) i.i.d., b k 0 singular values. Estimate θ = {θ k } with the observation Y = {Y k }. Use L 2 risk, it is equivalent to estimate f. Remark that b k 0 weaken the signal θ k. Ill-posed problem. Yale, May p. 8/35

30 Inversion We have to invert in some sense the operator A. Yale, May p. 9/35

31 Inversion We have to invert in some sense the operator A. Thus, we obtain the model : where σ k = b 1 k. X k = b 1 k y k = θ k + εσ k ξ k, k = 1, 2,... Yale, May p. 9/35

32 Inversion We have to invert in some sense the operator A. Thus, we obtain the model : X k = b 1 k y k = θ k + εσ k ξ k, k = 1, 2,... where σ k = b 1 k. In the case where the problem is ill-posed the variance term grows to infinity. Yale, May p. 9/35

33 Inversion We have to invert in some sense the operator A. Thus, we obtain the model : X k = b 1 k y k = θ k + εσ k ξ k, k = 1, 2,... where σ k = b 1 k. In the case where the problem is ill-posed the variance term grows to infinity. In this model the aim is to estimate {θ k } by use of {X k }. When k is large the noise in X k may then be very large, making the estimation difficult. Yale, May p. 9/35

34 Inversion We have to invert in some sense the operator A. Thus, we obtain the model : X k = b 1 k y k = θ k + εσ k ξ k, k = 1, 2,... where σ k = b 1 k. In the case where the problem is ill-posed the variance term grows to infinity. In this model the aim is to estimate {θ k } by use of {X k }. When k is large the noise in X k may then be very large, making the estimation difficult. (see Donoho (1995), Mair and Ruymgaart (1996), Johnstone (1999) and C. and Tsybakov (2002)...). Yale, May p. 9/35

35 Difficulty of inverse problems Yale, May p. 10/35

36 Difficulty of inverse problems σ k 1 : Direct problem. Yale, May p. 10/35

37 Difficulty of inverse problems σ k 1 : Direct problem. σ k k β, β > 0 : Mildly ill-posed problem. Yale, May p. 10/35

38 Difficulty of inverse problems σ k 1 : Direct problem. σ k k β, β > 0 : Mildly ill-posed problem. σ k exp(βk), β > 0 : Severely ill-posed problem. Yale, May p. 10/35

39 Difficulty of inverse problems σ k 1 : Direct problem. σ k k β, β > 0 : Mildly ill-posed problem. σ k exp(βk), β > 0 : Severely ill-posed problem. Parameter β is called degree of ill-posedness. Yale, May p. 10/35

40 Examples There exist many examples of operators for which the SVD is known: Yale, May p. 11/35

41 Examples There exist many examples of operators for which the SVD is known: Convolution. Yale, May p. 11/35

42 Examples There exist many examples of operators for which the SVD is known: Convolution. Tomography. Yale, May p. 11/35

43 Examples There exist many examples of operators for which the SVD is known: Convolution. Tomography. Instrumental variables. Yale, May p. 11/35

44 Circular convolution The framework of deconvolution is perhaps one of the most well-known inverse problem. It is used in many applications as econometrics, physics, astronomy, medical image processing. For example, it corresponds to the problem of a blurred signal that one wants to recover from indirect data. Yale, May p. 12/35

45 Circular convolution The framework of deconvolution is perhaps one of the most well-known inverse problem. It is used in many applications as econometrics, physics, astronomy, medical image processing. For example, it corresponds to the problem of a blurred signal that one wants to recover from indirect data. Consider the following convolution operator Af(t) = r f(t) = 1 0 r(t x)f(x)dx, x [0, 1], where r is a known 1-periodic symetric real convolution kernel in L 2 [0, 1]. In this model, A is a linear bounded self-adjoint operator from L 2 [0, 1] to L 2 [0, 1]. Yale, May p. 12/35

46 Blurred cameraman (a) (b) Yale, May p. 13/35

47 Convolution model Define then the following model Y (t) = r f(t) + ε ξ(t), x [0, 1], where Y is observed, f is an unknown periodic function in L 2 [0, 1] and ξ(t) is a white noise on L 2 [0, 1]. Yale, May p. 14/35

48 Convolution model Define then the following model Y (t) = r f(t) + ε ξ(t), x [0, 1], where Y is observed, f is an unknown periodic function in L 2 [0, 1] and ξ(t) is a white noise on L 2 [0, 1]. The SVD basis is then clearly here the Fourier basis {ϕ k (t)}. Yale, May p. 14/35

49 Convolution model Define then the following model Y (t) = r f(t) + ε ξ(t), x [0, 1], where Y is observed, f is an unknown periodic function in L 2 [0, 1] and ξ(t) is a white noise on L 2 [0, 1]. The SVD basis is then clearly here the Fourier basis {ϕ k (t)}. We make the projection on {ϕ k (t)}, in the Fourier domain, and obtain y k = b k θ k + εξ k, where b k = r(x) cos(2πkx)dx for even k, θ k are the Fourier coefficients of f, and ξ k are i.i.d. N(0, 1). Yale, May p. 14/35

50 Tomography scan Yale, May p. 15/35

51 Instrumental variables An economic relationship between a response variable Y and a vector X of explanatory variables is represented by Y i = f(x i ) + U i, i = 1,...,n, where f has to be estimated and U i are the errors. Yale, May p. 16/35

52 Instrumental variables An economic relationship between a response variable Y and a vector X of explanatory variables is represented by Y i = f(x i ) + U i, i = 1,...,n, where f has to be estimated and U i are the errors. This model does not characterize the function f if U is not constrained. The problem is solved if E(U X) = 0. Yale, May p. 16/35

53 Instrumental variables An economic relationship between a response variable Y and a vector X of explanatory variables is represented by Y i = f(x i ) + U i, i = 1,...,n, where f has to be estimated and U i are the errors. This model does not characterize the function f if U is not constrained. The problem is solved if E(U X) = 0. In many structural econometrics models some components of X are endogeneous. Yale, May p. 16/35

54 Instrumental variables An economic relationship between a response variable Y and a vector X of explanatory variables is represented by Y i = f(x i ) + U i, i = 1,...,n, where f has to be estimated and U i are the errors. This model does not characterize the function f if U is not constrained. The problem is solved if E(U X) = 0. In many structural econometrics models some components of X are endogeneous. If Y denotes wages and X, level of education, among other variables. The error U includes, ability, not observed, but influences wages. Yale, May p. 16/35

55 Instrumental variables An economic relationship between a response variable Y and a vector X of explanatory variables is represented by Y i = f(x i ) + U i, i = 1,...,n, where f has to be estimated and U i are the errors. This model does not characterize the function f if U is not constrained. The problem is solved if E(U X) = 0. In many structural econometrics models some components of X are endogeneous. If Y denotes wages and X, level of education, among other variables. The error U includes, ability, not observed, but influences wages. High ability tends to have high level of education, then education and ability are correlated, and thus X and U also. Yale, May p. 16/35

56 Instrumental variables Nevertheless, suppose that we observe another set of data, W i where W is called an instrumental variable for which E(U W) = E(Y f(x) W) = 0. Yale, May p. 17/35

57 Instrumental variables Nevertheless, suppose that we observe another set of data, W i where W is called an instrumental variable for which E(U W) = E(Y f(x) W) = 0. This equation characterizes f by a Fredholm equation of the first kind. Estimation of the function f is in fact an ill-posed inverse problems. Yale, May p. 17/35

58 Instrumental variables Nevertheless, suppose that we observe another set of data, W i where W is called an instrumental variable for which E(U W) = E(Y f(x) W) = 0. This equation characterizes f by a Fredholm equation of the first kind. Estimation of the function f is in fact an ill-posed inverse problems. Not exactly our model of Gaussian white noise, but closely related. Yale, May p. 17/35

59 Instrumental variables Nevertheless, suppose that we observe another set of data, W i where W is called an instrumental variable for which E(U W) = E(Y f(x) W) = 0. This equation characterizes f by a Fredholm equation of the first kind. Estimation of the function f is in fact an ill-posed inverse problems. Not exactly our model of Gaussian white noise, but closely related. Inverse problems have been the topic of many articles in the econometrics literature, see Florens (2003), Hall and Horowitz (2005), Chen and Reiss (2009). Yale, May p. 17/35

60 Inverse problem and sequence space Let the model : Y = Af + εξ, where Y is the observation, f H unknown, A a continuous linear compact operator fom H into G, ξ is a white noise, ε is the noise level. Yale, May p. 18/35

61 Inverse problem and sequence space Let the model : Y = Af + εξ, where Y is the observation, f H unknown, A a continuous linear compact operator fom H into G, ξ is a white noise, ε is the noise level. By using the SVD, we obtain the equivalent sequence space model : X k = θ k + εσ k ξ k, k = 1, 2,.... where σ k. Yale, May p. 18/35

62 Inverse problem and sequence space Let the model : Y = Af + εξ, where Y is the observation, f H unknown, A a continuous linear compact operator fom H into G, ξ is a white noise, ε is the noise level. By using the SVD, we obtain the equivalent sequence space model : X k = θ k + εσ k ξ k, k = 1, 2,.... where σ k. The aim is to estimate (reconstruct) the function f (or the sequence {θ k }) by use of observations. Yale, May p. 18/35

63 Linear estimators Consider here a specific family of estimators. Yale, May p. 19/35

64 Linear estimators Consider here a specific family of estimators. Let λ = (λ 1,λ 2,...) be a sequence of nonrandom weights. Every sequence λ defines a linear estimator ˆθ(λ) = (ˆθ 1, ˆθ 2,...), ˆθk = λ k X k Yale, May p. 19/35

65 Linear estimators Consider here a specific family of estimators. Let λ = (λ 1,λ 2,...) be a sequence of nonrandom weights. Every sequence λ defines a linear estimator ˆθ(λ) = (ˆθ 1, ˆθ 2,...), ˆθk = λ k X k and ˆf(λ) = ˆθ k ϕ k. k=1 Yale, May p. 19/35

66 Linear estimators Consider here a specific family of estimators. Let λ = (λ 1,λ 2,...) be a sequence of nonrandom weights. Every sequence λ defines a linear estimator ˆθ(λ) = (ˆθ 1, ˆθ 2,...), ˆθk = λ k X k and ˆf(λ) = ˆθ k ϕ k. k=1 The L 2 risk of a linear estimator is E ˆf(λ) f 2 = R(θ,λ) = (1 λ k ) 2 θk 2 + ε2 k=1 k=1 σ 2 k λ2 k. Yale, May p. 19/35

67 Classes of linear estimators Projection estimators (spectral cut-off), λ k = I(k N), N > 0. Yale, May p. 20/35

68 Classes of linear estimators Projection estimators (spectral cut-off), λ k = I(k N), N > 0. Tikhonov regularization (penalized), λ k = γσk 2α, α 1, γ > 0. Yale, May p. 20/35

69 Classes of linear estimators Projection estimators (spectral cut-off), λ k = I(k N), N > 0. Tikhonov regularization (penalized), λ k = γσk 2α, α 1, γ > 0. Landweber iteration, λ k = 1 (1 σ 2 k )n,n N. Yale, May p. 20/35

70 Classes of linear estimators Projection estimators (spectral cut-off), λ k = I(k N), N > 0. Tikhonov regularization (penalized), λ k = γσk 2α, α 1, γ > 0. Landweber iteration, Choice of N,γ or n? λ k = 1 (1 σ 2 k )n,n N. Yale, May p. 20/35

71 Ellipsoid of coefficients Assume that f belongs to functional class corresponding to ellipsoids Θ in space of coefficients {θ k }: { } Θ = Θ(a,L) = θ : a 2 k θ2 k L, k=1 where a = {a k }, where a k > 0,a k and L > 0. Yale, May p. 21/35

72 Ellipsoid of coefficients Assume that f belongs to functional class corresponding to ellipsoids Θ in space of coefficients {θ k }: Θ = Θ(a,L) = { θ : } a 2 k θ2 k L, k=1 where a = {a k }, where a k > 0,a k and L > 0. For large values of k coefficients θ k will be decreasing with k and then be small. Yale, May p. 21/35

73 Ellipsoid of coefficients Assume that f belongs to functional class corresponding to ellipsoids Θ in space of coefficients {θ k }: Θ = Θ(a,L) = { θ : } a 2 k θ2 k L, k=1 where a = {a k }, where a k > 0,a k and L > 0. For large values of k coefficients θ k will be decreasing with k and then be small. Assumptions on coefficients θ k usually related to properties (smoothness) on f. Yale, May p. 21/35

74 Sobolev classes Introduce the Sobolev classes { W(α,L) = f = k=1 θ k ϕ k : θ Θ(α,L) } Yale, May p. 22/35

75 Sobolev classes Introduce the Sobolev classes { W(α,L) = f = k=1 θ k ϕ k : θ Θ(α,L) } where Θ(α,L) = Θ(a,L) with a = {a k } polynomial such that a 1 = 0 and a k = { (k 1) α for k odd, k α for k even, where α > 0, L > 0. We have also k = 2, 3,..., Yale, May p. 22/35

76 Sobolev classes Introduce the Sobolev classes { W(α,L) = f = k=1 θ k ϕ k : θ Θ(α,L) } where Θ(α,L) = Θ(a,L) with a = {a k } polynomial such that a 1 = 0 and a k = { (k 1) α for k odd, k α for k even, where α > 0, L > 0. We have also W(α,L) = { f periodic : 1 0 k = 2, 3,..., } (f (α) (t)) 2 dt π 2α L. Yale, May p. 22/35

77 Rates of convergence Function f has Fourier coefficients in some ellipsoid, and the problem is mildly, severely ill-posed or even direct. Rates appear in the following table : Yale, May p. 23/35

78 Rates of convergence Function f has Fourier coefficients in some ellipsoid, and the problem is mildly, severely ill-posed or even direct. Rates appear in the following table : Problem/Functions Direct problem Mildly ill-posed Severely ill-posed Sobolev ε 4α 2α+1 ε 4α 2α+2β+1 (log 1 ε ) 2α Yale, May p. 23/35

79 Comments Rates usually depend on smoothness α of function f and on degree of ill-posedness β. Yale, May p. 24/35

80 Comments Rates usually depend on smoothness α of function f and on degree of ill-posedness β. When β increases rates are slower. Yale, May p. 24/35

81 Comments Rates usually depend on smoothness α of function f and on degree of ill-posedness β. When β increases rates are slower. In direct model, standard rates for nonparametric estimation. For example, 2α/(2α + 1) with Sobolev classes. Yale, May p. 24/35

82 Comments To attain optimal rate with projection estimator, choose N corresponding to optimal trade-off between bias and variance. Yale, May p. 25/35

83 Comments To attain optimal rate with projection estimator, choose N corresponding to optimal trade-off between bias and variance. In minimax sense, optimal choice for N. However, choice depends on smoothness α and on degree of ill-posedness β. Yale, May p. 25/35

84 Comments To attain optimal rate with projection estimator, choose N corresponding to optimal trade-off between bias and variance. In minimax sense, optimal choice for N. However, choice depends on smoothness α and on degree of ill-posedness β. Even if operator A (and its degree β) is known, no real meaning to consider smoothness of f as known. Yale, May p. 25/35

85 Comments To attain optimal rate with projection estimator, choose N corresponding to optimal trade-off between bias and variance. In minimax sense, optimal choice for N. However, choice depends on smoothness α and on degree of ill-posedness β. Even if operator A (and its degree β) is known, no real meaning to consider smoothness of f as known. Notion of adaptation and oracle inequalities, i.e. how to choose bandwidth N without prior assumptions on f. Yale, May p. 25/35

86 Oracle Consider now a linked, but different point of view. Assume that a class of estimators is fixed, i.e. that the class of possible weights λ Λ is given (projection, Tikhonov,...). Yale, May p. 26/35

87 Oracle Consider now a linked, but different point of view. Assume that a class of estimators is fixed, i.e. that the class of possible weights λ Λ is given (projection, Tikhonov,...). Define the oracle λ 0 as R(θ,λ 0 ) = inf λ Λ R(θ,λ). The oracle corresponds to the best possible choice in Λ, i.e. the one which minimizes the risk. Yale, May p. 26/35

88 Oracle Consider now a linked, but different point of view. Assume that a class of estimators is fixed, i.e. that the class of possible weights λ Λ is given (projection, Tikhonov,...). Define the oracle λ 0 as R(θ,λ 0 ) = inf λ Λ R(θ,λ). The oracle corresponds to the best possible choice in Λ, i.e. the one which minimizes the risk. However, this is not an estimator since the risk depends on the unknown θ, the oracle will depend also. Yale, May p. 26/35

89 Oracle Consider now a linked, but different point of view. Assume that a class of estimators is fixed, i.e. that the class of possible weights λ Λ is given (projection, Tikhonov,...). Define the oracle λ 0 as R(θ,λ 0 ) = inf λ Λ R(θ,λ). The oracle corresponds to the best possible choice in Λ, i.e. the one which minimizes the risk. However, this is not an estimator since the risk depends on the unknown θ, the oracle will depend also. An oracle is the best in the family, but it knows the true θ. Yale, May p. 26/35

90 Unbiased risk estimation A very natural idea in statistics is to estimate this unknown risk using the available data, and then to minimize this estimator of the risk. Yale, May p. 27/35

91 Unbiased risk estimation A very natural idea in statistics is to estimate this unknown risk using the available data, and then to minimize this estimator of the risk. A classical approach to this minimization problem is based on the principle of unbiased risk estimation (URE) (Stein (1981)). Yale, May p. 27/35

92 Unbiased risk estimation A very natural idea in statistics is to estimate this unknown risk using the available data, and then to minimize this estimator of the risk. A classical approach to this minimization problem is based on the principle of unbiased risk estimation (URE) (Stein (1981)). This method goes back to Akaike Information Criteria (AIC) in Akaike (1973) and Mallows C p (1973). Yale, May p. 27/35

93 Unbiased risk estimation A very natural idea in statistics is to estimate this unknown risk using the available data, and then to minimize this estimator of the risk. A classical approach to this minimization problem is based on the principle of unbiased risk estimation (URE) (Stein (1981)). This method goes back to Akaike Information Criteria (AIC) in Akaike (1973) and Mallows C p (1973). Originally, the URE was in the context of regression estimation. Nowadays, it is a basic adaptation tool for many statistical models. Yale, May p. 27/35

94 Unbiased risk estimation A very natural idea in statistics is to estimate this unknown risk using the available data, and then to minimize this estimator of the risk. A classical approach to this minimization problem is based on the principle of unbiased risk estimation (URE) (Stein (1981)). This method goes back to Akaike Information Criteria (AIC) in Akaike (1973) and Mallows C p (1973). Originally, the URE was in the context of regression estimation. Nowadays, it is a basic adaptation tool for many statistical models. This idea appears also in all the cross-validation techniques. Yale, May p. 27/35

95 URE in inverse problems For inverse problems, this method was studied in C., Golubev, Picard and Tsybakov (2002), where exact oracle inequalities were obtained. Yale, May p. 28/35

96 URE in inverse problems For inverse problems, this method was studied in C., Golubev, Picard and Tsybakov (2002), where exact oracle inequalities were obtained. In this setting, the functional U(X,λ) = (1 λ k ) 2 (X 2 k ε2 σ 2 k ) + ε2 σ 2 k λ2 k k=1 k=1 Yale, May p. 28/35

97 URE in inverse problems For inverse problems, this method was studied in C., Golubev, Picard and Tsybakov (2002), where exact oracle inequalities were obtained. In this setting, the functional U(X,λ) = (1 λ k ) 2 (X 2 k ε2 σ 2 k ) + ε2 σ 2 k λ2 k k=1 k=1 is an unbiased estimator of R(θ,λ): R(θ,λ) = E θ U(X,λ), λ. Yale, May p. 28/35

98 Data-driven choice Unbiased risk estimation suggests to minimize over λ Λ the functional U(X,λ) in place of R(θ,λ). Yale, May p. 29/35

99 Data-driven choice Unbiased risk estimation suggests to minimize over λ Λ the functional U(X,λ) in place of R(θ,λ). This leads to the following data-driven choice of λ: λ = arg min λ Λ U(X,λ). Yale, May p. 29/35

100 Data-driven choice Unbiased risk estimation suggests to minimize over λ Λ the functional U(X,λ) in place of R(θ,λ). This leads to the following data-driven choice of λ: λ = arg min λ Λ U(X,λ). Define then the estimator θ by θ k = λ k X k. Yale, May p. 29/35

101 Assumptions Denote S = ( maxλ Λ k=1 σ4 k λ2 k min λ Λ k=1 σ4 k λ2 k ) 1/2. Let the following assumptions hold. Yale, May p. 30/35

102 Assumptions Denote S = ( maxλ Λ k=1 σ4 k λ2 k min λ Λ k=1 σ4 k λ2 k ) 1/2. Let the following assumptions hold. For any λ Λ 0 < k=1 σ 2 k λ2 k <, max sup λ k 1. λ Λ k Yale, May p. 30/35

103 Assumptions Denote S = ( maxλ Λ k=1 σ4 k λ2 k min λ Λ k=1 σ4 k λ2 k ) 1/2. Let the following assumptions hold. For any λ Λ 0 < k=1 σ 2 k λ2 k <, max sup λ k 1. λ Λ k There exists a constant C 1 > 0 such that, uniformly in λ Λ, σ 4 k λ2 k C 1 σ 4 k λ4 k. k=1 k=1 Yale, May p. 30/35

104 Oracle inequality for URE Theorem. Suppose σ k k β, β 0. Assume that Λ is finite with cardinality D and belongs to the family of Projection, Tikhonov or Pinsker estimators. There exist constants γ,c > 0 such that θ l 2, we have for B large enough, E θ θ θ 2 (1 + γb 1 ) min λ Λ R(θ,λ) + BC ε 2 (log(ds)) 2β+1. Yale, May p. 31/35

105 Oracle inequality for URE Theorem. Suppose σ k k β, β 0. Assume that Λ is finite with cardinality D and belongs to the family of Projection, Tikhonov or Pinsker estimators. There exist constants γ,c > 0 such that θ l 2, we have for B large enough, E θ θ θ 2 (1 + γb 1 ) min λ Λ R(θ,λ) + BC ε 2 (log(ds)) 2β+1. The data-driven choice by URE mimics the oracle. Yale, May p. 31/35

106 Simulations Discrete model : inverse problem. Y (i) = g f ( i m ) + ε mξ(i), i = 1,...,m, Yale, May p. 32/35

107 Simulations Discrete model : inverse problem. where Y (i) = g f ( i m ) + ε mξ(i), i = 1,...,m, f(t) = 0.5n(t, 0.4, 0.12) + 0.5n(t, 0.7, 0.08), and g(t) = exp( 10 t 0.5 ), β 2. Yale, May p. 32/35

108 Simulations Discrete model : inverse problem. where and Y (i) = g f ( i m ) + ε mξ(i), i = 1,...,m, f(t) = 0.5n(t, 0.4, 0.12) + 0.5n(t, 0.7, 0.08), g(t) = exp( 10 t 0.5 ), β 2. Here m = 1000 et ε 2 = Signal/Noise = 100. Estimator by truncated Fourier series. Yale, May p. 32/35

109 Simulations Discrete model : inverse problem. where and Y (i) = g f ( i m ) + ε mξ(i), i = 1,...,m, f(t) = 0.5n(t, 0.4, 0.12) + 0.5n(t, 0.7, 0.08), g(t) = exp( 10 t 0.5 ), β 2. Here m = 1000 et ε 2 = Signal/Noise = 100. Estimator by truncated Fourier series. With ε 2 k N σ 2 k 1/ log(1/ε). Yale, May p. 32/35

110 Simulations Discrete model : inverse problem. where and Y (i) = g f ( i m ) + ε mξ(i), i = 1,...,m, f(t) = 0.5n(t, 0.4, 0.12) + 0.5n(t, 0.7, 0.08), g(t) = exp( 10 t 0.5 ), β 2. Here m = 1000 et ε 2 = Signal/Noise = 100. Estimator by truncated Fourier series. With ε 2 k N σ 2 k 1/ log(1/ε). Yale, May p. 32/35

111 True function f. Estimator f. 3 Estimation de f Yale, May p. 33/35

112 Oracle by projection. Estimator f Risque Quadratique Signal/Bruit Yale, May p. 34/35

113 Comments Simulations correspond more or less to theory. Yale, May p. 35/35

114 Comments Simulations correspond more or less to theory. Limitation on the size of the family. Yale, May p. 35/35

115 Comments Simulations correspond more or less to theory. Limitation on the size of the family. Method not always stable enough. Yale, May p. 35/35

116 Comments Simulations correspond more or less to theory. Limitation on the size of the family. Method not always stable enough. Need for stronger penalties than the URE penalty (or AIC). Yale, May p. 35/35

117 Comments Simulations correspond more or less to theory. Limitation on the size of the family. Method not always stable enough. Need for stronger penalties than the URE penalty (or AIC). Different method called Risk Hull Method, defined in C. and Golubev (2006). Yale, May p. 35/35

Inverse problems in statistics

Inverse problems in statistics Inverse problems in statistics Laurent Cavalier (Université Aix-Marseille 1, France) YES, Eurandom, 10 October 2011 p. 1/27 Table of contents YES, Eurandom, 10 October 2011 p. 2/27 Table of contents 1)

More information

Inverse problems in statistics

Inverse problems in statistics Inverse problems in statistics Laurent Cavalier (Université Aix-Marseille 1, France) YES, Eurandom, 10 October 2011 p. 1/32 Part II 2) Adaptation and oracle inequalities YES, Eurandom, 10 October 2011

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

D I S C U S S I O N P A P E R

D I S C U S S I O N P A P E R I N S T I T U T D E S T A T I S T I Q U E B I O S T A T I S T I Q U E E T S C I E N C E S A C T U A R I E L L E S ( I S B A ) UNIVERSITÉ CATHOLIQUE DE LOUVAIN D I S C U S S I O N P A P E R 2014/06 Adaptive

More information

Statistical Inverse Problems and Instrumental Variables

Statistical Inverse Problems and Instrumental Variables Statistical Inverse Problems and Instrumental Variables Thorsten Hohage Institut für Numerische und Angewandte Mathematik University of Göttingen Workshop on Inverse and Partial Information Problems: Methodology

More information

Empirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods

Empirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods Empirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods Frank Werner 1 Statistical Inverse Problems in Biophysics Group Max Planck Institute for Biophysical Chemistry,

More information

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS Olivier Scaillet a * This draft: July 2016. Abstract This note shows that adding monotonicity or convexity

More information

Unbiased Risk Estimation as Parameter Choice Rule for Filter-based Regularization Methods

Unbiased Risk Estimation as Parameter Choice Rule for Filter-based Regularization Methods Unbiased Risk Estimation as Parameter Choice Rule for Filter-based Regularization Methods Frank Werner 1 Statistical Inverse Problems in Biophysics Group Max Planck Institute for Biophysical Chemistry,

More information

Minimax Goodness-of-Fit Testing in Ill-Posed Inverse Problems with Partially Unknown Operators

Minimax Goodness-of-Fit Testing in Ill-Posed Inverse Problems with Partially Unknown Operators Minimax Goodness-of-Fit Testing in Ill-Posed Inverse Problems with Partially Unknown Operators Clément Marteau, Institut Camille Jordan, Université Lyon I - Claude Bernard, 43 boulevard du novembre 98,

More information

A Lower Bound Theorem. Lin Hu.

A Lower Bound Theorem. Lin Hu. American J. of Mathematics and Sciences Vol. 3, No -1,(January 014) Copyright Mind Reader Publications ISSN No: 50-310 A Lower Bound Theorem Department of Applied Mathematics, Beijing University of Technology,

More information

The Stein hull. Clément Marteau* Institut de Mathématiques, Université de Toulouse, INSA - 135, Avenue de Rangueil, F Toulouse Cedex 4, France

The Stein hull. Clément Marteau* Institut de Mathématiques, Université de Toulouse, INSA - 135, Avenue de Rangueil, F Toulouse Cedex 4, France Journal of Nonparametric Statistics Vol. 22, No. 6, August 2010, 685 702 The Stein hull Clément Marteau* Institut de Mathématiques, Université de Toulouse, INSA - 135, Avenue de Rangueil, F-31 077 Toulouse

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 11, 2009 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Inverse Statistical Learning

Inverse Statistical Learning Inverse Statistical Learning Minimax theory, adaptation and algorithm avec (par ordre d apparition) C. Marteau, M. Chichignoud, C. Brunet and S. Souchet Dijon, le 15 janvier 2014 Inverse Statistical Learning

More information

Singular value decomposition. If only the first p singular values are nonzero we write. U T o U p =0

Singular value decomposition. If only the first p singular values are nonzero we write. U T o U p =0 Singular value decomposition If only the first p singular values are nonzero we write G =[U p U o ] " Sp 0 0 0 # [V p V o ] T U p represents the first p columns of U U o represents the last N-p columns

More information

Minimax Risk: Pinsker Bound

Minimax Risk: Pinsker Bound Minimax Risk: Pinsker Bound Michael Nussbaum Cornell University From: Encyclopedia of Statistical Sciences, Update Volume (S. Kotz, Ed.), 1999. Wiley, New York. Abstract We give an account of the Pinsker

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 12, 2007 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Convergence rates of spectral methods for statistical inverse learning problems

Convergence rates of spectral methods for statistical inverse learning problems Convergence rates of spectral methods for statistical inverse learning problems G. Blanchard Universtität Potsdam UCL/Gatsby unit, 04/11/2015 Joint work with N. Mücke (U. Potsdam); N. Krämer (U. München)

More information

Is there an optimal weighting for linear inverse problems?

Is there an optimal weighting for linear inverse problems? Is there an optimal weighting for linear inverse problems? Jean-Pierre FLORENS Toulouse School of Economics Senay SOKULLU University of Bristol October 9, 205 Abstract This paper considers linear equations

More information

Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan

Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania Professors Antoniadis and Fan are

More information

A talk on Oracle inequalities and regularization. by Sara van de Geer

A talk on Oracle inequalities and regularization. by Sara van de Geer A talk on Oracle inequalities and regularization by Sara van de Geer Workshop Regularization in Statistics Banff International Regularization Station September 6-11, 2003 Aim: to compare l 1 and other

More information

Linear Inverse Problems

Linear Inverse Problems Linear Inverse Problems Ajinkya Kadu Utrecht University, The Netherlands February 26, 2018 Outline Introduction Least-squares Reconstruction Methods Examples Summary Introduction 2 What are inverse problems?

More information

Statistical Geometry Processing Winter Semester 2011/2012

Statistical Geometry Processing Winter Semester 2011/2012 Statistical Geometry Processing Winter Semester 2011/2012 Linear Algebra, Function Spaces & Inverse Problems Vector and Function Spaces 3 Vectors vectors are arrows in space classically: 2 or 3 dim. Euclidian

More information

Statistical inference on Lévy processes

Statistical inference on Lévy processes Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline

More information

Instrumental Variables Estimation and Other Inverse Problems in Econometrics. February Jean-Pierre Florens (TSE)

Instrumental Variables Estimation and Other Inverse Problems in Econometrics. February Jean-Pierre Florens (TSE) Instrumental Variables Estimation and Other Inverse Problems in Econometrics February 2011 Jean-Pierre Florens (TSE) 2 I - Introduction Econometric model: Relation between Y, Z and U Y, Z observable random

More information

Gaussian model selection

Gaussian model selection J. Eur. Math. Soc. 3, 203 268 (2001) Digital Object Identifier (DOI) 10.1007/s100970100031 Lucien Birgé Pascal Massart Gaussian model selection Received February 1, 1999 / final version received January

More information

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 27 Introduction Fredholm first kind integral equation of convolution type in one space dimension: g(x) = 1 k(x x )f(x

More information

Satellite image deconvolution using complex wavelet packets

Satellite image deconvolution using complex wavelet packets Satellite image deconvolution using complex wavelet packets André Jalobeanu, Laure Blanc-Féraud, Josiane Zerubia ARIANA research group INRIA Sophia Antipolis, France CNRS / INRIA / UNSA www.inria.fr/ariana

More information

Endogeneity in non separable models. Application to treatment models where the outcomes are durations

Endogeneity in non separable models. Application to treatment models where the outcomes are durations Endogeneity in non separable models. Application to treatment models where the outcomes are durations J.P. Florens First Draft: December 2004 This version: January 2005 Preliminary and Incomplete Version

More information

Adaptive estimation of functionals in nonparametric instrumental regression.

Adaptive estimation of functionals in nonparametric instrumental regression. Adaptive estimation of functionals in nonparametric instrumental regression. Christoph Breunig Universität Mannheim Jan Johannes Université catholique de Louvain May 22, 2013 We consider the problem of

More information

OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1

OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1 The Annals of Statistics 1997, Vol. 25, No. 6, 2512 2546 OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1 By O. V. Lepski and V. G. Spokoiny Humboldt University and Weierstrass Institute

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

OPTIMAL UNIFORM CONVERGENCE RATES FOR SIEVE NONPARAMETRIC INSTRUMENTAL VARIABLES REGRESSION. Xiaohong Chen and Timothy Christensen.

OPTIMAL UNIFORM CONVERGENCE RATES FOR SIEVE NONPARAMETRIC INSTRUMENTAL VARIABLES REGRESSION. Xiaohong Chen and Timothy Christensen. OPTIMAL UNIFORM CONVERGENCE RATES FOR SIEVE NONPARAMETRIC INSTRUMENTAL VARIABLES REGRESSION By Xiaohong Chen and Timothy Christensen November 2013 COWLES FOUNDATION DISCUSSION PAPER NO. 1923 COWLES FOUNDATION

More information

Inverse problem and optimization

Inverse problem and optimization Inverse problem and optimization Laurent Condat, Nelly Pustelnik CNRS, Gipsa-lab CNRS, Laboratoire de Physique de l ENS de Lyon Decembre, 15th 2016 Inverse problem and optimization 2/36 Plan 1. Examples

More information

Nonparametric estimation using wavelet methods. Dominique Picard. Laboratoire Probabilités et Modèles Aléatoires Université Paris VII

Nonparametric estimation using wavelet methods. Dominique Picard. Laboratoire Probabilités et Modèles Aléatoires Université Paris VII Nonparametric estimation using wavelet methods Dominique Picard Laboratoire Probabilités et Modèles Aléatoires Université Paris VII http ://www.proba.jussieu.fr/mathdoc/preprints/index.html 1 Nonparametric

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 9, 2011 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Ill-Posedness of Backward Heat Conduction Problem 1

Ill-Posedness of Backward Heat Conduction Problem 1 Ill-Posedness of Backward Heat Conduction Problem 1 M.THAMBAN NAIR Department of Mathematics, IIT Madras Chennai-600 036, INDIA, E-Mail mtnair@iitm.ac.in 1. Ill-Posedness of Inverse Problems Problems that

More information

Spectral Regularization

Spectral Regularization Spectral Regularization Lorenzo Rosasco 9.520 Class 07 February 27, 2008 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

PDEs in Image Processing, Tutorials

PDEs in Image Processing, Tutorials PDEs in Image Processing, Tutorials Markus Grasmair Vienna, Winter Term 2010 2011 Direct Methods Let X be a topological space and R: X R {+ } some functional. following definitions: The mapping R is lower

More information

Regularization methods for large-scale, ill-posed, linear, discrete, inverse problems

Regularization methods for large-scale, ill-posed, linear, discrete, inverse problems Regularization methods for large-scale, ill-posed, linear, discrete, inverse problems Silvia Gazzola Dipartimento di Matematica - Università di Padova January 10, 2012 Seminario ex-studenti 2 Silvia Gazzola

More information

Bootstrap tuning in model choice problem

Bootstrap tuning in model choice problem Weierstrass Institute for Applied Analysis and Stochastics Bootstrap tuning in model choice problem Vladimir Spokoiny, (with Niklas Willrich) WIAS, HU Berlin, MIPT, IITP Moscow SFB 649 Motzen, 17.07.2015

More information

Direct estimation of linear functionals from indirect noisy observations

Direct estimation of linear functionals from indirect noisy observations Direct estimation of linear functionals from indirect noisy observations Peter Mathé Weierstraß Institute for Applied Analysis and Stochastics, Mohrenstraße 39, D 10117 Berlin, Germany E-mail: mathe@wias-berlin.de

More information

Optimal design for inverse problems

Optimal design for inverse problems School of Mathematics and Statistical Sciences Research Institute University of Southampton, UK Joint work with Nicolai Bissantz, Holger Dette (both Ruhr-Universität Bochum) and Edmund Jones (University

More information

Regularization via Spectral Filtering

Regularization via Spectral Filtering Regularization via Spectral Filtering Lorenzo Rosasco MIT, 9.520 Class 7 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

An Overview of Sparsity with Applications to Compression, Restoration, and Inverse Problems

An Overview of Sparsity with Applications to Compression, Restoration, and Inverse Problems An Overview of Sparsity with Applications to Compression, Restoration, and Inverse Problems Justin Romberg Georgia Tech, School of ECE ENS Winter School January 9, 2012 Lyon, France Applied and Computational

More information

SIGNAL AND IMAGE RESTORATION: SOLVING

SIGNAL AND IMAGE RESTORATION: SOLVING 1 / 55 SIGNAL AND IMAGE RESTORATION: SOLVING ILL-POSED INVERSE PROBLEMS - ESTIMATING PARAMETERS Rosemary Renaut http://math.asu.edu/ rosie CORNELL MAY 10, 2013 2 / 55 Outline Background Parameter Estimation

More information

41903: Introduction to Nonparametrics

41903: Introduction to Nonparametrics 41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific

More information

arxiv: v2 [math.st] 18 Oct 2018

arxiv: v2 [math.st] 18 Oct 2018 Bayesian inverse problems with partial observations Shota Gugushvili a,, Aad W. van der Vaart a, Dong Yan a a Mathematical Institute, Faculty of Science, Leiden University, P.O. Box 9512, 2300 RA Leiden,

More information

Uncertainty Quantification for Inverse Problems. November 7, 2011

Uncertainty Quantification for Inverse Problems. November 7, 2011 Uncertainty Quantification for Inverse Problems November 7, 2011 Outline UQ and inverse problems Review: least-squares Review: Gaussian Bayesian linear model Parametric reductions for IP Bias, variance

More information

ORACLE INEQUALITY FOR A STATISTICAL RAUS GFRERER TYPE RULE

ORACLE INEQUALITY FOR A STATISTICAL RAUS GFRERER TYPE RULE ORACLE INEQUALITY FOR A STATISTICAL RAUS GFRERER TYPE RULE QINIAN JIN AND PETER MATHÉ Abstract. We consider statistical linear inverse problems in Hilbert spaces. Approximate solutions are sought within

More information

Ultra High Dimensional Variable Selection with Endogenous Variables

Ultra High Dimensional Variable Selection with Endogenous Variables 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High

More information

ASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE. By Michael Nussbaum Weierstrass Institute, Berlin

ASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE. By Michael Nussbaum Weierstrass Institute, Berlin The Annals of Statistics 1996, Vol. 4, No. 6, 399 430 ASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE By Michael Nussbaum Weierstrass Institute, Berlin Signal recovery in Gaussian

More information

Numerical differentiation by means of Legendre polynomials in the presence of square summable noise

Numerical differentiation by means of Legendre polynomials in the presence of square summable noise www.oeaw.ac.at Numerical differentiation by means of Legendre polynomials in the presence of square summable noise S. Lu, V. Naumova, S. Pereverzyev RICAM-Report 2012-15 www.ricam.oeaw.ac.at Numerical

More information

Model selection theory: a tutorial with applications to learning

Model selection theory: a tutorial with applications to learning Model selection theory: a tutorial with applications to learning Pascal Massart Université Paris-Sud, Orsay ALT 2012, October 29 Asymptotic approach to model selection - Idea of using some penalized empirical

More information

Fast learning rates for plug-in classifiers under the margin condition

Fast learning rates for plug-in classifiers under the margin condition Fast learning rates for plug-in classifiers under the margin condition Jean-Yves Audibert 1 Alexandre B. Tsybakov 2 1 Certis ParisTech - Ecole des Ponts, France 2 LPMA Université Pierre et Marie Curie,

More information

2 Tikhonov Regularization and ERM

2 Tikhonov Regularization and ERM Introduction Here we discusses how a class of regularization methods originally designed to solve ill-posed inverse problems give rise to regularized learning algorithms. These algorithms are kernel methods

More information

Multiscale Frame-based Kernels for Image Registration

Multiscale Frame-based Kernels for Image Registration Multiscale Frame-based Kernels for Image Registration Ming Zhen, Tan National University of Singapore 22 July, 16 Ming Zhen, Tan (National University of Singapore) Multiscale Frame-based Kernels for Image

More information

Lecture 3: Statistical Decision Theory (Part II)

Lecture 3: Statistical Decision Theory (Part II) Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical

More information

PhD Course: Introduction to Inverse Problem. Salvatore Frandina Siena, August 19, 2012

PhD Course: Introduction to Inverse Problem. Salvatore Frandina Siena, August 19, 2012 PhD Course: to Inverse Problem salvatore.frandina@gmail.com theory Department of Information Engineering, Siena, Italy Siena, August 19, 2012 1 / 68 An overview of the - - - theory 2 / 68 Direct and Inverse

More information

Estimation of a quadratic regression functional using the sinc kernel

Estimation of a quadratic regression functional using the sinc kernel Estimation of a quadratic regression functional using the sinc kernel Nicolai Bissantz Hajo Holzmann Institute for Mathematical Stochastics, Georg-August-University Göttingen, Maschmühlenweg 8 10, D-37073

More information

Adaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach

Adaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1999 Adaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach T. Tony Cai University of Pennsylvania

More information

Resolving the White Noise Paradox in the Regularisation of Inverse Problems

Resolving the White Noise Paradox in the Regularisation of Inverse Problems 1 / 32 Resolving the White Noise Paradox in the Regularisation of Inverse Problems Hanne Kekkonen joint work with Matti Lassas and Samuli Siltanen Department of Mathematics and Statistics University of

More information

Additive Nonparametric Instrumental Regressions: A Guide to Implementation

Additive Nonparametric Instrumental Regressions: A Guide to Implementation Additive Nonparametric Instrumental Regressions: A Guide to Implementation Working Paper 17-06 S. CENTORRINO, F. FÈVE AND J. P. FLORENS June, 2017 ADDITIVE NONPARAMETRIC INSTRUMENTAL REGRESSIONS: A GUIDE

More information

LPA-ICI Applications in Image Processing

LPA-ICI Applications in Image Processing LPA-ICI Applications in Image Processing Denoising Deblurring Derivative estimation Edge detection Inverse halftoning Denoising Consider z (x) =y (x)+η (x), wherey is noise-free image and η is noise. assume

More information

Optimal Estimation of a Nonsmooth Functional

Optimal Estimation of a Nonsmooth Functional Optimal Estimation of a Nonsmooth Functional T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania http://stat.wharton.upenn.edu/ tcai Joint work with Mark Low 1 Question Suppose

More information

Springer Series in Statistics

Springer Series in Statistics Springer Series in Statistics Advisors: P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger The French edition of this work that is the basis of this expanded edition was translated by Vladimir

More information

arxiv: v4 [stat.me] 27 Nov 2017

arxiv: v4 [stat.me] 27 Nov 2017 CLASSIFICATION OF LOCAL FIELD POTENTIALS USING GAUSSIAN SEQUENCE MODEL Taposh Banerjee John Choi Bijan Pesaran Demba Ba and Vahid Tarokh School of Engineering and Applied Sciences, Harvard University Center

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces 9.520: Statistical Learning Theory and Applications February 10th, 2010 Reproducing Kernel Hilbert Spaces Lecturer: Lorenzo Rosasco Scribe: Greg Durrett 1 Introduction In the previous two lectures, we

More information

An Introduction to Statistical Machine Learning - Theoretical Aspects -

An Introduction to Statistical Machine Learning - Theoretical Aspects - An Introduction to Statistical Machine Learning - Theoretical Aspects - Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny,

More information

Approximation Theoretical Questions for SVMs

Approximation Theoretical Questions for SVMs Ingo Steinwart LA-UR 07-7056 October 20, 2007 Statistical Learning Theory: an Overview Support Vector Machines Informal Description of the Learning Goal X space of input samples Y space of labels, usually

More information

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation

More information

Spectral Filtering for MultiOutput Learning

Spectral Filtering for MultiOutput Learning Spectral Filtering for MultiOutput Learning Lorenzo Rosasco Center for Biological and Computational Learning, MIT Universita di Genova, Italy Plan Learning with kernels Multioutput kernel and regularization

More information

Wavelet Shrinkage for Nonequispaced Samples

Wavelet Shrinkage for Nonequispaced Samples University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Wavelet Shrinkage for Nonequispaced Samples T. Tony Cai University of Pennsylvania Lawrence D. Brown University

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

Functional linear instrumental regression under second order stationarity.

Functional linear instrumental regression under second order stationarity. Functional linear instrumental regression under second order stationarity. Jan Johannes May 13, 2008 Abstract We consider the problem of estimating the slope parameter in functional linear instrumental

More information

Bayesian Nonparametric Point Estimation Under a Conjugate Prior

Bayesian Nonparametric Point Estimation Under a Conjugate Prior University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 5-15-2002 Bayesian Nonparametric Point Estimation Under a Conjugate Prior Xuefeng Li University of Pennsylvania Linda

More information

Nonparametric Inference In Functional Data

Nonparametric Inference In Functional Data Nonparametric Inference In Functional Data Zuofeng Shang Purdue University Joint work with Guang Cheng from Purdue Univ. An Example Consider the functional linear model: Y = α + where 1 0 X(t)β(t)dt +

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

sparse and low-rank tensor recovery Cubic-Sketching

sparse and low-rank tensor recovery Cubic-Sketching Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru

More information

Functional Analysis Exercise Class

Functional Analysis Exercise Class Functional Analysis Exercise Class Week: December 4 8 Deadline to hand in the homework: your exercise class on week January 5. Exercises with solutions ) Let H, K be Hilbert spaces, and A : H K be a linear

More information

RATES OF CONVERGENCE OF ESTIMATES, KOLMOGOROV S ENTROPY AND THE DIMENSIONALITY REDUCTION PRINCIPLE IN REGRESSION 1

RATES OF CONVERGENCE OF ESTIMATES, KOLMOGOROV S ENTROPY AND THE DIMENSIONALITY REDUCTION PRINCIPLE IN REGRESSION 1 The Annals of Statistics 1997, Vol. 25, No. 6, 2493 2511 RATES OF CONVERGENCE OF ESTIMATES, KOLMOGOROV S ENTROPY AND THE DIMENSIONALITY REDUCTION PRINCIPLE IN REGRESSION 1 By Theodoros Nicoleris and Yannis

More information

Math 127: Course Summary

Math 127: Course Summary Math 27: Course Summary Rich Schwartz October 27, 2009 General Information: M27 is a course in functional analysis. Functional analysis deals with normed, infinite dimensional vector spaces. Usually, these

More information

Efficient Solution Methods for Inverse Problems with Application to Tomography Radon Transform and Friends

Efficient Solution Methods for Inverse Problems with Application to Tomography Radon Transform and Friends Efficient Solution Methods for Inverse Problems with Application to Tomography Radon Transform and Friends Alfred K. Louis Institut für Angewandte Mathematik Universität des Saarlandes 66041 Saarbrücken

More information

One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017

One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017 One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017 1 One Picture and a Thousand Words Using Matrix Approximations Dianne P. O Leary

More information

A Data-Driven Block Thresholding Approach To Wavelet Estimation

A Data-Driven Block Thresholding Approach To Wavelet Estimation A Data-Driven Block Thresholding Approach To Wavelet Estimation T. Tony Cai 1 and Harrison H. Zhou University of Pennsylvania and Yale University Abstract A data-driven block thresholding procedure for

More information

Divide and Conquer Kernel Ridge Regression. A Distributed Algorithm with Minimax Optimal Rates

Divide and Conquer Kernel Ridge Regression. A Distributed Algorithm with Minimax Optimal Rates : A Distributed Algorithm with Minimax Optimal Rates Yuchen Zhang, John C. Duchi, Martin Wainwright (UC Berkeley;http://arxiv.org/pdf/1305.509; Apr 9, 014) Gatsby Unit, Tea Talk June 10, 014 Outline Motivation.

More information

Deconvolution. Parameter Estimation in Linear Inverse Problems

Deconvolution. Parameter Estimation in Linear Inverse Problems Image Parameter Estimation in Linear Inverse Problems Chair for Computer Aided Medical Procedures & Augmented Reality Department of Computer Science, TUM November 10, 2006 Contents A naive approach......with

More information

Lecture 9 February 2, 2016

Lecture 9 February 2, 2016 MATH 262/CME 372: Applied Fourier Analysis and Winter 26 Elements of Modern Signal Processing Lecture 9 February 2, 26 Prof. Emmanuel Candes Scribe: Carlos A. Sing-Long, Edited by E. Bates Outline Agenda:

More information

Simultaneous White Noise Models and Optimal Recovery of Functional Data. Mark Koudstaal

Simultaneous White Noise Models and Optimal Recovery of Functional Data. Mark Koudstaal Simultaneous White Noise Models and Optimal Recovery of Functional Data by Mark Koudstaal A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department

More information

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28

More information

Minimax theory for a class of non-linear statistical inverse problems

Minimax theory for a class of non-linear statistical inverse problems Minimax theory for a class of non-linear statistical inverse problems Kolyan Ray and Johannes Schmidt-Hieber Leiden University Abstract We study a class of statistical inverse problems with non-linear

More information

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Jorge F. Silva and Eduardo Pavez Department of Electrical Engineering Information and Decision Systems Group Universidad

More information

High-dimensional regression with unknown variance

High-dimensional regression with unknown variance High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f

More information

Asymptotic Equivalence and Adaptive Estimation for Robust Nonparametric Regression

Asymptotic Equivalence and Adaptive Estimation for Robust Nonparametric Regression Asymptotic Equivalence and Adaptive Estimation for Robust Nonparametric Regression T. Tony Cai 1 and Harrison H. Zhou 2 University of Pennsylvania and Yale University Abstract Asymptotic equivalence theory

More information

Statistical Convergence of Kernel CCA

Statistical Convergence of Kernel CCA Statistical Convergence of Kernel CCA Kenji Fukumizu Institute of Statistical Mathematics Tokyo 106-8569 Japan fukumizu@ism.ac.jp Francis R. Bach Centre de Morphologie Mathematique Ecole des Mines de Paris,

More information

Stochastic optimization in Hilbert spaces

Stochastic optimization in Hilbert spaces Stochastic optimization in Hilbert spaces Aymeric Dieuleveut Aymeric Dieuleveut Stochastic optimization Hilbert spaces 1 / 48 Outline Learning vs Statistics Aymeric Dieuleveut Stochastic optimization Hilbert

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

1 Math 241A-B Homework Problem List for F2015 and W2016

1 Math 241A-B Homework Problem List for F2015 and W2016 1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let

More information

Integral approximation by kernel smoothing

Integral approximation by kernel smoothing Integral approximation by kernel smoothing François Portier Université catholique de Louvain - ISBA August, 29 2014 In collaboration with Bernard Delyon Topic of the talk: Given ϕ : R d R, estimation of

More information

3 Compact Operators, Generalized Inverse, Best- Approximate Solution

3 Compact Operators, Generalized Inverse, Best- Approximate Solution 3 Compact Operators, Generalized Inverse, Best- Approximate Solution As we have already heard in the lecture a mathematical problem is well - posed in the sense of Hadamard if the following properties

More information