Noise Reduction for Enhanced Component Identification in Multi-Dimensional Biomolecular NMR Studies
|
|
- Alice Young
- 6 years ago
- Views:
Transcription
1 Noise Reduction for Enhanced Component Identification in Multi-Dimensional Biomolecular NMR Studies Nicoleta Serban 1 The objective of the research presented in this paper is to shed light into the benefits of multi-dimensional wavelet-based methodology applied to NMR biomolecular data analysis. Specifically, the emphasis is on noise reduction for enhanced component identification in multi-dimensional mixture regression. The contributions of this research are multi-fold. First, the wavelet-based noise reduction method applies to multi-dimensional data whereas most of the existing work focuses on one- or two-dimensional data only. The proposed wavelet-based methodology is founded on rigorous analysis of the dependence between wavelet coefficients, an important aspect of multi-dimensional wavelet de-noising. The wavelet de-noising rule is based on the Stein s unbiased risk estimator (SURE) where the smoothness thresholds vary with the resolution level and orientation of the wavelet transform and selected by controlling the False Discovery Rate of the significant wavelet coefficients. Second, this paper highlights the application of the wavelet methodology to multi-dimensional NMR data analysis for protein structure determination. The noise reduction method is general and applicable to multi-dimensional data arising in many other research fields, prominently in biology science. Our empirical investigation shows that reducing the noise using the method in this paper results in more detectable true components and fewer false positives without altering the shape of the significant components. 1 Introduction The objective of the research presented in this paper is to introduce a wavelet-based noise reduction method for enhanced component identification in multi-dimensional 1 The author is grateful to Gordon Rule for allowing her to use the data from the NMR experiment and for his mentorship in NMR research. She also thankful to Kobi Abayomi, Ray Carroll and Brani Vidakovic for their useful insights about the research and the presentation of this paper. 1
2 mixture regression described by the model Z i1,...,i d = L s (x i1,..., x id ; θ l ) + σɛ i1,...,i d, i 1 = 1,... M 1,..., i d = 1,..., M d (1) l=1 where s (x i1,..., x id ; A l, w l, τ l ) is a regression component identifiable by a set of parameters θ l and observed over a set of equally spaced d dimensional grid points, (x i1,..., x id ). We assume that the regression function s is bounded above zero, continuous and unimodal. Examples of such functions are Gaussian and Lorentzian, commonly used in modeling NMR data. The number of components L is large and unknown. One challenging statistical problem relevant to the multi-dimensional mixture regression model is component identification - estimation of L. Because the regression components are contaminated by noise and many components may be observed at the noise level, a preliminary step is to reduce the noise to enhance the estimation accuracy of the number of components L. Typically, data generated by the multi-dimensional mixture regression model (1) feature sharp changes, spatial inhomogeneity, signal sparsity and local correlated noise. A common method for noise reduction, which overcomes these difficulties, is data filtering using a multiscale bandpass filter or wavelet transform and reducing the noise in the wavelet domain using a spatially adaptive method. Two common wavelet-based noise reduction methods are hard and soft thresholding introduced by Donoho (1995) and Donoho and Johnstone (1995). In the past 15 years, a series of other noise reduction techniques have been explored. Jansen (2001) reviews the existing noise reduction methods complemented by methodological and theoretical results. Antoniadis et al. (2001) present a comparison study of a series of de-noising methods applied to one-dimensional signals. Recent research in waveletbased noise reduction explores wavelet coefficient shrinkage methods by taking into account the intrascale and interscale dependence of the wavelet coefficients. Some of the early works on wavelet coefficient shrinkage using interscale dependence are by Hall et al. (1997) and Cai (1999) who introduce the concept of non-overlapping block thresholding. Other relevant research is by Shapiro (1993); Jansen (2001); Cai and Silverman (2001), Abramovich et al. (2002), Pizurica et al. (2002), Shen et al. (2002), Portilla et al. (2003), Autin (2008), Chaux et al. (2008), Cai and Zhou (2009) and the references therein. Although there are many competitive approaches to wavelet-based noise reduc- 2
3 tion, they have not yet been fully explored for more than two-dimensions. Multidimensional wavelet noise reduction requires specific considerations as the structure of the coefficients is highly complex - the intrinsic dependence between coefficients extends to both intra- and inter-resolution levels and orientations. Moreover, the signal may be highly sparse and the signal-to-noise structure may vary with the dimensionality. The methodology presented in this paper contributes to the research on wavelet-based noise reduction with a procedure that focuses on d-dimensional data with d > 2. The noise reduction method introduced in this paper complements the work on block-thresholding by Cai (1999) and the methodological and theoretical considerations introduced by Johnstone and Silverman (1997); Cai and Silverman (2001); and Shen et al. (2002). In most of the work so far on block-thresholding, the selection of the block composition is arbitrary to some extent. One exception is Cai and Zhou (2009) introducing a non-overlapping block thresholding method allowing for varying data-driven block sizes across resolution levels. This work considers intra-scale dependencies only and applies to one-dimensional data - for d-dimensional, the problem of selecting the block size is equivalent to a d-dimensional optimization problem since the block size will vary across dimensions. Although many research papers highlight the importance of incorporating intra- and inter-scale dependencies in estimating and de-noising wavelet coefficients, none presents theoretically-founded explanations of selecting one block of coefficients over another. The wavelet-based methodology in this paper is founded on rigorous analysis of the dependence between wavelet coefficients, an important aspect of multi-dimensional wavelet de-noising. The emphasis of the multi-dimensional noise reduction method introduced in this paper is on enhanced component identification for the regression model in (1) - more detectable components for a given number of data samples and fewer false-positive components. With this objective in mind, the overlapping-block shrinkage rule investigated in this paper is based on the Stein s unbiased risk estimator (SURE) where the smoothness thresholds vary with the resolution level and orientation of the wavelet transform to optimally adapt to spatially inhomogeneities in multi-dimensional data generated by the regression model in (1). The smoothness thresholds are data-driven and selected by controlling the False Discovery Rate (FDR) of the significant wavelet coefficients, which in turn, implies controlling the FDR of the regression components, the primary objective of this research. In contrast, in most of the existing block- 3
4 thresholding methods (Cai, 1999; Cai and Silverman, 2001; Portilla et al., 2003; Autin, 2008; and Chaux et al., 2008), the smoothness thresholds are fixed across resolutions and orientations. The statistical application investigated in this paper is pertinent to the study of three-dimensional protein structure determination using Nuclear Magnetic Resonance (NMR). In NMR data analysis for biomolecular studies, one primary objective is to estimate parameters (e.g. chemical shifts) of the atomic nuclei of a protein under study when the protein is magnetized using a strong magnetic field. Under protein magnetization, targeted atomic nuclei undergo energy transfers; each energy transfer induces a signal which is mathematically described by a decaying sinusoid. Therefore, the NMR signal generated by a d-dimensional NMR experiment is a sum of decaying sinusoids commonly observed over equally spaced time points plus error ( L d ) S(t 1, t 2,..., t d ) = A l e iφ l e ts/τ sl e itsw sl + ɛ t1,...,t d (2) l=1 s=1 where each sinusoid is generated by an energy transfer between d atomic nuclei and their parameters in d-dimensional NMR experiments (Hoch and Stern, 1996). The model parameters of interest are the resonance frequencies w l = (w 1l,..., w dl ) (translated into chemical shifts), and the signal amplitudes A l (translated into structural distance of the atomic nuclei involved in the transfer of energy in specific NMR experiments). Also L is the number of observed energy transfers, which is large and unknown. The protein structure is resolved by accurately estimating the resonance frequencies and the signal amplitudes from data generated by NMR experiments. The traditional methodology in biomolecular NMR data analysis involves Fourier transformation (FT) of the NMR signal data complemented by other pre-processing steps (Hoch and Stern, 1996). After Fourier Transform, the resulting model is a d-dimensional mixture regression model as described by the model in equation (1). In this model, the parameters of the lth regression component are the location parameter w l = (w 1l,..., w dl ), which are the signal frequencies, width parameter τ l = (τ 1l,..., τ dl ), and amplitude parameters A l. In this paper, the regression components in model (1) are dubbed spectral components and their parameters are dubbed spectral parameters. Because of the one-to-one mapping between energy transfers and spectral components, the problem of identifying the parameters of the atomic nuclei undergoing energy transfers translates into accurately identifying and estimating the 4
5 spectral parameters. Spectral components and their parameters must be identified accurately for accurate prediction of the protein structure. In multi-dimensional NMR data, many of the spectral components will have low amplitude, and therefore, they will be difficult to identify from the noise in the data. Detecting spectral components with amplitude only slightly above the noise level without erroneously including noise components (false positives) is crucial for a robust and reliable component identification algorithm, which will lead to stable protein prediction (Herrmann, et al., 2002). In certain cases, the lack of a small number of essential spectral components can lead to a significant deviation of the structure (Güuntert, 2003). In NMR biomolecular studies, it is common practice to manually remove the noise or false positives and/or manually identify low amplitude components. In this paper, the approach for overcoming this difficulty is to partially reduce the noise in the NMR data using a wavelet-based methodology. The NMR literature has already acknowledged the potential of wavelet-based noise reduction but it has not fully explored the benefits of (wavelet-based) noise reduction for multi-dimensional NMR data (Trbovic et al., 2005; Dancea and Güunther, 2005). Generally, the de-noising methodology applies to other research applications generating multi-dimensional data. One such example is breast computed tomography (CT), where the mixture components correspond to lesion or tumor masses. In this application, it is important to identify the mixture components or lesions to characterize their size and their distribution providing information about the survival rate as well as whether the tumors are benign or malign (Ning et al. 2004). One difficulty in this application is identification of the small tumors from the noisy background. To overcome this difficulty, one approach is to reduce the noise using a spatially adaptive method similarly to the NMR component identification problem. Zhong, et al. (2004) applied wavelet-based de-noising to two-dimensional CT data. However, their method does not allow for spatial adaptivity to signal inhomogeneity and sparsity featuring CT data. Because three-dimensional CT data are increasingly used, multi-dimensional noise reduction methods will be key to enhanced detection of tumor masses using the CT technology. In this paper, we will first introduce the noise reduction method for multi-dimensional data in Section 2 and apply the proposed methods to three- and four-dimensional synthetic examples in Section 3 and to three-dimensional NMR data for two different experiments in Section 4. We conclude with a discussion and further considerations 5
6 on the methodology presented in this paper. 2 Noise Reduction in Multi-dimensional Data In the underlying model described in (1), Z i1,...,i d are observed intensities in a d- dimensional space and the errors ɛ i1,...,i d are assumed to be additive, normally distributed but locally correlated. These are common assumptions in NMR data analysis (Hoch and Stern, 1996; Gragea and Akke, 2003). The noise reduction method is a three-step procedure: 1. Decomposition of the multi-dimensional data using a wavelet basis; 2. De-noising the coefficients in the wavelet domain; and 3. Reconstruction based on the wavelet coefficients de-noised at step (2). In this section, we will first describe the noise reduction method including a discussion about the intra-resolution and inter-resolution dependence between coefficients and the definition of block of influence for a wavelet coefficient using these dependence relationships. Using a block-based statistic called a cumulative influence statistic, we further define the shrinkage or the de-noising rule for the wavelet coefficients and propose a method for deriving the shrinkage level. In the final subsection, we introduce a wavelet-based method for component identification. 2.1 Method Description In this study, we apply a separable or tensor-product orthogonal wavelet transform (Section 7 in Mallat, 1998 and Section 5 in Vidakovic, 1999), which maps the observed intensities to the wavelet domain coefficients j,m {Z i1...i d } i1...i d {ᾱ i1...i d } i1...i d (coarse coefficients), { β } j,m,i1...i d (detailed coefficients) where j = J 0,..., J indexes the resolution level, m = 1,..., 2 d 1 indexes the orientation level and (i 1... i d ) for i 1 = 1,..., 2 j ;... ; i d = 1,..., 2 j are grid locations at the resolution level j. A wavelet basis with a small number of vanishing moments is used to capture local regularities with a small Lipschitz exponent (see Mallat and Hwang, 1992 and the references therein). Daubechies and Symmlets (least asymmetric Daubechies) wavelets are the common choice for capturing regularities described 6
7 by spectral components in NMR data because they have a support of minimum size for a given number of vanishing moments (Daubechies, 1992). One key advantage of performing noise reduction in the wavelet domain rather than directly in the original domain is that the wavelet coefficients are approximately uncorrelated although the original data may be highly correlated. Johnstone and Silverman (1997) and Jansen (2001, Lemma 5.1) showed that for one-dimensional signals, if the noise in the data is stationary and correlated, then the variance of the wavelet coefficients will depend on the resolution level but otherwise, the coefficients are approximately uncorrelated. These results can be extended to multi-dimensional data with one modification; the variance of the wavelet coefficients will depend on both the resolution level and orientation. Similar to Johnstone and Silverman (1997), in this research, the detail coefficients are normalized β j,m β j,m σ j,m where ˆσ j,m 2 is the estimated variance within resolution j and orientation m; therefore, j,m we assume that the variance of the coefficients is one (V( β ) = 1).We estimate the resolution and orientation dependent variances using the mean absolute deviation estimator Block-Level Shrinkage A series of existing noise reduction methods use the idea of wavelet coefficient shrinkage using information from neighboring coefficients within the same resolution and across resolutions. For example, most block-thresholding methods use information between coefficients within the same resolution (Hall et al., 1997; Cai, 1999; Cai and Silverman, 2001; Cai and Zhou, 2009) and some across resolution levels (Chaux et al., 2008 and Portilla et al. 2003). Other noise reduction methods define inter-resolution dependence on a basis of a tree of wavelet coefficients (Jansen, 2001, Section 5.3.1; Autin, 2008). Most of the wavelet de-noising literature has focused on one- and twodimensional applications only. However, one- and two-dimensional models overlook the challenges arising in multi-dimensional data. The number of intra- and interresolution neighboring coefficients increases exponentially with the dimensionality and the dependence relationships vary not only with the resolution level but also 7
8 with the orientation. In this paper, we investigate a series of significance and insignificance relationships between wavelet coefficients and define a block of influence for each coefficient based on these relationships as described in this sub-section. Block of Influence. We define the block of influence of a wavelet coefficient based on significance and insignificance relationships with its neighboring coefficients. In this paper we define a significance relationship when a coefficient is significant if a set of specific neighbors are also significant. We define an insignificance relationship when the insignificance of a coefficient implies insignificance of specific neighboring coefficients. Using these relationships, in this paper, we provide a theoretical basis for selecting the block of influence under the multi-dimensional mixture regression model. Assuming the model defined in (1), we show in this paper that there is a significance relationship between a coefficient and its immediate neighbors from the same resolution and the same orientation stated in part (1) of Proposition 1. There is also a significance relationship between a coefficient and the coefficients corresponding to the same location and the same resolution but from different orientations. This relationship is stated in part (2) of Proposition 1. Finally, there is a insignificance relationship between a coefficient and its neighbors from the same location coefficients and the same orientation but the neighboring descendant resolution stated in part (3) of Proposition 1. Proposition 1. Under a series of assumptions for the mixture regression model (1) described in the Appendix, the following significance and insignificance relationships hold: 1. β j,m i 1 ±1...i d > 0,... β j,m ±1 > 0 βj,m > 0; (3) 2. For d 3 : β j,m > 0, for m = 2 d 1, m m β j,m > 0; (4) 3. β j,m = 0 β j+1,m 2i 1...2i d = 0, β j+1,m 2i i d = 0,... β j+1,m 2i i d +1 = 0. (5) The insignificance relationship in (3) holds even for higher descendent resolution levels, but in this research, we will only include the coefficients from the descendent resolution j+1 in the block of influence. In Appendix A, we prove the significance and insignificance relationships described in this proposition under specific assumptions about the shape of the components s in the mixture regression model described in (1). One has to bear in mind, that in practice, a significance relationship is when the 8
9 source coefficients are significantly way from zero and an insignificance relationship is when the source coefficients are approximately zero due to the presence of noise. β j,m β j,m ±1 Based on the significance and insignificance relationships described in Proposition j,m 1, the block of influence is defined based on three sources: (a) coefficient itself ( β ); (b) immediate neighboring coefficients from the same resolution level and the same orientation ( i 1 ±1...i d,..., ); and (c) the same location coefficients from the same j+1,m j+1,m orientation but the neighboring descendant resolution ( β 2i 1...2i d,..., β 2i i d +1 ). The source of influence in (b) is a significance source (part 1 of Proposition 1) and the source of influence in (c) is an insignificance source (part 3 of Proposition 1). We use the two sources of energy in (b) and (c) to balance the significance and insignificance influence from the intra and inter-scale neighboring coefficients. For multi-dimensional data with d 3, we may replace the significance source (b) with the wavelet coefficients from the same resolution, the same location but different orientations (β j,m, with m m), since this is also a significance source (Proposition 1, part 2). Shen, et al. (2002) proposed a similar block composition for two-dimensional data. The block defined in their approach includes all three sources of energy in Proposition 1. That is, it includes two significance sources and one insignificance source. With two sources of significance, the block-shrinkage rule will be biased towards significant coefficients. Moreover, for two-dimensional data, the significance relationship coming from coefficients from different orientations does not hold under the modeling assumptions stated in the Appendix. We define the average influence measures for the three sources of a wavelet coefficient at resolution j, orientation m and grid location (i 1,..., i d ) in the following equations ( S j,m )1 ( S j,m )2 ( S j,m )3 = ( βj,m = 1 2d ) 2 ( ( βj,m i i d ) 2 + ( βj,m i i d ) ( βj,m +1 = 1 2 d ( ( βj+1,m 2i 1...2i d ) 2 + ( βj+1,m 2i i d ) ( βj+1,m 2i i d +1 ) 2 ( ) 2 ) + βj,m 1 ) 2 ) and the cumulative influence statistic as S j,m = ( S j,m ) 1 + ( S j,m ) 2 + ( S j,m )3. (6) The cumulative energetic influence balances the average influence from both intra- 9
10 and inter-scale dependence sources as supported by Proposition 1. Shrinkage rule. James-Stein estimator We estimate a coefficient using a shrinkage rule based on the ˆβ j,m i 1...i n = ( 1 L j,m S j,m ) + β j,m i 1...i n. (7) where L j,m is a resolution and orientation dependent shrinkage level. The James-Stein estimator has been previously used to shrink/estimate the detail wavelet coefficients in soft-thresholding (Donoho and Johnstone, 1995) and block-thresholding (Cai, 1999) for one-dimensional data and for two-dimensional data (Chaux et al., 2008). In the method introduced in this paper, the cumulative influence statistic S j,m i 1...i n accounts for three sources of energy influence in contrast to soft-thresholding which uses source (a) (the coefficient itself), and Cai s block-thresholding which uses sources (a) and (b) (the intra-scale neighboring coefficients). Chaux et al. (2008) introduce a generalized non-overlapping thresholding method which allows for any nonoverlapping block composition. In addition, the existing block-thresholding methods (Cai, 1999; Cai and Zhou, 2009; Chaux et. al., 2008) shrink the coefficients in nonoverlapping blocks. Portilla et al. (2003) point out that non-overlapping block thresholding leads to noticeable de-noising artifacts at the discontinuities introduced by the block boundaries. In our simulation study, we compare non-overlapping block thresholding using the method by Cai and Zhou (2009) but extended to multi-dimensional data to moving-block thresholding introduced in this research paper. By individually shrinking the wavelet coefficients, we will show that the mean squared error improves without additionally altering the signal components Wavelet Coefficient Shrinkage Level The shrinkage rule in equation (7) is a function of a shrinkage level L j,m. We allow for resolution and orientation dependent levels to adapt to spatial inhomogeneity and signal sparsity. The shrinkage levels L j,m are evaluated assuming independent multivariate estimation problems across resolution levels and orientations similar to current wavelet-based noise reduction methods which allow for resolution-varying thresholding levels (see Jansen, 2001 for a review). In this context, L j,m is a smoothing parameter for resolution level j and orientation m controlling the trade off between fitting and smoothing of the wavelet coefficients where smoothing is measured using the L 2 norm when using the shrinkage rule defined in (7). We will probably not discover many 10
11 components when L j,m s are large. As we decrease L j,m, we can potentially discover more components at the price of less noise smoothing. Consequently, obtaining L j,m is important to maximize the number of true discoveries with the inclusion of only a small number of false positives and without distorting the shape of the components. The shrinkage level L j,m acts as a significance threshold for the cumulative influence statistics S j,m S j,m at the resolution level j and orientation m. That is, when > L j,m, the cumulative influence statistic S j,m is assigned to be significant. To evaluate L j,m, we assume S j,m = X j,m (s i1...i d ) are observed from a d-dimensional random field X j,m (s); because the average influence measures are computed based on moving blocks, there is an underlying known spatial dependence between S j,m. That is, we observe the random field X j,m (s) over a regular grid of points within some closed space S j, and with mean E(X j,m (s)) = µ j,m (s) and covariance surface C{X j,m (s), X j,m (s )} = C j (s, s ). The covariance surface C j (s, s ) is known and of second order stationarity. The covariance structure varies with the resolution level j provided by C(S j,m i 1,...,i d, S k1,...,k d ) = where i k 2 = (i 1 k 1 ) (i d k d ) 2. σ 2 d i k 2 = 1 σ 2 2d 2 i k 2 = 2 σ 2 4d 2 i k 2 = 4 Methodological and theoretical statistical research for identifying global significance thresholds for random fields has been applied to imaging and astrophysics. A few representative references are by Siegmund and Worsley (1995); Cao and Worsley (1999); Hopkins et al. (2001); and Pacifico, et al. (2004). Shrinkage level via hypothesis testing. Similar to Pacifico, et al. (2004), we determine the shrinkage level for β j,m i 1...i n hypothesis testing. The hypothesis test is or the significance threshold for S j,m H 0 (X) : µ j,m (s) = µ j,m 0 vs H A (X) : µ j,m (s) > µ j,m 0. using Denote S j,m 0 = {s S j : µ j,m (s) = µ j,m 0 } as the null set. The shrinkage level is obtained by controlling the false discovery rate Γ j,m (L) = λ(rj,m L Sj,m 0 ) λ(r j,m L ) 11
12 where R j,m L = {s Sj : X j,m (s) > L} (alternative set) and λ is a probability measure. Consequently, select L j,m such that L j,m = arginf L {L : Γ j,m (L) ɛ}. (8) where ɛ is a tolerance level for the error rate. However, in the formulation above, we do not have the null set S j,m 0, and therefore, we will derive the shrinkage level by controlling an estimated error rate ˆΓ j,m (L) which is obtained by replacing the null set S j,m 0 with an estimated superset. Estimation of the shrinkage level. Following Pacifico, et al. (2004), we estimate a superset of S j,m 0 called U j,m by testing for all subsets in the sample space A S j : H 0,j,m : A S j,m 0 vs H A,j,m : A S j,m 0. Based on this testing procedure, the superset U j,m is the union of all subsets A for which the null hypothesis H 0,j,m is not rejected at the significance level α. Because the signal information at a resolution level j is partitioned across 2 d 1 orientations, we need to correct for multiplicity in simultaneous inference within resolution level j. It follows that Pr{U j,m S j,m 0, m = 1,..., 2 d 1} 1 α, j = J 0,..., J 1. (9) In this paper, we obtain the probability values by simulating from the null distribution assuming that the wavelet coefficients are normally distributed with a known dependence structure. Using the estimated superset U j,m, the estimated false discovery rate becomes: ˆΓ j,m (L) = λ(rj,m L U j,m ) λ(r j,m L ). We therefore obtain the shrinkage level L j,m by controlling ˆΓ j,m (L). That is, L j,m = arginf L {L : ˆΓ j,m (L) ɛ}. The condition in (9) and the inequality ˆΓ j,m (L) Γ j,m (L) imply that Pr{Γ j,m (L j,m ) ɛ, m = 1,..., 2 d 1} 1 α, j = J 0,..., J 1. (10) 12
13 In all our examples in the next sections, we used α = 0.05 and ɛ = 0.1. That is, we allow for 10% of false positive coefficients at confidence level of The number of detectable components is more sensitive to the condence level than the tolerance level ɛ. Interpretation of the Error Rate. The hypothesis testing procedure discussed above is used to identify an optimal shrinkage level L j,m for the wavelet coefficients at resolution j and orientation m. We interpret the error rate criterion in (10) as the proportion of false discoveries, where a discovery is a significant coefficient defined by the alternative hypothesis H A (X); this proportion is smaller than ɛ with a probability 1 α within a fixed resolution level j. The choice of ɛ depends on the problem at hand. For example, when ɛ is close to zero, we allow for a small proportion of false positives but we may also fail to identify components close to the noise level. 3 Simulation Studies In this simulation study, we compare the multi-dimensional denoising method introduced in this paper to three common thresholding methods including hard-thresholding (Donoho and Johnstone, 1994), soft-thresholding (Donoho and Johnstone, 1995), and non-overlapping block thresholding (Cai and Zhou, 2007). For multi-dimensional wavelet de-noising, varying block size as suggested by Cai and Zhou (2007) requires solving a d-dimensional optimization problem. Because of this difficulty, in this simulation study, the block size is fixed and approximately equal to the log of the number of grid points (log(m k )) for each dimension k = 1,..., d, which has been suggested by Cai (1999) and Cai and Silverman (2001) for one-dimensional data. For all three comparison methods, the shrinkage levels vary with both resolution and orientation and they are selected to minimize the James-Stein risk estimator. We compare these four methods not only by evaluating the mean squared errors but also by investigating how well they perform in terms of component identification enhancement. Simulation Settings. To evaluate these methods in the context of our application, we simulate data following the general model for multi-dimensional NMR data described in (1). We simulate data in three and four dimensions following this model where the function s is assumed to be a Lorentzian - commonly assumed as shape function in biomolecular NMR data analysis (Hoch and Stern, 1996). Therefore, in 13
14 our simulation study, we assume that the mixture regression function in (1) is f(x 1,..., x d ) = L l=1 ( d ) A l / s=1 τ sl d ( ). (11) s=1 (xs ω sl ) 2 τ 2 sl + 1 The parameters of the simulation model are as follows. The amplitudes A l, l = 1,..., L vary in the interval of values [10, 100] and the noise standard error varies: σ = 10, 15 and 20. In this example, the number of Lorentzian components is L = 500 on a grid of points for d = 3. We simulate the error term from a multivariate normal with local dependency provided by an autoregressive process of lag four. For each noise level σ, we compare the four methods based on one simulated data set. Separation of the insignificant and significant wavelet coefficients. In order to motivate the use of neighboring wavelet coefficients in classifying significant coefficients from significant ones as well as in their estimation, we investigate the separation between the density functions of the insignificant (corresponding to signal-free locations on the boundaries) and significant wavelet coefficients with respect to two statistics: the coefficient magnitudes and the cumulative energetic influence defined in (6). Figure 1 shows the log-scale density functions for one orientation only and for the wavelet coefficients at the finest resolution. For the magnitude statistic, the density functions overlap considerably, whereas for the cumulative energetic influence statistic, there is a clear separation - the wider the separation, the more effective the shrinkage rule is. It is important to note that the density separation for the energetic influence statistic improves for lower noise level and at higher dimensionality as more information is incorporated in the cumulative energetic influence statistic. Therefore, the significance test discussed in Section 2.1 will improve in power and the error rate will be less conservative for high-dimensionality. Mean squared error (MSE). We first compare the MSE for the method introduced in this paper and for the three comparative thresholding methods. In this simulation study, moving-block thresholding with ɛ and α varying in [0.01, 0.1] outperforms all three comparative methods in terms of MSE. The results discussed in Table 1 are averages over 10 repeated simulations. We have also computed the mean squared error for other block compositions. The block replacing the significance source of influence with the coefficients from different orientations (Proposition 1, part 2) 14
15 Density estimate: Cumulative Energetic Influence Density estimate: Magnitude 0.25 Significant Significant Insignificant Insignificant Density functions: Cumulative Energetic Influence Density functions: Magnitude 0.18 Significant Significant Insignificant 0.16 Insignificant Figure 1: Density functions for the energetic influence statistic (left plot) and magnitude statistic (right) separated for significant and insignificant coefficients. Upper plots for three-dimensional synthetic data and lower plots for four-dimensional synthetic data. performs similarly to the block defined in this paper. On the other hand, the block including both sources of significance provided higher MSE. The rate of decrease in MSE over the best performing comparison method (non-overlapping block thresholding) compared to the proposed block-thresholding method is higher for lower signal levels. Therefore, at very low signal-to-noise ratios, block-thresholding methods will perform similarly. σ SYNT HARD SOFT Non-overlap BLOCK Moving BLOCK (ɛ =.01 and α =.05) 10 = = = Table 1: Mean squared errors for the noised data (first column) and for de-noised data using hard-thresholding, soft-thresholding, non-overlapping block thresholding and the method introduced in this paper. 15
16 Number of Components. To highlight the benefits of noise reduction for enhanced estimation of the number of spectral components L, we analyze the performance of a commonly employed component identification method, which detects local maxima above a fixed threshold. A local maximum corresponds to a location (x i1,..., x id ) with intensity value Z i1,...,i d larger than the immediate neighboring intensities. Hence the local maxima identified with this method are initial estimates for the spectral components. This component identification method has been implemented in most of the existing commercial and non-commercial software packages for NMR data analysis (Gronwald and Kalbitzer, 2004; Güntert, 2003). We apply this component identification method complemented by a test for component identifiability introduced in Serban (2007). The noise level for the simulated data is σ = 15. Since for simulated data the locations of the spectral components (w l, l = 1,..., L) are known, we evaluate the performance of the component identification method using false discovery rate (FDR) and false negative rate (FNR). FDR at a threshold T is computed as the number of false positives (defined as local maxima which do not correspond to locations of true spectral components) divided by the total number of local maxima discovered up to the threshold T. FNR at a threshold T is computed as the number of false negatives (defined as undetected local maxima which correspond to locations of true spectral components) divided by the sum of false negatives and true positives up to the threshold T. In Figure 2, FDR (left plot) and FNR (right plot) are compared for simulated data without de-noising (black solid line) and with de-noising (colored solid lines). On one hand, the FDR computed for noisy data is large whereas the FNR is zero. This is the extreme case since a large number of false positives are introduced to detect all the spectral components. One another hand, the four different de-noising methods perform differently with respect to FDR and FNR. For example, the soft thresholding method has a high FNR over all thresholds and hard thresholding has a high FDR. The two block thresholding methods perform similarly in terms of FDR but the block-thresholding method in this paper has a lower FNR at higher thresholds implying less component shape alteration or smoothing. This finding is supported by a lower estimation error of the amplitude parameters when the block-thresholding method introduced in this paper is employed as presented in the next result. Parameter Estimation. One primary objective of the analysis of NMR biomolecular data is to obtain parameter estimates for the frequency parameters (w l, l = 16
17 False Discovery Rate noised hard denoised soft denoised sure block denoised block denoised False Negative Rate hard denoised soft denoised block sure denoised block denoised Threshold values Threshold values Figure 2: False Discovery Rate (left) and False Negative Rate (right) resulting from component identification applied to noisy data and de-noised data using four different methods: hard-thresholding (blue), soft-thresholding (red), non-overlapping block thresholding (green) and moving-block thresholding (purple). 1,..., L) and amplitude parameters (A l, l = 1,..., L). Since de-noising methods not only reduce the noise in the data but also smooth out signal components, the parameter estimates, more specifically the amplitudes, may be altered. To evaluate whether the amplitude estimates are altered after de-noising, we applied an initial estimation method to de-noised data. We compare the amplitude estimates for the two block de-noising methods (non-overlapping block thresholding with SURE shrinkage levels and moving-block thresholding). The amplitude error estimates are presented in Figure 3. The smaller error rates for the moving-block thresholding method imply less altered amplitude estimates than for the non-overlapping block thresholding. Component Identification in Wavelet Domain. Lastly, we evaluate the Lipschitz exponent estimator introduced by Pizurica et al. (2002), which measures the local signal regularity characterized by the decay of the wavelet transform amplitude across resolution levels. The Lipschitz exponent estimator can be used in identifying regression components for the mixture regression model in (1). An exponent larger than 1 will correspond to a significant regression component whereas a coefficient close to zero corresponds to a noise component. To investigate the benefits of using de-noising in component identification, we plot the Lipschitz exponent estimates for insignificant and significant wavelet coefficients. Furthermore, we compare the density functions of the estimated exponents for the de-noised and noised wavelet coefficients and for both three-dimensional and four-dimensional simulation data. The 17
18 estimated amplitude error true amplitude Figure 3: Amplitude error estimates for simulated data: blue point-down triangles are the error estimates after de-noising using the block thresholding method in this paper and black point-up triangles are error estimates after de-noising using non-overlapping block thresholding with SURE shrinkage. separation improves significantly for the de-noised coefficients as provided in Figure 4. This supports the use of wavelet-based noise reduction as a preliminary step in component identification. 4 3D NMR Frequency Data The experimental data explored in this study are for a doubly-labeled sample of a 130 residue RNA binding protein- rho130 - using standard triple resonance experiments on a 1 mm protein sample at a proton frequency of 600 MHz as introduced in Briercheck et al. (1998). The data were processed with FELIX (Accelrys Software Inc.) using apodization and linear prediction methods that are typical for these types of experiments. The two NMR experiments which generated the data analyzed in this paper are HNCOCA and HNCA. HNCOCA is a three-dimensional experiment in which each spectral component arises due to correlations between the amide nitrogen and amide proton of a specific residue and the alpha carbon of the preceding residue in the protein sequence. Therefore, the true number of spectral components will be slightly larger than the number of protein residues (130 residues for the protein under study). HNCA is also a three-dimensional experiment in which spectral components are paired 18
19 Noised Coefficients De Noised Coefficients Significant Significant 0.8 Insignificant 0.8 Insignificant Noised Coefficients De Noised Coefficients Significant Significant Insignificant Insignificant Figure 4: Density functions for Lipschitz exponent estimates of the noised wavelet coefficients (left plot) and of the de-noised wavelet coefficients (right) separated for significant and insignificant coefficients. Upper plots for three-dimensional synthetic data and lower plots for four-dimensional synthetic data. with similar amide nitrogen and amide proton frequencies. In HNCA, a pair of spectral components arises due to correlations between the amide nitrogen, amide proton and the alpha carbon nuclei of the preceding residue as in HNCOCA and of the intra-residue. Therefore, for this experiment, the true number of spectral components will be slightly larger than twice the number of protein residues and half of the spectral components will match the spectral components in HNCOCA. We graphically describe the nuclei correlations in the two experiments in Figure 5. Number of Spectral Components. For the two HNCA and HNCOCA dataset, we evaluate whether wavelet-based de-noising enhances estimation of the number of components and whether the method for wavelet coefficient shrinkage introduced in this paper out-performs other existing methods. For this, we apply the local maxima identification method discussed in Section 3 and the method described in this paper. We identify local maxima above a fixed threshold - T = 8000 for HNCOCA data and T = for HNCA data. These thresholds are approximately equal to the noise level estimated as ˆσ 2 log(m 1 M 2 M 3 ) where ˆσ is the mean absolute variance estimator. Because the HNCA and HNCOCA NMR data are for the same protein, we can 19
20 HN(CO)CA H C H H C H 15 N C C N C 1 H H O H H 13 C H C H H C H 15 N C C N C 1 H H O H H HNCA 13 C Figure 5: Description of correlations in HNCA and HNCOCA NMR experiments. evaluate which spectral components are true positives and which are false positives or false negatives by comparing the locations of the identified local maxima in the two datasets. Half of the local maxima identified for HNCA data need to have similar locations to local maxima identied for HNCOCA data. Moreover, the local maxima identified for HNCA data need to pair with respect to amide nitrogen (first dimension) and amide proton frequencies (third dimension). We also know the number of protein residues 130, and therefore, we know that for HNCOCA and HNCA NMR data, we need to detect slightly more than 130, and respectively, 260 spectral components. In Table 2, we present the results of this comparison. Specifically, we report the number of local maxima identified for noisy and de-noised HNCA and HNCOCA data (first and second rows). We also show the number of paired local maxima identified for HNCA data (third row) and the number of HNCA pairs matching HNCOCA local maxima (forth row). Not all spectral components in HNCA data are paired because of missing spectral components and because some components are close to the noise level. Lastly, we present the total number of HNCA local maxima matching HNCOCA local maxima which is an estimate for the detectable HNCA components (fifth row). The last row is an evaluation of the false discovery rate calculated as the proportion of the detectable HNCA-HNCOCA matches and the total number of local maxima. The first observation based on the results reported in Table 2 is that the number of false local maxima is extremely large for noisy HNCA and HNCOCA data. More than half of the local maxima are false positives at the specified threshold. One alternative is to use a higher threshold but if we increase the threshold, a large number of true 20
21 Noisy Data HARD SOFT Non-overlap Moving BLOCK BLOCK # of local maxima for HNCA data # of local maxima for HNCOCA data # of HNCA pairs # of HNCOCA local maxima matching HNCA pairs # of detectable HNCA components matching HNCOCA components False Discover Rate Table 2: The number of local maxima identified for HNCA and HNCOCA data, the number of pairs for HNCA and the number of matches between HNCA and HNCOCA local maxima. positives will not be detectable. Therefore, for these two NMR experiments, wavelet based de-noising greatly enhances the component identification. A second observation is that the four comparative denoising methods perform rather differently. Soft-thresholding method over-smooths the signal components resulting in a small number of local maxima for both HNCA and HNCOCA. On the other hand, hard-thresholding under-denoises or/and adds spurious aberrations of Gibbs type resulting in a larger number of false positives for both HNCA and HN- COCA data at the specified threshold. These observations are consistent with our FDR and FNR comparisons in the simulation study. Importantly, the block de-noising method in this paper allows for the largest number of HNCA-HNCOCA detectable components (provided in the fifth row of Table 2) at the lowest false discovery rate. Moreover, the number of HNCA pairs matching HNCOCA local maxima is also the largest for the block method introduced in this paper. Therefore, with moving block de-noised data, there is a high level of matching between HNCA and HNCOCA components. De-noising level. We compare the mean squared differences between observed and de-noised data within 64 non-overlapping spatially contiguous regions for the HNCA and HNCOCA data (see Figure 6). For a spatially adaptive de-noising method, 21
22 we would expect that the mean squared residual to be approximately equal for all 64 regions. For hard- and sure-soft thresholding, the mean squared residual varies greatly with large spikes in regions where most of the signal components are located. These findings confirm our assessment on the detectable number of components. Although both the hard and soft thresholding over-smooth the spectral components, only soft-thresholding results in a smaller number of local maxima whereas in contrast, hard thresholding leads to the largest number of detectable local maxima (above the specified threshold). One reason for this difference in the number of detectable local maxima is that hard thresholding adds spurious aberrations of Gibbs type resulting in a larger number of false positives for both HNCA and HNCOCA data at the specified threshold. For the block-shrinkage method in this paper, the mean squared residual varies less across the 64 regions as compared to all three comparative de-noising methods. Moreover, the mean squared differences are smaller for the block-thresholding method throughout all regions. These results indicate that the method introduced in this paper is more conservative in the sense that the signal-free areas will be less denoised but the signal components will preserve their shape and amplitude. This is an important aspect of the noise reduction method since the estimation of the spectral parameters will be more accurate. Overall de-noising. Next we compare the HNCA (FT transformed) data with its de-noised version using the moving-block thresholding method in Figure 7. The (FT) NMR HNCA data are very noisy with varying signal to noise ratio. The signal is mainly concentrated at lower values of the z axis. Ideally, we would smooth out the noise without altering the signal components. The block de-noising method clears the noise reasonably well. There is a trade-off between reducing the noise in signalfree areas and the estimation accuracy of the number of spectral components and of their parameters. With the block method presented in this paper, we can control this trade-off through varying the level of the false discovery rate (ɛ) and its corresponding probability (α). For these results, we used α =.05 and ɛ =.01. In our case study, the primary objective of the noise reduction is component identification, and therefore, it is important to control the error level ɛ at a small level. The choice of α in a range of [.05,.15] does not change the results significantly. 22
23 5e+08 Hard Soft Sure Block Moving Block 3e+08 Mean Squared Residuals HNCA Data e e e+08 Mean Squared Residuals HNCOCA Data Hard Soft Sure Block Moving Block Figure 6: Mean square differences (MSD) between observed and de-noised data within 64 non-overlapping regions for the HNCA frequency data Figure 7: (left)hnca NMR frequency data: observed (left) and denoised using the moving-block method in this paper (right). 23
A Statistical Test for Mixture Detection with Application to Component Identification in Multi-dimensional Biomolecular NMR Studies
A Statistical Test for Mixture Detection with Application to Component Identification in Multi-dimensional Biomolecular NMR Studies Nicoleta Serban 1 H. Milton Stewart School of Industrial Systems and
More informationTheoretical Limits of Component Identification in a Separable Nonlinear Least Squares Problem
To appear in the Journal of Nonparametric Statistics Vol. 00, No. 00, Month 0XX, 1 5 Theoretical Limits of Component Identification in a Separable Nonlinear Least Squares Problem R. Hilton a and N. Serban
More informationSignal Denoising with Wavelets
Signal Denoising with Wavelets Selin Aviyente Department of Electrical and Computer Engineering Michigan State University March 30, 2010 Introduction Assume an additive noise model: x[n] = f [n] + w[n]
More informationMLISP: Machine Learning in Signal Processing Spring Lecture 10 May 11
MLISP: Machine Learning in Signal Processing Spring 2018 Lecture 10 May 11 Prof. Venia Morgenshtern Scribe: Mohamed Elshawi Illustrations: The elements of statistical learning, Hastie, Tibshirani, Friedman
More informationIf we want to analyze experimental or simulated data we might encounter the following tasks:
Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction
More informationOPTIMAL SURE PARAMETERS FOR SIGMOIDAL WAVELET SHRINKAGE
17th European Signal Processing Conference (EUSIPCO 009) Glasgow, Scotland, August 4-8, 009 OPTIMAL SURE PARAMETERS FOR SIGMOIDAL WAVELET SHRINKAGE Abdourrahmane M. Atto 1, Dominique Pastor, Gregoire Mercier
More informationLecture Notes 5: Multiresolution Analysis
Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and
More informationDesign of Image Adaptive Wavelets for Denoising Applications
Design of Image Adaptive Wavelets for Denoising Applications Sanjeev Pragada and Jayanthi Sivaswamy Center for Visual Information Technology International Institute of Information Technology - Hyderabad,
More informationAn Introduction to Wavelets and some Applications
An Introduction to Wavelets and some Applications Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France An Introduction to Wavelets and some Applications p.1/54
More informationMULTI-SCALE IMAGE DENOISING BASED ON GOODNESS OF FIT (GOF) TESTS
MULTI-SCALE IMAGE DENOISING BASED ON GOODNESS OF FIT (GOF) TESTS Naveed ur Rehman 1, Khuram Naveed 1, Shoaib Ehsan 2, Klaus McDonald-Maier 2 1 Department of Electrical Engineering, COMSATS Institute of
More informationWhich wavelet bases are the best for image denoising?
Which wavelet bases are the best for image denoising? Florian Luisier a, Thierry Blu a, Brigitte Forster b and Michael Unser a a Biomedical Imaging Group (BIG), Ecole Polytechnique Fédérale de Lausanne
More informationSparsity Measure and the Detection of Significant Data
Sparsity Measure and the Detection of Significant Data Abdourrahmane Atto, Dominique Pastor, Grégoire Mercier To cite this version: Abdourrahmane Atto, Dominique Pastor, Grégoire Mercier. Sparsity Measure
More informationJEFF HOCH NONUNIFORM SAMPLING + NON-FOURIER SPECTRUM ANALYSIS: AN OVERVIEW
GRASP NMR 2017 JEFF HOCH NONUNIFORM SAMPLING + NON-FOURIER SPECTRUM ANALYSIS: AN OVERVIEW NONUNIFORM SAMPLING + NON-FOURIER SPECTRUM ANALYSIS NUS+NONFOURIER 4D: HCC(CO)NH-TOCSY Experiment time US 150 days
More informationIntroduction Wavelet shrinage methods have been very successful in nonparametric regression. But so far most of the wavelet regression methods have be
Wavelet Estimation For Samples With Random Uniform Design T. Tony Cai Department of Statistics, Purdue University Lawrence D. Brown Department of Statistics, University of Pennsylvania Abstract We show
More informationAutomated Assignment of Backbone NMR Data using Artificial Intelligence
Automated Assignment of Backbone NMR Data using Artificial Intelligence John Emmons στ, Steven Johnson τ, Timothy Urness*, and Adina Kilpatrick* Department of Computer Science and Mathematics Department
More information저작권법에따른이용자의권리는위의내용에의하여영향을받지않습니다.
저작자표시 - 비영리 - 변경금지 2.0 대한민국 이용자는아래의조건을따르는경우에한하여자유롭게 이저작물을복제, 배포, 전송, 전시, 공연및방송할수있습니다. 다음과같은조건을따라야합니다 : 저작자표시. 귀하는원저작자를표시하여야합니다. 비영리. 귀하는이저작물을영리목적으로이용할수없습니다. 변경금지. 귀하는이저작물을개작, 변형또는가공할수없습니다. 귀하는, 이저작물의재이용이나배포의경우,
More informationMODEL-BASED DATA MINING METHODS FOR IDENTIFYING PATTERNS IN MEDICAL AND HEALTH DATA
MODEL-BASED DATA MINING METHODS FOR IDENTIFYING PATTERNS IN MEDICAL AND HEALTH DATA A Thesis Presented to The Academic Faculty by Ross P. Hilton In Partial Fulfillment of the Requirements for the Degree
More informationWavelet Footprints: Theory, Algorithms, and Applications
1306 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 51, NO. 5, MAY 2003 Wavelet Footprints: Theory, Algorithms, and Applications Pier Luigi Dragotti, Member, IEEE, and Martin Vetterli, Fellow, IEEE Abstract
More informationAdaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1999 Adaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach T. Tony Cai University of Pennsylvania
More informationcovariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of
Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated
More informationNonparametric Regression
Adaptive Variance Function Estimation in Heteroscedastic Nonparametric Regression T. Tony Cai and Lie Wang Abstract We consider a wavelet thresholding approach to adaptive variance function estimation
More informationBTRY 4090: Spring 2009 Theory of Statistics
BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)
More informationBOUNDARY CONDITIONS FOR SYMMETRIC BANDED TOEPLITZ MATRICES: AN APPLICATION TO TIME SERIES ANALYSIS
BOUNDARY CONDITIONS FOR SYMMETRIC BANDED TOEPLITZ MATRICES: AN APPLICATION TO TIME SERIES ANALYSIS Alessandra Luati Dip. Scienze Statistiche University of Bologna Tommaso Proietti S.E.F. e ME. Q. University
More informationARIMA Modelling and Forecasting
ARIMA Modelling and Forecasting Economic time series often appear nonstationary, because of trends, seasonal patterns, cycles, etc. However, the differences may appear stationary. Δx t x t x t 1 (first
More information6.435, System Identification
System Identification 6.435 SET 3 Nonparametric Identification Munther A. Dahleh 1 Nonparametric Methods for System ID Time domain methods Impulse response Step response Correlation analysis / time Frequency
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationDeep Learning: Approximation of Functions by Composition
Deep Learning: Approximation of Functions by Composition Zuowei Shen Department of Mathematics National University of Singapore Outline 1 A brief introduction of approximation theory 2 Deep learning: approximation
More informationInvariant Scattering Convolution Networks
Invariant Scattering Convolution Networks Joan Bruna and Stephane Mallat Submitted to PAMI, Feb. 2012 Presented by Bo Chen Other important related papers: [1] S. Mallat, A Theory for Multiresolution Signal
More informationDiscussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan
Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania Professors Antoniadis and Fan are
More informationTime series denoising with wavelet transform
Paper Time series denoising with wavelet transform Bartosz Kozłowski Abstract This paper concerns the possibilities of applying wavelet analysis to discovering and reducing distortions occurring in time
More informationIndependent Component Analysis. Contents
Contents Preface xvii 1 Introduction 1 1.1 Linear representation of multivariate data 1 1.1.1 The general statistical setting 1 1.1.2 Dimension reduction methods 2 1.1.3 Independence as a guiding principle
More informationWavelet methods and null models for spatial pattern analysis
Wavelet methods and null models for spatial pattern analysis Pavel Dodonov Part of my PhD thesis, by the Federal University of São Carlos (São Carlos, SP, Brazil), supervised by Dr Dalva M. Silva-Matos
More informationTime Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY
Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY & Contents PREFACE xiii 1 1.1. 1.2. Difference Equations First-Order Difference Equations 1 /?th-order Difference
More informationConfidence Intervals for Low-dimensional Parameters with High-dimensional Data
Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology
More informationMultivariate Bayes Wavelet Shrinkage and Applications
Journal of Applied Statistics Vol. 32, No. 5, 529 542, July 2005 Multivariate Bayes Wavelet Shrinkage and Applications GABRIEL HUERTA Department of Mathematics and Statistics, University of New Mexico
More informationTime Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY
Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY PREFACE xiii 1 Difference Equations 1.1. First-Order Difference Equations 1 1.2. pth-order Difference Equations 7
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationLinear Model Selection and Regularization
Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationNoise & Data Reduction
Noise & Data Reduction Andreas Wichert - Teóricas andreas.wichert@inesc-id.pt 1 Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis
More informationOn the Estimation of the Function and Its Derivatives in Nonparametric Regression: A Bayesian Testimation Approach
Sankhyā : The Indian Journal of Statistics 2011, Volume 73-A, Part 2, pp. 231-244 2011, Indian Statistical Institute On the Estimation of the Function and Its Derivatives in Nonparametric Regression: A
More informationStrengthened Sobolev inequalities for a random subspace of functions
Strengthened Sobolev inequalities for a random subspace of functions Rachel Ward University of Texas at Austin April 2013 2 Discrete Sobolev inequalities Proposition (Sobolev inequality for discrete images)
More informationTime Series: Theory and Methods
Peter J. Brockwell Richard A. Davis Time Series: Theory and Methods Second Edition With 124 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition vn ix CHAPTER 1 Stationary
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationBayesian-based Wavelet Shrinkage for SAR Image Despeckling Using Cycle Spinning *
Jun. 006 Journal of Electronic Science and Technology of China Vol.4 No. ayesian-based Wavelet Shrinkage for SAR Image Despeckling Using Cycle Spinning * ZHANG De-xiang,, GAO Qing-wei,, CHEN Jun-ning.
More informationFractal functional regression for classification of gene expression data by wavelets
Fractal functional regression for classification of gene expression data by wavelets Margarita María Rincón 1 and María Dolores Ruiz-Medina 2 1 University of Granada Campus Fuente Nueva 18071 Granada,
More informationLocal Polynomial Wavelet Regression with Missing at Random
Applied Mathematical Sciences, Vol. 6, 2012, no. 57, 2805-2819 Local Polynomial Wavelet Regression with Missing at Random Alsaidi M. Altaher School of Mathematical Sciences Universiti Sains Malaysia 11800
More informationCourse content (will be adapted to the background knowledge of the class):
Biomedical Signal Processing and Signal Modeling Lucas C Parra, parra@ccny.cuny.edu Departamento the Fisica, UBA Synopsis This course introduces two fundamental concepts of signal processing: linear systems
More information(a). Bumps (b). Wavelet Coefficients
Incorporating Information on Neighboring Coecients into Wavelet Estimation T. Tony Cai Bernard W. Silverman Department of Statistics Department of Mathematics Purdue University University of Bristol West
More informationImage representation with multi-scale gradients
Image representation with multi-scale gradients Eero P Simoncelli Center for Neural Science, and Courant Institute of Mathematical Sciences New York University http://www.cns.nyu.edu/~eero Visual image
More informationModel Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model
Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population
More informationChange Detection in Multivariate Data
Change Detection in Multivariate Data Likelihood and Detectability Loss Giacomo Boracchi July, 8 th, 2016 giacomo.boracchi@polimi.it TJ Watson, IBM NY Examples of CD Problems: Anomaly Detection Examples
More informationSparse representation classification and positive L1 minimization
Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng
More informationTIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA
CHAPTER 6 TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA 6.1. Introduction A time series is a sequence of observations ordered in time. A basic assumption in the time series analysis
More informationWavelet denoising of magnetic prospecting data
JOURNAL OF BALKAN GEOPHYSICAL SOCIETY, Vol. 8, No.2, May, 2005, p. 28-36 Wavelet denoising of magnetic prospecting data Basiliki Tsivouraki-Papafotiou, Gregory N. Tsokas and Panagiotis Tsurlos (Received
More informationOn least squares estimators under Bernoulli-Laplacian mixture priors
On least squares estimators under Bernoulli-Laplacian mixture priors Aleksandra Pižurica and Wilfried Philips Ghent University Image Processing and Interpretation Group Sint-Pietersnieuwstraat 41, B9000
More informationEmpirical Market Microstructure Analysis (EMMA)
Empirical Market Microstructure Analysis (EMMA) Lecture 3: Statistical Building Blocks and Econometric Basics Prof. Dr. Michael Stein michael.stein@vwl.uni-freiburg.de Albert-Ludwigs-University of Freiburg
More informationENVIRONMENTAL DATA ANALYSIS WILLIAM MENKE JOSHUA MENKE WITH MATLAB COPYRIGHT 2011 BY ELSEVIER, INC. ALL RIGHTS RESERVED.
ENVIRONMENTAL DATA ANALYSIS WITH MATLAB WILLIAM MENKE PROFESSOR OF EARTH AND ENVIRONMENTAL SCIENCE COLUMBIA UNIVERSITY JOSHUA MENKE SOFTWARE ENGINEER JOM ASSOCIATES COPYRIGHT 2011 BY ELSEVIER, INC. ALL
More informationMultiscale Geometric Analysis: Thoughts and Applications (a summary)
Multiscale Geometric Analysis: Thoughts and Applications (a summary) Anestis Antoniadis, University Joseph Fourier Assimage 2005,Chamrousse, February 2005 Classical Multiscale Analysis Wavelets: Enormous
More informationhigh-dimensional inference robust to the lack of model sparsity
high-dimensional inference robust to the lack of model sparsity Jelena Bradic (joint with a PhD student Yinchu Zhu) www.jelenabradic.net Assistant Professor Department of Mathematics University of California,
More informationNoise & Data Reduction
Noise & Data Reduction Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis - Spectrum Dimension Reduction 1 Remember: Central Limit
More informationELEMENTS OF PROBABILITY THEORY
ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationRegression Clustering
Regression Clustering In regression clustering, we assume a model of the form y = f g (x, θ g ) + ɛ g for observations y and x in the g th group. Usually, of course, we assume linear models of the form
More informationStatistics 910, #5 1. Regression Methods
Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationThe lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding
Patrick Breheny February 15 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/24 Introduction Last week, we introduced penalized regression and discussed ridge regression, in which the penalty
More informationRegression Shrinkage and Selection via the Lasso
Regression Shrinkage and Selection via the Lasso ROBERT TIBSHIRANI, 1996 Presenter: Guiyun Feng April 27 () 1 / 20 Motivation Estimation in Linear Models: y = β T x + ɛ. data (x i, y i ), i = 1, 2,...,
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More informationDenoising via Recursive Wavelet Thresholding. Alyson Kerry Fletcher. A thesis submitted in partial satisfaction of the requirements for the degree of
Denoising via Recursive Wavelet Thresholding by Alyson Kerry Fletcher A thesis submitted in partial satisfaction of the requirements for the degree of Master of Science in Electrical Engineering in the
More informationA Data-Driven Block Thresholding Approach To Wavelet Estimation
A Data-Driven Block Thresholding Approach To Wavelet Estimation T. Tony Cai 1 and Harrison H. Zhou University of Pennsylvania and Yale University Abstract A data-driven block thresholding procedure for
More informationDetection of structural breaks in multivariate time series
Detection of structural breaks in multivariate time series Holger Dette, Ruhr-Universität Bochum Philip Preuß, Ruhr-Universität Bochum Ruprecht Puchstein, Ruhr-Universität Bochum January 14, 2014 Outline
More informationState-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53
State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State
More informationLearning gradients: prescriptive models
Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan
More informationMinimum Hellinger Distance Estimation in a. Semiparametric Mixture Model
Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.
More informationBasic principles of multidimensional NMR in solution
Basic principles of multidimensional NMR in solution 19.03.2008 The program 2/93 General aspects Basic principles Parameters in NMR spectroscopy Multidimensional NMR-spectroscopy Protein structures NMR-spectra
More informationLecture Hilbert-Huang Transform. An examination of Fourier Analysis. Existing non-stationary data handling method
Lecture 12-13 Hilbert-Huang Transform Background: An examination of Fourier Analysis Existing non-stationary data handling method Instantaneous frequency Intrinsic mode functions(imf) Empirical mode decomposition(emd)
More information1 Least Squares Estimation - multiple regression.
Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1
More informationSparsity in Underdetermined Systems
Sparsity in Underdetermined Systems Department of Statistics Stanford University August 19, 2005 Classical Linear Regression Problem X n y p n 1 > Given predictors and response, y Xβ ε = + ε N( 0, σ 2
More informationMachine Learning Linear Regression. Prof. Matteo Matteucci
Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares
More informationCurvelet imaging & processing: sparseness constrained least-squares migration
Curvelet imaging & processing: sparseness constrained least-squares migration Felix J. Herrmann and Peyman P. Moghaddam (EOS-UBC) felix@eos.ubc.ca & www.eos.ubc.ca/~felix thanks to: Gilles, Peyman and
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationMODWT Based Time Scale Decomposition Analysis. of BSE and NSE Indexes Financial Time Series
Int. Journal of Math. Analysis, Vol. 5, 211, no. 27, 1343-1352 MODWT Based Time Scale Decomposition Analysis of BSE and NSE Indexes Financial Time Series Anu Kumar 1* and Loesh K. Joshi 2 Department of
More informationSparse linear models
Sparse linear models Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 2/22/2016 Introduction Linear transforms Frequency representation Short-time
More informationDenosing Using Wavelets and Projections onto the l 1 -Ball
1 Denosing Using Wavelets and Projections onto the l 1 -Ball October 6, 2014 A. Enis Cetin, M. Tofighi Dept. of Electrical and Electronic Engineering, Bilkent University, Ankara, Turkey cetin@bilkent.edu.tr,
More informationLow-rank Promoting Transformations and Tensor Interpolation - Applications to Seismic Data Denoising
Low-rank Promoting Transformations and Tensor Interpolation - Applications to Seismic Data Denoising Curt Da Silva and Felix J. Herrmann 2 Dept. of Mathematics 2 Dept. of Earth and Ocean Sciences, University
More informationAccelerated Block-Coordinate Relaxation for Regularized Optimization
Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth
More informationDetection theory. H 0 : x[n] = w[n]
Detection Theory Detection theory A the last topic of the course, we will briefly consider detection theory. The methods are based on estimation theory and attempt to answer questions such as Is a signal
More informationEfficient Algorithms for Pulse Parameter Estimation, Pulse Peak Localization And Pileup Reduction in Gamma Ray Spectroscopy M.W.Raad 1, L.
Efficient Algorithms for Pulse Parameter Estimation, Pulse Peak Localization And Pileup Reduction in Gamma Ray Spectroscopy M.W.Raad 1, L. Cheded 2 1 Computer Engineering Department, 2 Systems Engineering
More informationA Multiple Testing Approach to the Regularisation of Large Sample Correlation Matrices
A Multiple Testing Approach to the Regularisation of Large Sample Correlation Matrices Natalia Bailey 1 M. Hashem Pesaran 2 L. Vanessa Smith 3 1 Department of Econometrics & Business Statistics, Monash
More informationWooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics
Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).
More informationFDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES
FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept
More informationAbstract. 1 Introduction. Cointerpretation of Flow Rate-Pressure-Temperature Data from Permanent Downhole Gauges. Deconvolution. Breakpoint detection
Cointerpretation of Flow Rate-Pressure-Temperature Data from Permanent Downhole Gauges CS 229 Course Final Report Chuan Tian chuant@stanford.edu Yue Li yuel@stanford.edu Abstract This report documents
More informationLineShapeKin NMR Line Shape Analysis Software for Studies of Protein-Ligand Interaction Kinetics
LineShapeKin NMR Line Shape Analysis Software for Studies of Protein-Ligand Interaction Kinetics http://lineshapekin.net Spectral intensity Evgenii L. Kovrigin Department of Biochemistry, Medical College
More informationCross-Validation with Confidence
Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation
More informationCOMPLEX WAVELET TRANSFORM IN SIGNAL AND IMAGE ANALYSIS
COMPLEX WAVELET TRANSFORM IN SIGNAL AND IMAGE ANALYSIS MUSOKO VICTOR, PROCHÁZKA ALEŠ Institute of Chemical Technology, Department of Computing and Control Engineering Technická 905, 66 8 Prague 6, Cech
More informationPenalty Methods for Bivariate Smoothing and Chicago Land Values
Penalty Methods for Bivariate Smoothing and Chicago Land Values Roger Koenker University of Illinois, Urbana-Champaign Ivan Mizera University of Alberta, Edmonton Northwestern University: October 2001
More informationProf. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis
Introduction to Time Series Analysis 1 Contents: I. Basics of Time Series Analysis... 4 I.1 Stationarity... 5 I.2 Autocorrelation Function... 9 I.3 Partial Autocorrelation Function (PACF)... 14 I.4 Transformation
More information