arxiv: v1 [math.st] 24 Apr 2015
|
|
- Chad Haynes
- 5 years ago
- Views:
Transcription
1 Multiple Testing of Local Extrema for Detection of Change Points arxiv:.v [math.st] Apr Dan Cheng North Carolina State University April, Astract Armin Schwartzman North Carolina State University A new approach to change point detection ased on smoothing and multiple testing is presented for long data sequences modeled as piecewise constant functions plus stationary ergodic Gaussian noise. As an application of the STEM algorithm for peak detection developed in Schwartzman et al. () and Cheng & Schwartzman (), the method detects change points as significant local maxima and minima after smoothing and differentiating the oserved sequence. The algorithm, comined with the Benjamini-Hocherg procedure for thresholding p- values, provides asymptotic strong control of the False Discovery Rate (FDR) and power consistency, as the length of the sequence and the size of the jumps get large. Simulations show that FDR levels are maintained in non-asymptotic conditions and guide the choice of smoothing andwidth with respect to the desired location tolerance. The methods are illustrated in genomic array-cgh data. Key Words: False discovery rate; Gaussian; Kernel smoothing; Local maxima; Local minima; Location tolerance. Introduction Detecting change points in the mean of an oserved signal is a common statistical prolem with applications in many research areas such as climatology (Reeves et al., ), oceanography(killick et al., ), finance (Zeileis et al., ) and medical imaging (Nam et al., ). It often appears in the analysis of time series ut it has more recently een found in the analysis of genomic sequences, see Erdman & Emerson (); Lai et al. (); Muggeo & Adelfio (); Olshen et al. (); Tishirani & Wang (); Wang et al. () and the references therein. Given the large amounts of data present in modern applications, it is of interest to design a change point detection method that Partially supported y NIH grant R-CA.
2 can operate over long sequences where the numer and location of change points are unknown, and in such a way that the overall detection error rate is controlled. Many different approaches have een proposed to find and estimate change points, such as kernelased methods (Gijels, ), Bayesian methods (Barry & Hartigan, ; Erdman & Emerson, ), segmentation techniques (Olshen et al., ; Wang et al., ; Muggeo & Adelfio, ), nonparametric tests (Lanzante, ) andl -penalty methods (Eilers & de Menezes, ; Huang et al., ; Tishirani & Wang, ), including the PELT method (Jackson et al., ; Killick et al., ). However, the approach proposed here is unique in the following two ways. First, the noise is assumed to e a stationary Gaussian process, allowing the error terms to e correlated. This is an important departure from the standard assumption of white noise in most of the change-point literature. In fact, applied statisticians desiring to use change-point methods have sometimes aandoned this option in favor of other techniques simply ecause the white noise assumption does not hold (Hung et al., ). This paper shows that change-point methods can e devised for correlated noise, expanding the domain of their applicaility. Second, we use the theory of Gaussian processes to compute p-values for all candidate change points, so that significant change points can e selected at a desired significance level. For concreteness, we adopt the Benjamini-Hocherg multiple testing procedure, enaling control of the false discovery rate (FDR) of detected change points when the data sequence is long and the numer and location of change points are unknown. To our knowledge, this is the first article proposing a multiple testing method for controlling FDR of detected change points. In this paper, we consider a signal-plus-noise model where the true signal is a piecewise constant function and the change points are defined as the points of discontinuity. Inspired y the method for detecting peaks in Schwartzman et al. () and Cheng & Schwartzman (), we modify the STEM algorithm therein to detect change points. The central idea is the oservation that the true signal has zero derivative everywhere except at the change points, where the derivative is infinite. Thus, in the presence of noise and under temporal or spatial sampling, change points can e seen as positive or negative peaks in the derivative of the smoothed signal. Note that ecause of the time sampling, derivatives cannot e oserved directly and can only e estimated. The focus on the derivative of the smoothed signal effectively transforms the change point detection prolem into a peak detection prolem. As in the STEM algorithm, the resulting peak detection prolem is then solved y identifying local maxima and minima of the derivative as candidate peaks and applying a multiple testing procedure to the list of candidates. The modified STEM algorithm for change point detection consists of the following steps:. Differential kernel smoothing: to transform change points to local maxima or minima.. Candidate peaks: find local maxima and minima of the differentiated smoothed process.. P-values: computed at each local maximum and minimum under the the null hypothesis of no signal in a local neighorhood.
3 Figure : Following the notation in. and Example., the left panel is the oserved signal-plus-noise model y(t) containing ten true change points with varying a i and noise z(t) given y (.) withσ = andν =. The right panel illustrates the STEM algorithm. The lue curve isy γ(t), otained with a Gaussian smoothing kernel with standard deviation γ =. Local maxima (green solid dots) and local minima (red solid dots) are declared as significant (marked with solid triangles) at FDR level α =. if their heights are eyond the dotted line thresholds. The cyan and pink ars indicate the location tolerance intervals (v i,v i + ) with = for increasing and decreasing change points respectively. At this tolerance, there are nine true discoveries and one false discovery.. Multiple testing: apply a multiple testing procedure to the set of local maxima and minima; declare as detected change points those local maxima and minima whose p-values are significant. The algorithm is illustrated y a toy example in Figure. The modified STEM algorithm aove differs from the ones in Schwartzman et al. () and Cheng & Schwartzman () in that peaks are sought in the derivative of the smoothed signal rather than the smoothed signal itself, and that oth positive and negative peaks are considered. In addition, an important consideration for the proper definition of error in change point detection is that, as opposed to the peak detection prolems considered in Schwartzman et al. () and Cheng & Schwartzman () where signal peaks had compact support, a true single change point at t = v has Leesgue measure zero. Thus in the presence of noise, it can never e detected exactly at t = v. Therefore we introduce a location tolerancethat defines the precision within which a change point should e detected. Specifically, given, a detected change point is regarded as a true discovery if it falls in the interval (v, v + ). Conversely, if a significant change point is found more than a distance from any true change point, it is considered a false discovery. The quantity is not used in the STEM algorithm itself ut is needed for proper error definition. Under this convention, it is shown here that the modified STEM algorithm exhiits asymptotic FDR control and power consistency as the length of the sequence and the size of the jumps at the change points increase. These asymptotic conditions are similar to those considered in Schwartzman et al.
4 () and Cheng & Schwartzman () and, in fact, the proofs are easily extended from those in Cheng & Schwartzman (). Simulations for varying levels of smoothing andwidth γ, location tolerance and jump size a are used to study the ehavior of the algorithm under non-asymptotic conditions. The simulation results help guide the choice of smoothing andwidth with respect to the desired location tolerance. In general, power increases with andwidth to a limit dictated y the distance etween the change points, so admitting a higher tolerance generally allows a higher andwidth and higher power. The methods are illustrated in a genomic sequence of array-cgh data in a reast-cancer tissue sample (Loo et al., ; Hsu et al., ). The goal of the analysis is to find genomic segments with copy-numer alterations. These are found y detecting change points in the copy numer genomic sequence. The multiple testing scheme. The model We consider a continuous time model, although the algorithm is designed for data discretely sampled in time. Consider the signal-plus-noise model y(t) = µ(t)+z(t), t R, (.) where the signal µ(t) is a piecewise constant function of the form µ(t) = a j h j (t), j= a j R\{}, with h j (t) = ½(t > v j ) for v j R. We are interested in finding the change points v j. For the asymptotic analysis, we assume a = inf j a j > and v = inf j v j v j >, so that the change points do not ecome aritrarily small in size nor aritrarily close to each other. Let w γ (t) = w(t/γ)/γ, where γ > is the andwidth parameter and w(t) is a unimodal symmetric kernel with compact connected support [ c, c] and unit action. Convolving the process (.) with the kernel w γ (t) results in the smoothed random process y γ (t) = w γ (t) y(t) = w γ (t s)y(s)ds = µ γ (t)+z γ (t), (.) where the smoothed signal and smoothed noise are defined respectively as µ γ (t) = w γ (t) µ(t) = R a j h j,γ (t) and z γ (t) = w γ (t) z(t), (.) j=
5 and where the smoothed change point takes the form h j,γ (t) = w γ (t) h j (t). (.) The smoothed noise z γ (t) defined y (.) is assumed to e a zero-mean four-times differentiale stationary ergodic Gaussian process.. Change point detection as peak detection of the derivative Consider now the derivative of the smoothed oserved process (.) y γ (t) = w γ R (t) y(t) = w γ (t s)y(s)ds = µ γ (t)+z γ (t), (.) N where the derivatives of the smoothed signal and smoothed noise are respectively µ γ (t) = w γ (t) µ(t) = a j h j,γ (t) and z γ (t) = w γ (t) z(t). A key oservation from (.) is that h j,γ(t) = w γ(t s)h j (s)ds = w γ(s)h j (t s)ds R R = w γ (s)½(t s > v j)ds = w γ (t v j ). R Thus (.) represents a signal-plus-noise model where the smoothed signal j= (.) µ γ (t) = a j h j,γ (t) = a j w γ (t v j ) (.) j= is a sequence of unimodal peaks with the same shape as that of w γ and located at locations v j. The prolem of finding change points in y γ (t) is thus reduced to finding (positive or negative) peaks in y γ(t). For simplicity, we assume that the compact supports S j,γ of the smoothed peak shape h j,γ (t) = j= w γ (t v j ) do not overlap, although this is not crucial in practice.. The STEM algorithm for change point detection Suppose we oserve y(t) with J jumps defined y (.) in the line of length L centered at the origin, denoted y U(L) = ( L/, L/). The following is a version of the STEM algorithm of Schwartzman et al. () and Cheng & Schwartzman () for detecting change points. Algorithm. (STEM algorithm for change point detection). Differential kernel smoothing: Otain the process (.) y convolution of y(t) with the kernel derivative w γ (t).
6 . Candidate peaks: Find the set of local maxima and minima of y γ(t) in U(L), denoted y T γ = T + γ T γ, where { } T γ + = t U(L) : y γ(t) =, y γ () (t) <, { } T γ = t U(L) : y γ(t) =, y γ () (t) >.. P-values: For each t T γ +, compute the p-value p γ(t) for testing the (conditional) hypotheses H (t) : {µ (s) = for all s (t,t+)} vs. H A (t) : {µ (s) > for some s (t,t+)}; and for each t T γ, compute the p-value p γ(t) for testing the (conditional) hypotheses H (t) : {µ (s) = for all s (t,t+)} vs. H A (t) : {µ (s) < for some s (t,t+)}, where > is an appropriate location tolerance.. Multiple testing: Let m γ = #{t T γ } e the numer of tested hypotheses. Apply a multiple testing procedure on the set of m γ p-values {p γ (t), t T γ }, and declare significant all local extrema whose p-values are smaller than the significance threshold.. P-values Given the oserved heights y γ (t) at the local maxima or minimat T γ = T γ + T γ, p-values in step () of Algorithm. are computed as F γ (y γ(t)), t T γ +, p γ (t) = F γ ( y γ (t)), t T (.) γ, wheref γ (u) denotes the right tail proaility ofz γ (t) at the local maximumt T γ + the null model µ (s) =, s (t,t+), that is,, evaluated under F γ (u) = P ( z γ(t) > u t is a local maximum of z γ (t) ). (.) The second line in (.) is otained y noting that, y (.), P ( z γ(t) < u t is a local minimum of z γ (t) ) = P ( z γ (t) > u tis a local maximum of z γ (t)) = F γ ( u), since z γ (t) and z γ (t) have the same distriution.
7 The distriution (.) has a closed-form expression, which can e otained as in Schwartzman et al. () or Cramér & Leadetter (). More specifically, the distriution (.) is given y ( ) λ,γ πλ ( ) ( ),γ u λ,γ F γ (u) = Φ u + λ,γ σ γ φ σ γ Φ u σ γ, (.) where σ γ = Var(z γ(t)), λ,γ = Var(z γ(t)), λ,γ = Var(z γ () (t)), = σ γ λ,γ λ,γ, and φ(x), Φ(x) are the standard normal density and cumulative distriution function, respectively. The quantities σ γ, λ,γ and λ,γ depend on the kernel w γ (t) and the autocorrelation function of the original noise process z(t). Explicit expressions may e otained, for instance, for the Gaussian autocorrelation model in Example. elow, which we use later in the simulations.. Error definitions Assuming the model of., define the signal region S = J j= (v j,v j + ) and null region S = U(L)\S. For u >, let T γ (u) = T γ + (u) T γ (u), where { } T γ + (u) = t U(L) : y γ(t) > u, y γ(t) =, y γ () (t) <, } T γ {t (u) = U(L) : y γ (t) < u, y γ (t) =, y() γ (t) >, indicating that T γ + (u) and T γ (u) are respectively the set of local maxima of y γ (t) aove u and the set of local minima of y γ (t) elow u. The numer of totally and falsely detected change points at threshold u are defined respectively as R γ (u) = #{t T + γ (u)}+#{t T γ (u)}, V γ (u;) = #{t T + γ (u) S }+#{t T γ (u) S }. (.) Both are defined as zero if T γ (u) is empty. proportion of falsely detected jumps FDR γ (u;) = E The FDR at threshold u is defined as the expected { } Vγ (u;). (.) R γ (u) Note that whenγ and u are fixed,v γ (u;) and hence FDR γ (u;) are decreasing in. Following the notation in Cheng & Schwartzman (), define the smoothed signal region S,γ to e the support of µ γ(t) and smoothed null region S,γ = U(L) \ S,γ. We call the difference etween the expanded signal support due to smoothing and the true signal support the transition region T γ = S,γ \S = S \S,γ.. Power Denote y I + and I the collections of indices j corresponding to increasing and decreasing change points v j, respectively. We define the power of Algorithm. as the expected fraction of true discov-
8 ered change points Power γ (u;) = J [ ( J Power j,γ (u;) = E J j= + j I ½ j I + ½ ( T+ γ (u) (v j,v j +) ) ) )] ( T γ (u) (v j,v j +), where Power j,γ (u;) is the proaility of detecting jump v j within a distance, ) P( T+ γ (u) (v j,v j +), if j I +, Power j,γ (u;) = ) P( T γ (u) (v j,v j +), if j I. (.) (.) The indicator function in (.) ensures that only one significant local extremum is counted within a distance of a change point, so power is not inflated. Note that whenγ anduare fixed,power γ (u;) and Power j,γ (u;) are increasing in. Asymptotic error control and power consistency Suppose the BH procedure is applied in step of Algorithm. as follows. For a fixed α (, ), let k e the largest index for which theith smallest p-value is less than iα/ m γ. Then the null hypothesis H (t) at t T γ is rejected if p γ (t) < kα m γ y γ(t) > ũ BH = F y γ(t) < ũ BH = F γ ( ) kα m γ γ ( ) kα m γ if t T + γ, if t T γ, wherekα/ m γ is defined as if m γ =. Sinceũ BH is random, we define FDR in such BH procedure as { } Vγ (ũ BH ;) FDR BH,γ () = E, R γ (ũ BH ) where R γ ( ) and V γ ( ;) are defined in (.) and the expectation is taken over all possile realizations of the random threshold ũ BH. We will make use of the following conditions: (C) The assumptions of. hold. (C) L and a = inf j a j, such that (logl)/a, J/L = A + O(a + L / ) with A >. Let E[ m,γ (U())] and E[ m,γ (U(),u)] e the expected numer of local maxima and local maxima aove level u of z γ(t) on unit interval U() = ( /,/), respectively. In particular, we have the following explicit formula (Schwartzman et al., ) E[ m,γ (U())] = λ,γ. (.) π λ,γ (.)
9 Theorem. Let conditions (C) and (C) hold. (i) Suppose Algorithm. is applied with a fixed threshold u >. Then FDR γ (u;) E[ m,γ (U(),u)]( cγa ) E[ m,γ (U(),u)]( cγa )+A +O(a +L / ). (ii) Suppose Algorithm. is applied with the random threshold ũ BH (.). Then E[ m,γ (U())]( cγa ) FDR BH,γ () α +O(a +L / ). E[ m,γ (U())]( cγa )+A Proof Since w γ (t) has compact support [ cγ,cγ], y (.), the support S,γ of µ γ (t) in (.) is composed of the support segments [v j cγ,v j + cγ] of h j,γ (t). By condition (C), S,γ /L = cγa +O(a +L / ), which implies S,γ /L = cγa +O(a +L / ). Notice that, on the null region S,γ, the expected numer of local extrema, including oth local maxima and minima, equals S,γ E[ m,γ (U())]. On the other hand, similarly to the proof of Theorem in Cheng & Schwartzman (), the expected numer of local extrema on the signal region S,γ is asymptotically equivalent toj. This is ecause, for eachj I + and >, asa, asymptotically, there is no local maximum ofy γ(t) in(v j cγ,v j ) (v j +,v j +cγ), and there is only one local maximum ofy γ (t) in (v j,v j +). The result then follows from similar arguments for proving Theorem in Cheng & Schwartzman () with N =, A,γ = cγa, z γ (t) replaced y z γ(t) and E[ m,γ (U())] replaced y E[ m,γ (U())]. Lemma. Let conditions (C) and (C) hold. As a j, the power for peak j and fixed u (.) can e approximated y ( ) aj w γ () u Power j,γ (u;) = Φ (+O( a j )). (.) σ γ Proof By (.), h j,γ (v j) = w γ () is the maximum of h j,γ (t) over t R. The result then follows from similar arguments for proving Lemma in Cheng & Schwartzman () with z γ (t) replaced y z γ(t). By similar arguments in Cheng & Schwartzman () (see equation () therein), one can show that the random threshold ũ BH converges asymptotically to the deterministic threshold ( ) u BH = αa F γ, (.) A +E[ m,γ (U())]( cγa )( α)
10 wheree[ m,γ (U())] is given y (.). Sinceũ BH is random, similarly to the definition offdr BH,γ (), we define power in the BH procedure as [ ( Power BH,γ () = E J j I + ½ + j I ½ Theorem. Let conditions (C) and (C) hold. ( T+ γ (ũ BH ) (v j,v j +) ) ) )] ( T γ (ũ BH ) (v j,v j +). (i) Suppose Algorithm. is applied with a fixed threshold u >. Then Power γ (u;) = O(a ). (ii) Suppose Algorithm. is applied with the random threshold ũ BH (.). Then Power BH,γ () = O(a +L / ). Proof The desired results follow from similar arguments for showing Theorem in Cheng & Schwartzman (). Example. [Gaussian autocorrelation model] Let the noise z(t) in (.) e constructed as ( ) t s z(t) = σ ν φ db(s), σ,ν >, (.) ν R where φ is the standard Gaussian density, db(s) is Gaussian white noise and ν > (z(t) is regarded y convention as Gaussian white noise when ν = ). Convolving with a Gaussian kernel w γ (t) = (/γ)φ(t/γ) with γ > as in (.) produces a zero-mean infinitely differentiale stationary ergodic Gaussian fieldz γ (t) such that z γ (t) = w γ (t) z(t) = σ (t s) R N ξ ( ) t s φ db(s), ξ = γ ξ +ν, withσ γ = σ /( πξ ), λ,γ = σ /( πξ ) and λ,γ = σ /( πξ ). We have SNR j,γ = a [ ] jw γ () aj (γ +ν ) / σ γ = σπ /. (.) γ As a function of γ, the SNR has a local minimum at γ = ν and is strictly increasing for large γ. In particular, when ν =, it is strictly increasing in γ. Thus we generally expect the detection power to increase with γ for γ > ν. This will e confirmed in the simulations elow. Note that for ν >, the SNR is unounded asγ, however in practice γ cannot e too small: if the support of w γ ecomes smaller than the sampling interval, then the derivative µ γ cannot e estimated.
11 a = a = a = FDR Power Figure : Realized FDR and power of the BH procedure ford =. Simulation studies Simulations were used to evaluate the performance and limitations of the STEM algorithm for signals µ(t) = a t/d, where t =,...,L, L =, d {,} and signal strength a {,,}. Under this setting, the true change points arev j = jd for j =,...,L/d. The noise is generated as the Gaussian process constructed in (.) with σ = and ν =. The smoothing kernels are w γ (t) = (/γ)φ(t/γ)½(t [ γ,γ]) for varying γ. The BH procedure was applied at FDR level α =.. Results were averaged over, replications to simulate the expectations. Figure shows the realized FDR and detection power for a change point separation of d =. A special color map was used for FDR to emphasize the control of FDR (FDR values less than the nominal level. appear in dark lue). We see that for every fixed andwidth γ, increasing the location tolerance allows for detections to e counted as true farther away from the true change points, therey decreasing FDR and increasing power. On the other hand, as expected from Theorems. and., as the strength of the signal a increases, FDR is eventually controlled elow the nominal level and the power tends to for every comination of parameters γ and. For a fixed tolerance, increasing the andwidth γ increases FDR y moving some true change points eyond and producing artificial errors. This also results in some loss of power, which can e seen in Figure especially when is small and γ is large. In addition, for larger and smaller γ, the power is seen to first decrease quickly and then increase again as γ increases. This phenomenon coincides with the ehavior of the SNR (.) derived in Example., predicting the power to decrease for γ ν and increase for γ > ν. Figure illustrates this phenomenon more clearly for fixed =. The realized FDR curves have the same shape as the theoretical power curves, otained y plugging the asymptotic threshold u BH
12 . Power.. γ Figure : The solid curves and dotted curves (in oth cases, lue for a =, green for a = and red for a = ) are respectively theoretical powers and realized powers from Figure with =. The pink dotted vertical line is γ = ν as shown in Example.. (.) into the approximate power (.). Both curves get closer to as the signal strength a increases. The separation of d = in Figure is large enough that andwidths up to γ = do not produce any interference etween neighoring change points. To investigate the effect of neighoring interference, Figure shows the realized FDR and detection power for a change point separation of d =. It can e seen that increasing γ eyond will produce interference and contamination error, therey decreasing power. Still, for any fixed γ and, FDR will eventually e controlled and power will increase if the signal strength is large enough. Data example. Data description Array-ased comparative genomic hyridization (array-cgh) is a high-throughput high-resolution technique used to evaluate changes in the numer of copies of alleles at thousands of genomic loci simultaneously. The output is often called Copy Numer Variation (CNV) data. Changes in copy numer are represented y segments whose mean is displaced with respect to the ackground. To detect these changes, it is costumary to search for change points along the genome. In this paper, we apply our method to chromosome of tumor sample # from the dataset of Hsu et al. () and Loo et al. (). This sample is one of formalin-fixed reast cancer tumors in that dataset and it was chosen for its visual appeal in the illustration of our method. The data in chromosome of tumor sample # consists of average copy numer reads mapped onto unequally spaced locations along the chromosome. For simplicity, the data was analyzed ignoring the gaps in the genomic locations: Figure (top left) shows the data with spacings etween reads artificially set to. Note that ignoring the spacings does not affect the presence or asence of change points.
13 a = a = a = FDR Power Figure : Realized FDR and power of the BH procedure ford =.. Data analysis To analyze the data, the STEM algorithm was applied with a truncated Gaussian smoothing kernel w γ (t) = (/γ)φ(t/γ)½(t [ γ,γ]) with γ =. The andwidth was chosen not too large in order to avoid interference etween neighoring change points. Figure (top right) shows the estimated first derivative (.). Figure (ottom left) marks local maxima (green) and local minima (red). P-values corresponding to local maxima and minima were computed according to (.) using the distriution (.). The required parameters σ γ = Var(z γ (t)), λ,γ = Var(z γ (t)), λ,γ = Var(z γ () (t)) were estimated empirically from the estimated first, second and third derivatives over the oserved data sequence. However, the empirical variances were computed using truncated averages instead of regular averages in order to avoid ias from the extreme derivatives at the change points without assuming their presence or location in advance. The BH algorithm was applied to the p-values FDR level., yielding a p-value significance threshold of.. The corresponding asolute height threshold of. is marked as dashed lines in Figure (ottom left). The significant peaks are plotted on the original data in Figure (ottom right) with a location tolerance of = for visual reference. Discussion In this paper, we comined oth local maxima and minima as candidate peaks, and then applied a multiple testing procedure to find a uniform threshold (in asolute value) for detecting all change points. This approach is sensile when the distriutions (numer and height) of true increasing and
14 Figure : Data example. Top left: Oserved data. Top right: estimated first derivative. Bottom left: local maxima (upward triangles), local minima (downward triangles) of the estimated first derivative, and significance height threshold (lack dashed). Bottom right: The detected positive (green) and negative (red) change points. decreasing change points are aout the same. Alternatively, different thresholds for detecting increasing and decreasing change points could e found y applying separate multiple testing procedures to the sets of candidate local maxima and local minima. While we applied the BH algorithm to control FDR, in principle other multiple testing procedures may e used to control other error rates. A natural and important question is how to choose the smoothing andwidth γ. From Example. and Figures and, we see that either a very small γ or a relatively large γ is preferred in order to increase power, ut only to the extent that the smoothed signal supports h j,γ (t) have little overlap and that detected change points are not displaced y more than the desired tolerance (recall that the value of is not used in the STEM algorithm itself, ut it may e determined y the needs of the specific scientific application). Considering the Gaussian kernel to have an effective support of ±cγ, a good value of γ may e aout min(,d/c), where d is the separation etween change points. Since the location of the change points is unknown, a more precise optimization of γ may require an iterative procedure. Moreover, if some change points are close together and others are far apart, an adpative andwidth may e preferale. We leave these as prolems for future research.
15 References ADLER, R. J. & TAYLOR, J. E. (). Random Fields and Geometry. Springer, New York. BARRY, D. & HARTIGAN, J. A. (). A Bayesian analysis for change point prolems. J. Am. Statist. Assoc.,. BENJAMINI, Y. & HOCHBERG, Y. (). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. B,. CHENG, D. & SCHWARTZMAN, A. (a). Distriution of the height of local maxima of Gaussian random fields. Extremes, to appear. arxiv:. CHENG, D. & SCHWARTZMAN, A. (). Multiple testing of local maxima for detection of peaks in random fields. arxiv:. CRAMÉR, H. & LEADBETTER, M. R. (). Stationary and Related Stochastic Processes. Wiley, New York. EILERS, P. H. C. & DE MENEZES, R. X. (). Quantile smoothing of array CGH data. Bioinformatics,. ERDMAN, C. & EMERSON J. W. (). A fast Bayesian change point analysis for the segmentation of microarray data. Bioinformatics,. GIJBELS, I. (). Inference for nonsmooth regression curves and surfaces using kernel-ased methods. a survey collected in Recent advances and trends in nonparametric statistics y Michael G. Akritas and Dimitris N. Politis. HUANG, T., WU, B., LIZARDI, P. & ZHAO, H. (). Detection of DNA copy numer alterations using penalized least squares regression. Bioinformatics,. HUNG, Y., WANG, Y., ZARNITSYNA, V., ZHU, C. & WU C. F. J. (). Hidden Markov models with applications in cell adhesion experiments. J. Am. Statist. Assoc.,. HSU, L., SELF, S. G., GROVE, D., RANDOLPH, T., WANG, K., DELROW, J. J., LOO, L. & PORTER, P. (). Denoising array-ased comparative genomic hyridization data using wavelets. Biostatistics,. JACKSON, B., SCARGLE, J. D., BARNES, D., ARABHI, S., ALT, A., GIOUMOUSIS, P., GWIN, E., SANG- TRAKULCHAROEN, P., TAN, L. & TSAI, T. T. (). An Algorithm for Optimal Partitioning of Data on an Interval. IEEE, Signal Processing Letters,. KILLICK, R., ECKLEY, I. A., EWANS, K. & JONATHAN, P. (). Detection of changes in variance of oceanographic time-series using changepoint analysis. Ocean Engineering,. KILLICK, R., FEARNHEAD, P. & ECKLEY, I. A. (). Optimal detection of changepoints with a linear computational cost. J. Am. Statist. Assoc.,. LAI, W. R., JOHNSON, M. D., KUCHERLAPATI, R. & PARK, P. J. (). Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics,.
16 LOO, L. W. M., GROVE, D. I., NEAL, C. L., COUSENS, L. A., SCHUBERT, E. L., WILLIAMS, E. M., HOLCOMB, I. N., DELROW, J. J., TRASK, B. J., HSU, L. & PORTER, P. L. (). Array-CGH analysis of genomic alterations in reast cancer sutypes. Cancer Research,. LANZANTE, J. R. (). Resistant, roust and non-parametric techniques for the analysis of climate data: Theory and example, including applications to historical radiosonde station data. International Journal of Climatology,. MUGGEO, V. M. R. & ADELFIO, G. (). Efficient change point detection for genomic sequences of continuous measurements. Bioinformatics,. NAM, C. F. H., ASTON, J. A. D. & JOHANSEN, A. M. (). Quantifying the uncertainty in change points. Journal of Time Series Analysis,. OLSHEN, A. B., VENKATRAMAN, E. S., LUCITO, R. & WIGLER, M. (). Circular inary segmentation for the analysis of array-ased DNA copy numer data. Biostatistics,. REEVES, J., CHEN, J., WANG, X. L., LUND, R. & LU, Q. Q. (). A Review and Comparison of Changepoint Detection Techniques for Climate Data. Journal of Applied Meteorology and Climatology,. SCHWARTZMAN, A., DOUGHERTY, R. F. & TAYLOR, J. E. (). False discovery rate analysis of rain diffusion direction maps. Ann. Appl. Statist.,. SCHWARTZMAN, A., GAVRILOV, Y. & ADLER, R. J. (). Multiple testing of local maxima for detection of peaks in D. Ann. Statist.,. TIBSHIRANI, R. & WANG, P. (). Spatial smoothing and hot spot detection for CGH data using the fused lasso. Biostatistics,. WANG, P., KIM, Y., POLLACK, J., NARASIMHAN, B. & TIBSHIRANI, R. (). A method for calling gains and losses in array CGH data. Biostatistics,. ZEILEIS, A., SHAH, A. & PATNAIK, I. (). Testing, Monitoring, and Dating Structural Changes in Exchange Rate Regimes. Computational Statistics & Data Analysis,.
Peak Detection for Images
Peak Detection for Images Armin Schwartzman Division of Biostatistics, UC San Diego June 016 Overview How can we improve detection power? Use a less conservative error criterion Take advantage of prior
More informationMultiple Change-Point Detection and Analysis of Chromosome Copy Number Variations
Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Yale School of Public Health Joint work with Ning Hao, Yue S. Niu presented @Tsinghua University Outline 1 The Problem
More informationarxiv: v1 [stat.ml] 10 Apr 2017
Integral Transforms from Finite Data: An Application of Gaussian Process Regression to Fourier Analysis arxiv:1704.02828v1 [stat.ml] 10 Apr 2017 Luca Amrogioni * and Eric Maris Radoud University, Donders
More informationFinQuiz Notes
Reading 9 A time series is any series of data that varies over time e.g. the quarterly sales for a company during the past five years or daily returns of a security. When assumptions of the regression
More informationA Brief Introduction to the R package StepSignalMargiLike
A Brief Introduction to the R package StepSignalMargiLike Chu-Lan Kao, Chao Du Jan 2015 This article contains a short introduction to the R package StepSignalMargiLike for estimating change-points in stepwise
More informationarxiv: v1 [math.st] 31 Mar 2009
The Annals of Statistics 2009, Vol. 37, No. 2, 619 629 DOI: 10.1214/07-AOS586 c Institute of Mathematical Statistics, 2009 arxiv:0903.5373v1 [math.st] 31 Mar 2009 AN ADAPTIVE STEP-DOWN PROCEDURE WITH PROVEN
More informationTable of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors
The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a
More informationA Brief Introduction to the Matlab package StepSignalMargiLike
A Brief Introduction to the Matlab package StepSignalMargiLike Chu-Lan Kao, Chao Du Jan 2015 This article contains a short introduction to the Matlab package for estimating change-points in stepwise signals.
More information1. Define the following terms (1 point each): alternative hypothesis
1 1. Define the following terms (1 point each): alternative hypothesis One of three hypotheses indicating that the parameter is not zero; one states the parameter is not equal to zero, one states the parameter
More informationSimple Examples. Let s look at a few simple examples of OI analysis.
Simple Examples Let s look at a few simple examples of OI analysis. Example 1: Consider a scalar prolem. We have one oservation y which is located at the analysis point. We also have a ackground estimate
More informationSpectrum Opportunity Detection with Weak and Correlated Signals
Specum Opportunity Detection with Weak and Correlated Signals Yao Xie Department of Elecical and Computer Engineering Duke University orth Carolina 775 Email: yaoxie@dukeedu David Siegmund Department of
More informationResearch Article Sample Size Calculation for Controlling False Discovery Proportion
Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,
More informationFixed-b Inference for Testing Structural Change in a Time Series Regression
Fixed- Inference for esting Structural Change in a ime Series Regression Cheol-Keun Cho Michigan State University imothy J. Vogelsang Michigan State University August 29, 24 Astract his paper addresses
More informationChaos and Dynamical Systems
Chaos and Dynamical Systems y Megan Richards Astract: In this paper, we will discuss the notion of chaos. We will start y introducing certain mathematical concepts needed in the understanding of chaos,
More informationIssues on quantile autoregression
Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides
More informationarxiv: v1 [math.st] 1 Dec 2014
HOW TO MONITOR AND MITIGATE STAIR-CASING IN L TREND FILTERING Cristian R. Rojas and Bo Wahlberg Department of Automatic Control and ACCESS Linnaeus Centre School of Electrical Engineering, KTH Royal Institute
More informationEstimation of a Two-component Mixture Model
Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint
More informationPost-Selection Inference
Classical Inference start end start Post-Selection Inference selected end model data inference data selection model data inference Post-Selection Inference Todd Kuffner Washington University in St. Louis
More informationA sequential multiple change-point detection procedure via VIF regression
Comput Stat DOI 10.1007/s00180-015-0587-5 ORIGINAL PAPER A sequential multiple change-point detection procedure via VIF regression iaoping Shi 1 iang-sheng Wang 1 Dongwei Wei 1 Yuehua Wu 1 Received: 6
More informationGeneralized Seasonal Tapered Block Bootstrap
Generalized Seasonal Tapered Block Bootstrap Anna E. Dudek 1, Efstathios Paparoditis 2, Dimitris N. Politis 3 Astract In this paper a new lock ootstrap method for periodic times series called Generalized
More informationDepth versus Breadth in Convolutional Polar Codes
Depth versus Breadth in Convolutional Polar Codes Maxime Tremlay, Benjamin Bourassa and David Poulin,2 Département de physique & Institut quantique, Université de Sherrooke, Sherrooke, Quéec, Canada JK
More informationTIGHT BOUNDS FOR THE FIRST ORDER MARCUM Q-FUNCTION
TIGHT BOUNDS FOR THE FIRST ORDER MARCUM Q-FUNCTION Jiangping Wang and Dapeng Wu Department of Electrical and Computer Engineering University of Florida, Gainesville, FL 3611 Correspondence author: Prof.
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca
More informationPost-selection Inference for Changepoint Detection
Post-selection Inference for Changepoint Detection Sangwon Hyun (Justin) Dept. of Statistics Advisors: Max G Sell, Ryan Tibshirani Committee: Will Fithian (UC Berkeley), Alessandro Rinaldo, Kathryn Roeder,
More information#A50 INTEGERS 14 (2014) ON RATS SEQUENCES IN GENERAL BASES
#A50 INTEGERS 14 (014) ON RATS SEQUENCES IN GENERAL BASES Johann Thiel Dept. of Mathematics, New York City College of Technology, Brooklyn, New York jthiel@citytech.cuny.edu Received: 6/11/13, Revised:
More informationEVALUATIONS OF EXPECTED GENERALIZED ORDER STATISTICS IN VARIOUS SCALE UNITS
APPLICATIONES MATHEMATICAE 9,3 (), pp. 85 95 Erhard Cramer (Oldenurg) Udo Kamps (Oldenurg) Tomasz Rychlik (Toruń) EVALUATIONS OF EXPECTED GENERALIZED ORDER STATISTICS IN VARIOUS SCALE UNITS Astract. We
More informationA moment-based method for estimating the proportion of true null hypotheses and its application to microarray gene expression data
Biostatistics (2007), 8, 4, pp. 744 755 doi:10.1093/biostatistics/kxm002 Advance Access publication on January 22, 2007 A moment-based method for estimating the proportion of true null hypotheses and its
More informationEstimating a Finite Population Mean under Random Non-Response in Two Stage Cluster Sampling with Replacement
Open Journal of Statistics, 07, 7, 834-848 http://www.scirp.org/journal/ojs ISS Online: 6-798 ISS Print: 6-78X Estimating a Finite Population ean under Random on-response in Two Stage Cluster Sampling
More informationThe miss rate for the analysis of gene expression data
Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,
More informationFalse discovery rate and related concepts in multiple comparisons problems, with applications to microarray data
False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using
More informationSupplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017
Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.
More informationSpiking problem in monotone regression : penalized residual sum of squares
Spiking prolem in monotone regression : penalized residual sum of squares Jayanta Kumar Pal 12 SAMSI, NC 27606, U.S.A. Astract We consider the estimation of a monotone regression at its end-point, where
More informationExpansion formula using properties of dot product (analogous to FOIL in algebra): u v 2 u v u v u u 2u v v v u 2 2u v v 2
Least squares: Mathematical theory Below we provide the "vector space" formulation, and solution, of the least squares prolem. While not strictly necessary until we ring in the machinery of matrix algera,
More informationOptional Stopping Theorem Let X be a martingale and T be a stopping time such
Plan Counting, Renewal, and Point Processes 0. Finish FDR Example 1. The Basic Renewal Process 2. The Poisson Process Revisited 3. Variants and Extensions 4. Point Processes Reading: G&S: 7.1 7.3, 7.10
More informationSample Size Estimation for Studies of High-Dimensional Data
Sample Size Estimation for Studies of High-Dimensional Data James J. Chen, Ph.D. National Center for Toxicological Research Food and Drug Administration June 3, 2009 China Medical University Taichung,
More informationTime-Varying Spectrum Estimation of Uniformly Modulated Processes by Means of Surrogate Data and Empirical Mode Decomposition
Time-Varying Spectrum Estimation of Uniformly Modulated Processes y Means of Surrogate Data and Empirical Mode Decomposition Azadeh Moghtaderi, Patrick Flandrin, Pierre Borgnat To cite this version: Azadeh
More informationNonparametric Estimation of Smooth Conditional Distributions
Nonparametric Estimation of Smooth Conditional Distriutions Bruce E. Hansen University of Wisconsin www.ssc.wisc.edu/~hansen May 24 Astract This paper considers nonparametric estimation of smooth conditional
More informationA Sufficient Condition for Optimality of Digital versus Analog Relaying in a Sensor Network
A Sufficient Condition for Optimality of Digital versus Analog Relaying in a Sensor Network Chandrashekhar Thejaswi PS Douglas Cochran and Junshan Zhang Department of Electrical Engineering Arizona State
More informationLuis Manuel Santana Gallego 100 Investigation and simulation of the clock skew in modern integrated circuits. Clock Skew Model
Luis Manuel Santana Gallego 100 Appendix 3 Clock Skew Model Xiaohong Jiang and Susumu Horiguchi [JIA-01] 1. Introduction The evolution of VLSI chips toward larger die sizes and faster clock speeds makes
More informationA Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data
A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data Faming Liang, Chuanhai Liu, and Naisyin Wang Texas A&M University Multiple Hypothesis Testing Introduction
More informationHigh-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018
High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously
More informationHypes and Other Important Developments in Statistics
Hypes and Other Important Developments in Statistics Aad van der Vaart Vrije Universiteit Amsterdam May 2009 The Hype Sparsity For decades we taught students that to estimate p parameters one needs n p
More informationNew Procedures for False Discovery Control
New Procedures for False Discovery Control Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Elisha Merriam Department of Neuroscience University
More informationReal option valuation for reserve capacity
Real option valuation for reserve capacity MORIARTY, JM; Palczewski, J doi:10.1016/j.ejor.2016.07.003 For additional information aout this pulication click this link. http://qmro.qmul.ac.uk/xmlui/handle/123456789/13838
More informationMATH 225: Foundations of Higher Matheamatics. Dr. Morton. 3.4: Proof by Cases
MATH 225: Foundations of Higher Matheamatics Dr. Morton 3.4: Proof y Cases Chapter 3 handout page 12 prolem 21: Prove that for all real values of y, the following inequality holds: 7 2y + 2 2y 5 7. You
More informationDefect Detection using Nonparametric Regression
Defect Detection using Nonparametric Regression Siana Halim Industrial Engineering Department-Petra Christian University Siwalankerto 121-131 Surabaya- Indonesia halim@petra.ac.id Abstract: To compare
More informationarxiv: v2 [stat.me] 14 Jul 2016
Multiple Change-point Detection: a Selective Overview Yue S. Niu, Ning Hao, and Heping Zhang University of Arizona and Yale University arxiv:1512.04093v2 [stat.me] 14 Jul 2016 July 15, 2016 Abstract Very
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2004 Paper 147 Multiple Testing Methods For ChIP-Chip High Density Oligonucleotide Array Data Sunduz
More informationModel-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate
Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel
More informationQUALITY CONTROL OF WINDS FROM METEOSAT 8 AT METEO FRANCE : SOME RESULTS
QUALITY CONTROL OF WINDS FROM METEOSAT 8 AT METEO FRANCE : SOME RESULTS Christophe Payan Météo France, Centre National de Recherches Météorologiques, Toulouse, France Astract The quality of a 30-days sample
More informationComments on A Time Delay Controller for Systems with Uncertain Dynamics
Comments on A Time Delay Controller for Systems with Uncertain Dynamics Qing-Chang Zhong Dept. of Electrical & Electronic Engineering Imperial College of Science, Technology, and Medicine Exhiition Rd.,
More informationTools and topics for microarray analysis
Tools and topics for microarray analysis USSES Conference, Blowing Rock, North Carolina, June, 2005 Jason A. Osborne, osborne@stat.ncsu.edu Department of Statistics, North Carolina State University 1 Outline
More informationSVETLANA KATOK AND ILIE UGARCOVICI (Communicated by Jens Marklof)
JOURNAL OF MODERN DYNAMICS VOLUME 4, NO. 4, 010, 637 691 doi: 10.3934/jmd.010.4.637 STRUCTURE OF ATTRACTORS FOR (a, )-CONTINUED FRACTION TRANSFORMATIONS SVETLANA KATOK AND ILIE UGARCOVICI (Communicated
More informationComment about AR spectral estimation Usually an estimate is produced by computing the AR theoretical spectrum at (ˆφ, ˆσ 2 ). With our Monte Carlo
Comment aout AR spectral estimation Usually an estimate is produced y computing the AR theoretical spectrum at (ˆφ, ˆσ 2 ). With our Monte Carlo simulation approach, for every draw (φ,σ 2 ), we can compute
More informationHIGH-DIMENSIONAL GRAPHS AND VARIABLE SELECTION WITH THE LASSO
The Annals of Statistics 2006, Vol. 34, No. 3, 1436 1462 DOI: 10.1214/009053606000000281 Institute of Mathematical Statistics, 2006 HIGH-DIMENSIONAL GRAPHS AND VARIABLE SELECTION WITH THE LASSO BY NICOLAI
More informationMiscellanea False discovery rate for scanning statistics
Biometrika (2011), 98,4,pp. 979 985 C 2011 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asr057 Miscellanea False discovery rate for scanning statistics BY D. O. SIEGMUND, N. R. ZHANG Department
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationA New Smoothing Model for Analyzing Array CGH Data
A New Smoothing Model for Analyzing Array CGH Data Nha Nguyen, Heng Huang, Soontorn Oraintara and An Vo Department of Electrical Engineering, University of Texas at Arlington Email: nhn3175@exchange.uta.edu,
More informationBumpbars: Inference for region detection. Yuval Benjamini, Hebrew University
Bumpbars: Inference for region detection Yuval Benjamini, Hebrew University yuvalbenj@gmail.com WHOA-PSI-2017 Collaborators Jonathan Taylor Stanford Rafael Irizarry Dana Farber, Harvard Amit Meir U of
More informationConfidence Thresholds and False Discovery Control
Confidence Thresholds and False Discovery Control Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics
More informationNon-specific filtering and control of false positives
Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationarxiv: v1 [hep-ex] 20 Jan 2013
Heavy-Flavor Results from CMS arxiv:.69v [hep-ex] Jan University of Colorado on ehalf of the CMS Collaoration E-mail: keith.ulmer@colorado.edu Heavy-flavor physics offers the opportunity to make indirect
More informationDoing Cosmology with Balls and Envelopes
Doing Cosmology with Balls and Envelopes Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie
More informationSTRONG NORMALITY AND GENERALIZED COPELAND ERDŐS NUMBERS
#A INTEGERS 6 (206) STRONG NORMALITY AND GENERALIZED COPELAND ERDŐS NUMBERS Elliot Catt School of Mathematical and Physical Sciences, The University of Newcastle, Callaghan, New South Wales, Australia
More informationChapter 3: Statistical methods for estimation and testing. Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001).
Chapter 3: Statistical methods for estimation and testing Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001). Chapter 3: Statistical methods for estimation and testing Key reference:
More informationThe Mean Version One way to write the One True Regression Line is: Equation 1 - The One True Line
Chapter 27: Inferences for Regression And so, there is one more thing which might vary one more thing aout which we might want to make some inference: the slope of the least squares regression line. The
More informationEstimation of the False Discovery Rate
Estimation of the False Discovery Rate Coffee Talk, Bioinformatics Research Center, Sept, 2005 Jason A. Osborne, osborne@stat.ncsu.edu Department of Statistics, North Carolina State University 1 Outline
More informationOptimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust Testing
Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Roust esting Yixiao Sun Department of Economics University of California, San Diego Peter C. B. Phillips Cowles Foundation, Yale University,
More informationTest for Discontinuities in Nonparametric Regression
Communications of the Korean Statistical Society Vol. 15, No. 5, 2008, pp. 709 717 Test for Discontinuities in Nonparametric Regression Dongryeon Park 1) Abstract The difference of two one-sided kernel
More informationA Large-Sample Approach to Controlling the False Discovery Rate
A Large-Sample Approach to Controlling the False Discovery Rate Christopher R. Genovese Department of Statistics Carnegie Mellon University Larry Wasserman Department of Statistics Carnegie Mellon University
More informationAsymptotic inference for a nonstationary double ar(1) model
Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk
More informationStructuring Unreliable Radio Networks
Structuring Unreliale Radio Networks Keren Censor-Hillel Seth Gilert Faian Kuhn Nancy Lynch Calvin Newport March 29, 2011 Astract In this paper we study the prolem of uilding a connected dominating set
More informationStep-down FDR Procedures for Large Numbers of Hypotheses
Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate
More informationarxiv: v4 [math.st] 29 Aug 2017
A Critical Value Function Approach, with an Application to Persistent Time-Series Marcelo J. Moreira, and Rafael Mourão arxiv:66.3496v4 [math.st] 29 Aug 27 Escola de Pós-Graduação em Economia e Finanças
More informationSupervised Classification for Functional Data Using False Discovery Rate and Multivariate Functional Depth
Supervised Classification for Functional Data Using False Discovery Rate and Multivariate Functional Depth Chong Ma 1 David B. Hitchcock 2 1 PhD Candidate University of South Carolina 2 Associate Professor
More informationUnderstanding Regressions with Observations Collected at High Frequency over Long Span
Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University
More informationSummary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing
Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper
More informationModel Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model
Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population
More informationA spatio-temporal model for extreme precipitation simulated by a climate model
A spatio-temporal model for extreme precipitation simulated by a climate model Jonathan Jalbert Postdoctoral fellow at McGill University, Montréal Anne-Catherine Favre, Claude Bélisle and Jean-François
More informationSupplement to: Guidelines for constructing a confidence interval for the intra-class correlation coefficient (ICC)
Supplement to: Guidelines for constructing a confidence interval for the intra-class correlation coefficient (ICC) Authors: Alexei C. Ionan, Mei-Yin C. Polley, Lisa M. McShane, Kevin K. Doin Section Page
More informationDivide-and-Conquer. Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, CSE 6331 Algorithms Steve Lai
Divide-and-Conquer Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, 33.4. CSE 6331 Algorithms Steve Lai Divide and Conquer Given an instance x of a prolem, the method works as follows: divide-and-conquer
More informationCPM: A Covariance-preserving Projection Method
CPM: A Covariance-preserving Projection Method Jieping Ye Tao Xiong Ravi Janardan Astract Dimension reduction is critical in many areas of data mining and machine learning. In this paper, a Covariance-preserving
More informationTime Series and Forecasting Lecture 4 NonLinear Time Series
Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations
More informationModels for models. Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research
Models for models Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research Outline Statistical models and tools Spatial fields (Wavelets) Climate regimes (Regression and clustering)
More informationStatistical Inference
Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park
More informationExample: Bipolar NRZ (non-return-to-zero) signaling
Baseand Data Transmission Data are sent without using a carrier signal Example: Bipolar NRZ (non-return-to-zero signaling is represented y is represented y T A -A T : it duration is represented y BT. Passand
More informationMultiple Testing. Hoang Tran. Department of Statistics, Florida State University
Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome
More informationParametric Empirical Bayes Methods for Microarrays
Parametric Empirical Bayes Methods for Microarrays Ming Yuan, Deepayan Sarkar, Michael Newton and Christina Kendziorski April 30, 2018 Contents 1 Introduction 1 2 General Model Structure: Two Conditions
More informationA nonparametric test for seasonal unit roots
Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna To be presented in Innsbruck November 7, 2007 Abstract We consider a nonparametric test for the
More informationTesting near or at the Boundary of the Parameter Space (Job Market Paper)
Testing near or at the Boundary of the Parameter Space (Jo Market Paper) Philipp Ketz Brown University Novemer 7, 24 Statistical inference aout a scalar parameter is often performed using the two-sided
More informationMinimum Hellinger Distance Estimation in a. Semiparametric Mixture Model
Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.
More informationIN this paper, we consider the estimation of the frequency
Iterative Frequency Estimation y Interpolation on Fourier Coefficients Elias Aoutanios, MIEEE, Bernard Mulgrew, MIEEE Astract The estimation of the frequency of a complex exponential is a prolem that is
More informationFast estimation of posterior probabilities in change-point analysis through a constrained hidden Markov model
arxiv:1203.4394v2 [stat.ap] 22 Jan 2013 Fast estimation of posterior probabilities in change-point analysis through a constrained hidden Markov model The Minh Luong, Yves Rozenholc, and Gregory Nuel MAP5,
More informationProbabilistic Inference for Multiple Testing
This is the title page! This is the title page! Probabilistic Inference for Multiple Testing Chuanhai Liu and Jun Xie Department of Statistics, Purdue University, West Lafayette, IN 47907. E-mail: chuanhai,
More informationTesting equality of autocovariance operators for functional time series
Testing equality of autocovariance operators for functional time series Dimitrios PILAVAKIS, Efstathios PAPARODITIS and Theofanis SAPATIAS Department of Mathematics and Statistics, University of Cyprus,
More informationSanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract
Adaptive Controls of FWER and FDR Under Block Dependence arxiv:1611.03155v1 [stat.me] 10 Nov 2016 Wenge Guo Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102, U.S.A.
More informationarxiv: v1 [cs.fl] 24 Nov 2017
(Biased) Majority Rule Cellular Automata Bernd Gärtner and Ahad N. Zehmakan Department of Computer Science, ETH Zurich arxiv:1711.1090v1 [cs.fl] 4 Nov 017 Astract Consider a graph G = (V, E) and a random
More informationModule 9: Further Numbers and Equations. Numbers and Indices. The aim of this lesson is to enable you to: work with rational and irrational numbers
Module 9: Further Numers and Equations Lesson Aims The aim of this lesson is to enale you to: wor with rational and irrational numers wor with surds to rationalise the denominator when calculating interest,
More informationROBUST FREQUENCY DOMAIN ARMA MODELLING. Jonas Gillberg Fredrik Gustafsson Rik Pintelon
ROBUST FREQUENCY DOMAIN ARMA MODELLING Jonas Gillerg Fredrik Gustafsson Rik Pintelon Department of Electrical Engineering, Linköping University SE-581 83 Linköping, Sweden Email: gillerg@isy.liu.se, fredrik@isy.liu.se
More information