Package FDRSeg. September 20, 2017

Similar documents
Package unbalhaar. February 20, 2015

Package horseshoe. November 8, 2016

Package leiv. R topics documented: February 20, Version Type Package

Package RTransferEntropy

Package esreg. August 22, 2017

Package bayeslm. R topics documented: June 18, Type Package

Package SpatPCA. R topics documented: February 20, Type Package

Package elhmc. R topics documented: July 4, Type Package

Package dhsic. R topics documented: July 27, 2017

Package depth.plot. December 20, 2015

Package noncompliance

Package NonpModelCheck

Package sbgcop. May 29, 2018

Package fwsim. October 5, 2018

Package FDRreg. August 29, 2016

Package msir. R topics documented: April 7, Type Package Version Date Title Model-Based Sliced Inverse Regression

Package ensemblepp. R topics documented: September 20, Title Ensemble Postprocessing Data Sets. Version

Package diffeq. February 19, 2015

Package EMVS. April 24, 2018

Package scpdsi. November 18, 2018

Package idmtpreg. February 27, 2018

Package mpmcorrelogram

Package face. January 19, 2018

Package pearson7. June 22, 2016

Package NlinTS. September 10, 2018

Package esaddle. R topics documented: January 9, 2017

Package basad. November 20, 2017

Package codadiags. February 19, 2015

Package mcmcse. July 4, 2017

Package severity. February 20, 2015

Package clogitboost. R topics documented: December 21, 2015

Package dina. April 26, 2017

Package msbp. December 13, 2018

Package CEC. R topics documented: August 29, Title Cross-Entropy Clustering Version Date

Package BNN. February 2, 2018

Package TSPred. April 5, 2017

Package EnergyOnlineCPM

Package goftest. R topics documented: April 3, Type Package

Package effectfusion

Package gk. R topics documented: February 4, Type Package Title g-and-k and g-and-h Distribution Functions Version 0.5.

Package icmm. October 12, 2017

Package netcoh. May 2, 2016

Package RI2by2. October 21, 2016

Package BootWPTOS. R topics documented: June 30, Type Package

Package hwwntest. R topics documented: August 3, Type Package Title Tests of White Noise using Wavelets Version 1.3.

Package SimCorMultRes

Package darts. February 19, 2015

Package BayesNI. February 19, 2015

Package sklarsomega. May 24, 2018

Package hierdiversity

Package IndepTest. August 23, 2018

Lab 9: An Introduction to Wavelets

Package zic. August 22, 2017

Package ltsbase. R topics documented: February 20, 2015

Package GPfit. R topics documented: April 2, Title Gaussian Processes Modeling Version Date

Package RATest. November 30, Type Package Title Randomization Tests

Package WaveletArima

Package HIMA. November 8, 2017

Package Grace. R topics documented: April 9, Type Package

Package invgamma. May 7, 2017

Package gma. September 19, 2017

Package covsep. May 6, 2018

Package tspi. August 7, 2017

Package flexcwm. R topics documented: July 11, Type Package. Title Flexible Cluster-Weighted Modeling. Version 1.0.

Package RcppSMC. March 18, 2018

Package SK. May 16, 2018

Package xdcclarge. R topics documented: July 12, Type Package

Package ARCensReg. September 11, 2016

Package pfa. July 4, 2016

Package polypoly. R topics documented: May 27, 2017

Package clustergeneration

Package TSF. July 15, 2017

Package beam. May 3, 2018

Package multivariance

Package IsingSampler

Package rarhsmm. March 20, 2018

Package flexcwm. R topics documented: July 2, Type Package. Title Flexible Cluster-Weighted Modeling. Version 1.1.

Package bpp. December 13, 2016

Package jmcm. November 25, 2017

Package ppcc. June 28, 2017

Package monmlp. R topics documented: December 5, Type Package

Package sindyr. September 10, 2018

Package R1magic. August 29, 2016

Package rgabriel. February 20, 2015

Package gwqs. R topics documented: May 7, Type Package. Title Generalized Weighted Quantile Sum Regression. Version 1.1.0

Package RootsExtremaInflections

Package CorrMixed. R topics documented: August 4, Type Package

Package neuralnet. August 16, 2016

Package skellam. R topics documented: December 15, Version Date

Package HDtest. R topics documented: September 13, Type Package

Package bivariate. December 10, 2018

The sbgcop Package. March 9, 2007

Package LPTime. March 3, 2015

Package survidinri. February 20, 2015

Package lomb. February 20, 2015

Package factorqr. R topics documented: February 19, 2015

Package weightquant. R topics documented: December 30, Type Package

Package MicroMacroMultilevel

Package thief. R topics documented: January 24, Version 0.3 Title Temporal Hierarchical Forecasting

Package SimSCRPiecewise

Transcription:

Type Package Package FDRSeg September 20, 2017 Title FDR-Control in Multiscale Change-Point Segmentation Version 1.0-3 Date 2017-09-20 Author Housen Li [aut], Hannes Sieling [aut], Timo Aspelmeier [cre] Maintainer Timo Aspelmeier <timo.aspelmeier@mathematik.uni-goettingen.de> Estimate step functions via multiscale inference with controlled false discovery rate (FDR). For details see H. Li, A. Munk and H. Sieling (2016) <doi:10.1214/16-ejs1131>. Depends R (>= 2.15.3) License GPL-3 Imports Rcpp (>= 0.11.5), stepr (>= 1.0-1), stats LinkingTo Rcpp NeedsCompilation yes Repository CRAN Date/Publication 2017-09-20 08:34:11 UTC R topics documented: FDRSeg-package...................................... 2 computefdp......................................... 4 dfdrseg........................................... 5 evalstepfun......................................... 6 fdrseg............................................ 7 simulquantile........................................ 9 smuce............................................ 10 teethfun........................................... 12 v_measure.......................................... 13 Index 14 1

2 FDRSeg-package FDRSeg-package FDR-Control in Multiscale Change-Point Segmentation Details Estimate step functions via multiscale inference with controlled false discovery rate (FDR). For details see H. Li, A. Munk and H. Sieling (2016) <doi:10.1214/16-ejs1131>. Package: FDRSeg Type: Package Version: 1.0-3 Date: 2017-09-20 License: The GNU General Public License Index: computefdp dfdrseg evalstepfun fdrseg simulquantile smuce teethfun v_measure Compute false discovery proportion (FDP) Piecewise constant regression with D-FDRSeg Evaluate step function Piecewise constant regression with FDRSeg Quantile simulations Piecewise constant regression with SMUCE Teeth function Compute V-measure Author(s) Housen Li [aut], Hannes Sieling [aut], Timo Aspelmeier [cre] Maintainer: Timo Aspelmeier <timo.aspelmeier@mathematik.uni-goettingen.de> Frick, K., Munk, A., and Sieling, H. (2014). Multiscale Change-Point Inference. J. R. Statist. Soc. B, with discussion and rejoinder by the authors, 76:495 580. Hotz, T., Schuette, O. M., Sieling, H., Polupanow, T., Diederichsen, U., Steinem, C., and Munk, A. (2013). Idealizing ion channel recordings by a jump segmentation multiresolution filter. IEEE Transactions on Nanobioscience, 12(4), 376 86. Li, H., Munk, A., and Sieling, H. (2015). FDR-control in multiscale change-point segmentation. arxiv:1412.5844. smucer, jsmurf

FDRSeg-package 3 library(stepr) ## (I) Independent Gaussian Data # simulate data n <- 300 # number of observations K <- 20 # number of change-points u0 <- teethfun(n, K) set.seed(2) Y <- rnorm(n, u0, 0.3) # plot data plot(y, pch = 20, col = "grey", ylab = "") lines(u0, type = "s") # estimate standard deviation sd <- sdrobnorm(y) # simulate quantiles alpha <- 0.1 qs <- simulquantile(1 - alpha, n, type = "smuce") # for SMUCE qfs <- simulquantile(1 - alpha, n, type = "fdrseg") # for FDRSeg # compute estimates us <- smuce(y, qs, sd = sd) # SMUCE ufs <- fdrseg(y, qfs, sd = sd) # FDRSeg # plot results lines(evalstepfun(us), type = "s", col = "blue") lines(evalstepfun(ufs), type = "s", col = "red") legend("topleft", c("truth", "SMUCE", "FDRSeg"), lty = c(1, 1, 1), col = c("black", "blue", "red")) ## (II) Dependent Gaussian Data # simulate data (a continuous time Markov chain) ts <- 0.1 # sampling time SNR <- 3 # signal-to-noise ratio sampling <- 1e4 # sampling rate 10 khz over <- 10 # tenfold oversampling cutoff <- 1e3 # 1 khz 4-pole Bessel-filter, adjusted for oversampling simdf <- dfilter("bessel", list(pole=4, cutoff=cutoff/sampling/over)) transrate <- 50 rates <- rbind(c(0, transrate), c(transrate, 0)) set.seed(123) sim <- contmc(ts*sampling, c(0,snr), rates, sampling = sampling, family = "gausskern", param = list(df=simdf, over=over, sd=1)) Y <- sim$data$y x <- sim$data$x # D-FDRseg convkern <- dfilter("bessel", list(pole=4, cutoff=cutoff/sampling))$kern alpha <- 0.1 r <- 10 # r could be much larger

4 computefdp qdfs udfs <- simulquantile(1 - alpha, ts*sampling, r, "dfdrseg", convkern) <- dfdrseg(y, qdfs, convkern = convkern) # plot results plot(x, Y, pch = 20, col = "grey", xlab="", ylab = "", main = "Simulate Ion Channel Data") lines(sim$discr, col = "blue") lines(x, evalstepfun(udfs), col = "red") legend("topleft", c("truth", "D-FDRSeg"), lty = c(1, 1), col = c("blue", "red")) computefdp Compute false discovery proportion (FDP) Compute false discovery proportion for estimated change-points, see (Li et al., 2015) for a detailed explanation. computefdp(u, ej) u ej true signal; a numeric vector estimated change-points; a numeric vector A scalar takes value in [0, 1]. Li, H., Munk, A., and Sieling, H. (2015). FDR-control in multiscale change-point segmentation. arxiv:1412.5844. fdrseg, v_measure # simulate data set.seed(2) u0 <- c(rep(1, 50), rep(5, 50)) Y <- rnorm(100, u0) # compute FDRSeg uh <- fdrseg(y)

dfdrseg 5 plot(y, pch = 20, col = "grey", xlab = "", ylab = "") lines(u0, type = "s", col = "blue") lines(evalstepfun(uh), type = "s", col = "red") legend("topleft", c("truth", "FDRSeg"), lty = c(1, 1), col = c("blue", "red")) # compute false discovery proportion fdp <- computefdp(u0, uh$left) cat("false discovery propostion is ", fdp, "\n") dfdrseg Piecewise constant regression with D-FDRSeg Compute the D-FDRSeg estimator for one-dimensional data with dependent Gaussian noises, especially for ion channel recordings, see (Hotz et al., 2013; Li et al., 2015) for further details. dfdrseg(y, q, alpha = 0.1, r = round(50/min(alpha, 1-alpha)), convkern, sd = stepr::sdrobnorm(y, lag=length(convkern)+1)) Y q alpha r a numeric vector containing the noisy data threshold value; a numeric vector of the same length as the data significance level; if q is missing, q is chosen as the (1-alpha)-quantile of the null distribution of the multiscale statistic via Monte Carlo simulation, see (Li et al., 2015) for an explanation numer of Monte Carlo simulations convkern kernel of the low-pass filter, see (Li et al., 2015) sd A list with components standard deviation of noises value left n function values on each segment of the estimator indices of leftmost points within each segment of the estimator number of samples Hotz, T., Schuette, O. M., Sieling, H., Polupanow, T., Diederichsen, U., Steinem, C., and Munk, A. (2013). Idealizing ion channel recordings by a jump segmentation multiresolution filter. IEEE Transactions on Nanobioscience, 12(4), 376-86. Li, H., Munk, A., and Sieling, H. (2015). FDR-control in multiscale change-point segmentation. arxiv:1412.5844.

6 evalstepfun smuce, dfdrseg, jsmurf, simulquantile, sdrobnorm, contmc, dfilter, evalstepfun library(stepr) # simulate data (a continuous time Markov chain) ts <- 0.1 # sampling time SNR <- 3 # signal-to-noise ratio sampling <- 1e4 # sampling rate 10 khz over <- 10 # tenfold oversampling cutoff <- 1e3 # 1 khz 4-pole Bessel-filter, adjusted for oversampling simdf <- dfilter("bessel", list(pole=4, cutoff=cutoff/sampling/over)) transrate <- 50 rates <- rbind(c(0, transrate), c(transrate, 0)) set.seed(123) sim <- contmc(ts*sampling, c(0,snr), rates, sampling = sampling, family = "gausskern", param = list(df=simdf, over=over, sd=1)) Y <- sim$data$y x <- sim$data$x # D-FDRseg library(stepr) convkern <- dfilter("bessel", list(pole=4, cutoff=cutoff/sampling))$kern uh <- dfdrseg(y, convkern = convkern, r = 10) # r could be much larger # plot results plot(x, Y, pch = 20, col = "grey", xlab="", ylab = "", main = "Simulate Ion Channel Data") lines(sim$discr, col = "blue") lines(x, evalstepfun(uh), col = "red") legend("topleft", c("truth", "D-FDRSeg"), lty = c(1, 1), col = c("blue", "red")) ## Not run: # alternatively simulate quantiles first alpha <- 0.1 q <- simulquantile(1 - alpha, ts*sampling, type = "dfdrseg", convkern = convkern) # then compute the estimate uh <- dfdrseg(y, q, convkern = convkern) ## End(Not run) evalstepfun Evaluate step function Transform the return value by smuce, fdrseg, or dfdrseg into a numeric vector.

fdrseg 7 evalstepfun(stepf) stepf a list returned by smuce, fdrseg, or dfdrseg with components value function values on each segment of the estimator left indices of leftmost points within each segment of the estimator n number of samples A numeric vector gives function values of stepf at sampling locations. smuce, fdrseg, dfdrseg # simulate data set.seed(2) u0 <- c(rep(1, 5), rep(5, 5)) Y <- rnorm(10, u0) # compute the SMUCE estimate uh <- smuce(y) # print results # step function returned by smuce print(uh) # vector returned by evalstepfun print(evalstepfun(uh)) fdrseg Piecewise constant regression with FDRSeg Compute the FDRSeg estimator for one-dimensional data with i.i.d. Gaussian noises. fdrseg(y, q, alpha = 0.1, r = round(50/min(alpha, 1-alpha)), sd = stepr::sdrobnorm(y))

8 fdrseg Y q alpha r sd a numeric vector containing the noisy data threshold value; a numeric vector of the same length as the data significance level; if q is missing, q is chosen as the (1-alpha)-quantile of the null distribution of the multiscale statistic via Monte Carlo simulation, see (Li et al., 2015) for an explanation numer of Monte Carlo simulations standard deviation of noises A list with components value left n function values on each segment of the estimator indices of leftmost points within each segment of the estimator number of samples Li, H., Munk, A., and Sieling, H. (2015). FDR-control in multiscale change-point segmentation. arxiv:1412.5844. smuce, dfdrseg, simulquantile, sdrobnorm, evalstepfun, computefdp, v_measure # simulate data set.seed(123) u0 <- c(rep(1, 50), rep(5, 50)) Y <- rnorm(100, u0) # compute the estimate (q is automatically simulated) # it might take a while due to simulating quantiles and will # be faster for later calls on signals of the same length uh <- fdrseg(y) # plot result plot(y, pch = 20, col = "grey", ylab = "", main = expression(alpha*" = 0.1")) lines(u0, type = "s", col = "blue") lines(evalstepfun(uh), type = "s", col = "red") legend("topleft", c("truth", "FDRSeg"), lty = c(1, 1), col = c("blue", "red")) # other choice of alpha uh <- fdrseg(y, alpha = 0.05) # plot result plot(y, pch = 20, col = "grey", ylab = "", main = expression(alpha*" = 0.05"))

simulquantile 9 lines(u0, type = "s", col = "blue") lines(evalstepfun(uh), type = "s", col = "red") legend("topleft", c("truth", "FDRSeg"), lty = c(1, 1), col = c("blue", "red")) ## Not run: # alternatively simulate quantiles first alpha <- 0.1 q <- simulquantile(1 - alpha, 100, type = "fdrseg") # then compute the estimate uh <- fdrseg(y, q) ## End(Not run) simulquantile Quantile simulations Simulate the quantiles of multiscale statistics for SMUCE, FDRSeg, and D-FDRSeg under null hypothesis. simulquantile(alpha, n, r = round(50/min(alpha, 1-alpha)), type = c("smuce","fdrseg","dfdrseg"), convkern, pos =.GlobalEnv) alpha n r a scalar with values in [0, 1]; the alpha-quantile of the null distribution of the multiscale statistic for SMUCE, FDRSeg, or D-FDRSeg via Monte Carlo simulation, see (Frick et al., 2014; Hotz et al., 2013; Li et al., 2015) for an explanation number of observations numer of Monte Carlo simulations type "smuce" simulate quantile for SMUCE "fdrseg" simulate quantiles for FDRSeg "dfdrseg" simulate quantiles for D-FDRSeg convkern pos convolution kernel, only needed when type is "dfdrseg" environment for saving the simulations for possible later usage A scalar value if type is chosen as "smuce"; a numeric vector of length n if type is chosen as "fdrseg" or "dfdrseg".

10 smuce Frick, K., Munk, A., and Sieling, H. (2014). Multiscale Change-Point Inference. J. R. Statist. Soc. B, with discussion and rejoinder by the authors, 76:495 580. Hotz, T., Schuette, O. M., Sieling, H., Polupanow, T., Diederichsen, U., Steinem, C., and Munk, A. (2013). Idealizing ion channel recordings by a jump segmentation multiresolution filter. IEEE Transactions on Nanobioscience, 12(4):376 86. Li, H., Munk, A., and Sieling, H. (2015). FDR-control in multiscale change-point segmentation. arxiv:1412.5844. smuce, fdrseg, dfdrseg library(stepr) # simulate quantiles for independent Gaussian noises qs <- simulquantile(0.9, 100, type = "smuce") qfs <- simulquantile(0.9, 100, type = "fdrseg") # plot result yrng <- range(qs, qfs) plot(qfs, pch = 20, ylim = yrng, xlab = "n", ylab = "") abline(h = qs) # simulate quantiles for dependent Gaussian noises convkern <- dfilter("bessel")$kern # create digital filters qdfs <- simulquantile(0.9, 100, type = "dfdrseg", convkern = convkern) plot(qdfs, pch = 20, xlab = "n", ylab = "") smuce Piecewise constant regression with SMUCE Compute the SMUCE estimator for one-dimensional data with i.i.d. Gaussian noises. smuce(y, q, alpha = 0.1, r = round(50/min(alpha, 1-alpha)), sd = stepr::sdrobnorm(y)) Y q alpha a numeric vector containing the noisy data threshold value; a scalar number significance level; if q is missing, q is chosen as the (1-alpha)-quantile of the null distribution of the multiscale statistic via Monte Carlo simulation, see (Frick et al., 2014) for an explanation

smuce 11 r sd numer of Monte Carlo simulations standard deviation of noises A list with components value left n function values on each segment of the estimator indices of leftmost points within each segment of the estimator number of samples Note This is an efficient implementation of function smucer in R package stepr (CRAN) for data with i.i.d. Gaussian noises. The detailed algorithm is described in (Seiling, 2013). Frick, K., Munk, A., and Sieling, H. (2014). Multiscale Change-Point Inference. J. R. Statist. Soc. B, with discussion and rejoinder by the authors, 76:495 580. Seiling, H. (2013). Statistical Multiscale Segmentation: Inference, Algorithms and Applications. PhD thesis, University of Goettingen, Germany. fdrseg, dfdrseg, simulquantile, sdrobnorm, evalstepfun, computefdp, v_measure # simulate data set.seed(2) u0 <- c(rep(1, 50), rep(5, 50)) Y <- rnorm(100, u0) # compute the estimate (q is automatically simulated) uh <- smuce(y) # plot result plot(y, pch = 20, col = "grey", ylab = "", main = expression(alpha*" = 0.1")) lines(u0, type = "s", col = "blue") lines(evalstepfun(uh), type = "s", col = "red") legend("topleft", c("truth", "SMUCE"), lty = c(1, 1), col = c("blue", "red")) # other choice of alpha uh <- smuce(y, alpha = 0.05) # plot result plot(y, pch = 20, col = "grey", ylab = "", main = expression(alpha*" = 0.05")) lines(u0, type = "s", col = "blue") lines(evalstepfun(uh), type = "s", col = "red")

12 teethfun legend("topleft", c("truth", "SMUCE"), lty = c(1, 1), col = c("blue", "red")) ## Not run: # alternatively simulate quantiles first alpha <- 0.1 q <- simulquantile(1 - alpha, 100, type = "smuce") # then compute the estimate uh <- smuce(y, q) ## End(Not run) teethfun Teeth function Creat the teeth function with specified lengths and number of change-points. teethfun(n, K, h = 3) n K h length of the vector (values of the teeth function) number of change-points height of the jump A numeric vector gives values of the teeth function. Fryzlewicz, P. (2014). Wild binary segmentation for multiple change-point detection. Ann. Statist., 42(6): 2243 1572. # create teeth function u <- teethfun(100, 6) # plot plot(u, type = "s")

v_measure 13 v_measure Compute V-measure Compute V-measure, a segmentation evaluation measure, which is based upon two criteria for clustering usefulness, homogeneity and completeness. v_measure(sig, est, beta = 1) sig est beta true signal; a numeric vector estimator; a numeric vector parameter in definition of V-measure, see (Rosenberg and Hirschberg, 2007) for details A scalar takes value in [0, 1], with a larger value indicating higher accuracy. Rosenberg, A., and Hirschberg, J. (2007). V-measure: a conditional entropy-based external cluster evaluation measures. Proc. Conf. Empirical Methods Natural Lang. Process., (June):410 420. computefdp, smuce, fdrseg, evalstepfun # simulate data u0 <- c(rep(1, 50), rep(5, 50)) Y <- rnorm(100, u0) # compute FDRSeg uh <- fdrseg(y) plot(y, pch = 20, col = "grey", xlab = "", ylab = "") lines(u0, type = "s", col = "blue") lines(evalstepfun(uh), type = "s", col = "red") legend("topleft", c("truth", "FDRSeg"), lty = c(1, 1), col = c("blue", "red")) # compute V-measure vm <- v_measure(u0, evalstepfun(uh)) print(vm)

Index Topic nonparametric computefdp, 4 dfdrseg, 5 evalstepfun, 6 fdrseg, 7 FDRSeg-package, 2 simulquantile, 9 smuce, 10 teethfun, 12 v_measure, 13 Topic package FDRSeg-package, 2 computefdp, 4, 8, 11, 13 contmc, 6 dfdrseg, 5, 6 8, 10, 11 dfilter, 6 evalstepfun, 6, 6, 8, 11, 13 FDRSeg (FDRSeg-package), 2 fdrseg, 4, 6, 7, 7, 10, 11, 13 FDRSeg-package, 2 jsmurf, 2, 6 sdrobnorm, 6, 8, 11 simulquantile, 6, 8, 9, 11 smuce, 6 8, 10, 10, 13 smucer, 2 teethfun, 12 v_measure, 4, 8, 11, 13 14