Simon J. Bonner Department of Statistics University of Kentucky

Size: px
Start display at page:

Download "Simon J. Bonner Department of Statistics University of Kentucky"

Transcription

1 A spline-based capture-mark-recapture model applied to estimating the number of steelhead within the Bulkley River passing the Moricetown Canyon in Carl James Schwarz (PStat (ASA, SSC)) Department of Statistics and Actuarial Science Simon Fraser University 8888 University Drive Burnaby, BC V5A 1S6 Simon J. Bonner Department of Statistics University of Kentucky

2 Summary A key assumption of capture-mark-recapture studies is homogeneity of catchability, i.e. all fish have the same probability of capture. This is often violated in studies of migrating fish populations where fish are marked and recaptured over several weeks, but each individual fish is only available for capture for shorter periods of time. The capture-mark-recapture studies of steelhead on the Bulkley River fall into this category with fish being marked, released, and recaptured over several months but individual fish migrate from the release site at the beach and past the canyon in two or three weeks. Heterogeneity in catchability may occur because of changes in effort or changes in conditions (e.g. flow) that affect catchability. Estimation methods that ignore heterogeneity in catchability such as the pooled- Petersen estimator, may result in an estimator that is biased (i.e. tend to over- or underestimate the true population size) and often estimates of uncertainty (e.g. the standard error) are underreported. Stratification is a device to account for heterogeneity in catchability. The study is temporally stratified (e.g. by weeks), and estimation methods allow for different catchability among weeks. The penalty from such methods is an increase in the number of parameters and reduction in precision. These stratified methods often fail to work in practice because of sparse and low counts in the recapture matrix. They require ad hoc decisions on pooling strata and do not account for the pooling process in the final estimates of uncertainty. Recently, Bonner and Schwarz (in press) have introduced a new methodology based on using splines to model the run distribution and Bayesian hierarchical models to share information among strata. This revised model explicitly recognizes and uses the temporal stratification to avoid ad hoc pooling. The model is self-calibrating in the sense that for sparse data, the final model is comparable to a simple pooled-petersen but for rich datasets, the underlying model can take quite complex forms. We applied the spline-based model to 10 years of capture-mark-recapture data on steelhead on the Bulkley River, British Columbia. The results were compared to previous estimates based on complete pooling and likelihood-based methods. There was evidence of heterogeneity in catchability for all but three years. Despite this, all of the methods gave similar estimates and similar levels of uncertainty. Nevertheless, the spline-based model has several advantages over the other methods. It does not require any pooling decisions; it can deal with sparse data; it provides estimates of abundance for every strata; it can deal with cases where information is missing from some strata; and it can provide estimates of derived parameters such as the time until a target number of fish has passed the recapture site. A key assumption of the spline (and other estimators) is population closure, in particular that all fish move past the recapture site. In the case of mortality (or events that act like mortality, e.g. fallback) that affect both marked and unmarked fish equally and are homogeneous in time, estimates of total abundance now refer to the number of fish at 2

3 the release site. Estimates of abundance of fish passing the recapture site must be interpreted carefully as they no longer have a direct interpretation. Unfortunately, there is no information in the current protocol to estimate this mortality. The spline method is also easily extended to deal with fallback from the release site where not all released and unmarked fish migrate past the recapture location. We used estimates of fall back from the telemetry study of Welch et al. (2009) to adjust the estimates of abundance. This adjustment now provides estimates of fish passing the canyon and estimates of abundance for individual weeks are directly interpretable. While the temporal distribution of movement between the release site and recapture site appears to be consistent over years, there is sufficient variability in the movement distribution across years and within years (among release groups) that using this prior information does not lead to improvements in precision at the effort levels used in this study. It may prove more useful in studies with very small sample sizes, but in this case, there is a great danger that year-specific changes in the movement distribution will be masked by the information in the prior leading to biases in estimates of abundance. A simple empirical planning tool was developed to predict the level of uncertainty that could be expected from the spline model under various levels of effort. Reductions in effort in releases (e.g. releases every second week) have less of an impact on estimates of abundance than similar reductions in recapture effort. The spline method can also be used for in-season estimates of the total run size, but the uncertainty is relatively poor until most of the run has been sampled. However, the time to reach a target (e.g. time needed for 10,000 fish to arrive 1 ) appears to have potential for serving as a proxy for the total run size relatively early in the season. Finally, the current report treats each year independently of other years. It should be possible to use all years of data to fit a multi-year Jolly-Seber model to estimate yearly abundances, survival among years, and population growth and decline. 1 The target of 10,000 fish is arbitrary but was suggested by R. Saimoto (pers communication) as representing about 1/3 to 1/2 of the total run and would provide an early indicator of the total run size. 3

4 1. Introduction Capture-mark-recapture studies on summer-run steelhead on the Bulkley River in British Columbia have been conducted by Wet suwet en Fisheries since The detailed sampling protocol for these studies is presented in Saimoto et al. (2010). Briefly, fish are captured by seine downstream of the canyon at a beach/campground. These are marked with individually identifiable tags and (mostly) released. Fish are again captured by dip net in the canyon area upstream of the falls and examined. If the fish is marked, its tag number is read. If the fish is untagged, it is often but not always tagged, and released. For each fish handled, the sex and fork length are recorded. Estimates of abundance (the number of steelhead that successfully migrate upstream past the falls) have been obtained using a variety of methods. For example, the pooled-petersen estimator will simply pool all releases, all recaptures of marked fish, and all captures at the canyon and use these pooled values in the simplest capture-markrecapture estimator (Seber, 1982). This estimator makes a crucial assumption of homogeneity of catchability (among others) and can be biased if the assumption of homogeneity is not valid. More importantly estimates of precision from the pooled- Petersen estimator in the presence of heterogeneity of catchability, will tend to understate the actual uncertainty in the estimate (i.e. the results from the pooled-petersen method will appear more precise than they should be) (Seber, 1982). Catchability is unlikely to be homogenous over the course of the study. The effort to capture fish varies over the study and river conditions fluctuate over time. In these cases, stratification of the study into intervals (e.g. weekly intervals) is often used under the assumption that conditions within a week are sufficiently similar so that the assumption of homogeneity within a week is reasonable. The estimator for stratified- Petersen studies was introduced by Darroch (1961) with further work by Seber (1982, Chapter 13), Plante et al. (1998), Schwarz and Taylor (1998), Arnason et al. (1996), and Bjorkstedt (2000). While these methods are theoretically justified, there are several practical problems that prevent their simple usage. When the data from the study are stratified, the resulting matrix of recoveries can be sparse with small counts. Consequently, the resulting estimates are often very unreliable and can often not be computed because they rely upon the inversion of this sparse recapture matrix with small counts. As well, these methods do not take into account the temporal stratification in the study where the abundance in one strata is likely to be similar to that in adjacent strata and the movement pattern of a release group is also likely to be similar to the movement pattern in adjacent release groups. Because of sparse data, extensive pooling of strata is often required. But there is no defensible method to decide which strata to pool, and the pooling decisions are not incorporated into the estimates of uncertainty. Recently, Bonner and Schwarz (in press) and Schwarz, et al. (2009) developed an alternate method that has many advantages over existing methods. It takes into account 4

5 the temporal stratification and shares information among neighboring strata to help alleviate problems caused by small counts. The key features of this method are the use of splines to model the general shape of the run and Bayesian hierarchical methods to share information among strata. The method is self-calibrating in the sense that if the data are sparse, the equivalent of simple-petersen methods where the catchability is assumed to be roughly the same over the study are fit, but when the data are rich, more complex models are fit. Estimates of abundance are provided for each recapture stratum and so it is relatively simple to also estimated derived quantities such as the time at which 50% of the run has passed, or the time needed to reach a pre-specified target number of fish (e.g. at what point have 10,000 fish passed?). The features of the model also deal with problems (such as no sampling in some strata) in a straightforward fashion the spline curve for the run is used to interpolate for the missing data. These last two features are difficult to obtain from the previous methods. The purpose of this report is to revisit the data collected from 1999 to the present and where possible to fit the spline-model to the data. The results will be compared to the existing estimates. Methods are developed to estimate the time until a targeted number of fish have passed the recapture site. 2. Notation: In the notation that follows, we use the generic term stratum to refer to a time period. If the data were stratified on a weekly basis, then a stratum would refer to a week. If the data were stratified on a daily basis, then a stratum would refer to a day. There are generally s strata for releases and t strata for recoveries and recaptures. All fish that are tagged and released are assumed to have an individually numbered tag so that the stratum of release and recapture can be identified. 2.1 Statistics. n i.- the number of fish marked and released at the beach in stratum i, m ij - the number of these fish that are released at the beach in stratum i and recaptured in the canyon in stratum j, and u j - the number of unmarked fish captured at the canyon in stratum j. 2.2 Parameters. The parameters of the study describe the quantities to be estimated based on the data collected and the model that describes how the data could come about. U j - the number of unmarked fish passing the recapture (canyon) site in stratum j, U Tot = U j j! - the total number of unmarked fish that pass the recapture (canyon) site over the entire period of the experiment. 5

6 N Tot = U Tot +! n i - the total number of fish (marked and unmarked) that pass the i recapture (canyon) site over the entire period of the experiment. p j - the recapture rate of marked and unmarked fish at the recapture (canyon) site in stratum j.! ij - the probability that a marked fish released in stratum i will move and be available for recapture in stratum j. Estimates of parameters are denoted by using a circumflex above the symbol. For example, ˆN Tot is the estimate of the (unknown) total abundance. 3. Review of currently used estimators 3.1 The Simple Peterson Estimator: The simplest possible estimator is the (completely) pooled-petersen estimator where the total number of marked fish, total number of recaptured fish, and total number of unmarked fish are found over all strata, and the usual Lincoln-Petersen estimator for the number of unmarked fish that pass the recapture site is computed as (Seber, 1982): (!n i )! u j Û Tot, pp = ( )!! m ij. [If the number of recaptures is small, the Chapman modification where 1 is added to each term in the expression is often used that reduces the bias in the estimator; however, this has little effect if the number of recaptures is moderately (more than 20) large.] The pooled-petersen estimator will be appropriate under the following assumptions: the population is closed (i.e., no fish enter or leave the population between the release and recapture locations); marks are not lost between the point of release and recapture; all fish over the entire experiment have the same probability of being captured at the recapture site (homogeneity); whether or not any individual is captured at the recapture site is independent of the capture of all other individuals; and marked and unmarked fish are equally mixed through out the study. Many of these assumptions are problematic for the steelhead capture-markrecapture program. It is unlikely that the probability of capture at the canyon is the same over the 6

7 entire study. Heterogeneity can be caused by changes in effort (more or less fishing at the canyon in certain strata) or river conditions (e.g. flows may affect catchability). If the assumption is violated, the pooled-petersen estimator will be biased (Seber, 2002, pg. 85). Fortunately, unless the variation in capture-probabilities is large, the bias is expected to be small (Seber, 2002, p.95; Schwarz and Taylor, 1998). However, the reported standard error (and confidence interval) will be unrealistically too small, i.e. the estimates appear to be more certain that they really are, even with moderate amounts of heterogeneity. The assumption of homogeneity of capture probability in all strata can be assessed using a traditional chi-square test for homogeneity of proportions. A 2 x s contingency table is created: Stratum Recaptured Notrecaptured 1 m 1 n 1! m 1 2 m 2 n 2! m 2 t s m s n s! m s where m i =! m ij is the total number of marks recovered from releases in stratum i over j =1 the entire experiment. The test-statistic is computed as ( [ ]) 2 [ ] s m X 2 i! E m i = " + i=1 E m i s " i=1 ( n i! m i! E[ n i! m i ]) 2 E n i! m i [ ] where the expected number of marks returned in stratum j is found as E m i! m i i=1 [ ] = n i s! n i (i.e. using the average recapture rate over all strata). The test-statistic can be compared to 2! s"1 distribution to see if it is unusually large which would indicate that pooling is not advisable. The usual cautions must be employed when using this test when some of the expected counts are small (i.e. less than 3) as this can inflate the test statistic. The assumption of mixing of marked and unmarked fish (which implies that the proportion of marked fish would be the same at the canyon over the entire study) is also violated because of the number and timing of releases. For example fish released early in the study likely do not mix with fish that arrive near the end of the study. Seber (1985, Chapter 13) shows that the assumption of mixing can be violated without biasing the estimate or the standard error if the marked-fraction is constant over strata. This is also unlikely to be true. For example, if the same number of fish were tagged and released regardless of the pattern of the daily run, then the marked fraction will be higher when the daily run is small, and smaller when the daily run is large. As well, if marked fish are released on an irregular schedule (e.g. every couple of day), then the marked fraction will s i=1 7

8 be highest at the time of release and will decline as no new marked fish are released and some fish make their way upstream. Seber (1982, Chapter 13) presents a chi-square test for equal proportions of marked fish based on the contingency table: Stratum Recaptured Unmarked 1 m 1 u 1 2 m 2 u 2 t m t u t 2 The test-statistic can be compared to! t "1 distribution to see if it is unusually large which would again that indicate that pooling is not advisable. The usual cautions must be employed when using this test when some of the expected counts are small (i.e. less than 3) as this can inflate the test statistic. Note that failure to reject either hypothesis may indicate (but is not definitive) that the pooled-petersen estimator may actually be suitable for the analysis of a dataset. The assumption of closure could be violated if not all fish pass the canyon after reaching the beach site. For example, suppose that 25% of fish do not make it upstream past the canyon during the study period. Then some of the marked fish will presumably also not move up past the canyon and will not be available for recapture. This type of non-closure will effectively reduce the number of marked fish available for capture and will lead to positive biases in estimates of abundance at the recapture site. Instantaneous marking mortality also violates the closure assumption and leads to similar biases. However, mortality that applied to both marked and unmarked fish equally is not problematic and still leads to unbiased estimates of the population size. It is not possible to examine the assumption of closure using the data from this capture-mark-recapture study. Welch et al. (2009) conducted a study using acoustically tagged fish and found that only about 61% of fish moved upstream after being released at the beach/campground site. This implies that the number of marked fish released must be reduced by this factor and leads to a similar reduction in the estimate. The uncertainty (standard error) must account for both the reduction in the number of marked fish available for recapture and the uncertainty in the estimate of 61% (it was based on 66 fish) and is not presented here. The assumption of no-mark loss could be examined by double-tagging some fish (Seber and Felton, 1981), but with only singly-tagged fish, it is impossible to estimate or correct for tag-loss. Tag-loss has the same effect as non-closure discussed in the previous paragraph and if estimates of tag-loss are available, the estimates can be adjusted for tagloss in a similar fashion. 3.2 Stratified Estimators: If the assumption of homogeneity is not appropriate, the experiment can be stratified temporally. The study duration is divided into strata (e.g. weeks) and the 8

9 number of fish released in each stratum, the number of recaptures from each release in each of the subsequent strata, and the number of unmarked fish captured in each stratum are determined. There are two important cases. First, the movement of fish between the release and recapture sites is very consistent among fish and all fish released together, arrive at the recapture site together. For example, if the stratification were at the week level, all fish released in week 2 would move together and be subject to recapture in week 3 2. No marked fish would be available at the canyon in weeks either prior to week 3 or after week 3. In this diagonal stratification case, estimation is simple: the stratified-petersen estimator is equivalent to computing the simple-petersen estimator for each of the strata and then adding the results to estimate the overall run size. However, in the Moricetown Study, movement is not diagonal. Fish often take several weeks to move from the beach site past the canyon site and recaptures can occur for several weeks after release. Darroch (1961) provided the first rigorous treatment of the stratified-petersen model and derived maximum likelihood estimators by conditioning on the numbers of individuals marked and released in each stratum at the first location. Similar methods were developed by Macdonald and Smith (1980) in which trapping occurs at only one location and marked fish are introduced by transporting them upstream to be released. Plante et al. (1998) developed unconditional likelihood methods by modeling the number of individuals at the first location, and further allowed the capture probabilities to depend on stratum specific covariates such as the rate of flow. Estimators of abundance from the stratified-petersen model have also been obtained by the methods of moments (Chapman and Junge, 1956) and by least squares (Banneheka et al., 1997). Although stratification reduces the bias of the abundance estimates, it also increases the number of parameters which leads to loss of precision. In the models above, marked individuals may pass the second location in any stratum so that it is necessary to model the movements of these fish between all possible pairs of strata. The number of parameters increases as the product of the number of strata at the two locations, and if few marked fish are recaptured then the movements are difficult to model and the resulting estimates of population size will be imprecise. The number of parameters can be reduced by partially pooling the data, but there are several pitfalls. Proposed methods for pooling strata essentially entail testing for differences in the capture probabilities between neighboring strata and then combining these strata if the null hypothesis is not rejected (see Darroch, 1961; Schwarz and Taylor, 1998; Bjorkstedt, 2000). However, such tests will have low power, because the number 2 The fish could move together in lockstep to any week (including the week of release). The values of 2 and 3 were chosen only for illustration purposes. 9

10 of fish marked in each stratum is often small. As well, measures of uncertainty do not account for the pooling decisions and will underestimate the true uncertainty of the estimated population size (Steinhorst et al., 2004). Schwarz and Dempson (1994) reduced the number of parameters by using a parametric distribution (e.g. a log-normal distribution) for the travel times of the marked individuals between release and recapture strata. There are no simple computational formulae for the stratified-petersen methods above. Computer software has been developed (e.g. Plante et al. 1998; Arnason et al. 1996; Bjorkstedt, 2000). To deal with problems caused by sparsity and small counts in the recapture matrix, pooling of strata is required. At the moment, there is no objective way to decide which strata to pool and estimates of precision do not account for the pooling process. Pooling also reduces the ability to produce estimates of the run-size at the stratum level and some assumptions of how the run occurred during the pooled strata will be needed to disaggregate the total run for that stratum. Another shortcoming of all of these methods is that they fail to account for the natural, temporal ordering of the data. The counts in each stratum are treated completely independently of counts in all other strata. This is often too general for temporallystratified capture-mark-recapture data. While fluctuations in the counts from day to day will always occur, migrations tend to follow a fairly predictable pattern: few fish pass on the days early in the migration period, the numbers grow fairly steadily to one or two peaks in the middle of the migration and then drop back down at the end of the period. The result is that the abundance of fish in one stratum is strongly associated with the abundance in the neighboring strata. 4. Bayesian spline-model The Bayesian model is fully described in Bonner (2008), Schwarz et al. (2009), and Bonner and Schwarz (in press). In particular, the description and use of the splinemodel in the case of diagonal movement is fully described by Schwarz et al. (2009) but is not relevant to the steelhead project. This method has three important components. First, we need to model the movement and capture of marked fish between the stratum of release and the stratum of recapture. Rather than assuming a parametric form for the movement (e.g. that the time to move between the release (beach) and recapture (canyon) site follows a log-normal distribution as done in Schwarz and Dempson, 1994), we allow for a very general movement pattern where! ij is the probability that a marked fish released in stratum (week) i will be available in stratum (week) j such that "! ij = 1. For example in an eight week study, fish may have movement probabilities after being t j =i 10

11 released in week 1 of {! " } = {0.1, 0.3, 0.3, 0.2, 0.1, 0.0, 0.0, 0.0) which implies that, on average, 10% of the fish released in week 1 at the beach arrive at the canyon in week 1; 30% arrive in weeks 2 and 3; 20% in week 4; and 10% in week 5. [Of course these movement probabilities are unknown and must be estimated.] Once marked fish arrive at the recapture (canyon) site in week j, they are recaptured with a probability p j. This and the previous movement distribution implies that the distribution of recaptures of marked fish following their release can be described by a multinomial distribution whose cell probabilities are the product of the movement rate and the recapture rate: ({ }) {m ij ; j = i,i + 1, t} ~ Multi! ij p j ; j = i,i + 1, t with the number of fish (and probability of) never seen after release obtained by subtraction. This means that a marked fish released in stratum (week) i has a probability of being recaptured in stratum (week j) of! ij p j. The multinomial distribution is a simple extension to the binomial distribution but rather than only having two outcomes, the multinomial distribution allows for more than 2 outcomes. Unmarked fish are assumed to have the same probability of capture in each stratum as marked fish, and so the number of unmarked fish (u j ) seen from the population of unmarked fish (U j ) is modeled using a binomial distribution: ( ) u j ~ Bi U j, p j This implies, for example that if p j =.10, then, on average, 10% of unmarked fish present in stratum j will be captured ( E! " u j # $ = U p j j ) and 10% of the marked fish that are available in the stratum will also be recaptured. The likelihood function that describes this experiment is then the product of these two distributions over all releases and all recaptures. The likelihood function is the key to link the data (the observed statistics) with the unknown parameters. At this point development of the model has been the same as in Darroch (1961). The likelihood function is very general with many parameters and will lead to estimates with poor precision unless large sample sizes are available. As noted earlier, pooling can be used to reduce the number of strata (e.g. pool into two week intervals), but the pooling decisions are somewhat arbitrary and still don t account for the natural structure of the model. The Darroch (1961) and similar models also don t take into account the temporal structure of the problem and the similarities in movement and recapture rates for fish that are temporally close. To take account of the temporal structure, Bonner and Schwarz (in press) extend the development. We model the pattern of the unmarked population arriving at the recapture strata (i.e. the shape of the run) using a spline. A spline is an alternative to simple parametric 11

12 curves such as a quadratic or cubic curve, and has the advantage to being able to match a wide variety of shapes. The basic idea of a spline is to fit a series of low-degree polynomials (e.g. cubic polynomials) to a small window of the data while ensuring that successive curves smoothly join at the window boundaries. The boundaries of the windows are termed the knots of the spline. We model the expected logarithm of the daily run size, E! " log( U j )# $ as a smooth function of j (time) using the Bayesian penalized splines or P-spline models of Lang and Brezger (2004). To avoid over fitting the data (i.e. having a spline that is too wiggly), the complexity of the spline is controlled by two factors: the number and locations of knot points, and the relationship amongst the regression coefficients, i.e. how smooth must be the transition between curves in different windows. The Bayesian P-spline algorithm addresses over fitting by spacing the knots points evenly across the data at intervals of every 4 strata and by specifying a prior distribution on the coefficients of the polynomials which favor small changes in the coefficients. In other words, a separate curve is fit in every window of 4 strata but the curves are constrained to be smooth at the window boundaries (the knots) unless the data very strongly shows that a very wiggly shape is needed. Note that while the spline curve is fit to the expected log(run size) the actual run size estimate is not constrained to lie on this very smooth curve and the data itself will indicate how much variation is allowable around this smooth underlying curve. The spline curve serves as a background to guide the actual estimates of run size over time. As an analogy, the fitted spline curve is similar to a straight line that is fit through data even though the straight line is smooth, the actual data points are not forced to lie on the line. The key advantage to this approach is the sharing of information among strata that are temporally close together. So even though the sample size for a particular stratum is small, localized pooling (of strata before and after) results in a larger sample size and more precise estimates. A nice consequence of this is that accounting for strata (weeks) where no sampling takes place is very straightforward the spline makes an interpolation for the missing stratum based on previous and succeeding strata. The spline model also assumes that there is some structure to the stratum-specific capture rates. We don t assume that the capture rates are equal over all strata (as in the pooled-petersen model) but allow the capture rates to vary around a common average. This is what is known as a hierarchical model. The individual capture probabilities over the recapture strata come from a common distribution. If the variation in the capture rates among strata is very small, then this model automatically acts like a pooled-petersen estimator and treats all the capture-rates as essentially the same. On the other hand, if the data indicate that the capture-rates are quite variable across strata, our model automatically accommodates this. This structure is added to the model by assuming that the stratum capture rates come from a normal distribution (on the logit scale) with a common mean and standard deviation (both of which are estimated from the data). The hierarchical model also has the benefit of again interpolating for weeks when no marked fish are available for recapture (say because of the timing of the releases and the movement pattern). In these cases, there is no information on the recapture rate. In these cases, the model interpolates various values for the capture rates that are consistent 12

13 with those found in the other strata. Finally, it is reasonable to assume that the movement patterns of successive release groups (the! ij ) should also be similar, and so another hierarchical model is imposed as was done for the capture rates, where the particular movement probability say of moving from the release to recapture strata in one unit of time must be similar, but not identical, among all of the release groups. Full details of the approach are found in Bonner and Schwarz (in press). The model is computationally complex, but can be fit using a combination of R (R Development Core Team, 2010). and WinBugs/OpenBugs (Thomas et al., 2006; Lunn et al., 2000) software. We have developed a software package (BTSPAS) available for free download from the Comprehensive R Archive Network (CRAN, that implements this methods. Complete computer code for this report is available at Bonner (2008, Section 2.2.4) conducted an extensive simulation study to compare the Bayesian P-spline method with the stratified-petersen estimator. Generally, the Bayesian P-spline method had negligible bias and its precision was at least as good as the stratified-petersen estimator. When perfect data were available (i.e., many marked fish were released and recaptured in each stratum) the results from the Bayesian P-spline and the stratified-petersen were very similar. When few marked fish were released or recaptured in each stratum, the performance of the Bayesian P-spline model depended on the amount of variation between the capture probabilities and the pattern of abundance over time. In the worst case, with large variations between the capture probabilities and abundances that followed no regular patterns, the two models continued to perform similarly. However, when the variation between the capture probabilities were smaller and the abundances followed close to a smooth curve the Bayesian P-spline produced much more precise estimates of the total population size. The Bayesian paradigm is a powerful way to attack complex models such as the one above. Numerical methods for Bayesian models must be used because the hand computations are too complex. The output from a Bayesian model is the posterior distribution for each parameter. This provides a range of values that are consistent with the data at hand. The most commonly reported summary statistic for the posterior distribution is the mean of the posterior (this is analogous to the estimate produced by maximum likelihood methods) and the standard deviation of the posterior distribution (this is analogous to the standard error of a maximum likelihood estimate). A pair of cutoffs in which 95% of the posterior distribution lies is known as a credible interval and is analogous to confidence intervals found in maximum likelihood methods. Welch et al. (2009) used acoustic telemetry to monitor the movement of fish after being released at the beach site. This study found that only 40 of 66 (61%) released fish actually migrated past the canyon. Presumably, the remaining 39% of the fish remained 13

14 below the canyon. In this case, the U j parameter refers only to those fish that move past the canyon. Estimates of U j are based on the recapture rate found from marked fish which were assume to all move upstream past the canyon. But, if only a fraction of the marked fish actually move upstream, estimates from this (and any capture-markrecapture model) will be biased upwards by a factor of 1/0.61=1.64. Effectively what happens is that the number of marked fish released that move upstream must be reduced. Unfortunately, there is little or no information in the capture-mark-recapture data to estimate the correction factor, and it must be given a priori. We have modified the splinemethodology and computer programs to account for this fall back rate. Rather than using a known correction factor, we allowed for variability in the correction factor based on a binomial distribution using the 40/66 ratio presented by Welch et al. (2009). Once the Bayesian model has been fit, it is easy to derive estimates of functions of the parameters. For example, it is of interest in this study to estimate the time needed to reach a target number (e.g. 10,000) fish that have passed the canyon. For example, suppose that the estimated number of fish that pass the recapture stratum in each week is: Julian Week Abundance The 10,000 fish will have passed the recapture station sometime between the end of week 35 and 36 and can be estimated by interpolation. The numerical Bayesian methods produce a large number of such sets. Each set differs, but is consistent with the data. For each set the derived estimate of the time needed to reach the target can be obtained. The average time to reach the target over the possible sets is the estimate and the measure of uncertainty is found as the standard deviation over the differing estimates. 4.1 Cautions Despite its complexity, the spline model is not a panacea to solve all potential problems encountered in capture-mark-recapture studies. There are number of caveats that apply to this (and potentially to other stratified models). No marked fish released in a stratum It may occur that no marked fish (i.e. n i =0 for some i) are released in a particular stratum because no fish were available or because of logistical constraints. This is usually handed without problems as long as adjacent strata have fish released and the movement of adjacent fish overlaps with fish that could have been released. The estimate of the capture probabilities is based on the ratio of the observed number of marked fish captured relative to the projected number available, which depends on the migration pattern. However, a large number of strata with no fish released may lead to situations where no marked fish are available to be captured in a recapture stratum. In this case, the Bayesian spline model will interpolate capture probabilities based on the hierarchical models for the capture rates. The likelihood methods will fail as there is no information on the capture rates available. 14

15 No sampling during a recapture stratum If no sampling takes place at the recapture stratum, the spline will impute a value for the run size and capture probability in that stratum given the shape of the spline and the variability in individual run sizes about the spline; and will impute a value for the capture probability given the range of capture probabilities in the other strata. The final estimate of U Tot will (automatically) incorporate the uncertainty for this imputed value. While it is possible to interpolate for several strata in a row, there is of course, no information on the shape of the underlying spline during these missed strata, and so the results should be interpreted with care. In particular, while it is possible to extrapolate the spline both before and after the study period, this must be approached cautiously as there is no information on the shape of the spline available outside the study boundaries. Likelihood methods will fail to provide estimates in this case. Sampling not continuous or complete during a recapture stratum It is implicitly assumed in the spline- and likelihood-model that sampling takes place during the entire recapture stratum. For example, if strata are weeks, then sampling takes place in all days of the weeks. This may or may not be problematic. If marked fish mix uniformly with unmarked fish during this stratum, then incomplete sampling during a recapture stratum simply leads to a reduction in the capture rate and no bias is introduced. However, incomplete mixing can lead to problems. Consider, for example, that fish generally take 7-8 days to move between the release and recapture sites and stratification is on a weekly level. Then if fish are released on Sundays, they will arrive at the recapture site on Sunday-Monday of the following week. If sampling only occurs on Wednesdays and Thursdays at the recapture site, then there is no chance of sampling any marked fish. Estimates of recapture rates will be biased downwards and estimates of abundance biased upwards. This case is problematic for any stratified model. In cases like this, the stratification level must be reduced (say to 2-day intervals). Then as long as some overlap of sampling and the presence of marked-fish occur, it is possible to estimate the recapture rates. In general, to avoid this problem, releases should occur on a fairly regular basis with no large gaps between releases and sufficiently far away from the recapture site so that movement is smeared over recapture strata. Varying effort during a stratum. In some cases, the recapture effort varies within a stratum. For example, two fishers may be operating on Monday, and then only one fisher is operating on Tuesday, 15

16 etc. Unfortunately, this type of problem cannot be adequately dealt with by the spline (or any other method) that uses batch marks. The problem is that differing effort during a stratum (week) results in heterogeneity of catchability during the week, e.g. the catchability on days when two fishers are operating is likely greater than the catchability on days when only one fisher is operating. If a batch mark is used, the data are pooled over the stratum (week) and it is impossible to separate out catches according to how effort is operating. As in the case of the pooled-petersen estimator, this will likely result in estimates with low bias, but the precision of the estimates for these strata will be overestimated, i.e. the standard deviations of the estimated run sizes for these strata will be understated. While there is no way to assess the extent of the problem (other than via simulations), it is hoped that the stratification into weekly strata will resolve most of the underreporting of the precision by the pooled-petersen estimator and that any remaining understatement is not material. Interpreting the estimates of population abundance under nonclosure (mortality or undocumented fallback). A key assumption of the spline and other stratified models is population closure between the point of marking and recapture. What is the impact of non-closure on the estimates? Immigration (new entrants to the population) between the marking site and recapture site is not a problem in the Moricetown problem and won t be discussed here. However, mortality, or events that look like mortality can occur. In particular, what if some fish do not move past the canyon, i.e. fallback. This looks like and is indistinguishable from mortality between the release and recapture sites, so the general term mortality will be used in the discussion below. In the pooled-petersen estimator, mortality that occurs at equal rates for marked and unmarked fish still leads to a valid estimate of population size but at the time (location) of marking, i.e. at the beach. Seber (1982, Chapter 11) discusses the impact on stratified experiments and the same comments hold for the spline model. If mortality occurs at an equal rate for marked and unmarked fish and moreover is equal across all release strata, then estimates of total abundance once again refer to the number of fish alive at the tagging site, i.e. at the beach. However, the estimates of strata abundance at the recapture site are artificially inflated and refer to the number of fish originally alive at the release site that would have migrated past the recapture site. Estimates of the recapture rate under these circumstances also have no easy interpretation. Unfortunately, there is no information within the data that allows for estimation of the mortality between the release and recapture sites. If information was available, it would be integrated into the model exactly in the same way as information on fallback because fallback and mortality have identical effects. 16

17 5. Example the 2010 data The 2010 data will be used as an illustration of fitting the spline model. Detailed results from 2001 to 2010 are available at In 2010, almost 3000 fish were tagged on the beach and over 400 were recovered at the canyon. A summary matrix when stratified to the weekly level is found in Table 1. For example, 35 fish were marked and released in julian week 30. Of these 35 fish, 5 were recaptured at the canyon in julian week 31; 7 in julian week 32, and 2 in julian week 33. A total of 65 unmarked fish were also captured in julian week 30 at the canyon. From this table one can derive the total number of fish marked and released (2917), the total number of recaptures (434) and the total number of unmarked fish captured at the canyon (5514). We will first assume that all marked fish swam upstream, i.e. all marked fish were available at some point in this period at the canyon for capture. The pooled-petersen estimate of the total number of unmarked fish that passed the canyon over julian weeks is ÛTot = 36,994 (SE 1697). The total number of fish (both marked and unmarked) is found by adding in the number of fish marked and released (2917) to give ˆN Tot = 39,911 (SE 1697). The pooled-petersen estimator assumes homogeneity of catchability over all weeks in the study. Table 2 compares the recapture rates by stratum total recapture ranges from 6% to 40% over the course of the experiment. The! 2 test for homogeneity of total recapture rates had a p-value <.0001 indicating strong evidence against homogeneity of recapture rates and so the pooled-petersen may not be appropriate. Similarly, Table 2 also compares the marked-fraction by stratum of recapture. The marked-fraction ranged from 0% to 17% over the course of the experiment. There was strong evidence against the hypothesis that the marked-fraction was equal among strata. The spline model was fit using BTSPAS and the estimated abundances are summarized in Table 3 and Figure 1. The interpolation spline nicely fits the general pattern seen from the empirical estimates. However, because of sparsity, the variation of the empirical estimates from the underlying spline may be an artifact of sampling variation. The credible intervals for the individual abundances are relatively tight as is the confidence interval for the total abundance. Again refer to the caution expressed earlier about the interpretation of the these estimates in the case of fallback or mortality between the release and recapture sites. 17

18 The posterior distribution for the total abundance is presented in Figure 2. The posterior distribution appears to be fairly symmetric about the model but there is a slight skew on the right. This implies that the 95% credible intervals are not quite symmetric about the mean of the posterior distribution as seen in Table 3. The pooled-petersen estimate is a bit larger than the estimate from the spline model. This is related to the heterogeneity that appears to be present in the catchability by strata. A plot of the estimated strata-specific recapture rates is found in Figure 3. The logit scale is a transformation of a probability scale from a [0,1] range to an unlimited " p % p =.14 range and is found as logit(p) = ln # $ 1! p& '. For example, if p = 0.05, then! logit(0.05) = ln.05 $ " #.95% & = '2.94. The inverse transformation, from the logit scale to the 1 probability scale is antilogit(!) =. For example, "! 1+ e 1 anti log it(!2.94) = = 1 = Figure 3 shows that the capture rates at the 1+ e!(!2.94) 20 canyon averaged around logit(p) =!1.8 ( p =.14 ) but ranged between logit(p) =!2.6 ( p =.07 ) to logit(p) =!0.6 ( p = 0.35 ). Capture rates were higher around julian weeks 32 and lower around julian week 38. There could be many reasons for the variation in capture rates. The number of people assigned to catch fish could vary, the number of days fished in a week may not be equal, or flows may affect catchability and vary over the course of the study. If values of a covariate (such as flow) are thought to be important, it is possible to further model the probability of capture as a function of these covariates, but this was done in this report. By looking at the cumulative totals of abundance presented in Table 3, the target of 10,000 fish was reached somewhere between julian weeks 33 and 34. The estimate of the time to reach the target of 10,000 fish is at julian week 33.3 (SD 0.2) with a 95% credible interval of between julian weeks (32.9, 33.7). Other plots are also produced by the BTSPAS program that are useful for determining the goodness of fit of the model, the convergence of the program to a stable result, etc. These are not discussed in this report, but a discussion is available in Schwarz et al. (2009) that describes how to interpret these other plots. Welch et al. (2009) concluded that a significant fraction of (sonic) tagged steelhead may not move upstream following release with only 40/66=61% of (sonic) tagged fish moving upstream. This information was incorporated into a modified version of BTSPAS. The revised estimates of abundance are presented in Table 4. Now estimates of total abundance and of the week abundances have been adjusted for this fall back rate. 18

19 The estimated abundances in Table 4 have been reduced (as expected) by a factor of about 66/40=1.65 compared to comparable estimates in Table 3 because of the reduced number of marks available. The absolute and relative standard deviations (rsd=sd/estimate) have been increased because of the uncertainty in the estimate of fall back based on a sample of only 66 (sonic) tagged fish. In general, the relative precision (se/estimate) of the Petersen, stratified Petersen, and spline estimators essentially depends only on the number of marks that are recaptured regardless of how many released. Accounting for fall back does affect the number of marks recaptured, which are used to estimate the movement rates and the recapture (ignoring fall back) rates. The latter are then adjusted by the fall back rate estimate and this adjustment add the extra uncertainty to the estimates. If one compares a plot of the estimated abundance of unmarked fish (Figure 4 and Figure 1), the shape of the curve is similar, but the curve is shifted downward by log(66/40)=.50 and the credible intervals are wider. Converse, the estimates of catchability at the canyon increase after accounting for fall back (Figure 5 vs. Figure 3) but again have the same basic shape over time. If fall back reduces the number of marks that migrate past the canyon but the number of marks recovered remain the same, then, all else being equal, the recapture rate must have been greater. This, in turn, depresses the expansion factor used to convert the capture of unmarked fish into an estimate of abundance for the unmarked fish. Lastly, because the estimates of abundance are reduced in all strata, the time needed to reach the target of 10,000 fish must be increased. Under this revised model, the estimated time at which 10,000 unmarked fish pass the canyon is julian week 34.7 (SD 0.5; 95% ci of > 35.7). 6. Analysis of studies. 6.1 Data extraction An Excel workbook containing the records of fish handled at the beach and canyon sites was provided as part of this project. For each year in which the raw data were available, the following steps were performed to extract and format the data as needed. Data were read and corrections were applied for such problems as misspelling of tag colors (Oramge for Orange), dates in wrong format (d/m/year rather than m/d/year), date with the wrong year specified (e.g. year recorded as 2003 in the 2002 data), etc. Data were sorted by date of action on a fish, and multiple recaptures of a fish at the same site were replaced by a single action at the earliest date. For example, if a fish was released at the beach site on 3 August and recaptured at the beach site on 10 August, it was classified as being released on 3 August and the second capture on the beach on 10 Aug would be ignored. Recaptures or releases at other than the beach and canyon sites were ignored (e.g., recaptures further upstream or downstream). 19

Assessing Fishery Condition: Population Estimation

Assessing Fishery Condition: Population Estimation Assessing Fishery Condition: Population Estimation Chapter 11 Thought for Today: "In God we trust, all others bring data." William Edward Deming (1900-1993). Why? Needs / Requirements Objectives Info about

More information

Jolly-Seber models in MARK

Jolly-Seber models in MARK Chapter 12 Jolly-Seber models in MARK Carl James Schwarz, Simon Fraser University A. Neil Arnason, University of Manitoba The original Jolly-Seber (JS) model (Jolly, 1965; Seber, 1965) was primarily interested

More information

Webinar Session 1. Introduction to Modern Methods for Analyzing Capture- Recapture Data: Closed Populations 1

Webinar Session 1. Introduction to Modern Methods for Analyzing Capture- Recapture Data: Closed Populations 1 Webinar Session 1. Introduction to Modern Methods for Analyzing Capture- Recapture Data: Closed Populations 1 b y Bryan F.J. Manly Western Ecosystems Technology Inc. Cheyenne, Wyoming bmanly@west-inc.com

More information

CHAPTER 12. Jolly-Seber models in MARK Protocol. Carl James Schwarz, Simon Fraser University A. Neil Arnason, University of Manitoba

CHAPTER 12. Jolly-Seber models in MARK Protocol. Carl James Schwarz, Simon Fraser University A. Neil Arnason, University of Manitoba CHAPTER 12 Jolly-Seber models in MARK Carl James Schwarz, Simon Fraser University A. Neil Arnason, University of Manitoba The original Jolly-Seber (JS) model (Jolly, 1965; Seber, 1965) was primarily interested

More information

CRISP: Capture-Recapture Interactive Simulation Package

CRISP: Capture-Recapture Interactive Simulation Package CRISP: Capture-Recapture Interactive Simulation Package George Volichenko Carnegie Mellon University Pittsburgh, PA gvoliche@andrew.cmu.edu December 17, 2012 Contents 1 Executive Summary 1 2 Introduction

More information

Lecture 7 Models for open populations: Tag recovery and CJS models, Goodness-of-fit

Lecture 7 Models for open populations: Tag recovery and CJS models, Goodness-of-fit WILD 7250 - Analysis of Wildlife Populations 1 of 16 Lecture 7 Models for open populations: Tag recovery and CJS models, Goodness-of-fit Resources Chapter 5 in Goodness of fit in E. Cooch and G.C. White

More information

Cormack-Jolly-Seber Models

Cormack-Jolly-Seber Models Cormack-Jolly-Seber Models Estimating Apparent Survival from Mark-Resight Data & Open-Population Models Ch. 17 of WNC, especially sections 17.1 & 17.2 For these models, animals are captured on k occasions

More information

BIOL 217 ESTIMATING ABUNDANCE Page 1 of 10

BIOL 217 ESTIMATING ABUNDANCE Page 1 of 10 BIOL 217 ESTIMATING ABUNDANCE Page 1 of 10 A calculator is needed for this lab. Abundance can be expressed as population size in numbers or mass, but is better expressed as density, the number of individuals

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Mark-Recapture. Mark-Recapture. Useful in estimating: Lincoln-Petersen Estimate. Lincoln-Petersen Estimate. Lincoln-Petersen Estimate

Mark-Recapture. Mark-Recapture. Useful in estimating: Lincoln-Petersen Estimate. Lincoln-Petersen Estimate. Lincoln-Petersen Estimate Mark-Recapture Mark-Recapture Modern use dates from work by C. G. J. Petersen (Danish fisheries biologist, 1896) and F. C. Lincoln (U. S. Fish and Wildlife Service, 1930) Useful in estimating: a. Size

More information

Stat 587: Key points and formulae Week 15

Stat 587: Key points and formulae Week 15 Odds ratios to compare two proportions: Difference, p 1 p 2, has issues when applied to many populations Vit. C: P[cold Placebo] = 0.82, P[cold Vit. C] = 0.74, Estimated diff. is 8% What if a year or place

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

Represent processes and observations that span multiple levels (aka multi level models) R 2

Represent processes and observations that span multiple levels (aka multi level models) R 2 Hierarchical models Hierarchical models Represent processes and observations that span multiple levels (aka multi level models) R 1 R 2 R 3 N 1 N 2 N 3 N 4 N 5 N 6 N 7 N 8 N 9 N i = true abundance on a

More information

Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter

Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Chris Paciorek Department of Biostatistics Harvard School of Public Health application joint

More information

STAT 536: Genetic Statistics

STAT 536: Genetic Statistics STAT 536: Genetic Statistics Tests for Hardy Weinberg Equilibrium Karin S. Dorman Department of Statistics Iowa State University September 7, 2006 Statistical Hypothesis Testing Identify a hypothesis,

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

C. J. Schwarz Department of Statistics and Actuarial Science, Simon Fraser University December 27, 2013.

C. J. Schwarz Department of Statistics and Actuarial Science, Simon Fraser University December 27, 2013. Errors in the Statistical Analysis of Egri, A., Blahó, M., Kriska, G., Farkas, R., Gyurkovszky, M., Åkesson, S. and Horváth, G. 2012. Polarotactic tabanids find striped patterns with brightness and/or

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Measurements and Data Analysis

Measurements and Data Analysis Measurements and Data Analysis 1 Introduction The central point in experimental physical science is the measurement of physical quantities. Experience has shown that all measurements, no matter how carefully

More information

Interpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score

Interpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score Interpret Standard Deviation Outlier Rule Linear Transformations Describe the Distribution OR Compare the Distributions SOCS Using Normalcdf and Invnorm (Calculator Tips) Interpret a z score What is an

More information

Mass Fare Adjustment Applied Big Data. Mark Langmead. Director Compass Operations, TransLink Vancouver, British Columbia

Mass Fare Adjustment Applied Big Data. Mark Langmead. Director Compass Operations, TransLink Vancouver, British Columbia Mass Fare Adjustment Applied Big Data Mark Langmead Director Compass Operations, TransLink Vancouver, British Columbia Vancouver British Columbia Transit Fare Structure Customer Satisfaction Correct fare

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression: Biost 518 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture utline Choice of Model Alternative Models Effect of data driven selection of

More information

Chapter Three. Hypothesis Testing

Chapter Three. Hypothesis Testing 3.1 Introduction The final phase of analyzing data is to make a decision concerning a set of choices or options. Should I invest in stocks or bonds? Should a new product be marketed? Are my products being

More information

Introduction to capture-markrecapture

Introduction to capture-markrecapture E-iNET Workshop, University of Kent, December 2014 Introduction to capture-markrecapture models Rachel McCrea Overview Introduction Lincoln-Petersen estimate Maximum-likelihood theory* Capture-mark-recapture

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Approach to Field Research Data Generation and Field Logistics Part 1. Road Map 8/26/2016

Approach to Field Research Data Generation and Field Logistics Part 1. Road Map 8/26/2016 Approach to Field Research Data Generation and Field Logistics Part 1 Lecture 3 AEC 460 Road Map How we do ecology Part 1 Recap Types of data Sampling abundance and density methods Part 2 Sampling design

More information

arxiv: v1 [physics.data-an] 2 Mar 2011

arxiv: v1 [physics.data-an] 2 Mar 2011 Incorporating Nuisance Parameters in Likelihoods for Multisource Spectra J. S. Conway University of California, Davis, USA arxiv:1103.0354v1 [physics.data-an] Mar 011 1 Overview Abstract We describe here

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

Chapter 19: Logistic regression

Chapter 19: Logistic regression Chapter 19: Logistic regression Self-test answers SELF-TEST Rerun this analysis using a stepwise method (Forward: LR) entry method of analysis. The main analysis To open the main Logistic Regression dialog

More information

Regression-Discontinuity Analysis

Regression-Discontinuity Analysis Page 1 of 11 Home» Analysis» Inferential Statistics» Regression-Discontinuity Analysis Analysis Requirements The basic RD Design is a two-group pretestposttest model as indicated in the design notation.

More information

Log-linear multidimensional Rasch model for capture-recapture

Log-linear multidimensional Rasch model for capture-recapture Log-linear multidimensional Rasch model for capture-recapture Elvira Pelle, University of Milano-Bicocca, e.pelle@campus.unimib.it David J. Hessen, Utrecht University, D.J.Hessen@uu.nl Peter G.M. Van der

More information

Fitting a Straight Line to Data

Fitting a Straight Line to Data Fitting a Straight Line to Data Thanks for your patience. Finally we ll take a shot at real data! The data set in question is baryonic Tully-Fisher data from http://astroweb.cwru.edu/sparc/btfr Lelli2016a.mrt,

More information

Statistical Analysis of List Experiments

Statistical Analysis of List Experiments Statistical Analysis of List Experiments Graeme Blair Kosuke Imai Princeton University December 17, 2010 Blair and Imai (Princeton) List Experiments Political Methodology Seminar 1 / 32 Motivation Surveys

More information

Stopover Models. Rachel McCrea. BES/DICE Workshop, Canterbury Collaborative work with

Stopover Models. Rachel McCrea. BES/DICE Workshop, Canterbury Collaborative work with BES/DICE Workshop, Canterbury 2014 Stopover Models Rachel McCrea Collaborative work with Hannah Worthington, Ruth King, Eleni Matechou Richard Griffiths and Thomas Bregnballe Overview Capture-recapture

More information

Four aspects of a sampling strategy necessary to make accurate and precise inferences about populations are:

Four aspects of a sampling strategy necessary to make accurate and precise inferences about populations are: Why Sample? Often researchers are interested in answering questions about a particular population. They might be interested in the density, species richness, or specific life history parameters such as

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

3 Joint Distributions 71

3 Joint Distributions 71 2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random

More information

A class of latent marginal models for capture-recapture data with continuous covariates

A class of latent marginal models for capture-recapture data with continuous covariates A class of latent marginal models for capture-recapture data with continuous covariates F Bartolucci A Forcina Università di Urbino Università di Perugia FrancescoBartolucci@uniurbit forcina@statunipgit

More information

Subject CT4 Models. October 2015 Examination INDICATIVE SOLUTION

Subject CT4 Models. October 2015 Examination INDICATIVE SOLUTION Institute of Actuaries of India Subject CT4 Models October 2015 Examination INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the aim of helping candidates.

More information

Integrating mark-resight, count, and photograph data to more effectively monitor non-breeding American oystercatcher populations

Integrating mark-resight, count, and photograph data to more effectively monitor non-breeding American oystercatcher populations Integrating mark-resight, count, and photograph data to more effectively monitor non-breeding American oystercatcher populations Gibson, Daniel, Thomas V. Riecke, Tim Keyes, Chris Depkin, Jim Fraser, and

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

AUTOMATED TEMPLATE MATCHING METHOD FOR NMIS AT THE Y-12 NATIONAL SECURITY COMPLEX

AUTOMATED TEMPLATE MATCHING METHOD FOR NMIS AT THE Y-12 NATIONAL SECURITY COMPLEX AUTOMATED TEMPLATE MATCHING METHOD FOR NMIS AT THE Y-1 NATIONAL SECURITY COMPLEX J. A. Mullens, J. K. Mattingly, L. G. Chiang, R. B. Oberer, J. T. Mihalczo ABSTRACT This paper describes a template matching

More information

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics The Nature of Geographic Data Types of spatial data Continuous spatial data: geostatistics Samples may be taken at intervals, but the spatial process is continuous e.g. soil quality Discrete data Irregular:

More information

Closed population capture-recapture models

Closed population capture-recapture models CHAPTER 14 Closed population capture-recapture models Paul Lukacs, University of Montana A fair argument could be made that the marking of individuals in a wild population was originally motivated by the

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION THOMAS MAILUND Machine learning means different things to different people, and there is no general agreed upon core set of algorithms that must be

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future

More information

A8824: Statistics Notes David Weinberg, Astronomy 8824: Statistics Notes 6 Estimating Errors From Data

A8824: Statistics Notes David Weinberg, Astronomy 8824: Statistics Notes 6 Estimating Errors From Data Where does the error bar go? Astronomy 8824: Statistics otes 6 Estimating Errors From Data Suppose you measure the average depression of flux in a quasar caused by absorption from the Lyman-alpha forest.

More information

Swarthmore Honors Exam 2012: Statistics

Swarthmore Honors Exam 2012: Statistics Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may

More information

Guideline on adjustment for baseline covariates in clinical trials

Guideline on adjustment for baseline covariates in clinical trials 26 February 2015 EMA/CHMP/295050/2013 Committee for Medicinal Products for Human Use (CHMP) Guideline on adjustment for baseline covariates in clinical trials Draft Agreed by Biostatistics Working Party

More information

Continuous Covariates in Mark-Recapture-Recovery Analysis: A Comparison. of Methods

Continuous Covariates in Mark-Recapture-Recovery Analysis: A Comparison. of Methods Biometrics 000, 000 000 DOI: 000 000 0000 Continuous Covariates in Mark-Recapture-Recovery Analysis: A Comparison of Methods Simon J. Bonner 1, Byron J. T. Morgan 2, and Ruth King 3 1 Department of Statistics,

More information

Biometrics Unit and Surveys. North Metro Area Office C West Broadway Forest Lake, Minnesota (651)

Biometrics Unit and Surveys. North Metro Area Office C West Broadway Forest Lake, Minnesota (651) Biometrics Unit and Surveys North Metro Area Office 5463 - C West Broadway Forest Lake, Minnesota 55025 (651) 296-5200 QUANTIFYING THE EFFECT OF HABITAT AVAILABILITY ON SPECIES DISTRIBUTIONS 1 Geert Aarts

More information

Categorical Data Analysis Chapter 3

Categorical Data Analysis Chapter 3 Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

Confidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean

Confidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean Confidence Intervals Confidence interval for sample mean The CLT tells us: as the sample size n increases, the sample mean is approximately Normal with mean and standard deviation Thus, we have a standard

More information

Introduction to hypothesis testing

Introduction to hypothesis testing Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If

More information

interval forecasting

interval forecasting Interval Forecasting Based on Chapter 7 of the Time Series Forecasting by Chatfield Econometric Forecasting, January 2008 Outline 1 2 3 4 5 Terminology Interval Forecasts Density Forecast Fan Chart Most

More information

Building a Prognostic Biomarker

Building a Prognostic Biomarker Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,

More information

2017 Technical Revision to the Marine Survival Forecast of the OCN Coho Work Group Harvest Matrix Erik Suring Oregon Department of Fish and Wildlife

2017 Technical Revision to the Marine Survival Forecast of the OCN Coho Work Group Harvest Matrix Erik Suring Oregon Department of Fish and Wildlife 2017 Technical Revision to the Marine Survival Forecast of the OCN Coho Work Group Harvest Matrix Erik Suring Oregon Department of Fish and Wildlife Agenda Item D.2 Attachment 1 November 2017 Introduction

More information

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis CIMAT Taller de Modelos de Capture y Recaptura 2010 Known Fate urvival Analysis B D BALANCE MODEL implest population model N = λ t+ 1 N t Deeper understanding of dynamics can be gained by identifying variation

More information

Sampling. Where we re heading: Last time. What is the sample? Next week: Lecture Monday. **Lab Tuesday leaving at 11:00 instead of 1:00** Tomorrow:

Sampling. Where we re heading: Last time. What is the sample? Next week: Lecture Monday. **Lab Tuesday leaving at 11:00 instead of 1:00** Tomorrow: Sampling Questions Define: Sampling, statistical inference, statistical vs. biological population, accuracy, precision, bias, random sampling Why do people use sampling techniques in monitoring? How do

More information

2 Prediction and Analysis of Variance

2 Prediction and Analysis of Variance 2 Prediction and Analysis of Variance Reading: Chapters and 2 of Kennedy A Guide to Econometrics Achen, Christopher H. Interpreting and Using Regression (London: Sage, 982). Chapter 4 of Andy Field, Discovering

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 12: Frequentist properties of estimators (v4) Ramesh Johari ramesh.johari@stanford.edu 1 / 39 Frequentist inference 2 / 39 Thinking like a frequentist Suppose that for some

More information

Investigation of Possible Biases in Tau Neutrino Mass Limits

Investigation of Possible Biases in Tau Neutrino Mass Limits Investigation of Possible Biases in Tau Neutrino Mass Limits Kyle Armour Departments of Physics and Mathematics, University of California, San Diego, La Jolla, CA 92093 (Dated: August 8, 2003) We study

More information

Part 7: Glossary Overview

Part 7: Glossary Overview Part 7: Glossary Overview In this Part This Part covers the following topic Topic See Page 7-1-1 Introduction This section provides an alphabetical list of all the terms used in a STEPS surveillance with

More information

Liang Li, PhD. MD Anderson

Liang Li, PhD. MD Anderson Liang Li, PhD Biostatistics @ MD Anderson Behavioral Science Workshop, October 13, 2014 The Multiphase Optimization Strategy (MOST) An increasingly popular research strategy to develop behavioral interventions

More information

Gapfilling of EC fluxes

Gapfilling of EC fluxes Gapfilling of EC fluxes Pasi Kolari Department of Forest Sciences / Department of Physics University of Helsinki EddyUH training course Helsinki 23.1.2013 Contents Basic concepts of gapfilling Example

More information

Occupancy models. Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology

Occupancy models. Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology Occupancy models Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology Advances in Species distribution modelling in ecological studies and conservation Pavia and Gran

More information

Let the x-axis have the following intervals:

Let the x-axis have the following intervals: 1 & 2. For the following sets of data calculate the mean and standard deviation. Then graph the data as a frequency histogram on the corresponding set of axes. Set 1: Length of bass caught in Conesus Lake

More information

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study TECHNICAL REPORT # 59 MAY 2013 Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study Sergey Tarima, Peng He, Tao Wang, Aniko Szabo Division of Biostatistics,

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Chap 4. Software Reliability

Chap 4. Software Reliability Chap 4. Software Reliability 4.2 Reliability Growth 1. Introduction 2. Reliability Growth Models 3. The Basic Execution Model 4. Calendar Time Computation 5. Reliability Demonstration Testing 1. Introduction

More information

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London Bayesian methods for missing data: part 1 Key Concepts Nicky Best and Alexina Mason Imperial College London BAYES 2013, May 21-23, Erasmus University Rotterdam Missing Data: Part 1 BAYES2013 1 / 68 Outline

More information

Joint live encounter & dead recovery data

Joint live encounter & dead recovery data Joint live encounter & dead recovery data CHAPTER 8 The first chapters in this book focussed on typical open population mark-recapture models, where the probability of an individual being encountered (dead

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information:

Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information: Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information: aanum@ug.edu.gh College of Education School of Continuing and Distance Education 2014/2015 2016/2017 Session Overview In this Session

More information

CS168: The Modern Algorithmic Toolbox Lecture #6: Regularization

CS168: The Modern Algorithmic Toolbox Lecture #6: Regularization CS168: The Modern Algorithmic Toolbox Lecture #6: Regularization Tim Roughgarden & Gregory Valiant April 18, 2018 1 The Context and Intuition behind Regularization Given a dataset, and some class of models

More information

Multistate models and recurrent event models

Multistate models and recurrent event models Multistate models Multistate models and recurrent event models Patrick Breheny December 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Multistate models In this final lecture,

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing Two-sample Categorical data: Testing Patrick Breheny October 29 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/22 Lister s experiment Introduction In the 1860s, Joseph Lister conducted a landmark

More information

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H. ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,

More information

Numerical Methods of Approximation

Numerical Methods of Approximation Contents 31 Numerical Methods of Approximation 31.1 Polynomial Approximations 2 31.2 Numerical Integration 28 31.3 Numerical Differentiation 58 31.4 Nonlinear Equations 67 Learning outcomes In this Workbook

More information

Computing Likelihood Functions for High-Energy Physics Experiments when Distributions are Defined by Simulators with Nuisance Parameters

Computing Likelihood Functions for High-Energy Physics Experiments when Distributions are Defined by Simulators with Nuisance Parameters Computing Likelihood Functions for High-Energy Physics Experiments when Distributions are Defined by Simulators with Nuisance Parameters Radford M. Neal Dept. of Statistics, University of Toronto Abstract

More information

Consider Table 1 (Note connection to start-stop process).

Consider Table 1 (Note connection to start-stop process). Discrete-Time Data and Models Discretized duration data are still duration data! Consider Table 1 (Note connection to start-stop process). Table 1: Example of Discrete-Time Event History Data Case Event

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Due Thursday, September 19, in class What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

Modules 1-2 are background; they are the same for regression analysis and time series.

Modules 1-2 are background; they are the same for regression analysis and time series. Regression Analysis, Module 1: Regression models (The attached PDF file has better formatting.) Required reading: Chapter 1, pages 3 13 (until appendix 1.1). Updated: May 23, 2005 Modules 1-2 are background;

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for

More information

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables. Chapter 10 Multinomial Experiments and Contingency Tables 1 Chapter 10 Multinomial Experiments and Contingency Tables 10-1 1 Overview 10-2 2 Multinomial Experiments: of-fitfit 10-3 3 Contingency Tables:

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"

Kneib, Fahrmeir: Supplement to Structured additive regression for categorical space-time data: A mixed model approach Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach" Sonderforschungsbereich 386, Paper 43 (25) Online unter: http://epub.ub.uni-muenchen.de/

More information

Detecting general patterns in fish movement from the analysis of fish tagging data

Detecting general patterns in fish movement from the analysis of fish tagging data 18 th World IMACS / MODSIM Congress, Cairns, Australia 13-17 July 2009 http://mssanz.org.au/modsim09 Detecting general patterns in fish movement from the analysis of fish tagging data Daphney Dagneaux,

More information

Time Series Analysis. Smoothing Time Series. 2) assessment of/accounting for seasonality. 3) assessment of/exploiting "serial correlation"

Time Series Analysis. Smoothing Time Series. 2) assessment of/accounting for seasonality. 3) assessment of/exploiting serial correlation Time Series Analysis 2) assessment of/accounting for seasonality This (not surprisingly) concerns the analysis of data collected over time... weekly values, monthly values, quarterly values, yearly values,

More information