arxiv: v3 [physics.data-an] 23 May 2011

Size: px
Start display at page:

Download "arxiv: v3 [physics.data-an] 23 May 2011"

Transcription

1 Date: October, 8 arxiv:.7v [hysics.data-an] May -values for Model Evaluation F. Beaujean, A. Caldwell, D. Kollár, K. Kröninger Max-Planck-Institut für Physik, München, Germany CERN, Geneva, Switzerland Georg-August-Universität, Göttingen, Germany Abstract Deciding whether a model rovides a good descrition of data is often based on a goodness-of-fit criterion summarized by a -value. Although there is considerable confusion concerning the meaning of -values, leading to their misuse, they are nevertheless of ractical imortance in common data analysis tasks. We motivate their alication using a Bayesian argumentation. We then describe commonly and less commonly known discreancy variables and how they are used to define -values. The distribution of these are then extracted for examles modeled on tyical data analysis tasks, and comments on their usefulness for determining goodness-of-fit are given.

2 Introduction Progress in science is the result of an interlay between model building and the testing of models with exerimental data. In this aer, we discuss model evaluation and focus rimarily on situations where a statement is desired on the validity of a model without exlicit reference to other models. We introduce different discreancy variables for this urose and define -values based on these. We then study the usefulness of the -values for assing judgments on models with a few simle examles reflecting commonly encountered analysis tasks. In the ideal case, it is ossible to calculate the degree-of-belief in a model based on the data. This otion is only available when a comlete set of models and their rior robabilities can be defined. However, the conditions necessary for this ideal case are usually not met in ractice. We nevertheless often want to make some statement concerning the validity of the model(s). We then are left with using robabilities of data outcomes assuming the model to try to make some judgments. These robabilities can be determined deductively since the model is assumed, and therefore frequencies of ossible outcomes can be roduced within the context of the model. These can then be used to roduce frequency distributions of discreancy variables, and -values (one-sided robabilities for the discreancy variables) can be calculated using the distributions and the observed values. The use of -values has been widely discussed in the literature [] and many authors have commented that -values are frequently misused in claiming suort for models []. We give a Bayesian argumentation for the use of -values to make judgments on model validity, and it is in this Bayesian sense that we will use -values. We start with a review of model testing in a Bayesian aroach when an exhaustive set of models is available and a coherent robability analysis is ossible. We then move to situations where this is not the case, and review some ossible choices of discreancy variables for Goodness-of-Fit (GoF) tests. Next, we define the -values for the discreancy variables and evaluate their usefulness for several examle data sets. Note that we omit a discussion of Bayes factors since these require the definition of at least two models for evaluation. Full set of models available. Formulation Assume we have a comlete set of models available for describing the data, such that we are sure the data can be described by one of the models available. The models can be used to calculate direct robabilities ; i.e., relative frequencies of ossible outcomes of the results were one to reroduce the exeriment many times under identical conditions. The robability of a model, M, is denoted by P (M), with P (M), () A discreancy variable [] is an extension of classical test statistics to allow ossible deendence on unknown (nuisance) arameters. We use robability to also include robability density.

3 while the robability densities of the model arameters, θ, are tyically continuous functions. In the Bayesian aroach, the quantities P (M) and P ( θ M) are treated as robabilities, although they are not frequency distributions and are more accurately described as degrees-ofbelief (DoB). This DoB is udated by comaring data with the redictions of the models. P (M = A) = reresents comlete certainty that A is the model which describes the data, and P (A) = reresents comletely certainty that A is not the correct model. The rocedure for udating our DoB using exerimental data is P i+ ( θ, M D) P ( x = D θ, M)P i ( θ, M), () where the index on P reresents a state-of-knowledge. The osterior robability density function, P i+, is usually written simly as P, and the rior is written as P. The osterior describes the state of knowledge after the exeriment is analyzed. The quantity P ( x = D θ, M) reresents a robability of getting the data D given the model and arameter values, and can usually be defined in a number of ways (see for examle Section.). Normalizing, and using P ( D) = M P ( x = D θ, M)P ( θ, M)d θ yields P ( θ, M D) = P ( D θ, M)P ( θ, M) P ( D) This is the classic equation due to Bayes and Lalace [].. () Models can be comared and the DoB in a model can be obtained using P (M D) = P ( θ, M D)d θ. () This evaluation requires the secification of a full set of models and a definition of the rior beliefs such that P (M) =. (5) M It is very sensitive to the definitions of the riors when the data are not very selective.. Examle with full set of models An examle where this aroach was used is given in [5], and is reviewed here. The analysis is to be erformed on an observed energy sectrum, for which we can form a backgroundonly hyothesis, or a background+signal hyothesis. An examle is the search for neutrinoless double beta decay. Note that there is in rincile no mathematical distinction between model and arameters. In ractice, we distinguish them because models are fixed constructs for which we evaluate the degree-of-belief that the model is correct, whereas arameters can take on a range of values and the analysis is used to extract a degree-of-belief for a articular value.

4 In the following, H denotes the hyothesis that the observed sectrum is due to background only; the negation, interreted here as the hyothesis that the signal rocess contributes to the sectrum, is labeled H. The osterior robabilities for H and H can be calculated using P (H sectrum) = P (sectrum H) P (H) P (sectrum) () and P (H sectrum) = P (sectrum H) P (H) P (sectrum). (7) The robability P (sectrum) is P (sectrum) = P (sectrum H) P (H) + P (sectrum H) P (H). (8) The robabilities for a given sectrum can be calculated based on assumtions on the signal strength and background shae as described in [5]. Evidence for a signal or a discovery is then decided based on the resulting value for P (H sectrum). In this analysis, it was assumed that the observed sectrum must come from either the background model or a combination of the background and double beta decay signal. The robability of each case is evaluated and conclusions are drawn from these robabilities. Incomlete set of models In most cases, we analyze data without having an exhaustive set of models available, but nevertheless want to reach conclusions on how well the models account for the data. This information can be used, for examle, to guide the search for new models. In the examle given above, it is ossible that there are unknown sources of background for which redictions are not ossible before the exeriment is erformed. The quantities P (sectrum H) and P (sectrum H) could be individually examined and, if both are on the small side of the exected distribution, doubts concerning the comleteness of our set of models could arise.. General aroach to GoF tests For a given model, we can define one or more discreancy variable(s) and calculate the exected frequency distribution of this discreancy variable. If the discreancy variable is well chosen, then the distribution for a good model should look significantly different than for a bad model. Finding the discreancy variable in the region oulated by incorrect models then gives us cause to think our model is not adequate. Since the shae of the background sectrum is assumed to be known the case of unknown background sources contributing to the measured sectrum is ignored. However, the overall level of background is allowed to vary.

5 . Definition of a -value A -value is the robability that, in a future exeriment, the discreancy variable will have a larger value (indicating greater deviation of the data from the model) than the value observed, assuming that the model is correct and all exerimental effects are erfectly known. In other words, not only is the model the correct one to describe the hysical situation, but correct distribution functions are used to reresent data fluctuations away from the true values. We will focus on GoF tests for the underlying model, but it should be clear that incorrect formulations of the data fluctuations will bias the -value distributions to lower (if the data fluctuations are underestimated) or higher (if the data fluctuations are overestimated) values. In general, any discreancy variable which can be calculated for the observations can be used to define a -value. We use R( x θ, M) and R( D θ, M) to denote discreancy variables evaluated with a ossible set of observations x for given model and arameter values, and for the observed data, x = D, resectively. To simlify the notation, we will occasionally dro the arguments on R and use R D to denote the value of the discreancy variable found from the data set at hand. R can be interreted as a random variable (e.g., ossible χ values for a given model), whereas R D has a fixed value (e.g., the observed χ derived from the data set at hand). Assuming that smaller values of R imly better agreement between the data and model redictions, the definition of (for continuous distributions of R) is written as: = P (R θ, M)dR. (9) R>R D The quantity is the tail-area robability to have found a result with R( x) > R D, assuming that the model M and the arameters θ are valid. If the modeling is correct (including that of the data fluctuations), will have a flat robability distribution between [, ]. For discrete distributions of R, the integral is relaced by a sum, the -value distribution is no longer continuous, and the cumulative distribution for will be ste-like. If the existing data are used to modify the arameter values, the extracted -value will be biased to higher values. The amount of bias will deend on many asects, including the number of data oints, the number of arameters, and the riors. We can remove the bias for the number of fitted arameters in χ fits by evaluating the robability of R = χ for N n degrees-offreedom, P (χ N n), where N is the number of data oints and n is the number of arameters fitted [], when the data fluctuations are Gaussian and indeendent of the arameters, the function to be comared to the data deends linearly on the arameters, and the arameters are chosen such that χ is at its global minimum. In general, the bias introduced by the number of fitted arameters becomes small if N n. -values cannot be turned into robabilistic statements about the model being correct without riors, and statements of suort for a model directly from the -value behave incoherently []. Furthermore, aroximations used for the distributions of the discreancy variables,

6 biases introduced when model arameters are fitted and difficulties in extracting reliable information from numerical algorithms used to evaluate the discreancy variable (see section..) further comlicate their use. -values should therefore be handled with care. Nevertheless, we discuss the use of -values to make judgments about the models at hand. The judgment will be based on a sequence of considerations of the tye: the -value distribution for a good model is exected to be (reasonably) flat between [, ]; the -values for bad models usually have sharly falling distributions starting at = ; small -values are worrisome; if we know that other models can be reasonably constructed which would have higher -values, then a small -value for the model under consideration indicates that we may have icked a oor model; if the -value is not too small, then our model is adequate to describe the existing data.. Bayesian argumentation We contend that the use of -values for evaluation of models as just described is essentially Bayesian in character. Following the arguments given above, assume that the -value robability density for a good model, M, is flat, P ( M ) =, and that for oor models, M i (i =... k), can be reresented by P ( M i ) c i e c i where c i so that the distribution is strongly eaked at and aroximately normalized to. The DoB assigned to model M after finding a -value is then If we take all models to have similar rior DoBs, then P (M ) = P ( M )P (M ) k i= P ( M i)p (M i ). () P (M ) P ( M ) k i= P ( M i). In the limit, we have while for c i i > P (M ) + k i= c i P (M ). Although this formulation in rincile allows for a ranking of models, the vague nature of this rocedure indicates that any model which can be constructed to yield a reasonable - value should be retained. A further consideration is that the correct distributions for the data fluctuations are often not known (due to the vague nature of systematic uncertainties) and best guesses are used. This will generally also lead to non-flat -value distributions for good models. Scientific rejudices (Occam s razor, elegance or esthetics, etc.) will influence the decision and act as a guide in selecting the best model in cases where several good models are available. 5

7 . Discreancy variables considered in this aer.. χ test for data with Gaussian uncertainties For uncorrelated data assumed to follow Gaussian robability distributions relative to the model redictions, the χ value is a natural discreancy variable for a GoF test: ( N y i f(x i ) θ, M) R G = χ = () i where the N data oints are given by {(x i, y i )}, and the rediction of the model for y i is f(x i θ, M). The modeling of the data fluctuations uses fixed standard deviations σ i. If the arameters of the model are fitted to the data by minimizing χ and there are n arameters we relace θ in the formula above with θ. The χ robability distribution is evaluated for N n degrees-of-freedom. In the secial case where f is linear in the arameters, this rocedure again yields a flat -value distribution between [, ]. σ i.. Runs test The standard χ test does not take into account clustering of data below or above exectations. To detect clusters the ordered set of N observations {(x i, y i )} is artitioned into subsets containing the success and failure runs (defined as sequences of consecutive y i above or below the exectation from the model, f(x i θ, M), resectively). Several discreancy variables based on success runs can be found in the literature [7] but these do not take into account the size of the deviation, y i f(x i θ, M). Recently, a discreancy variable for runs incororating this extra information was roosed for ordered data with Gaussian fluctuations [8]. Let A j denote the subset of the observations of the j th success run. The weight of the j th success run is then taken to be ( y i f(x i θ, ) M) χ run,j = j +N j i=j where the sum over i covers the {(x i, y i )} A j and N j is the length of the run. The discreancy variable is the largest weight of any run σ i R sr = max χ run,j. j The exlicit distribution of R sr, used to define the -value, = P (R sr > R D sr N), is given in [8] for the case when ( θ, M) are fully secified (no fitting). A similar discreancy variable can be defined for failure measurements, R fr. To illustrate the definition we resent a simle examle. Suose N = 5 observations at x ositions (,,,, 5) with standardized residuals (y i f(x i θ, M))/σ i given by (.,.,.8,.,.). Then there are two success runs A = {(,.)}, A = {(,.), (5,.)} and we find R sr =. +. =. due to the second run. Similarly, for the single failure run, R fr =.5.

8 .. χ tests for Poisson distributed data In the case of binned, Poisson distributed data, the quantity R P = χ P = N b i= (m i λ i ) λ i () can be used as a discreancy variable where N b is the number of bins, the m i are the numbers of measured events and λ i = λ i ( θ, M) are the model exectations for event bin i; for an exectation λ i the exected variance is λ i. This Pearson χ statistic [9] was originally roosed for multinomial data but has found wide use in analyzing Poisson distributed data. Rather than using the robability distribution of this discreancy variable directly, P (χ N b ) is often used as an aroximation for P (R P ) 5. The distribution of R P has been shown to asymtotically reach a χ distribution for multinomially distributed data, giving some justification for this rocedure. Practically, this is the case for data with a large number of entries m i in all bins. When arameters are first estimated from a fit to the data, the arameter values which minimize R P are used to calculate the λ i, and the -value is evaluated using P (χ N b n). It is often seen that the exected weight, λ i, is relaced with the observed weight, m i, in the denominator (Neyman χ []). The discreancy variable is then R N = χ N = N b i= (m i λ i ) m i. () Again, rather than using the robability distribution of this discreancy variable directly, it is assumed that R N has a distribution which aroximates a χ distribution with N b degrees of freedom and P (χ N b ) is used as an aroximation for P (R N ). In cases where m i =, ractitioners of this aroach set m i = to avoid divergence. Sometimes bins with m i = are ignored, which can lead to very misleading results since finding m i = is valuable information. When arameters are first estimated from a fit to the data, the arameter values which minimize R N are used, and the -value is evaluated using P (χ N b n). Note that in both of these cases, we do not exect flat -value distributions since only aroximations are used for P (R P/N ). The deviations from flatness are exected to be greatest when small event numbers are resent in the data sets... Likelihood ratio test for Poisson distributed data Another otion for a Poisson model is based on the (log of) the likelihood ratio [] (sometimes referred to as the Cash statistic []) R C = log P ( x λ i = m i ) P ( x λ i = λ i ( θ)) = N b i= [ λ i m i + m i log m ] i λ i. () 5 The use of this aroximation robably dates back to a time when comlicated numerical calculations were not ossible. 7

9 In bins where m i =, the last term is set to. Again, rather than using the robability distribution of this discreancy variable directly, since R C has a distribution which aroximates a χ distribution with N b degrees of freedom for large λ i, P (χ N b ) is used as an aroximation for P (R C ). The validity of this aroximation is critically discussed in []. When arameters are first estimated from a fit to the data, the arameter values which maximize the likelihood are used, and the -value is evaluated using P (χ N b n). Note that the distribution of R C asymtotically converges faster to the χ distribution than the distribution of R P (see [])...5 Probability of the data test Any robability of the data can be chosen as the discreancy variable: R L = P ( x θ, M). In this case, larger values of R L imly better agreement with the data. The robability P ( x θ, M) can be used to extract the robability for R L as P (R L )dr L = P ( x θ, M)d x. R L <P ( x θ,m)<r L +dr L An examle of how this is done numerically for Poisson distributed data and using P ( m θ, M) = N b i e λ i λ m i i m i! is given in the Aendix. Once we have P (R L θ, M) we can then evaluate = P (R L )dr L where R D L R L <R D L is the value observed with the data set at hand. In a model with Gaussian uncertainties where we use a roduct of Gaussian densities, P ( x θ, M) N P (x i µ i, σ i ) i where x i N (µ i, σ i ), then taking R L = P ( x θ, M) is equivalent to the usual χ test. If the model arameters are first fitted using the data, we roose the following correction for the number of fitted arameters: calculate the -value taking RL D (no fitted arameters); = P ( x = D θ, M), but assuming a simle hyothesis calculate the χ value which corresonds to this -value using the inverse χ distribution corresonding to N degrees of freedom; 8

10 recalculate the -value using the χ value and N n degrees of freedom. This rocedure is valid for the case where we have Gaussian uncertainties, but is ad-hoc for other cases. We test its usefulness below. Care must be taken in using R L as a discreancy variable, articularly when it is written as the roduct of individual robability densities. The overall shae of the distribution is otentially not tested, and large -values can be roduced with incorrect model choices. An examle of this is given below in Section.... Partial/Prior/Posterior-redictive -value Rather than using the arameters at the mode of the osterior, it is also ossible to define a -value by averaging over the arameter values according to a robability distribution [5]. For the osterior-redictive case [], assuming we are using a robability of the data as discreancy variable, [ = R L <R D L P (R L θ, M)dR L ] P ( θ D, M)d θ. (5) While the artial osterior-redictive -value [5] has the desirable roerty of a flat distribution on [, ], at least in the large samle limit as N, it is not known in general how to comute it for realistic roblems. Furthermore, the numerical effort required to evaluate the double integral in (5) quickly becomes rohibitive. Therefore, these -value definitions are not considered here desite their aeal from a Bayesian ersective...7 Johnson test Johnson [7] roosed a modification of the χ definition to take into account the osterior robability density for the arameters of a model. Rather than evaluating the robability of the data at fixed values of the arameters, the arameters are given values according to their robability density after evaluating the data. Johnson s statistic, R J, is exected to behave asymtotically as a χ distribution with N b degrees of freedom, where N b is a number of bins to be defined, regardless of the number of arameters. Imagine the data is given by a vector of values x and these values are exected to deviate from the model rediction according to individual robabilities P (x j θ, M). The Johnson rescrition is to define N b bins, where bin i contains a robability P i. Intervals x i,j are defined for each data oint j via P i ( θ) = P (x j θ, M)dx j. x i,j The intervals x i,j cover the full range of ossible values for each x j and are ordered so that they include increasing values of x. The definition of the intervals deends on the values of the arameters θ and vary for each data oint j. The arameter values are to be samled from 9

11 the full osterior robability P ( θ D, M). For each θ, a discreancy variable R J is calculated according to ( m i NP i ( θ) ) R J = N b i= NP i ( θ) where m i is the actual number of data oints falling within the intervals x i,j and N is the total number of data oints in the data set. In [7] it is shown that asymtotically R J is distributed as a χ variable with N b degrees of freedom, regardless of the number of fitted arameters n. Hence the -value is calculated using P (χ N b ). For data where the x follow continuous robability densities the bins are usually chosen to have equal robabilities. The number of bins to be chosen is given by a rule of thumb due to Mann/Wald [8] with a modification for small number of bins such that there are at least three: N b = e. log(n) +. This means by default we have N b (N = 5) = 5, N b (N = ) = 8, N b (N =, ) = 7. In case the values of x follow a discrete distribution it is usually not ossible to have equal robabilities for all P i. In this case, a randomization rocedure is used to allocate data oints to bins. () Test of -value definitions In the following, we test the usefulness of the different -values given above by looking at their distributions for secific examles motivated from common situations faced in exerimental hysics. We first consider a data set which consists of a background known to be smoothly rising and, in addition to the background, a ossible signal. This could corresond for examle to an enhancement in a mass sectrum from the resence of a new resonance. The width of the resonance is not known, so that a wide range of widths must be allowed for. Also, the shae of the background is not well known. We do not have an exhaustive set of models to comare and want to look at GoF s for models individually to make decisions. In this examle, we will first assume that distributions of the data relative to exectations are modeled with Gaussian distributions. We then consider the same roblem with small event numbers, so that Poisson statistics are aroriate, and again test our different -value definitions. These examles were also discussed in [9]. Finally, we consider the case of testing an exonential decay law on a samle of measured decay times.

12 . Data with Gaussian uncertainties.. Definition of the data A tyical data set is shown in Fig.. It is generated from the function f(x i ) = A + B x i + C x i + D σ (x i µ) π e σ, (7) with arameter values (A =, B =.5, C =., D = 5, σ =.5, µ = 5.). The y i are generated from f(x i ) as y i = f(x i ) + z i where z i is samled according to N (, ). y 5 5 Data Model I Model II Model III Model IV x Figure : Examle data set for the case N = 5 with Gaussian fluctuations. The fits of the four models are suerosed on the data. The x domain is [, ] and two cases were considered: N = 5 data oints evenly samled in the range, and N = data oints evenly samled in the range. The exerimental resolution (Gaussian with width ) is assumed known and correct. Ensembles consisting of, data sets with N = 5 or N = data oints each were generated and four different models were fitted to the data. Table summarizes the arameters available in each model and the range over which they are allowed to vary. In all models, flat riors were assumed for all arameters for ease of comarison between results. The models were fitted one at a time. Two different fitting aroaches were used: ) The fitting was erformed using the gradient-based fitter MIGRAD from the MINUIT ackage [] accessed from within the Bayesian Analysis Toolkit (BAT) [9]. The starting values for the arameters were chosen at the center of the allowed ranges given in Table.

13 Small range Large range Model Par Min Max Min Max I. A I + B I x i + C I x i A I 5-5 B I. -5 C I II. A II + D II σ II π e (x i µ II ) σ II A II -5 B II -5 µ II 8 5 σ II. III. A III + B III x i + D III σ III π e (x i µ III ) σ III A III -5 B III -5 D III µ III 8 5 σ III. IV. A IV + B IV x i + C IV x i + D IV σ IV π e (x i µ IV ) σ IV A IV -5 B IV -5 C IV.5-5 D IV µ IV 8 5 σ IV. Table : Summary of the models fitted to the data, along with the ranges allowed for the arameters.

14 ) The Markov Chain Monte Carlo (MCMC) imlemented in BAT was run with its default settings first, and then MIGRAD was used to find the mode of the osterior using the arameters at the mode found by the MCMC as starting oint. Since it is known that the distributions of the data are Gaussian and flat riors were used for the arameters, the maximum of the osterior robability corresonds to the minimum of χ. The -values were therefore extracted for each model fit using the χ robability distribution as discussed in Sec.... Model I Model II 8 MIGRAD MCMC Model III.5 Model IV Figure : -value distributions based on R G. The arameters of the models discussed in Table are fitted to N = 5 data oints and allowing arameter values in the small range. The -values corresond to N n degrees of freedom, where n is the number of fitted arameters... Comarison of -values The -value distributions for the four models are given in Fig. for the case N = 5 and using the small fit range from Table. Two different histograms are shown for each model,

15 corresonding to the two fitting techniques described above. The distributions for models I and II are eaked at small values of for MIGRAD only and for MCMC+MIGRAD, and the models would be disfavored in most exeriments. For models III and IV, there is a significant difference in the -value distribution found from fitting using only MIGRAD, or MCMC+MIGRAD, with the -value distribution in the latter case being much flatter. An investigation into the source of this behavior showed that the -value distribution from MIGRAD is very sensitive to the range allowed for the arameters. This can be seen in Fig. where the same fits were erformed but now using the large ranges for the arameters (see table). In the case where larger arameter ranges are allowed, MIGRAD will converge to arameter values which are not at the global mode more often, and the choice of the starting oint and the (initial) ste size for the fit are crucial. Even in this rather simle fitting roblem, it is seen that the use of the MCMC can make a significant difference in the fitting results Model I MIGRAD MCMC Model III Model II Model IV....8 Figure : -value distributions based on R G. The arameters of the models discussed in Table are fitted to N = 5 data oints and allowing arameter values in the large range. The -values corresond to N n degrees of freedom, where n is the number of fitted arameters. The results from the MCMC+MIGRAD fits also deended on the fit range, although to a lesser extent. Figure shows the -value distributions from χ fits to N = 5 data oints for Model IV for the two arameter ranges. There are small but nevertheless significant differences

16 in the distribution, indicating that the arameter range is also an imortant factor in the -value distribution. Note that we are not execting flat -value distributions since the correction for the number of fitted arameters is only valid for models linearly deendent on the arameters. This is not the case here since we have a Gaussian term in the model (see 7)... Model IV.8... MCMC - small range MCMC - large range....8 Figure : -value distributions based on R G for model IV. The arameters of the model are fitted to N = 5 data oints and allowing arameter values in the two ranges range. The -values corresond to N n degrees of freedom, where n is the number of fitted arameters. Focusing on the results from the MCMC+MIGRAD fits, models III and IV give flat - value distributions and would have a large DoB in a majority of exeriments. In general, these distributions satisfy our exectations and these -values can be used to udate our DoBs in the models as described in Section.. The -value distributions for the four models and using the Johnson discreancy variable are given in Fig. 5. Note that in the results shown for this descreancy variable, we have used θ rather than samling θ according to the osterior. We verified that this did not significantly change the -value distribution of this discreancy variable. The distributions show siky behavior, which becomes much smoother as the number of data oints increases. However, there is still very little discriminating ower between the models using this discreancy variable. This was generally the case for other examles considered, and we therefore do not show any further results from the Johnson discreancy variable in this aer. For all further results resented in this aer, we have erformed MIGRAD only fits but with starting arameter values located near the global mode (whose location can be estimated since the underlying model is known). Only the smaller fit ranges were used. In this way, we aroximate the results which would be achieved with otimal fitting. This allows us to generate and analyze larger data sets, and thus have more sensitivity to the shae of the -value distributions. The -value distributions for the four models and using the runs discreancy variable are given in Fig.. In this figure, there are two -value distributions in each lot since we consider both cases where we have runs of data oints below the exectations, and runs of data oints above the exectations. For the correct model, the joint distribution of (R sr ) and (R fr ) should 5

17 .5 Model I.5 Model II MIGRAD MCMC Model III.5 Model IV Figure 5: -value distributions based on R J. The arameters of the models discussed in Table are fitted to N = 5 data oints and allowing arameter values in the small range. The -values corresond to N b degrees of freedom, where N b = 5 is the number of bins.

18 be symmetric. The -values should generally not be too small. The distribution of (R sr ) vs. (R fr ) is shown in Fig. 7 for the examle under consideration. The biasing of the -values resulting from the fitting of arameters is aarent. In using R fr and R sr, rather large -values should therefore be exected for valid models. 7 Model I 7 Model II 5 Success Failure Model III.5 Model IV Figure : -value distributions based on R sr and R fr. The arameters of the models discussed in Table are fitted to N = 5 data oints and allowing arameter values in the small range. The results for the ensembles with N = data oints are shown in Figs. 8 and 9. The models I and II now usually result in very small -values using both the standard χ and runs discreancy variables. For the χ fits, the -value distribution for model IV is again rather flat, as exected, but it would generally be difficult to conclude that model III is not adequate. Although this -value distribution is falling, as oosed to the case where we only had N = 5 data oints, there is still a large robability to get a sizable -value. The runs statistic has a sharly eaked distribution near = (for success runs) for models I,II, and should therefore have better discriminating ower than the standard χ test. In other words, it should more often lead to the desired result that the incorrect models are discarded. 7

19 sr Model IV Model I Figure 7: Joint distribution of the -values for success and failure runs. The results are for N = 5 data oints. Marker sizes in each bin are chosen relative to the bin with the highest robability of the individual model. Bins with robability less than.5 have been excluded from the lot for the urose of clarity. fr For model III, the success runs variable behaves similarly to the χ test, but the failure runs variable is rather flat. For the correct underlying model, model IV, we see that both success and failure runs variables have similar -value distributions. 8

20 8 8 Model I MIGRAD Model II Model III.5 Model IV Figure 8: -value distributions based on R G. The arameters of the models discussed in Table are fitted to N = data oints and allowing arameter values in the small range. The -values corresond to N n degrees of freedom, where n is the number of fitted arameters. 9

21 8 8 Model I Success Failure Model II Model III.5 Model IV Figure 9: -value distributions based on R sr and R fr. The arameters of the models discussed in Table are fitted to N = data oints and allowing arameter values in the small range.

22 . Examle with Poisson uncertainties.. Definition of the data We again use the function f(x) (see (7)), but now generate data sets with fluctuations from a Poisson distribution. The function is used to give the exected number of events in a bin covering an x range, and the observed number of events follows a Poisson distribution with this mean. For N b = 5 bins, e.g., the third bin is centered at x =. and extends over [.,.]. The exected number of events in this bin is defined as λ =.. N b f(x)dx... Comarison of -value distributions For the Poisson case we consider the four discreancy variables described in sections to comare the -value distributions for the four models (I-IV). We also show the -value distribution for the true model for comarison, where the -value is calculated by evaluating R = R( x θ true, M). Different fits were erformed for each of the different discreancy variable definitions. For the χ discreancy variables, χ minimization was erformed using either the exected or observed number of events as weight. For the likelihood ratio test, a maximum likelihood fit was erformed. The likelihood was defined as P ( m θ, M) = N b i= e λ i λ m i i m i! (8) where m i is the observed number of events in a bin and λ i = λ i ( θ). Since we use flat riors for the arameters, the same results were used for the case where the discreancy variable is the robability of the data. A tyical likelihood fit result is shown in Fig. together with the data, while the -value distributions are given in Figs. -. For all definitions of -values considered here, models I,II generally lead to low DoBs, while models III and IV generally have high DoBs. However, there are differences in the behavior of the -values which we now discuss in more detail. The lower left anel in Figs. - show the -value distribution taking the true model and the true arameters, and can be used as a gauge of the biases introduced by the -value definition. There is a bias for all discreancy variables shown in Fig. because aroximations are used for the robability distribution of the discreancy variable (using P (χ N) rather than P (R P N b ) or P (R N N b ). This bias is usually small, excet in the case where the Neyman χ discreancy variable is used. In this case, the robability of a very small -value is much too high even with the true arameters. The -value distribution using the robability of the data, on the other hand, is flat when the true arameters are used as seen in Fig., so that this choice would be otimal in cases where no arameters are fit.

23 number of events 5 5 Data Model I Model II Model III Model IV x Figure : Examle fits to one data set for the four models defined in Table for N b = 5 and the small arameter range. The lines shows the fit functions evaluated using the best-fit arameter values... Pearson χ and Neyman χ Comarison of the behavior of the Pearson χ to the Neyman χ, Fig., clearly shows that using the exected number of events as weight is suerior. The sike at in the -value distribution when using the observed number of events indicates that this quantity does not behave as exected for a χ distribution when dealing with small numbers of events, and will lead to undesirable conclusions more often than anticiated. The behavior of the Pearson χ using the exected number of events for the different models is quite satisfactory, and this quantity makes a good discreancy variable even in this examle with small numbers of events in the bins. Both of these -value distributions become flat in the case of large number of events in all bins... Likelihood Ratio While models I,II are strongly eaked at small -values, for this definition of a -value models III,IV also have a falling -value distribution. This is somewhat worrisome. In assigning a DoB to models based on this -value, this bias towards smaller -values should be considered, otherwise good models will be assigned too low a DoB. The behavior of the -value extracted from Pearson s χ is referable to the likelihood ratio, as it will lead to value judgments more in line with exectations...5 Probability of the Data Using this definition, the -value distribution is flat for the true model, since this is the only case where the correct robability distribution of the discreancy variable is emloyed. For the

24 8 8 Model I Pearson Neyman Likelihood ratio Model II Model III 5 Model IV True model....8 Figure : -value distributions based on R P, R N and R C. The arameters of the models discussed in Table are fitted to N b = 5 bins and allowing arameter values in the small range. The -values corresond to N n degrees of freedom, where n is the number of fitted arameters.

25 fitted models, two -value distributions are shown - one uncorrected for the number of fitted arameters, and one corrected for the fitted arameters as described in Section..5. Using the uncorrected -values, both models III and IV show -value distributions eaking at =, so that both models would tyically be ket. As exected, the correction for the number of fitted arameters ushes to lower values, and roduces a nearly flat distribution for model IV. This ad-hoc correction works well for this examle. In general, this -value definition has similar roerties to the Pearson χ statistic, but with the advantage of a flat -value distribution when the true model was used and no arameters were fit.. Exonential Decay When dealing with event samles with continuous robability distributions for the measurement variables, it is common ractice when determining the arameters of a model to use a roduct of event robabilities (unbinned likelihood): P ( x θ, N M) = P (x i θ, M). i= If the model chosen is correct, then this definition for the robability of the data (or likelihood) can be used successfully to determine aroriate ranges for the arameters of the model. However, P ( x θ, M) defined in this way has no sensitivity to the overall shae of the distribution and can lead to unexected results if this quantity is used in a GoF test of the model in question. We use a common examle, exonential decay, to illustrate this oint (see also discussions in [] and []). Our model is that the data follows an exonential decay law. We measure a set of event times, t, and analyze these data to extract the lifetime arameter τ. We define two different robabilities of the data D = {t i } (i =... N) I Unbinned likelihood II Binned likelihood ( ) P D τ = N i= τ e t i/τ, P ( D τ) = N b i= λ m i ti + t i m i! e λ i, λ i (τ) = dt N t i τ e t/τ. In the first case, the robability density is a roduct of the densities for the individual events, while in the second case the events are counted in time intervals and Poisson robabilities are calculated in each bin. The exected number of events is normalized to the total number of observed events. We consider time intervals with a width t = unit and time measurements ranging from t = to t =. The overall robability is the roduct of the bin robabilities. For each of these two cases, we consider the -value determined from the distribution of R = P ( D τ). In order to make the oint about the imortance of the choice for the discreancy variable for GoF tests, we generated data which do not follow an exonential distribution. The data is generated according to a linearly rising function, and a tyical data set is shown in Fig..

26 8 8 Model I N DoF N - n DoF Model II Model III.5 Model IV True model Figure : -value distributions based on R L. The arameters of the models discussed in Table are fitted to N b = 5 bins and allowing arameter values in the small range. The -values corresond to N b and N b n degrees of freedom, where n is the number of fitted arameters. 5

27 number of decays Unbinned data Binned data Unbinned ML fit t Figure : Tyical data set used for the fitting of the exonential model. The individual event times are shown, as well as the binned contents as defined in the text. The best fit exonential from the unbinned likelihood is also shown on the lot, normalized to the total number of fitted events... Product of exonentials If the data are fitted using a flat rior robability for the lifetime arameter, then we can solve for the mode of the osterior robability of τ analytically, and get the well-known result τ = N ti. Defining ξ t i, so τ = ξ/n, we can also solve analytically for the -value: = t i >ξ dt dt... (τ ) N e t i /τ and the result is = P (N, N) with the regularized incomlete Gamma function P (s, x) = γ (s, x) Γ (s) = x ts e t dt t s e t dt. Surrisingly, as N increases is aroximately constant with a value.5. Regardless of the actual data, is never small and deends only on N. The best fit exonential is comared to the data in Fig., and yields a large -value although the data is not exonentially distributed. This

28 -value definition is clearly useless, and the unbinned likelihood is seen to not be aroriate as a GoF statistic. The definition of χ in the Gaussian data uncertainties case also results from a roduct of robability densities, so the question as to what is different in this case needs clarification. The oint is that in the case of fitting a curve to a data set with Gaussian uncertainties, the data are in the form of airs, {(x i, y i )}, where the measured value of y is assigned to a articular value of x, and the discreancy between y and f(x) is what is tested. In other words, the value of the function is tested at different x so the shae is imortant. In the case considered here, the data are in the form {x i }, and there is no measurement of the value of the function f(x) at a articular x. The orderings of the x i is irrelevant, and there is no sensitivity to the shae of the function... Product of Poisson robabilities In this case, the -value for the model is determined from the roduct of Poisson robabilities using event counting in bins as described above. Since the exected number of events now includes an integration of the robability density and cover a wide range of exectations, it is sensitive to the distribution of the events and gives a valuable -value for GoF. The -value using R P for the data shown in Fig. is = within the recision of the calculation. The exonential model assumtion is now easily ruled out. In comarison to the unbinned likelihood case, the data are now in the form m, where the m i now refer to the number of events in a well-defined bin i which defines an x range. In other words, we have now a measurement of the height of the function f(x) at articular values of x, and are therefore sensitive to the shae of the function. As should be clear from this examle, the choice of discreancy variable can be very imortant in GoF decisions: Maximum Likelihood estimation with unbinned data will always give an otimal arameter estimate in terms of bias and variance, but it will give no information about the correctness of the model. 5 Discussion We have investigated a number of ossible discreancy variables and -value definitions for Goodness-of-Fit tests, with the understanding that these -values are to be used to make judgments on the accetability of models. The sense in which the -values are used follows Bayesian logic as described in Section.. For this urose, the -value distributions should be reasonably uniform for models which are considered good and they should eak at small values for oor models. Using these requirements to guide us, we classify our results in Table for the examle data sets and models described in the text. Based on our exerience with these examles, we also summarize our suggestions for the use of the different discreancy variables and -value definitions considered here. Most common -values use aroximations for the distribution of the discreancy variable. This leads to non-flat distributions of the -value even when no arameters are fitted, and these 7

29 Discreancy variable -value definition Data tye Comment RG P (χ > R G D N n) D = {(x i, yi)} Flat if n = or model linear + Rsr/fr P (Rsr/fr > R sr/fr D N) D = {(x i, yi)} Generally not flat for n > + RP P (χ > R P D N b n) D = m Reasonably flat + RN P (χ > R N D N b n) D = m Peaked at for small numbers of events RC P (χ > R C D N b n) D = m Distribution for R C is less flat than that for RP + RL P (RL < R L D θ, M) D = {(x i, yi)} Same as RG + RL P (RL < R L D θ, M) D = m Almost flat with correction + RL P (RL < R L D θ, M) D = {x i} No sensitivity to shae RJ P (χ > R J D N b ) D = {(x i, yi)} Siky distribution RJ P (χ > R J D N b ) D = m Siky distribution Table : Summary of the discreancy variables considered in this aer and their erformance for different data tyes. Here, n is the number of fit arameters, N is the number of events and Nb denotes the number of bins. The comment concerns the shae of the -value distribution for good models. The last column gives our recommendation for use as a discreancy variable for model judgments based on the given -value definition. 8

30 deviations can be severe in some cases. We discussed the use of the robability of the data itself as a discreancy variable, and showed that it is flat in the case of no fitted arameters and Poisson distributed data. This was due to the use of the correct robability distribution for the discreancy variable. The algorithm described in the Aendix can be used to generate the correct -value distribution for any discreancy variable for Poisson distributed data. For comosite models, where arameters of the model are fit to the data before a -value is calculated, we find that -value distributions can deend strongly on the technical aroach used to fit the data. Using gradient-based aroaches to find minima or maxima requires considerable attention from the user and generally fine tuning of fit ranges and starting values. The finetuning will certainly affect the -value distributions, making their use more difficult. Setting range limits effectively means defining a rior, and imacts -value distributions. First fitting the data with a Markov Chain Monte Carlo before using MIGRAD gave considerably better results, in the sense that the -value distributions extracted in this way followed exectations. We therefore recommend using a MCMC to ma out the arameter sace as a start to the fitting rocedure in situations where giving good starting values for a gradient-based fit is difficult. 9

31 A Fast -value evaluation in MCMC for Poisson distributed data The eak robability for a Poisson distribution P (m λ) = e λ λ m occurs for m = λ. If the robability distribution of a data set is modeled as a roduct of Poisson terms, then the highest robability is given for {m i } = { λ i }. We use this to define the starting oint for a Markov Chain, and move the bin contents u or down (chosen randomly) at each iteration. For each attemted change in the bin content, we aly the usual Metroolis test []. The robability P ( m λ) is easily udated at each change. E.g., if the result in bin i increases from m i to m i, then the robability changes by P P m i!!λm m i m! i m i i. A large number of exeriments can be quickly simulated and the -value extracted. References [] A. Gelman, X. L. Meng, and H. Stern, Posterior redictive assessment of model fitness via realized discreancies, Stat. Sinica, (99) 7. [] See, e.g., M. J. Bayarri and J. O. Berger, P values for Comosite Null Models, Jour. Am. Stat. Assoc., 95 () 7; M. J. Bayarri and J. O. Berger, The interlay of Bayesian and Frequentist Analysis, Statistic. Sci., 9 () 58; I. J. Good, The Bayes/non-Bayes Comromise: a Brief Review, Jour. Am. Stat. Assoc., 87 (99) 597; J. de la Horra and M. T. Rodriguez-Bernal, Posterior redictive -values: what they are and what they are not, Sociedad de Estadstica e Investigacin Oerativa Test () 75; J. I. Marden, Hyothesis Testing: From Values to Bayes Factors, Jour. Am. Stat. Assoc., 95 () ; T. Sellke, M.J. Bayarri and J.O. Berger, Calibration of Values for Testing Precise Null Hyotheses, The American Statistician, 55 (). [] M. J. Schervish, P Values: What They Are and What They Are Not, The American Statistician, 5 (99). [] See, e.g., E. T. Jaynes, Probability Theory - the logic of science, G. L. Bretthorst (ed.) Cambridge University Press (); A. Gelman, J. B. Carlin, H. S. Stern, D. B. Rubin Bayesian Data Analysis, Second Edition (Texts in Statistical Science) Chaman & Hall, London () ; P. Gregory, Bayesian Logical Data Analysis for the Physics Sciences, Cambridge University Press, (5); G. D Agostini, Bayesian Reasoning in Data Analysis - A Critical Introduction, World Scientific,. [5] A. Caldwell and K. Kröninger, Signal discovery in sarse sectra: A Bayesian analysis, Phys. Rev. D 7 () 9.

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

STA 250: Statistics. Notes 7. Bayesian Approach to Statistics. Book chapters: 7.2

STA 250: Statistics. Notes 7. Bayesian Approach to Statistics. Book chapters: 7.2 STA 25: Statistics Notes 7. Bayesian Aroach to Statistics Book chaters: 7.2 1 From calibrating a rocedure to quantifying uncertainty We saw that the central idea of classical testing is to rovide a rigorous

More information

On split sample and randomized confidence intervals for binomial proportions

On split sample and randomized confidence intervals for binomial proportions On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have

More information

Notes on Instrumental Variables Methods

Notes on Instrumental Variables Methods Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of

More information

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers

More information

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process P. Mantalos a1, K. Mattheou b, A. Karagrigoriou b a.deartment of Statistics University of Lund

More information

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly rovided by Paul Lewis and Mark Holder Note 2: Paul Lewis has written nice software for demonstrating Markov

More information

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI **

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI ** Iranian Journal of Science & Technology, Transaction A, Vol 3, No A3 Printed in The Islamic Reublic of Iran, 26 Shiraz University Research Note REGRESSION ANALYSIS IN MARKOV HAIN * A Y ALAMUTI AND M R

More information

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit Chater 5 Statistical Inference 69 CHAPTER 5 STATISTICAL INFERENCE.0 Hyothesis Testing.0 Decision Errors 3.0 How a Hyothesis is Tested 4.0 Test for Goodness of Fit 5.0 Inferences about Two Means It ain't

More information

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management

More information

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK Comuter Modelling and ew Technologies, 5, Vol.9, o., 3-39 Transort and Telecommunication Institute, Lomonosov, LV-9, Riga, Latvia MATHEMATICAL MODELLIG OF THE WIRELESS COMMUICATIO ETWORK M. KOPEETSK Deartment

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell October 25, 2009 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO) Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell February 10, 2010 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

Bayesian Spatially Varying Coefficient Models in the Presence of Collinearity

Bayesian Spatially Varying Coefficient Models in the Presence of Collinearity Bayesian Satially Varying Coefficient Models in the Presence of Collinearity David C. Wheeler 1, Catherine A. Calder 1 he Ohio State University 1 Abstract he belief that relationshis between exlanatory

More information

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split A Bound on the Error of Cross Validation Using the Aroximation and Estimation Rates, with Consequences for the Training-Test Slit Michael Kearns AT&T Bell Laboratories Murray Hill, NJ 7974 mkearns@research.att.com

More information

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test) Chater 225 Tests for Two Proortions in a Stratified Design (Cochran/Mantel-Haenszel Test) Introduction In a stratified design, the subects are selected from two or more strata which are formed from imortant

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

Plotting the Wilson distribution

Plotting the Wilson distribution , Survey of English Usage, University College London Setember 018 1 1. Introduction We have discussed the Wilson score interval at length elsewhere (Wallis 013a, b). Given an observed Binomial roortion

More information

SAS for Bayesian Mediation Analysis

SAS for Bayesian Mediation Analysis Paer 1569-2014 SAS for Bayesian Mediation Analysis Miočević Milica, Arizona State University; David P. MacKinnon, Arizona State University ABSTRACT Recent statistical mediation analysis research focuses

More information

Introduction to Probability and Statistics

Introduction to Probability and Statistics Introduction to Probability and Statistics Chater 8 Ammar M. Sarhan, asarhan@mathstat.dal.ca Deartment of Mathematics and Statistics, Dalhousie University Fall Semester 28 Chater 8 Tests of Hyotheses Based

More information

Learning Sequence Motif Models Using Gibbs Sampling

Learning Sequence Motif Models Using Gibbs Sampling Learning Sequence Motif Models Using Gibbs Samling BMI/CS 776 www.biostat.wisc.edu/bmi776/ Sring 2018 Anthony Gitter gitter@biostat.wisc.edu These slides excluding third-arty material are licensed under

More information

One-way ANOVA Inference for one-way ANOVA

One-way ANOVA Inference for one-way ANOVA One-way ANOVA Inference for one-way ANOVA IPS Chater 12.1 2009 W.H. Freeman and Comany Objectives (IPS Chater 12.1) Inference for one-way ANOVA Comaring means The two-samle t statistic An overview of ANOVA

More information

General Linear Model Introduction, Classes of Linear models and Estimation

General Linear Model Introduction, Classes of Linear models and Estimation Stat 740 General Linear Model Introduction, Classes of Linear models and Estimation An aim of scientific enquiry: To describe or to discover relationshis among events (variables) in the controlled (laboratory)

More information

Estimating function analysis for a class of Tweedie regression models

Estimating function analysis for a class of Tweedie regression models Title Estimating function analysis for a class of Tweedie regression models Author Wagner Hugo Bonat Deartamento de Estatística - DEST, Laboratório de Estatística e Geoinformação - LEG, Universidade Federal

More information

Statistics II Logistic Regression. So far... Two-way repeated measures ANOVA: an example. RM-ANOVA example: the data after log transform

Statistics II Logistic Regression. So far... Two-way repeated measures ANOVA: an example. RM-ANOVA example: the data after log transform Statistics II Logistic Regression Çağrı Çöltekin Exam date & time: June 21, 10:00 13:00 (The same day/time lanned at the beginning of the semester) University of Groningen, Det of Information Science May

More information

Determining Momentum and Energy Corrections for g1c Using Kinematic Fitting

Determining Momentum and Energy Corrections for g1c Using Kinematic Fitting CLAS-NOTE 4-17 Determining Momentum and Energy Corrections for g1c Using Kinematic Fitting Mike Williams, Doug Alegate and Curtis A. Meyer Carnegie Mellon University June 7, 24 Abstract We have used the

More information

Spectral Analysis by Stationary Time Series Modeling

Spectral Analysis by Stationary Time Series Modeling Chater 6 Sectral Analysis by Stationary Time Series Modeling Choosing a arametric model among all the existing models is by itself a difficult roblem. Generally, this is a riori information about the signal

More information

Paper C Exact Volume Balance Versus Exact Mass Balance in Compositional Reservoir Simulation

Paper C Exact Volume Balance Versus Exact Mass Balance in Compositional Reservoir Simulation Paer C Exact Volume Balance Versus Exact Mass Balance in Comositional Reservoir Simulation Submitted to Comutational Geosciences, December 2005. Exact Volume Balance Versus Exact Mass Balance in Comositional

More information

Chapter 13 Variable Selection and Model Building

Chapter 13 Variable Selection and Model Building Chater 3 Variable Selection and Model Building The comlete regsion analysis deends on the exlanatory variables ent in the model. It is understood in the regsion analysis that only correct and imortant

More information

Statics and dynamics: some elementary concepts

Statics and dynamics: some elementary concepts 1 Statics and dynamics: some elementary concets Dynamics is the study of the movement through time of variables such as heartbeat, temerature, secies oulation, voltage, roduction, emloyment, rices and

More information

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests 009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract

More information

Distributed Rule-Based Inference in the Presence of Redundant Information

Distributed Rule-Based Inference in the Presence of Redundant Information istribution Statement : roved for ublic release; distribution is unlimited. istributed Rule-ased Inference in the Presence of Redundant Information June 8, 004 William J. Farrell III Lockheed Martin dvanced

More information

The Poisson Regression Model

The Poisson Regression Model The Poisson Regression Model The Poisson regression model aims at modeling a counting variable Y, counting the number of times that a certain event occurs during a given time eriod. We observe a samle

More information

Asymptotically Optimal Simulation Allocation under Dependent Sampling

Asymptotically Optimal Simulation Allocation under Dependent Sampling Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee

More information

Maximum Entropy and the Stress Distribution in Soft Disk Packings Above Jamming

Maximum Entropy and the Stress Distribution in Soft Disk Packings Above Jamming Maximum Entroy and the Stress Distribution in Soft Disk Packings Above Jamming Yegang Wu and S. Teitel Deartment of Physics and Astronomy, University of ochester, ochester, New York 467, USA (Dated: August

More information

Chapter 3. GMM: Selected Topics

Chapter 3. GMM: Selected Topics Chater 3. GMM: Selected oics Contents Otimal Instruments. he issue of interest..............................2 Otimal Instruments under the i:i:d: assumtion..............2. he basic result............................2.2

More information

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS Proceedings of DETC 03 ASME 003 Design Engineering Technical Conferences and Comuters and Information in Engineering Conference Chicago, Illinois USA, Setember -6, 003 DETC003/DAC-48760 AN EFFICIENT ALGORITHM

More information

Bayesian Networks Practice

Bayesian Networks Practice Bayesian Networks Practice Part 2 2016-03-17 Byoung-Hee Kim, Seong-Ho Son Biointelligence Lab, CSE, Seoul National University Agenda Probabilistic Inference in Bayesian networks Probability basics D-searation

More information

2. Sample representativeness. That means some type of probability/random sampling.

2. Sample representativeness. That means some type of probability/random sampling. 1 Neuendorf Cluster Analysis Assumes: 1. Actually, any level of measurement (nominal, ordinal, interval/ratio) is accetable for certain tyes of clustering. The tyical methods, though, require metric (I/R)

More information

A MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST BASED ON THE WEIBULL DISTRIBUTION

A MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST BASED ON THE WEIBULL DISTRIBUTION O P E R A T I O N S R E S E A R C H A N D D E C I S I O N S No. 27 DOI:.5277/ord73 Nasrullah KHAN Muhammad ASLAM 2 Kyung-Jun KIM 3 Chi-Hyuck JUN 4 A MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST

More information

Hotelling s Two- Sample T 2

Hotelling s Two- Sample T 2 Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test

More information

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning TNN-2009-P-1186.R2 1 Uncorrelated Multilinear Princial Comonent Analysis for Unsuervised Multilinear Subsace Learning Haiing Lu, K. N. Plataniotis and A. N. Venetsanooulos The Edward S. Rogers Sr. Deartment

More information

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)

More information

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules. Introduction: The is widely used in industry to monitor the number of fraction nonconforming units. A nonconforming unit is

More information

The Binomial Approach for Probability of Detection

The Binomial Approach for Probability of Detection Vol. No. (Mar 5) - The e-journal of Nondestructive Testing - ISSN 45-494 www.ndt.net/?id=7498 The Binomial Aroach for of Detection Carlos Correia Gruo Endalloy C.A. - Caracas - Venezuela www.endalloy.net

More information

arxiv:cond-mat/ v2 25 Sep 2002

arxiv:cond-mat/ v2 25 Sep 2002 Energy fluctuations at the multicritical oint in two-dimensional sin glasses arxiv:cond-mat/0207694 v2 25 Se 2002 1. Introduction Hidetoshi Nishimori, Cyril Falvo and Yukiyasu Ozeki Deartment of Physics,

More information

8 STOCHASTIC PROCESSES

8 STOCHASTIC PROCESSES 8 STOCHASTIC PROCESSES The word stochastic is derived from the Greek στoχαστικoς, meaning to aim at a target. Stochastic rocesses involve state which changes in a random way. A Markov rocess is a articular

More information

Feedback-error control

Feedback-error control Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller

More information

arxiv: v2 [stat.me] 3 Nov 2014

arxiv: v2 [stat.me] 3 Nov 2014 onarametric Stein-tye Shrinkage Covariance Matrix Estimators in High-Dimensional Settings Anestis Touloumis Cancer Research UK Cambridge Institute University of Cambridge Cambridge CB2 0RE, U.K. Anestis.Touloumis@cruk.cam.ac.uk

More information

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data Quality Technology & Quantitative Management Vol. 1, No.,. 51-65, 15 QTQM IAQM 15 Lower onfidence Bound for Process-Yield Index with Autocorrelated Process Data Fu-Kwun Wang * and Yeneneh Tamirat Deartment

More information

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Journal of Modern Alied Statistical Methods Volume Issue Article 7 --03 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University, Saudi

More information

Outline. Markov Chains and Markov Models. Outline. Markov Chains. Markov Chains Definitions Huizhen Yu

Outline. Markov Chains and Markov Models. Outline. Markov Chains. Markov Chains Definitions Huizhen Yu and Markov Models Huizhen Yu janey.yu@cs.helsinki.fi Det. Comuter Science, Univ. of Helsinki Some Proerties of Probabilistic Models, Sring, 200 Huizhen Yu (U.H.) and Markov Models Jan. 2 / 32 Huizhen Yu

More information

Invariant yield calculation

Invariant yield calculation Chater 6 Invariant yield calculation he invariant yield of the neutral ions and η mesons er one minimum bias collision as a function of the transverse momentum is given by E d3 N d 3 = d 3 N d dydφ = d

More information

Supplementary Materials for Robust Estimation of the False Discovery Rate

Supplementary Materials for Robust Estimation of the False Discovery Rate Sulementary Materials for Robust Estimation of the False Discovery Rate Stan Pounds and Cheng Cheng This sulemental contains roofs regarding theoretical roerties of the roosed method (Section S1), rovides

More information

Preconditioning techniques for Newton s method for the incompressible Navier Stokes equations

Preconditioning techniques for Newton s method for the incompressible Navier Stokes equations Preconditioning techniques for Newton s method for the incomressible Navier Stokes equations H. C. ELMAN 1, D. LOGHIN 2 and A. J. WATHEN 3 1 Deartment of Comuter Science, University of Maryland, College

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models

Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models Ketan N. Patel, Igor L. Markov and John P. Hayes University of Michigan, Ann Arbor 48109-2122 {knatel,imarkov,jhayes}@eecs.umich.edu

More information

7.2 Inference for comparing means of two populations where the samples are independent

7.2 Inference for comparing means of two populations where the samples are independent Objectives 7.2 Inference for comaring means of two oulations where the samles are indeendent Two-samle t significance test (we give three examles) Two-samle t confidence interval htt://onlinestatbook.com/2/tests_of_means/difference_means.ht

More information

Morten Frydenberg Section for Biostatistics Version :Friday, 05 September 2014

Morten Frydenberg Section for Biostatistics Version :Friday, 05 September 2014 Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 All models are aroximations! The best model does not exist! Comlicated models needs a lot of data. lower your ambitions or get

More information

A New Asymmetric Interaction Ridge (AIR) Regression Method

A New Asymmetric Interaction Ridge (AIR) Regression Method A New Asymmetric Interaction Ridge (AIR) Regression Method by Kristofer Månsson, Ghazi Shukur, and Pär Sölander The Swedish Retail Institute, HUI Research, Stockholm, Sweden. Deartment of Economics and

More information

Unsupervised Hyperspectral Image Analysis Using Independent Component Analysis (ICA)

Unsupervised Hyperspectral Image Analysis Using Independent Component Analysis (ICA) Unsuervised Hyersectral Image Analysis Using Indeendent Comonent Analysis (ICA) Shao-Shan Chiang Chein-I Chang Irving W. Ginsberg Remote Sensing Signal and Image Processing Laboratory Deartment of Comuter

More information

Hidden Predictors: A Factor Analysis Primer

Hidden Predictors: A Factor Analysis Primer Hidden Predictors: A Factor Analysis Primer Ryan C Sanchez Western Washington University Factor Analysis is a owerful statistical method in the modern research sychologist s toolbag When used roerly, factor

More information

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP Submitted to the Annals of Statistics arxiv: arxiv:1706.07237 CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP By Johannes Tewes, Dimitris N. Politis and Daniel J. Nordman Ruhr-Universität

More information

Single and double coincidence nucleon spectra in the weak decay of Λ hypernuclei

Single and double coincidence nucleon spectra in the weak decay of Λ hypernuclei Single and double coincidence nucleon sectra in the weak decay of hyernuclei E. Bauer 1, G. Garbarino 2, A. Parreño 3 and A. Ramos 3 1 Deartamento de Física, Universidad Nacional de La Plata, C. C. 67

More information

A Qualitative Event-based Approach to Multiple Fault Diagnosis in Continuous Systems using Structural Model Decomposition

A Qualitative Event-based Approach to Multiple Fault Diagnosis in Continuous Systems using Structural Model Decomposition A Qualitative Event-based Aroach to Multile Fault Diagnosis in Continuous Systems using Structural Model Decomosition Matthew J. Daigle a,,, Anibal Bregon b,, Xenofon Koutsoukos c, Gautam Biswas c, Belarmino

More information

Developing A Deterioration Probabilistic Model for Rail Wear

Developing A Deterioration Probabilistic Model for Rail Wear International Journal of Traffic and Transortation Engineering 2012, 1(2): 13-18 DOI: 10.5923/j.ijtte.20120102.02 Develoing A Deterioration Probabilistic Model for Rail Wear Jabbar-Ali Zakeri *, Shahrbanoo

More information

Machine Learning: Homework 4

Machine Learning: Homework 4 10-601 Machine Learning: Homework 4 Due 5.m. Monday, February 16, 2015 Instructions Late homework olicy: Homework is worth full credit if submitted before the due date, half credit during the next 48 hours,

More information

ESTIMATION OF THE RECIPROCAL OF THE MEAN OF THE INVERSE GAUSSIAN DISTRIBUTION WITH PRIOR INFORMATION

ESTIMATION OF THE RECIPROCAL OF THE MEAN OF THE INVERSE GAUSSIAN DISTRIBUTION WITH PRIOR INFORMATION STATISTICA, anno LXVIII, n., 008 ESTIMATION OF THE RECIPROCAL OF THE MEAN OF THE INVERSE GAUSSIAN DISTRIBUTION WITH PRIOR INFORMATION 1. INTRODUCTION The Inverse Gaussian distribution was first introduced

More information

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek Use of Transformations and the Reeated Statement in PROC GLM in SAS Ed Stanek Introduction We describe how the Reeated Statement in PROC GLM in SAS transforms the data to rovide tests of hyotheses of interest.

More information

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

Yixi Shi. Jose Blanchet. IEOR Department Columbia University New York, NY 10027, USA. IEOR Department Columbia University New York, NY 10027, USA

Yixi Shi. Jose Blanchet. IEOR Department Columbia University New York, NY 10027, USA. IEOR Department Columbia University New York, NY 10027, USA Proceedings of the 2011 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelsach, K. P. White, and M. Fu, eds. EFFICIENT RARE EVENT SIMULATION FOR HEAVY-TAILED SYSTEMS VIA CROSS ENTROPY Jose

More information

Adaptive estimation with change detection for streaming data

Adaptive estimation with change detection for streaming data Adative estimation with change detection for streaming data A thesis resented for the degree of Doctor of Philosohy of the University of London and the Diloma of Imerial College by Dean Adam Bodenham Deartment

More information

Bayesian Model Averaging Kriging Jize Zhang and Alexandros Taflanidis

Bayesian Model Averaging Kriging Jize Zhang and Alexandros Taflanidis HIPAD LAB: HIGH PERFORMANCE SYSTEMS LABORATORY DEPARTMENT OF CIVIL AND ENVIRONMENTAL ENGINEERING AND EARTH SCIENCES Bayesian Model Averaging Kriging Jize Zhang and Alexandros Taflanidis Why use metamodeling

More information

Voting with Behavioral Heterogeneity

Voting with Behavioral Heterogeneity Voting with Behavioral Heterogeneity Youzong Xu Setember 22, 2016 Abstract This aer studies collective decisions made by behaviorally heterogeneous voters with asymmetric information. Here behavioral heterogeneity

More information

Characteristics of Beam-Based Flexure Modules

Characteristics of Beam-Based Flexure Modules Shorya Awtar e-mail: shorya@mit.edu Alexander H. Slocum e-mail: slocum@mit.edu Precision Engineering Research Grou, Massachusetts Institute of Technology, Cambridge, MA 039 Edi Sevincer Omega Advanced

More information

A SIMPLE PLASTICITY MODEL FOR PREDICTING TRANSVERSE COMPOSITE RESPONSE AND FAILURE

A SIMPLE PLASTICITY MODEL FOR PREDICTING TRANSVERSE COMPOSITE RESPONSE AND FAILURE THE 19 TH INTERNATIONAL CONFERENCE ON COMPOSITE MATERIALS A SIMPLE PLASTICITY MODEL FOR PREDICTING TRANSVERSE COMPOSITE RESPONSE AND FAILURE K.W. Gan*, M.R. Wisnom, S.R. Hallett, G. Allegri Advanced Comosites

More information

Published: 14 October 2013

Published: 14 October 2013 Electronic Journal of Alied Statistical Analysis EJASA, Electron. J. A. Stat. Anal. htt://siba-ese.unisalento.it/index.h/ejasa/index e-issn: 27-5948 DOI: 1.1285/i275948v6n213 Estimation of Parameters of

More information

Linear diophantine equations for discrete tomography

Linear diophantine equations for discrete tomography Journal of X-Ray Science and Technology 10 001 59 66 59 IOS Press Linear diohantine euations for discrete tomograhy Yangbo Ye a,gewang b and Jiehua Zhu a a Deartment of Mathematics, The University of Iowa,

More information

Empirical Bayesian EM-based Motion Segmentation

Empirical Bayesian EM-based Motion Segmentation Emirical Bayesian EM-based Motion Segmentation Nuno Vasconcelos Andrew Liman MIT Media Laboratory 0 Ames St, E5-0M, Cambridge, MA 09 fnuno,lig@media.mit.edu Abstract A recent trend in motion-based segmentation

More information

Variable Selection and Model Building

Variable Selection and Model Building LINEAR REGRESSION ANALYSIS MODULE XIII Lecture - 38 Variable Selection and Model Building Dr. Shalabh Deartment of Mathematics and Statistics Indian Institute of Technology Kanur Evaluation of subset regression

More information

Sampling. Inferential statistics draws probabilistic conclusions about populations on the basis of sample statistics

Sampling. Inferential statistics draws probabilistic conclusions about populations on the basis of sample statistics Samling Inferential statistics draws robabilistic conclusions about oulations on the basis of samle statistics Probability models assume that every observation in the oulation is equally likely to be observed

More information

Analysis of some entrance probabilities for killed birth-death processes

Analysis of some entrance probabilities for killed birth-death processes Analysis of some entrance robabilities for killed birth-death rocesses Master s Thesis O.J.G. van der Velde Suervisor: Dr. F.M. Sieksma July 5, 207 Mathematical Institute, Leiden University Contents Introduction

More information

State Estimation with ARMarkov Models

State Estimation with ARMarkov Models Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,

More information

8.7 Associated and Non-associated Flow Rules

8.7 Associated and Non-associated Flow Rules 8.7 Associated and Non-associated Flow Rules Recall the Levy-Mises flow rule, Eqn. 8.4., d ds (8.7.) The lastic multilier can be determined from the hardening rule. Given the hardening rule one can more

More information

Covariance Matrix Estimation for Reinforcement Learning

Covariance Matrix Estimation for Reinforcement Learning Covariance Matrix Estimation for Reinforcement Learning Tomer Lancewicki Deartment of Electrical Engineering and Comuter Science University of Tennessee Knoxville, TN 37996 tlancewi@utk.edu Itamar Arel

More information

START Selected Topics in Assurance

START Selected Topics in Assurance START Selected Toics in Assurance Related Technologies Table of Contents Introduction Statistical Models for Simle Systems (U/Down) and Interretation Markov Models for Simle Systems (U/Down) and Interretation

More information

Universal Finite Memory Coding of Binary Sequences

Universal Finite Memory Coding of Binary Sequences Deartment of Electrical Engineering Systems Universal Finite Memory Coding of Binary Sequences Thesis submitted towards the degree of Master of Science in Electrical and Electronic Engineering in Tel-Aviv

More information

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V.

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deriving ndicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deutsch Centre for Comutational Geostatistics Deartment of Civil &

More information

Modeling and Estimation of Full-Chip Leakage Current Considering Within-Die Correlation

Modeling and Estimation of Full-Chip Leakage Current Considering Within-Die Correlation 6.3 Modeling and Estimation of Full-Chi Leaage Current Considering Within-Die Correlation Khaled R. eloue, Navid Azizi, Farid N. Najm Deartment of ECE, University of Toronto,Toronto, Ontario, Canada {haled,nazizi,najm}@eecg.utoronto.ca

More information

Chapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population

Chapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population Chater 7 and s Selecting a Samle Point Estimation Introduction to s of Proerties of Point Estimators Other Methods Introduction An element is the entity on which data are collected. A oulation is a collection

More information

VIBRATION ANALYSIS OF BEAMS WITH MULTIPLE CONSTRAINED LAYER DAMPING PATCHES

VIBRATION ANALYSIS OF BEAMS WITH MULTIPLE CONSTRAINED LAYER DAMPING PATCHES Journal of Sound and Vibration (998) 22(5), 78 85 VIBRATION ANALYSIS OF BEAMS WITH MULTIPLE CONSTRAINED LAYER DAMPING PATCHES Acoustics and Dynamics Laboratory, Deartment of Mechanical Engineering, The

More information

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)] LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for

More information

MATH 2710: NOTES FOR ANALYSIS

MATH 2710: NOTES FOR ANALYSIS MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite

More information

How to Estimate Expected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty

How to Estimate Expected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty How to Estimate Exected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty Christian Servin Information Technology Deartment El Paso Community College El Paso, TX 7995, USA cservin@gmail.com

More information

On parameter estimation in deformable models

On parameter estimation in deformable models Downloaded from orbitdtudk on: Dec 7, 07 On arameter estimation in deformable models Fisker, Rune; Carstensen, Jens Michael Published in: Proceedings of the 4th International Conference on Pattern Recognition

More information

CSE 599d - Quantum Computing When Quantum Computers Fall Apart

CSE 599d - Quantum Computing When Quantum Computers Fall Apart CSE 599d - Quantum Comuting When Quantum Comuters Fall Aart Dave Bacon Deartment of Comuter Science & Engineering, University of Washington In this lecture we are going to begin discussing what haens to

More information

Scaling Multiple Point Statistics for Non-Stationary Geostatistical Modeling

Scaling Multiple Point Statistics for Non-Stationary Geostatistical Modeling Scaling Multile Point Statistics or Non-Stationary Geostatistical Modeling Julián M. Ortiz, Steven Lyster and Clayton V. Deutsch Centre or Comutational Geostatistics Deartment o Civil & Environmental Engineering

More information

A PEAK FACTOR FOR PREDICTING NON-GAUSSIAN PEAK RESULTANT RESPONSE OF WIND-EXCITED TALL BUILDINGS

A PEAK FACTOR FOR PREDICTING NON-GAUSSIAN PEAK RESULTANT RESPONSE OF WIND-EXCITED TALL BUILDINGS The Seventh Asia-Pacific Conference on Wind Engineering, November 8-1, 009, Taiei, Taiwan A PEAK FACTOR FOR PREDICTING NON-GAUSSIAN PEAK RESULTANT RESPONSE OF WIND-EXCITED TALL BUILDINGS M.F. Huang 1,

More information