Plotting the Wilson distribution
|
|
- Mitchell Elliott
- 5 years ago
- Views:
Transcription
1 , Survey of English Usage, University College London Setember Introduction We have discussed the Wilson score interval at length elsewhere (Wallis 013a, b). Given an observed Binomial roortion = f / n observations, and confidence level 1-α, the interval reresents the two-tailed range of values where P, the true roortion in the oulation, is likely to be found. Note that f and n are integers, so whereas P is a robability, is a roer fraction (a rational number). The interval rovides a robust method (Newcombe 1998, Wallis 013a) for directly estimating confidence intervals on these simle observations. It can take a correction for continuity in circumstances where it is desired to erform a more conservative test and err on the side of caution. We have also shown how it can be emloyed in logistic regression (Wallis 015). The oint of this aer is to exlore methods for comuting Wilson distributions, i.e. the analogue of the Normal distribution for this interval. There are at least two good reasons why we might wish to do this. The first is to shed insight onto the erformance of the generating function (formula), interval and distribution itself. Plotting an interval means selecting a single error level α, whereas visualising the distribution allows us to see how the function erforms over the range of ossible values for α, for different values of and n. A second good reason is to counteract the tendency, common in too many resentations of statistics, to resent the Gaussian ( Normal ) distribution as if it were some kind of universal law of data, a mistaken corollary of the Central Limit Theorem. This is articularly unwise in the case of observations of Binomial roortions, which are strictly bounded at 0 and 1. As we shall see, the Wilson distribution diverges from the Gaussian most dramatically as it tends towards the boundaries of the robabilistic range, i.e. where the interval aroaches 0 or 1. By contrast, the Normal distribution is unbounded, and continues to lus or minus infinity. The Wilson score interval (Wilson 197) may be comuted with the following formula. Wilson score interval (w, w + ) z + ± zα n (1 ) zα/ + n 4n α/ / 1 z + α/ n. (1) Let us first consider cases where P is less than. At the lower bound of this interval (P = w ) the uer bound for the Gaussian interval for P, E +, must be equal to (Wallis 013a). We can carry out a test for significant difference between and P by either a) calculating a Gaussian interval at P and testing if is greater than the uer bound, or b) calculating a Wilson interval at and testing if P is less than the lower bound. 1 This aer summarises erformance obtained with a sreadsheet by the author, Exerimenting with different values of and n is recommended.
2 To consider cases where P is greater than, we simly reverse this logic. We test if is smaller than the lower bound of a Gaussian interval for P, or P is greater than the uer bound of the Wilson interval for. The Gaussian version of the test is called the single roortion z test. It can also be calculated as a goodness of fit χ test (Wallis 013a, b).. Plotting the distribution We can define the Wilson distribution as follows: the distribution of the redicted robability of the true value P, based on an observation, where P has a known relationshi to, comuted using the Wilson score interval. More recisely, we might consider it as the sum of two distributions: the distribution of the Wilson score interval lower bound w, based on an observation and the distribution of the Wilson score interval uer bound w +..1 Obtaining values of w First, we calculate the lower bound w from Equation (1) above for a series of values of α. In ractice, we obtain a reasonably accurate initial lot by comuting z α/ and thus w, for α A where A = {00, 5,, , 1}, i.e. for intervals of 5 but excluding zero. w (α) = zα/ + zα n / (1 ) + n z α/ 4n 1 + z α/ n. () Note that z α/ for α = 1, z 1/ = 0, and for α > 1, z α/ = z (1 α/) and w (α/) = w + (1 α/). For α > 1, calculating w comutes ercentage oints above the observation (i.e. w + ). So to comute w + we can simly extend A beyond 1, to include {1, , 1.95, }. By insection we note that the limit region below 5 (and above 1.95), is likely to see gradient change as α aroaches zero. In other words, we cannot assume the line between these oints is a straight line. Therefore we add oints to A covering successive fractions {1/40, 1/80, 1/640}, and { 1/40, 1/80, 1/640}. Equation () obtains a osition on a horizontal robability scale, w, comuted for a given cumulative robability α. In other words, for w <, the formula tells us that there is a robability of α that the true value is below w. area δ h error e. Emloying a delta aroximation The next stage is to convert this cumulative robability into a column height. To do this we emloy a delta aroximation, a trick familiar to students of calculus. area α δ The simlest method is to calculate Equation () for two oints, α and α δ. We aroximate the area between the resulting values of w to a column δ wide and h high, we can comute h = width / area, which we can lot over w. w (α δ) w (α) Figure 1. Estimating the height of w (α), h(α), using a onse-sided delta aroximation. As δ 0, error e 0. Coyright 018. All rights reserved.
3 To lot w for areas below, α < 1, we can use the following formula. h(α) = 0 if w ( α) = w ( α δ) δ = w ( α) w ( α δ) otherwise. (3) The first test deals with cases where = 0, which obtain a situation where all values of w (α) = 0. We can continue this aroximation for α 1. But if we want symmetric results for =, we can take a delta above α for all cases for α 1. h(α) = 0 if w ( α) = w ( α + δ) δ = w ( α + δ) w ( α) otherwise. (4) Finally, we set h(1) = 0 when = 0 or 1. Equation (3) converges to the correct value as δ 0. It follows that δ should be as small as ossible. By exerimentation, we find that if δ is below 001, in some versions of Excel results become unreliable. Aroximations in the comutation of z α/ seem to be the culrit. This leaves us with a small error in the calculation. We can see this error in that Equations (3) and (4) do not obtain exactly the same results. At the scale of the grah, this error is small, but erceivable. To minimise this error, we average heights estimated using delta aroximations above and below α. This imroves the estimate for any monotonic region (α δ, α + δ), and does not substantially worsen it if α reresents a eak value. h(α) = ½ ( w δ + ( α) w ( α δ) w δ ). (5) ( α + δ) w ( α) Although the distribution may be comuted with a single formula over α (0, ), recall that the Wilson distribution is really the sum of two distributions, each with a unit area of 1. The first of these areas is the distribution for the uer bound w +, the second the distribution for the lower bound w. (This distinction will become imortant later on.) To scale these distributions to the same scale as the equivalent Binomial or Normal distribution above and below, we can divide both uer and lower bound distributions by n. 3. Examle lots 3.1 An initial examle To begin with, let us hold n = 10. This is a small samle size, but not so small as to resent articular issues. First, we will consider =. We obtain an interval that aears at first sight to match the Normal distribution. A region where the gradient is always increasing or decreasing, i.e. everywhere excet where the eak value is within the range. Coyright 018. All rights reserved. 3
4 For the uroses of comarison we have also lotted Normal distributions centred at P = w (5) and w + (5), divided by n. These distributions therefore have the same area (ignoring boundary cliing) as each corresonding area of the Wilson distribution below and above. =, n = 10, α = 5 distribution of w distribution of w + 0. w (α) w + (α) W+ N+ T t Figure. Plot of Wilson distribution (centre), with tail areas highlighted for α = 5, lotted =, n = 10; with Normal distributions centred at w and w Proerties of the Wilson distributions In this figure, the area under the Wilson distribution for w + (where > ), W+, has the same area as the area under the comlementary Normal distribution N+ (assuming that the Normal distribution is unclied). In this case, area(w+) = area(n+) = 1/n. It also has the same area as the Wilson distribution for w. Provided that (0, 1) (i.e. it is not at the extremes), the interval will be two-sided, area(w+) = area(w ) and have a total area of area(w+) = /n. The tail areas of the Wilson distributions reresent 5 of the area under the curve above and below resectively, in the same way as the equivalent tail areas of the Normal distribution. The tail areas of both distributions on either side of are also 5 of the area under those curves above and below these centres. For small n, the Normal distribution is visibly clied by the robability range, but we can disregard the clied section of these distributions for testing uroses, as our observation is always on the inner side of these distributions. The tail areas for the Normal, area(t) = area(n+) α/. Both tail areas for the Wilson interval, below w (α) and above w + (α), are α = 5 of each searate distribution. Thus in Figure, area(t) = area(w ) α (i.e., α/ of the total area). This obtains a two-tailed test when is not at the extremes, but a one-tailed test when is at 0 or 1. Coyright 018. All rights reserved. 4
5 3.3 Varying As tends to 0, we obtain increasingly skewed distributions (Figure 3). The interval cannot be easily aroximated by a Normal interval, and the sum of the two distributions is decidedly not Gaussian ( Normal ). In Figure 3, note how the mean is no longer the most likely value (mode). In lotting this distribution air, the area on either side of is rojected to be of equal size, i.e. it treats as a given that the true value P is equally likely to be above and below. This is not necessarily true! Indeed we might multily both distributions by the robability of the rior. But this fact should not cause us to change the lot. =, n = 10, α = 5 w 0. w =, n = 10, α = 5 Note how, thanks to the roximity to the boundary at zero, the interval for w becomes increasingly comressed between 0 and, reflected by the increased height of the curve. 0.6 w The tendency to exress the distribution like an exonential decline on the least bounded side reaches its limit when = 0 or 1. The squeezed interval is uncomutable and simly disaears. 3.4 Small n What haens if we reduce n? 0. w + All else being equal we should exect that the smaller the samle size, the larger the confidence interval. In the figures that follow we have lotted Wilson distributions for = 0 and = for n =. Recall also that must be a true fraction of n, so, for examle, for n =, = 0. would not be ossible in ractice =, n = 10, α = 5 The interval for α = 5 now sans most of the range between 0 and 1. The boundaries squeeze the interval close to 0 and 1. We obtain the wisdom-tooth shae in Figure 4 and an undulating curve in Figure w + Note that the areas are larger because we are now scaling by / = 1 instead of /10 = 1/ Figure 3. Plots of Wilson distributions for Coyright 018. All rights reserved. =, and. 5
6 =, n =, α = = 0, n =, α = 5 w + 0. w w Figure 4. Plot of Wilson interval for = and n =. With such a large confidence interval, the boundaries at 0 and 1 cause the area to bulge on either side Figure 5. With = 0, and n =, the gradient close to w + (5) is also affected by the boundary at 1, causing the gradient to undulate. 4. Further ersectives on the distribution 4.1 Percentiles of the Wilson distributions We can lot ercentiles of the distributions, as in Figure 6. The set A includes ten-ercentile oints, and we have simly lotted dividing lines to artition the area at each oint. Figure 6 contains two distributions, containing twenty areas in total, each equal in area. distribution of w distribution of w + =, n = % lower bound 50% 80% each area = 10% of area above 90% uer bound 10% 10% 10% Figure 6. Ten-ercentiles of the Wilson lower and uer distributions. Each area marked 10% is of equal area. This is not always easy to see, articularly with resect to the tails. Coyright 018. All rights reserved. 6 10%
7 4. The logit Wilson distribution We earlier noted Robert Newcombe s observation (Newcombe, 1998) that save when = 0 or 1 Wilson s score interval is symmetric on a logit scale. Our method for logistic line fitting (regression) uses an estimate of variance based on the Wilson interval exressed on an inverse logistic, or logit scale (Wallis 015). Regression over variance relies on an assumtion that the model of variance emloyed is Normal. In other words, it assumes the logit of the Wilson distribution resembles a Normal distribution. We are now in a osition to exlore that assumtion. We calculate logit(w ) using Equation () and (6): logit() log() log(1 ), (6) where log is the natural logarithm. Figure 7 lots the resulting distribution obtained by delta aroximation, and (for comarison uroses) a closely-matching Gaussian distribution. logit() =, n = logit Wilson Gaussian Figure 7. Logit Wilson distribution, i.e. the Wilson score interval on a logit scale, transformed into a distribution. This closely resembles a Gaussian (Normal) distribution centred on logit(). It turns out that, with the excetion of when is at boundaries 0 or 1 (which we exclude from fitting), the distribution closely matches a Normal distribution estimated by the following. mean µ = logit(), standard deviation σ = (logit() logit(w (α/)) / z α/. (7) Figure 7 shows, by way of comarison, the Normal distribution estimated using α = in this formula. This aroximation imroves with increasing centrality and increasing n. The aroximation is not erfect, but it is considerably less rone to error than aroximating the Normal to the Wilson interval on the robability scale (also known as the Wald interval), or even the generally acceted aroximation of the Normal to the Binomial distribution. Coyright 018. All rights reserved. 7 0
8 4.3 Continuity-corrected Wilson distributions As we noted, the aroximation from the discrete Binomial distribution to the Normal introduces an error that is conventionally mitigated with a continuity correction originally due to Yates (1934). In the case of the Normal distribution around P, this widens the interval by adding /n to the uer bound and subtracting this term from the lower bound. Newcombe (1998) resents a formula for comuting the equivalent Wilson score interval with continuity correction. The equation initially aears forbidding but it includes common terms that can be re-calculated. w 1 n + zα/ { zα/ zα/ n + 4n(1 ) + (4 ) + 1} max( 0, ), and ( n + z ) α/ w + 1 n + zα/ + { zα/ zα/ n + 4n(1 ) (4 ) + 1} min( 1, ). (8) ( n + z ) This is the continuity-corrected version of Equation (1). α/ Earlier we emhasised that the Wilson distribution was really two different distributions: one for w and one for w +. Thanks to the continuity correction, these two formulae do not obtain the same result for α = 1, unlike Equation (1), which converges to a midoint. This means we calculate intervals and heights searately. =, n = 10, α = 5 distribution of w distribution of w + 0. w (5) w + (5) ± /n standard Wilson continuitycorrected Figure 8. Uncorrected Wilson distribution (solid line) with continuity-corrected distributions for uer and lower bounds (dashed). We can see the effect of the continuity correction on the intervals, rendering them more conservative (moving them further out from ), at the same time as causing the interval to be comressed even further within the robabilistic range [0, 1]. Coyright 018. All rights reserved. 8
9 Conclusions The Wilson score interval is a member of a class of confidence intervals that correctly characterise exected variation about an observation of a Binomial roortion, [0, 1]. These intervals include the Cloer-Pearson interval, calculated by finding roots of the Binomial distribution for a given α, and the Wilson interval with continuity-correction that we document here. All three behave similarly, with the Cloer-Pearson falling between the two Wilson interval distributions deicted in Figure 8. See Newcombe (1998) and Wallis (013a) for a comarison of cometing intervals. Common to this class of intervals is the fact that they are affected by boundary conditions at 0 and 1. In discussing the logistic curve, Wallis (010) ointed out that the inverse logistic or logit function mas a robabilistic range to an unbounded Real dimension y by effectively folding sace as it aroaches the boundary. Figure 9 shows the idea. 3 logit() Figure 9. Absolute logit cross-section folding an infinite lane into a robabilistic trench. After Wallis (010). It is this folding of the interval into robability sace that exlains two asects of the Wilson distribution we observe. 1. As aroaches 0 or 1, the distribution between the boundary and becomes increasingly comressed and is ushed u, in some cases above the distribution at. Meanwhile the interval on the oen side increasingly resembles a decay curve. This exlains the shae of the distributions in Figure 3.. In Figure 4 and 5, we examined what haens to the distribution for small n. This aeared to generate what at first sight seems an even more baffling result, namely that for = and n =, the distribution had two eaks (it was bimodal ). A small n causes the distribution to sread over most of the robability range. The boundaries distort what would otherwise be a declining interval. We see a similar but less dramatic effect for = 0. The logit transformation of the same interval for = and n = obtains a bell curve aroximating to a Normal distribution about 0. We showed that rovided that was not at 0 or 1, not only is the logit Wilson interval symmetric as Newcombe (1998) ointed out, it resembles a Normal distribution. With increased n, the aroximation imroves, and for n = 10 the aroximation is very close indeed (see Figure 7). This distribution is centred at logit(), with a standard deviation that may be obtained from the width of the Wilson interval on a logit scale. This observation is suort for the generalised logistic regression method described in Wallis (015). Our final comment relates to a oint we made by way of introduction. It is often imortant to lot distributions to hel us concetualise the erformance of what otherwise may aear to be Coyright 018. All rights reserved. 9
10 dry algebraic functions. Statistical distributions are not exerienced directly. They reresent the aggregated sum of exeriences, and statistical reasoning is necessarily an act of imagination. The bell curve exectation is the ideological redisosition to exect that variation around observations of any kind is Normal and symmetric. This exectation aears in the Wald interval or resentations of standard error for observed roortions or robabilities. As we have shown, the redicted distribution of future observations based on a single observation of a Binomial roortion cannot be Normal. Where the observation is suorted by a large n and the distribution is tightly sread, and/or where the observation is close to, the distribution may be aroximately Normal. But many tyes of data are highly skewed, and there are often good reasons why we might wish to work with small n. In the 1990s, medical statisticians started aying attention to this question. Consider a clinical trial for a new heart drug for atients vulnerable to heart attacks. We have an exected rate of heart attacks for this grou based on revious clinical data. We do not wish to recruit more subjects than necessary, so we must work with small n. The exected chance of a heart attack, P, over a short monitored eriod, t, is still small however, being close to zero. A clinical trial manager must contend with two questions. 1. How many heart attack incidents would be significantly greater than would be exected by chance? In other words, does the lower bound of observed rate of heart attacks in the subject grou,, at a given time t exceed P sufficiently to be incaable of being exlained by chance? The trial should sto immediately because the drug aears to be having a negative effect.. Following a trial eriod, is the drug working so well that further trials may be accelerated, more subjects recruited, etc.? To reach this conclusion we must examine the uer bound of our observed heart attack rate, / t. Either way, we are concerned with robabilities that are likely to be close to, but not equal to, zero, by observing roortions of events found in small samles. We need an accurate method for identifying when either stoing condition is reached without extending t longer than necessary. This is what the Wilson class of intervals obtains. References Newcombe, R.G Two-sided confidence intervals for the single roortion: comarison of seven methods. Statistics in Medicine 17: Wallis, S.A Cometition between choices over time. London: Survey of English Usage. htt://corlingstats.wordress.com/01/03/31/cometition-between-choices-over-time Wallis, S.A. 013a. Binomial confidence intervals and contingency tests: mathematical fundamentals and the evaluation of alternative methods. Journal of Quantitative Linguistics 0:3, Wallis, S.A. 013b. z-squared: the origin and alication of χ². Journal of Quantitative Linguistics 0:4, Wallis, S.A Logistic regression with Wilson intervals. London: Survey of English Usage. htt://corlingstats.wordress.com/015/04/4/logistic-regression Wilson, E.B Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association : Yates, F Contingency tables involving small numbers and the chi-square test. Journal of the Royal Statistical Society, 1: Coyright 018. All rights reserved. 10
On split sample and randomized confidence intervals for binomial proportions
On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have
More informationPretest (Optional) Use as an additional pacing tool to guide instruction. August 21
Trimester 1 Pretest (Otional) Use as an additional acing tool to guide instruction. August 21 Beyond the Basic Facts In Trimester 1, Grade 8 focus on multilication. Daily Unit 1: Rational vs. Irrational
More information4. Score normalization technical details We now discuss the technical details of the score normalization method.
SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules
More informationarxiv:cond-mat/ v2 25 Sep 2002
Energy fluctuations at the multicritical oint in two-dimensional sin glasses arxiv:cond-mat/0207694 v2 25 Se 2002 1. Introduction Hidetoshi Nishimori, Cyril Falvo and Yukiyasu Ozeki Deartment of Physics,
More informationChapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population
Chater 7 and s Selecting a Samle Point Estimation Introduction to s of Proerties of Point Estimators Other Methods Introduction An element is the entity on which data are collected. A oulation is a collection
More informationThe Binomial Approach for Probability of Detection
Vol. No. (Mar 5) - The e-journal of Nondestructive Testing - ISSN 45-494 www.ndt.net/?id=7498 The Binomial Aroach for of Detection Carlos Correia Gruo Endalloy C.A. - Caracas - Venezuela www.endalloy.net
More informationMeasuring center and spread for density curves. Calculating probabilities using the standard Normal Table (CIS Chapter 8, p 105 mainly p114)
Objectives 1.3 Density curves and Normal distributions Density curves Measuring center and sread for density curves Normal distributions The 68-95-99.7 (Emirical) rule Standardizing observations Calculating
More informationTowards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK
Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)
More information7.2 Inference for comparing means of two populations where the samples are independent
Objectives 7.2 Inference for comaring means of two oulations where the samles are indeendent Two-samle t significance test (we give three examles) Two-samle t confidence interval htt://onlinestatbook.com/2/tests_of_means/difference_means.ht
More informationHotelling s Two- Sample T 2
Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test
More informationFeedback-error control
Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller
More informationMeasuring center and spread for density curves. Calculating probabilities using the standard Normal Table (CIS Chapter 8, p 105 mainly p114)
Objectives Density curves Measuring center and sread for density curves Normal distributions The 68-95-99.7 (Emirical) rule Standardizing observations Calculating robabilities using the standard Normal
More informationAn Analysis of Reliable Classifiers through ROC Isometrics
An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy s.vanderlooy@cs.unimaas.nl Ida G. Srinkhuizen-Kuyer kuyer@cs.unimaas.nl Evgueni N. Smirnov smirnov@cs.unimaas.nl MICC-IKAT, Universiteit
More information¼ ¼ 6:0. sum of all sample means in ð8þ 25
1. Samling Distribution of means. A oulation consists of the five numbers 2, 3, 6, 8, and 11. Consider all ossible samles of size 2 that can be drawn with relacement from this oulation. Find the mean of
More informationCMSC 425: Lecture 4 Geometry and Geometric Programming
CMSC 425: Lecture 4 Geometry and Geometric Programming Geometry for Game Programming and Grahics: For the next few lectures, we will discuss some of the basic elements of geometry. There are many areas
More information8 STOCHASTIC PROCESSES
8 STOCHASTIC PROCESSES The word stochastic is derived from the Greek στoχαστικoς, meaning to aim at a target. Stochastic rocesses involve state which changes in a random way. A Markov rocess is a articular
More informationTests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)
Chater 225 Tests for Two Proortions in a Stratified Design (Cochran/Mantel-Haenszel Test) Introduction In a stratified design, the subects are selected from two or more strata which are formed from imortant
More informationDeriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V.
Deriving ndicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deutsch Centre for Comutational Geostatistics Deartment of Civil &
More informationSAS for Bayesian Mediation Analysis
Paer 1569-2014 SAS for Bayesian Mediation Analysis Miočević Milica, Arizona State University; David P. MacKinnon, Arizona State University ABSTRACT Recent statistical mediation analysis research focuses
More informationCHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit
Chater 5 Statistical Inference 69 CHAPTER 5 STATISTICAL INFERENCE.0 Hyothesis Testing.0 Decision Errors 3.0 How a Hyothesis is Tested 4.0 Test for Goodness of Fit 5.0 Inferences about Two Means It ain't
More informationLOGISTIC REGRESSION. VINAYANAND KANDALA M.Sc. (Agricultural Statistics), Roll No I.A.S.R.I, Library Avenue, New Delhi
LOGISTIC REGRESSION VINAANAND KANDALA M.Sc. (Agricultural Statistics), Roll No. 444 I.A.S.R.I, Library Avenue, New Delhi- Chairerson: Dr. Ranjana Agarwal Abstract: Logistic regression is widely used when
More informationMATH 2710: NOTES FOR ANALYSIS
MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite
More informationCombining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)
Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment
More informationPb nanoprecipitates in Al: Magic-shape effects due to elastic strain
Downloaded from orbit.dtu.dk on: Nov 04, 018 nanoreciitates in Al: Magic-shae effects due to elastic strain Hamilton, J.C.; Leoard, F.; Johnson, Erik; Dahmen, U. Published in: Physical Review Letters Link
More informationModeling and Estimation of Full-Chip Leakage Current Considering Within-Die Correlation
6.3 Modeling and Estimation of Full-Chi Leaage Current Considering Within-Die Correlation Khaled R. eloue, Navid Azizi, Farid N. Najm Deartment of ECE, University of Toronto,Toronto, Ontario, Canada {haled,nazizi,najm}@eecg.utoronto.ca
More informationPretest (Optional) Use as an additional pacing tool to guide instruction. August 21
Trimester 1 Pretest (Otional) Use as an additional acing tool to guide instruction. August 21 Beyond the Basic Facts In Trimester 1, Grade 7 focus on multilication. Daily Unit 1: The Number System Part
More informationJohn Weatherwax. Analysis of Parallel Depth First Search Algorithms
Sulementary Discussions and Solutions to Selected Problems in: Introduction to Parallel Comuting by Viin Kumar, Ananth Grama, Anshul Guta, & George Karyis John Weatherwax Chater 8 Analysis of Parallel
More informationSlides Prepared by JOHN S. LOUCKS St. Edward s s University Thomson/South-Western. Slide
s Preared by JOHN S. LOUCKS St. Edward s s University 1 Chater 11 Comarisons Involving Proortions and a Test of Indeendence Inferences About the Difference Between Two Poulation Proortions Hyothesis Test
More informationUsing the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process
Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process P. Mantalos a1, K. Mattheou b, A. Karagrigoriou b a.deartment of Statistics University of Lund
More informationProbability Estimates for Multi-class Classification by Pairwise Coupling
Probability Estimates for Multi-class Classification by Pairwise Couling Ting-Fan Wu Chih-Jen Lin Deartment of Comuter Science National Taiwan University Taiei 06, Taiwan Ruby C. Weng Deartment of Statistics
More informationAn Analysis of TCP over Random Access Satellite Links
An Analysis of over Random Access Satellite Links Chunmei Liu and Eytan Modiano Massachusetts Institute of Technology Cambridge, MA 0239 Email: mayliu, modiano@mit.edu Abstract This aer analyzes the erformance
More informationInformation collection on a graph
Information collection on a grah Ilya O. Ryzhov Warren Powell February 10, 2010 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements
More informationCOMMUNICATION BETWEEN SHAREHOLDERS 1
COMMUNICATION BTWN SHARHOLDRS 1 A B. O A : A D Lemma B.1. U to µ Z r 2 σ2 Z + σ2 X 2r ω 2 an additive constant that does not deend on a or θ, the agents ayoffs can be written as: 2r rθa ω2 + θ µ Y rcov
More informationSTA 250: Statistics. Notes 7. Bayesian Approach to Statistics. Book chapters: 7.2
STA 25: Statistics Notes 7. Bayesian Aroach to Statistics Book chaters: 7.2 1 From calibrating a rocedure to quantifying uncertainty We saw that the central idea of classical testing is to rovide a rigorous
More informationCHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules
CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules. Introduction: The is widely used in industry to monitor the number of fraction nonconforming units. A nonconforming unit is
More informationA Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression
Journal of Modern Alied Statistical Methods Volume Issue Article 7 --03 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University, Saudi
More informationRANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES
RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES AARON ZWIEBACH Abstract. In this aer we will analyze research that has been recently done in the field of discrete
More informationUse of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek
Use of Transformations and the Reeated Statement in PROC GLM in SAS Ed Stanek Introduction We describe how the Reeated Statement in PROC GLM in SAS transforms the data to rovide tests of hyotheses of interest.
More informationMaximum Entropy and the Stress Distribution in Soft Disk Packings Above Jamming
Maximum Entroy and the Stress Distribution in Soft Disk Packings Above Jamming Yegang Wu and S. Teitel Deartment of Physics and Astronomy, University of ochester, ochester, New York 467, USA (Dated: August
More informationCharacterizing the Behavior of a Probabilistic CMOS Switch Through Analytical Models and Its Verification Through Simulations
Characterizing the Behavior of a Probabilistic CMOS Switch Through Analytical Models and Its Verification Through Simulations PINAR KORKMAZ, BILGE E. S. AKGUL and KRISHNA V. PALEM Georgia Institute of
More informationDETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS
Proceedings of DETC 03 ASME 003 Design Engineering Technical Conferences and Comuters and Information in Engineering Conference Chicago, Illinois USA, Setember -6, 003 DETC003/DAC-48760 AN EFFICIENT ALGORITHM
More informationarxiv: v3 [physics.data-an] 23 May 2011
Date: October, 8 arxiv:.7v [hysics.data-an] May -values for Model Evaluation F. Beaujean, A. Caldwell, D. Kollár, K. Kröninger Max-Planck-Institut für Physik, München, Germany CERN, Geneva, Switzerland
More informationPHYS 301 HOMEWORK #9-- SOLUTIONS
PHYS 0 HOMEWORK #9-- SOLUTIONS. We are asked to use Dirichlet' s theorem to determine the value of f (x) as defined below at x = 0, ± /, ± f(x) = 0, - < x
More informationInformation collection on a graph
Information collection on a grah Ilya O. Ryzhov Warren Powell October 25, 2009 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements
More informationOn Wrapping of Exponentiated Inverted Weibull Distribution
IJIRST International Journal for Innovative Research in Science & Technology Volume 3 Issue 11 Aril 217 ISSN (online): 2349-61 On Wraing of Exonentiated Inverted Weibull Distribution P.Srinivasa Subrahmanyam
More informationVIBRATION ANALYSIS OF BEAMS WITH MULTIPLE CONSTRAINED LAYER DAMPING PATCHES
Journal of Sound and Vibration (998) 22(5), 78 85 VIBRATION ANALYSIS OF BEAMS WITH MULTIPLE CONSTRAINED LAYER DAMPING PATCHES Acoustics and Dynamics Laboratory, Deartment of Mechanical Engineering, The
More informationAI*IA 2003 Fusion of Multiple Pattern Classifiers PART III
AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers
More informationThe Noise Power Ratio - Theory and ADC Testing
The Noise Power Ratio - Theory and ADC Testing FH Irons, KJ Riley, and DM Hummels Abstract This aer develos theory behind the noise ower ratio (NPR) testing of ADCs. A mid-riser formulation is used for
More informationObjectives. Displaying data and distributions with graphs. Variables Types of variables (CIS p40-41) Distribution of a variable
Objectives Dislaying data and distributions with grahs Variables Tyes of variables (CIS 40-41) Distribution of a variable Bar grahs for categorical variables (CIS 42) Histograms for quantitative variables
More informationEvaluating Process Capability Indices for some Quality Characteristics of a Manufacturing Process
Journal of Statistical and Econometric Methods, vol., no.3, 013, 105-114 ISSN: 051-5057 (rint version), 051-5065(online) Scienress Ltd, 013 Evaluating Process aability Indices for some Quality haracteristics
More informationUniform Law on the Unit Sphere of a Banach Space
Uniform Law on the Unit Shere of a Banach Sace by Bernard Beauzamy Société de Calcul Mathématique SA Faubourg Saint Honoré 75008 Paris France Setember 008 Abstract We investigate the construction of a
More informationSupplementary Materials for Robust Estimation of the False Discovery Rate
Sulementary Materials for Robust Estimation of the False Discovery Rate Stan Pounds and Cheng Cheng This sulemental contains roofs regarding theoretical roerties of the roosed method (Section S1), rovides
More informationOne-way ANOVA Inference for one-way ANOVA
One-way ANOVA Inference for one-way ANOVA IPS Chater 12.1 2009 W.H. Freeman and Comany Objectives (IPS Chater 12.1) Inference for one-way ANOVA Comaring means The two-samle t statistic An overview of ANOVA
More informationObjectives. Estimating with confidence Confidence intervals.
Objectives Estimating with confidence Confidence intervals. Sections 6.1 and 7.1 in IPS. Page 174-180 OS3. Choosing the samle size t distributions. Further reading htt://onlinestatbook.com/2/estimation/t_distribution.html
More informationMonte Carlo Studies. Monte Carlo Studies. Sampling Distribution
Monte Carlo Studies Do not let yourself be intimidated by the material in this lecture This lecture involves more theory but is meant to imrove your understanding of: Samling distributions and tests of
More informationA MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST BASED ON THE WEIBULL DISTRIBUTION
O P E R A T I O N S R E S E A R C H A N D D E C I S I O N S No. 27 DOI:.5277/ord73 Nasrullah KHAN Muhammad ASLAM 2 Kyung-Jun KIM 3 Chi-Hyuck JUN 4 A MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST
More informationMultiplicative group law on the folium of Descartes
Multilicative grou law on the folium of Descartes Steluţa Pricoie and Constantin Udrişte Abstract. The folium of Descartes is still studied and understood today. Not only did it rovide for the roof of
More informationBrownian Motion and Random Prime Factorization
Brownian Motion and Random Prime Factorization Kendrick Tang June 4, 202 Contents Introduction 2 2 Brownian Motion 2 2. Develoing Brownian Motion.................... 2 2.. Measure Saces and Borel Sigma-Algebras.........
More informationAn Improved Generalized Estimation Procedure of Current Population Mean in Two-Occasion Successive Sampling
Journal of Modern Alied Statistical Methods Volume 15 Issue Article 14 11-1-016 An Imroved Generalized Estimation Procedure of Current Poulation Mean in Two-Occasion Successive Samling G. N. Singh Indian
More informationObjectives. 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) CI)
Objectives 6.1, 7.1 Estimating with confidence (CIS: Chater 10) Statistical confidence (CIS gives a good exlanation of a 95% CI) Confidence intervals. Further reading htt://onlinestatbook.com/2/estimation/confidence.html
More informationElementary Analysis in Q p
Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some
More informationMULTIVARIATE STATISTICAL PROCESS OF HOTELLING S T CONTROL CHARTS PROCEDURES WITH INDUSTRIAL APPLICATION
Journal of Statistics: Advances in heory and Alications Volume 8, Number, 07, Pages -44 Available at htt://scientificadvances.co.in DOI: htt://dx.doi.org/0.864/jsata_700868 MULIVARIAE SAISICAL PROCESS
More informationCharacteristics of Beam-Based Flexure Modules
Shorya Awtar e-mail: shorya@mit.edu Alexander H. Slocum e-mail: slocum@mit.edu Precision Engineering Research Grou, Massachusetts Institute of Technology, Cambridge, MA 039 Edi Sevincer Omega Advanced
More informationOn the Toppling of a Sand Pile
Discrete Mathematics and Theoretical Comuter Science Proceedings AA (DM-CCG), 2001, 275 286 On the Toling of a Sand Pile Jean-Christohe Novelli 1 and Dominique Rossin 2 1 CNRS, LIFL, Bâtiment M3, Université
More informationMorten Frydenberg Section for Biostatistics Version :Friday, 05 September 2014
Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 All models are aroximations! The best model does not exist! Comlicated models needs a lot of data. lower your ambitions or get
More informationApproximating min-max k-clustering
Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost
More informationDownloaded from jhs.mazums.ac.ir at 9: on Monday September 17th 2018 [ DOI: /acadpub.jhs ]
Iranian journal of health sciences 013; 1(): 56-60 htt://jhs.mazums.ac.ir Original Article Comaring Two Formulas of Samle Size Determination for Prevalence Studies Hamed Tabesh 1 *Azadeh Saki Fatemeh Pourmotahari
More informationThe one-sample t test for a population mean
Objectives Constructing and assessing hyotheses The t-statistic and the P-value Statistical significance The one-samle t test for a oulation mean One-sided versus two-sided tests Further reading: OS3,
More informationLower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data
Quality Technology & Quantitative Management Vol. 1, No.,. 51-65, 15 QTQM IAQM 15 Lower onfidence Bound for Process-Yield Index with Autocorrelated Process Data Fu-Kwun Wang * and Yeneneh Tamirat Deartment
More informationEstimating function analysis for a class of Tweedie regression models
Title Estimating function analysis for a class of Tweedie regression models Author Wagner Hugo Bonat Deartamento de Estatística - DEST, Laboratório de Estatística e Geoinformação - LEG, Universidade Federal
More information1 Random Experiments from Random Experiments
Random Exeriments from Random Exeriments. Bernoulli Trials The simlest tye of random exeriment is called a Bernoulli trial. A Bernoulli trial is a random exeriment that has only two ossible outcomes: success
More informationUncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning
TNN-2009-P-1186.R2 1 Uncorrelated Multilinear Princial Comonent Analysis for Unsuervised Multilinear Subsace Learning Haiing Lu, K. N. Plataniotis and A. N. Venetsanooulos The Edward S. Rogers Sr. Deartment
More informationAn Ant Colony Optimization Approach to the Probabilistic Traveling Salesman Problem
An Ant Colony Otimization Aroach to the Probabilistic Traveling Salesman Problem Leonora Bianchi 1, Luca Maria Gambardella 1, and Marco Dorigo 2 1 IDSIA, Strada Cantonale Galleria 2, CH-6928 Manno, Switzerland
More informationrate~ If no additional source of holes were present, the excess
DIFFUSION OF CARRIERS Diffusion currents are resent in semiconductor devices which generate a satially non-uniform distribution of carriers. The most imortant examles are the -n junction and the biolar
More informationIntrinsic Approximation on Cantor-like Sets, a Problem of Mahler
Intrinsic Aroximation on Cantor-like Sets, a Problem of Mahler Ryan Broderick, Lior Fishman, Asaf Reich and Barak Weiss July 200 Abstract In 984, Kurt Mahler osed the following fundamental question: How
More informationBiostat Methods STAT 5500/6500 Handout #12: Methods and Issues in (Binary Response) Logistic Regression
Biostat Methods STAT 5500/6500 Handout #12: Methods and Issues in (Binary Resonse) Logistic Regression Recall general χ 2 test setu: Y 0 1 Trt 0 a b Trt 1 c d I. Basic logistic regression Previously (Handout
More informationPulse Propagation in Optical Fibers using the Moment Method
Pulse Proagation in Otical Fibers using the Moment Method Bruno Miguel Viçoso Gonçalves das Mercês, Instituto Suerior Técnico Abstract The scoe of this aer is to use the semianalytic technique of the Moment
More informationNumerical Linear Algebra
Numerical Linear Algebra Numerous alications in statistics, articularly in the fitting of linear models. Notation and conventions: Elements of a matrix A are denoted by a ij, where i indexes the rows and
More informationAnswers Investigation 2
Answers Alications 1. a. Plan 1: y = x + 5; Plan 2: y = 1.5x + 2.5 b. Intersection oint (5, 10) is an exact solution to the system of equations. c. x + 5 = 1.5x + 2.5 leads to x = 5; (5) + 5 = 10 or 1.5(5)
More informationPrincipal Components Analysis and Unsupervised Hebbian Learning
Princial Comonents Analysis and Unsuervised Hebbian Learning Robert Jacobs Deartment of Brain & Cognitive Sciences University of Rochester Rochester, NY 1467, USA August 8, 008 Reference: Much of the material
More information1 Gambler s Ruin Problem
Coyright c 2017 by Karl Sigman 1 Gambler s Ruin Problem Let N 2 be an integer and let 1 i N 1. Consider a gambler who starts with an initial fortune of $i and then on each successive gamble either wins
More informationStatics and dynamics: some elementary concepts
1 Statics and dynamics: some elementary concets Dynamics is the study of the movement through time of variables such as heartbeat, temerature, secies oulation, voltage, roduction, emloyment, rices and
More informationarxiv: v1 [physics.data-an] 26 Oct 2012
Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch
More informationSets of Real Numbers
Chater 4 Sets of Real Numbers 4. The Integers Z and their Proerties In our revious discussions about sets and functions the set of integers Z served as a key examle. Its ubiquitousness comes from the fact
More informationEvaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models
Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models Ketan N. Patel, Igor L. Markov and John P. Hayes University of Michigan, Ann Arbor 48109-2122 {knatel,imarkov,jhayes}@eecs.umich.edu
More informationA Simple Weight Decay Can Improve. Abstract. It has been observed in numerical simulations that a weight decay can improve
In Advances in Neural Information Processing Systems 4, J.E. Moody, S.J. Hanson and R.P. Limann, eds. Morgan Kaumann Publishers, San Mateo CA, 1995,. 950{957. A Simle Weight Decay Can Imrove Generalization
More informationEcological Resemblance. Ecological Resemblance. Modes of Analysis. - Outline - Welcome to Paradise
Ecological Resemblance - Outline - Ecological Resemblance Mode of analysis Analytical saces Association Coefficients Q-mode similarity coefficients Symmetrical binary coefficients Asymmetrical binary coefficients
More informationOutline. Markov Chains and Markov Models. Outline. Markov Chains. Markov Chains Definitions Huizhen Yu
and Markov Models Huizhen Yu janey.yu@cs.helsinki.fi Det. Comuter Science, Univ. of Helsinki Some Proerties of Probabilistic Models, Sring, 200 Huizhen Yu (U.H.) and Markov Models Jan. 2 / 32 Huizhen Yu
More informationBENDING INDUCED VERTICAL OSCILLATIONS DURING SEISMIC RESPONSE OF RC BRIDGE PIERS
BENDING INDUCED VERTICAL OSCILLATIONS DURING SEISMIC RESPONSE OF RC BRIDGE PIERS Giulio RANZO 1, Marco PETRANGELI And Paolo E PINTO 3 SUMMARY The aer resents a numerical investigation on the behaviour
More informationCombinatorics of topmost discs of multi-peg Tower of Hanoi problem
Combinatorics of tomost discs of multi-eg Tower of Hanoi roblem Sandi Klavžar Deartment of Mathematics, PEF, Unversity of Maribor Koroška cesta 160, 000 Maribor, Slovenia Uroš Milutinović Deartment of
More informationState Estimation with ARMarkov Models
Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,
More informationPreconditioning techniques for Newton s method for the incompressible Navier Stokes equations
Preconditioning techniques for Newton s method for the incomressible Navier Stokes equations H. C. ELMAN 1, D. LOGHIN 2 and A. J. WATHEN 3 1 Deartment of Comuter Science, University of Maryland, College
More informationAN EVALUATION OF A SIMPLE DYNAMICAL MODEL FOR IMPACTS BETWEEN RIGID OBJECTS
XIX IMEKO World Congress Fundamental and Alied Metrology Setember 6, 009, Lisbon, Portugal AN EVALUATION OF A SIMPLE DYNAMICAL MODEL FOR IMPACTS BETWEEN RIGID OBJECTS Erik Molino Minero Re, Mariano Lóez,
More informationSolution sheet ξi ξ < ξ i+1 0 otherwise ξ ξ i N i,p 1 (ξ) + where 0 0
Advanced Finite Elements MA5337 - WS7/8 Solution sheet This exercise sheets deals with B-slines and NURBS, which are the basis of isogeometric analysis as they will later relace the olynomial ansatz-functions
More informationConvex Optimization methods for Computing Channel Capacity
Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem
More informationAdaptive estimation with change detection for streaming data
Adative estimation with change detection for streaming data A thesis resented for the degree of Doctor of Philosohy of the University of London and the Diloma of Imerial College by Dean Adam Bodenham Deartment
More informationStatistics II Logistic Regression. So far... Two-way repeated measures ANOVA: an example. RM-ANOVA example: the data after log transform
Statistics II Logistic Regression Çağrı Çöltekin Exam date & time: June 21, 10:00 13:00 (The same day/time lanned at the beginning of the semester) University of Groningen, Det of Information Science May
More informationSystem Reliability Estimation and Confidence Regions from Subsystem and Full System Tests
009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract
More informationECON 4130 Supplementary Exercises 1-4
HG Set. 0 ECON 430 Sulementary Exercises - 4 Exercise Quantiles (ercentiles). Let X be a continuous random variable (rv.) with df f( x ) and cdf F( x ). For 0< < we define -th quantile (or 00-th ercentile),
More informationHow to Estimate Expected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty
How to Estimate Exected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty Christian Servin Information Technology Deartment El Paso Community College El Paso, TX 7995, USA cservin@gmail.com
More information