Descriptive Statistics

Size: px
Start display at page:

Download "Descriptive Statistics"

Transcription

1 Chapter 00 Itroductio This procedure summarizes variables both statistically ad graphically. Iformatio about the locatio (ceter), spread (variability), ad distributio is provided. The procedure provides a large variety of statistical iformatio about a sigle variable. Kids of Research Questios The use of this module for a sigle variable is geerally appropriate for oe of four purposes: umerical summary, data screeig, outlier idetificatio (which sometimes is icorporated ito data screeig), ad distributioal shape. We will briefly discuss each of these ow. Numerical Descriptors The umerical descriptors of a sample are called statistics. These statistics may be categorized as locatio, spread, shape idicators, percetiles, ad iterval estimates. Locatio or Cetral Tedecy Oe of the first impressios that we like to get from a variable is its geeral locatio. You might thik of this as the ceter of the variable o the umber lie. The average (mea) is a commo measure of locatio. Whe ivestigatig the ceter of a variable, the mai descriptors are the mea, media, mode, ad the trimmed mea. Other averages, such as the geometric ad harmoic mea, have specialized uses. We will ow briefly compare these measures. If the data come from the ormal distributio, the mea, media, mode, ad the trimmed mea are all equal. If the mea ad media are very differet, most likely there are outliers i the data or the distributio is skewed. If this is the case, the media is probably a better measure of locatio. The mea is very sesitive to extreme values ad ca be seriously cotamiated by just oe observatio. A compromise betwee the mea ad media is give by the trimmed mea (where a predetermied umber of observatios are trimmed from each ed of the data distributio). This trimmed mea is more robust tha the mea but more sesitive tha the media. Compariso of the trimmed mea to the media should show the trimmed mea approachig the media as the degree of trimmig icreases. If the trimmed mea coverges to the media for a small degree of trimmig, say 5 or 10%, the umber of outliers is relatively few. 00-1

2 Variability, Dispersio, or Spread After establishig the ceter of a variable s values, the ext questio is how closely the data fall about this ceter. The patter of the values aroud the ceter is called the spread, dispersio, or variability. There are umerous measures of variability: rage, variace, stadard deviatio, iterquartile rage, ad so o. All of these measures of dispersio are affected by outliers to some degree, but some do much better tha others. The stadard deviatio is oe of the most popular measures of dispersio. Ufortuately, it is greatly iflueced by outlyig observatios ad by the overall shape of the distributio. Because of this, various substitutes for it have bee developed. It will be up to you to decide which is best i a give situatio. Shape The shape of the distributio describes the patter of the values alog the umber lie. Are there a few uique values that occur over ad over, or is there a cotiuum? Is the patter symmetric or asymmetric? Are the data bell shaped? Do they seem to have a sigle ceter or are there several areas of clumpig? These are all aspects of the shape of the distributio of the data. Two of the most popular measures of shape are skewess ad kurtosis. Skewess measures the directio ad lack of symmetry. The more skewed a distributio is, the greater the eed for usig robust estimators, such as the media ad the iterquartile rage. Positive skewess idicates a logtailedess to the right while egative skewess idicates logtailedess to the left. Kurtosis measures the heaviess of the tails. A kurtosis value less tha three idicates lighter tails tha a ormal distributio. Kurtosis values greater tha three idicate heavier tails tha a ormal distributio. The measures of shape require more data to be accurate. For example, a reasoable estimate of the mea may require oly te observatios i a radom sample. The stadard deviatio will require at least thirty. A reasoably detailed estimate of the shape (especially if the tails are importat) will require several hudred observatios. Percetiles Percetiles are extremely useful for certai applicatios as well as for cases whe the distributio is very skewed or cotamiated by outliers. If the distributio of the variable is skewed, you might wat to use the exact iterval estimates for the percetiles. Cofidece Limits or Iterval Estimates A iterval estimate of a statistic gives a rage of its possible values. Cofidece limits are a special type of iterval estimate that have, uder certai coditios, a level of cofidece or probability attached to them. If the assumptio of ormality is valid, the cofidece itervals for the mea, variace, ad stadard deviatio are valid. However, the stadard error of each of these itervals depeds o the sample stadard deviatio ad the sample size. If the sample stadard deviatio is iaccurate, these other measures will be also. The bottom lie is that outliers ot oly affect the stadard deviatio but also all cofidece limits that use the sample stadard deviatio. It should be obvious the that the stadard deviatio is a critical measure of dispersio i parametric methods. 00-

3 Data Screeig Data screeig ivolves missig data, data validity, ad outliers. If these issues are ot dealt with prior to the use of descriptive statistics, errors i iterpretatios are very likely. Missig Data Wheever data are missig, questios eed to be asked. 1. Is the missigess due to icomplete data collectio? If so, try to complete the data collectio.. Is the missigess due to orespose from a survey? If so, attempt to collect data from the orespoders. 3. Are the missig data due to a cesorig of data beyod or below certai values? If so, some differet statistical tools will be eeded. 4. Is the patter of missigess radom? If oly a few data poits are missig from a large data set ad the patter of missigess is radom, there is little to be cocered with. However, if the data set is small or moderate i size, ay degree of missigess could cause bias i iterpretatios. Wheever missig values occur without aswers to the above questios, there is little that ca be doe. If the distributioal shape of the variable is kow ad there are missig data for certai percetiles, estimates could be made for the missig values. If there are other variables i the data set as well ad the patter of missigess is radom, multiple regressio ad multivariate methods ca be used to estimate the missig values. Data Validity Data validity eeds to be cofirmed prior to ay statistical aalysis, but it usually begis after a uivariate descriptive aalysis. Extremes or outliers for a variable could be due to a data etry error, to a icorrect or iappropriate specificatio of a missig code, to samplig from a populatio other tha the iteded oe, or due to a atural abormality that exists i this variable from time to time. The first two cases of ivalid data are easily corrected. The latter two require iformatio about the distributio form ad ecessitate the use of regressio or multivariate methods to re-estimate the values. Outliers Outliers i a uivariate data set are defied as observatios that appear to be icosistet with the rest of the data. A outlier is a observatio that sticks out at either ed of the data set. The visualizatio of uivariate outliers ca be doe i three ways: with the stem-ad-leaf plot, with the box plot, ad with the ormal probability plot. I each of these iformal methods, the outlier is far removed from the rest of the data. A word of cautio: the box plot ad the ormal probability plot evaluate the potetiality of a outlier assumig the data are ormally distributed. If the variable is ot ormally distributed, these plots may idicate may outliers. You must be careful about checkig what distributioal assumptios are behid the outliers you may be lookig for. Outliers ca completely distort descriptive statistics. For istace, if oe suspects outliers, a compariso of the mea, media, mode, ad trimmed mea should be made. If the outliers are oly to oe side of the mea, the media is a better measure of locatio. O the other had, if the outliers are equally diverget o each side of the ceter, the mea ad media will be close together, but the stadard deviatio will be iflated. The iterquartile rage is the oly measure of variatio ot greatly affected by outliers. Outliers may also cotamiate measures of skewess ad kurtosis as well as cofidece limits. This discussio has focused o uivariate outliers, i a simplistic way. If the data set has several variables, multiple regressio ad multivariate methods must be used to idetify these outliers. 00-3

4 Normality A primary use of descriptive statistics is to determie whether the data are ormally distributed. If the variable is ormally distributed, you ca use parametric statistics that are based o this assumptio. If the variable is ot ormally distributed, you might try a trasformatio o the variable (such as, the atural log or square root) to make the data ormal. If a trasformatio is ot a viable alterative, oparametric methods that do ot require ormality should be used. NCSS provides seve tests to formally test for ormality. If a variable fails a ormality test, it is critical to look at the box plot ad the ormal probability plot to see if a outlier or a small subset of outliers has caused the oormality. A pragmatic approach is to omit the outliers ad reru the tests to see if the variable ow passes the ormality tests. Always remember that a reasoably large sample size is ecessary to detect ormality. Oly extreme types of oormality ca be detected with samples less tha fifty observatios. There is a commo miscoceptio that a histogram is always a valid graphical tool for assessig ormality. Sice there are may subjective choices that must be made i costructig a histogram, ad sice histograms geerally eed large sample sizes to display a accurate picture of ormality, preferece should be give to other graphical displays such as the box plot, the desity trace, ad the ormal probability plot. Data Structure The data are cotaied i a sigle variable. Height dataset (subset) Height Procedure Optios This sectio describes the optios available i this procedure. To fid out more about usig a procedure, tur to the Procedures chapter. Followig is a list of the procedure s optios. Variables Tab The optios o this pael specify which variables to use. Data Variables Variable(s) Specify a list of oe or more variables upo which the uivariate statistics are to be geerated. You ca doubleclick the field or sigle click the butto o the right of the field to brig up the Variable Selectio widow. 00-4

5 Frequecy Variable Frequecy Variable This optioal variable specifies the umber of observatios that each row represets. Whe omitted, each row represets a sigle observatio. If your data is the result of a previous summarizatio, you may wat certai rows to represet several observatios. Note that egative values are treated as a zero weight ad are omitted. This is oe way of weightig your data. Groupig Variables Group (1-5) Variable You ca select up to five categorical variables. Whe oe or more of these are specified, a separate set of reports is geerated for each uique set of values for these variables. Data Trasformatio Optios Expoet Occasioally, you might wat to obtai a statistical report o the square root or square of your variable. This optio lets you specify a o-the-fly trasformatio of the variable. The form of this trasformatio is X = Y A, where Y is the origial value, A is the selected expoet, ad X is the value that is summarized. Additive Costat Occasioally, you might wat to obtai a statistical report o a trasformed versio of a variable. This optio lets you specify a o-the-fly trasformatio of the variable. The form of this trasformatio is X = Y+B, where Y is the origial value, B is the selected value, ad X is the value that is summarized. Note that if you apply both the Expoet ad the Additive Costat, the form of the trasformatio is X = (Y+B) A. Reports Tab The optios o this pael cotrol the format of the report. Select Reports Summary Sectio Percetile Sectio Each of these optios idicates whether to display the idicated report. Alpha Level The value of alpha for the cofidece limits ad rejectio decisios. Usually, this umber will rage from 0.1 to The default value of 0.05 results i 95% cofidece limits. Stem ad Leaf Stem Leaf Specify whether to iclude the stem ad leaf plot. 00-5

6 Report Optios Precisio Specify the precisio of umbers i the report. A sigle-precisio umber will show seve-place accuracy, while a double-precisio umber will show thirtee-place accuracy. Note that the reports were formatted for sigle precisio. If you select double precisio, some umbers may ru ito others. Also ote that all calculatios are performed i double precisio regardless of which optio you select here. This is for reportig purposes oly. Value Labels This optio applies to the Group Variable(s). It lets you select whether to display data values, value labels, or both. Use this optio if you wat the output to automatically attach labels to the values (like 1=Yes, =No, etc.). See the sectio o specifyig Value Labels elsewhere i this maual. Variable Names This optio lets you select whether to display oly variable ames, variable labels, or both. Report Optios - Decimal Places Values, Meas, Probabilities Specify the umber of decimal places whe displayig this item. Select Geeral to display all possible decimal places. Report Optios - Percetiles Percetile Type This selects from five methods used to calculate the p th percetile, z p. The first optio, Xp(+1), gives the commo value of the media. These optios are: AveXp(+1) The 100p th percetile is computed as Z p = (1-g)X [k1] + gx [k] where k1 equals the iteger part of p(+1), k=k1+1, g is the fractioal part of p(+1), ad X [k] is the k th observatio whe the data are sorted from lowest to highest. AveXp() The 100p th percetile is computed as Z p = (1-g)X [k1] + gx [k] where k1 equals the iteger part of p, k=k1+1, g is the fractioal part of p, ad X [k] is the k th observatio whe the data are sorted from lowest to highest. Closest to p The 100p th percetile is computed as Z p = X [k1] where k1 equals the iteger that is closest to p ad X [k] is the k th observatio whe the data are sorted from lowest to highest. 00-6

7 EDF The 100p th percetile is computed as Z p = X [k1] where k1 equals the iteger part of p if p is exactly a iteger or the iteger part of p+1 if p is ot exactly a iteger. X [k] is the k th observatio whe the data are sorted from lowest to highest. Note that EDF stads for empirical distributio fuctio. EDF w/ave The 100p th percetile is computed as Z p = (X [k1] + X [k])/ where k1 ad k are defied as follows: If p is a iteger, k1=k=p. If p is ot exactly a iteger, k1 equals the iteger part of p ad k = k1+1. X [k] is the k th observatio whe the data are sorted from lowest to highest. Note that EDF stads for empirical distributio fuctio. Smallest Percetile By default, the smallest percetile displayed is the 1st percetile. This optio lets you chage this value to ay value betwee 0 ad 100. For example, you might eter.5 to see the.5 th percetile. Largest Percetile By default, the largest percetile displayed is the 99th percetile. This optio lets you chage this value to ay value betwee 0 ad 100. For example, you might eter 97.5 to see the 97.5 th percetile. Plots Tab These optios specify the plots. Select Plots Histogram ad Probability Plot Specify whether to display the idicated plots. Click the plot format butto to chage the plot settigs. 00-7

8 Example 1 Ruig This sectio presets a detailed example of how to ru a descriptive statistics report o the Height variable i the Height dataset. To ru this example, take the followig steps (ote that step 1 is ot ecessary if the Height dataset is ope): You may follow alog here by makig the appropriate etries or load the completed template Example 1 by clickig o Ope Example Template from the File meu of the widow. 1 Ope the Height dataset. From the File meu of the NCSS Data widow, select Ope Example Data. Click o the file Height.NCSS. Click Ope. Ope the widow. Usig the Aalysis meu or the Procedure Navigator, fid ad select the procedure. O the meus, select File, the New Template. This will fill the procedure with the default template. 3 Specify the Height variable. O the widow, select the Variables tab. (This is the default.) Double-click i the Variables text box. This will brig up the variable selectio widow. Select Height from the list of variables ad the click Ok. The word Height will appear i the Variables box. Remember that you could have etered a 1 here sigifyig the first (left-most) variable o the dataset. 4 Ru the procedure. From the Ru meu, select Ru Procedure. Alteratively, just click the gree Ru butto. The followig reports ad charts will be displayed i the Output widow. Report This report is rather large ad complicated, so we will defie each sectio separately. Usually, you will focus o oly a few items from this report. Ufortuately, each user wats a differet few items, so we had to iclude much more tha ay oe user eeds! Several of the formulas ivolve both raw ad cetral momets. The raw momets are defied as: The cetral momets are defied as: m r m ' r = = i = 1 i = 1 x r i ( x x ) i r 00-8

9 Large sample estimates of the stadard errors are provided for several statistics. These are based o the followig formula from Kedall ad Stuart (1987): m r mr + 4mmr -1 rmr - mr + Var( mr ) = dg Var( g( x)) = Var( x) dx 1 1 Summary Sectio Summary Sectio of Height Stadard Stadard Cout Mea Deviatio Error Miimum Maximum Rage Cout This is the umber of omissig values. If o frequecy variable was specified, this is the umber of omissig rows. Mea This is the average of the data values. (See Meas Sectio below.) Stadard Deviatio This is the stadard deviatio of the data values. (See Variatio Sectio below.) Stadard Error This is the stadard error of the mea. (See Meas Sectio below.) Miimum The smallest value i this variable. Maximum The largest value i this variable. Rage The differece betwee the largest ad smallest values for a variable. If the data for a give variable is ormally distributed, a quick estimate of the stadard deviatio ca be made by dividig the rage by six. Cout Sectio Couts Sectio of Height Sum of Missig Distict Total Adjusted Rows Frequecies Values Values Sum Sum Squares Sum Squares Rows This is the total umber of rows available i this variable. 00-9

10 Sum of Frequecies This is the umber of omissig values. If o frequecy variable was specified, this is the umber of omissig rows. Missig Values The umber of missig (empty) rows. Distict Values This is the umber of uique values i this variable. This value is useful for fidig data etry errors ad for determiig if a variable is cotiuous or discrete. Sum This is the sum of the data values. Total Sum Squares This is the sum of the squared values of the variable. It is sometimes referred to as the uadjusted sum of squares. It is reported for its usefuless i calculatig other statistics ad is ot iterpreted directly. sum squares Adjusted Sum Squares This is the sum of the squared differeces from the mea. = x i i =1 sum squares = ( x x ) i = 1 i Meas Sectio Meas Sectio of Height Geometric Harmoic Parameter Mea Media Mea Mea Sum Mode Value Std Error % LCL % UCL T-Value Prob Level 0 Cout The geometric mea cofidece iterval assumes that the l(y) are ormally distributed. The harmoic mea cofidece iterval assumes that the 1/y are ormally distributed. Mea This is the average of the data values. x = i = 1 x i 00-10

11 Std Error (Mea) This is the stadard error of the mea. This is the estimated stadard deviatio for the distributio of sample meas for a ifiite populatio. LCL ad 95% UCL of the Mea s x = This is the upper ad lower values of a 100(1-α) iterval estimate for the mea based o a t distributio with -1 degrees of freedom. This iterval estimate assumes that the populatio stadard deviatio is ot kow ad that the data for this variable are ormally distributed. s x t s ± a/, 1 x T-Value (Mea) This is the t-test value for testig that the sample mea is equal to zero versus the alterative that it is ot. The degrees of freedom for this t-test are -1. The variable that is beig tested must be approximately ormally distributed for this test to be valid. t = x α /, 1 s Prob Level (Mea) This is the sigificace level of the above t-test, assumig a two-tailed test. Geerally, this p-value is compared to the level of sigificace,.05 or.01, chose by the researcher. If the p-value is less tha the pre-determied level of sigificace, the sample mea is differet from zero. Media The value of the media. The media is the 50th percetile of the data set. It is the poit that splits the data base i half. The value of the percetile depeds upo the percetile method that was selected. LCL ad 95% UCL of the Media These are the values of a exact cofidece iterval for the media. These exact cofidece itervals are discussed i the Percetile Sectio. Geometric Mea The geometric mea (GM) is a alterative type of mea that is used for busiess, ecoomic, ad biological applicatios. Oly oegative values are used i the computatio. If oe of the values is zero, the geometric mea is defied to be zero. Oe example of whe the GM is appropriate is whe a variable is the product of may small effects combied by multiplicatio istead of additio. GM = x x i i= 1 A alterative form, showig the GM s relatioship to the arithmetic mea, is: 1/ GM = exp 1 l( x i ) Cout for Geometric Mea The umber of positive umbers used i computig the geometric mea

12 Harmoic Mea The harmoic mea is used to average rates. For example, suppose we wat the average speed of a bus that travels a fixed distace every day at speeds s 1, s, ad s 3. The average speed, foud by dividig the total distace by the total time, is equal to the harmoic mea of the three speeds. The harmoic mea is appropriate whe the distace is costat from trial to trial ad the time required was variable. However, if the times were costat ad the distaces were variable, the arithmetic mea would have bee appropriate. Oly ozero values may be used i its calculatio. HM = 1 i =1 x i Cout for the Harmoic Mea The umber of ozero umbers used i computig the harmoic mea. Sum This is the sum of the data values. The stadard error ad cofidece limits are foud by multiplyig the correspodig values for the mea by the sample size,. Std Error of Sum This is the stadard deviatio of the distributio of sums. With this stadard error, cofidece itervals ad hypothesis testig ca be doe for the sum. The assumptios for the iterval estimate of the mea must also hold here. Mode This is the most frequetly occurrig value i the data. Mode Cout This is a cout of the most frequetly occurrig value, i.e., frequecy. s sum = s x Variatio Sectio Variatio Sectio of Height Stadard Ubiased Std Error Iterquartile Parameter Variace Deviatio Std Dev of Mea Rage Rage Value Std Error % LCL % UCL Variace The sample variace, s, is a popular measure of dispersio. It is a average of the squared deviatios from the mea. s i = 1 = ( x x ) i

13 Std Error of Variace This is a large sample estimate of the stadard error of s for a ifiite populatio. LCL of the Variace This is the lower value of a 100(1-α) iterval estimate for the variace based o the chi-squared distributio with -1 degrees of freedom. This iterval estimate assumes that the variable is ormally distributed. UCL of the Variace LCL = s ( - 1) χ α /, 1 This is the upper value of a 100(1-α) iterval estimate for the variace based o the chi-squared distributio with -1 degrees of freedom. This iterval estimate assumes that the variable is ormally distributed. UCL = s ( - 1) χ 1 α /, 1 Stadard Deviatio The sample stadard deviatio, s, is a popular measure of dispersio. It measures the average distace betwee a sigle observatio ad its mea. The use of -1 i the deomiator istead of the more atural is ofte of cocer. It turs out that if (istead of -1) were used, a biased estimate of the populatio stadard deviatio would result. The use of -1 corrects for this bias. Ufortuately, s is iordiately iflueced by outliers. For this reaso, you must always check for outliers i your data before you use this statistic. Also, s is a biased estimator of the populatio stadard deviatio. A ubiased estimate, calculated by adjustig s, is give uder the headig Ubiased Std Dev. s = i = 1 ( x x ) i 1 Aother form of the above formula that shows that the stadard deviatio is proportioal to the differece betwee each pair of observatios. Notice that the sample mea does ot eter ito this secod formulatio. s = i all i, j where i < j ( x x ) ( 1) j Std Error of Stadard Deviatio This is a large sample estimate of the stadard error of s for a ifiite populatio. LCL of Stadard Deviatio This is the lower value of a 100(1-α) iterval estimate for the stadard deviatio based o the chi-squared distributio with -1 degrees of freedom. This iterval estimate assumes that the variable is ormally distributed. LCL = s ( - 1) χ α /,

14 UCL of Stadard Deviatio This is the upper value of a 100(1-α) iterval estimate for the stadard deviatio based o the chi-squared distributio with -1 degrees of freedom. This iterval estimate assumes that the variable is ormally distributed. UCL = s ( - 1) χ 1 α /, 1 Ubiased Std Dev This is a ubiased estimate of the stadard deviatio. If the data come from a ormal distributio, the sample variace, s, is a ubiased estimate of the populatio variace. Ufortuately, the sample stadard deviatio, s, is a biased estimate of the populatio stadard deviatio. This bias is usually overlooked, but divisio of s by a correctio factor, c 4, will correct for this bias. This is frequetly doe i quality cotrol applicatios. The formula for c 4 is: where 1 0 t Γ( ) = t e dt c 4 = Γ( / ) 1 Γ(( 1) / ) Std Error of Mea This is a estimate of the stadard error of the mea. This is a estimate of the precisio of the sample mea. It, its stadard error ad cofidece limits, are calculated by dividig the correspodig Stadard Deviatio value by the square root of. Iterquartile Rage This is the iterquartile rage (IQR). It is the differece betwee the third quartile ad the first quartile (betwee the 75th percetile ad the 5th percetile). This represets the rage of the middle 50 percet of the distributio. It is a very robust (ot affected by outliers) measure of dispersio. I fact, if the data are ormally distributed, a robust estimate of the sample stadard deviatio is IQR/1.35. If a distributio is very cocetrated aroud its mea, the IQR will be small. O the other had, if the data are widely dispersed, the IQR will be much larger. Rage The differece betwee the largest ad smallest values for a variable. If the data for a give variable is ormally distributed, a quick estimate of the stadard deviatio ca be made by dividig the rage by six. Skewess ad Kurtosis Sectio Skewess ad Kurtosis Sectio of Height Coefficiet Coefficiet Parameter Skewess Kurtosis Fisher's g1 Fisher's g of Variatio of Dispersio Value Std Error

15 Skewess This statistic measures the directio ad degree of asymmetry. A value of zero idicates a symmetrical distributio. A positive value idicates skewess (logtailedess) to the right while a egative value idicates skewess to the left. Values betwee -3 ad +3 idicate are typical values of samples from a ormal distributio. For a alterative measure of skewess, see Fisher s g1, below. m 3 b1 = 3/ m Std Error of Skewess This is a large sample estimate of the stadard error of skewess for a ifiite populatio. Kurtosis This statistic measures the heaviess of the tails of a distributio. The usual referece poit i kurtosis is the ormal distributio. If this kurtosis statistic equals three ad the skewess is zero, the distributio is ormal. Uimodal distributios that have kurtosis greater tha three have heavier or thicker tails tha the ormal. These same distributios also ted to have higher peaks i the ceter of the distributio (leptokurtic). Uimodal distributios whose tails are lighter tha the ormal distributio ted to have a kurtosis that is less tha three. I this case, the peak of the distributio teds to be broader tha the ormal (platykurtic). Be forewared that this statistic is a ureliable estimator of kurtosis for small sample sizes. For a alterative measure of skewess, see Fisher s g, below. m b = 4 m Std Error of Kurtosis This is a large sample estimate of the stadard error of skewess for a ifiite populatio. Fisher s g1 Fisher s g 1 measure is a alterative measure of skewess. g = 1 ( -1) b - 1 Fisher s g The Fisher s g measure is a alterative measure of kurtosis. g = (+1)( -1) 3( -1) b - ( - )( - 3) +1 Coefficiet of Variatio The coefficiet of variatio is a relative measure of dispersio. It is most ofte used to compare the amout of variatio i two samples. It ca be used for the same data over two time periods or for the same time period but two differet places. It is the stadard deviatio divided by the mea: cv = s x Std Error of Coefficiet of Variatio This is a large sample estimate of the stadard error of the estimated coefficiet of variatio

16 Coefficiet of Dispersio The coefficiet of dispersio is a robust, relative measure of dispersio. It is frequetly used i real estate or tax assessmet applicatios. xi COD = media - media Trimmed Sectio Trimmed Sectio of Height 5% 10% 15% 5% 35% 45% Parameter Trimmed Trimmed Trimmed Trimmed Trimmed Trimmed Trim-Mea Trim-Std Dev Cout %Trimmed We call 100g the trimmig percetage, the percet of data that is trimmed from each side of the sorted data. Thus, if g = 5%, for a sample size of 00, 10 observatios are igored from each side of the sorted array of data values. Note that our formulatio allows fractioal data values. Differet trimmig percetages are available, but 5% ad 10% are the most commo i practice. Trim-Mea These are the alpha-trimmed meas discussed by Hoagli (1983, page 311). These are useful for quickly assessig the impact of outliers. You would like to see stability i these trimmed meas after a small degree of trimmig. The formula for the trimmed mea for 100g% trimmig is where g [ ] = α ad r = α g. g 1 1 x = ( 1 r ( ) )[ X X ] + X 1 α ( α ) ( g ) ( g ) ( i) i= g+ Trim-Std Dev This is the stadard deviatio of the observatios that remai after the trimmig. It ca be used to evaluate chages i the stadard deviatio for differet degrees of trimmig. The formula for the trimmed stadard deviatio for 100g% trimmig is the stadard formula for a weighted average usig the weights give below. a i = a i = 0 if i g or i g +1 1 r α if i = g +1or i = g 1 ai = α if g + i g 1 Cout This is the umber of observatios remaiig after the trimmig operatio. Note that this may be a fractioal amout uder alpha-trimmig

17 Mea-Deviatio Sectio Mea-Deviatio Sectio of Height Parameter X-Mea X-Media (X-Mea)^ (X-Mea)^3 (X-Mea)^4 Average Std Error Average of X-Mea This is a measure of dispersio, called the mea deviatio or the mea absolute deviatio. It is ot affected by outliers as much as the stadard deviatio, sice the differeces from the mea are ot squared. If the distributio for the variable of iterest is ormal, the mea deviatio is approximately equal to 0.8 stadard deviatios. MAD = i = 1 Std Error of X-Mea This is a estimate of the stadard error of the mea deviatio. x i x SE = s ( 1) π 1 MAD + ( ) + arcsi π 1 Average of X-Media This is a alterate formulatio of the mea deviatio above that is more robust to outliers sice the media is used as the ceter poit of the distributio. MAD Robust = i= 1 x media i Average of (X-Mea)^ This is the secod momet about the mea, m. Std Error of (X-Mea)^ This is the estimated stadard deviatio of the secod momet. Average of (X-Mea)^3 This is the third momet about the mea, m 3. Std Error of (X-Mea)^3 This is the estimated stadard deviatio of the third momet. Average of (X-Mea)^4 This is the fourth momet about the mea, m 4. Std Error of (X-Mea)^4 This is the estimated stadard deviatio of the fourth momet

18 Quartile Sectio This gives the value of the j th percetile. Of course, the 5 th percetile is called the first (lower) quartile, the 50 th percetile is the media, ad the 75 th percetile is called the third (upper) quartile. Quartile Sectio of Height 10th 5th 50th 75th 90th Parameter Percetile Percetile Percetile Percetile Percetile Value % LCL % UCL Value These are the values of the specified percetiles. Note that the defiitio of a percetile depeds o the type of percetile that was specified. LCL ad 95% UCL These give a exact, 100(1-α)% cofidece iterval for the populatio percetile. This cofidece iterval does ot assume ormality. Istead, it oly assumes a radom sample of items from a cotiuous distributio. The iterval is based o the equatio: 1 α = I ( r, r + 1) I ( r + 1, r) Here I p(a,b) is the itegral of the icomplete beta fuctio: ad q=1-p ad I p(a,b) = 1- I 1-p(b,a). p r 1 I q ( r + 1, r ) = k p ( 1 p ) k = 0 p k k Normality Test Sectio Normality Test Sectio of Height Test Prob 10% Critical 5% Critical Decisio Test Name Value Level Value Value (5%) Shapiro-Wilk W Ca't reject ormality Aderso-Darlig Ca't reject ormality Martiez-Iglewicz Ca't reject ormality Kolmogorov-Smirov Ca't reject ormality D'Agostio Skewess Ca't reject ormality D'Agostio Kurtosis Ca't reject ormality D'Agostio Omibus Ca't reject ormality Normality Tests This sectio displays the results of seve tests of the hypothesis that the data come from the ormal distributio. The Shapiro-Wilk adaderso-darlig tests are usually cosidered as the best. The Kolmogorov-Smirov test is icluded because of its historical popularity, but is bettered i almost every way by the other tests. Ufortuately, these tests have small statistical power (probability of detectig oormal data) uless the sample sizes are large, say over 100. Hece, if the decisio is to reject, you ca be reasoably certai that the data are ot ormal. However, if the decisio is to accept, the situatio is ot as clear. If you have a sample size of 100 or more, you ca reasoably assume that the actual distributio is closely approximated by the ormal distributio. If your sample size is less tha 100, all you kow is that there was ot eough evidece i your data to reject the ormality assumptio. I other words, the data might be oormal, you just could ot prove it. I this case, you must rely o the graphics ad past experiece to justify the ormality assumptio

19 Shapiro-Wilk W Test This test for ormality has bee foud to be the most powerful test i most situatios. It is the ratio of two estimates of the variace of a ormal distributio based o a radom sample of observatios. The umerator is proportioal to the square of the best liear estimator of the stadard deviatio. The deomiator is the sum of squares of the observatios about the sample mea. The test statistic W may be writte as the square of the Pearso correlatio coefficiet betwee the ordered observatios ad a set of weights which are used to calculate the umerator. Sice these weights are asymptotically proportioal to the correspodig expected ormal order statistics, W is roughly a measure of the straightess of the ormal quatile-quatile plot. Hece, the closer W is to oe, the more ormal the sample is. The test was developed by Shapiro ad Wilk (1965) for samples up to 0. NCSS uses the approximatios suggested by Roysto (199) ad Roysto (1995) which allow ulimited sample sizes. Note that Roysto oly checked the results for sample sizes up to 5000, but idicated that he saw o reaso larger sample sizes should ot work. The probability values for W are valid for samples greater tha 3. W may ot be as powerful as other tests whe ties occur i your data. The test is ot calculated whe a frequecy variable is specified. Aderso-Darlig Test This test, developed by Aderso ad Darlig (1954), is the most popular ormality test that is based o EDF statistics. I some situatios, it has bee foud to be as powerful as the Shapiro-Wilk test. The test is ot calculated whe a frequecy variable is specified. Martiez-Iglewicz This test for ormality, developed by Martiez ad Iglewicz (1981), is based o the media ad a robust estimator of dispersio. They have show that this test is very powerful for heavy-tailed symmetric distributios as well as a variety of other situatios. A value of the test statistic that is close to oe idicates that the distributio is ormal. This test is recommeded for exploratory data aalysis by Hoagli (1983). The formula for this test is: where s bi is a biweight estimator of scale. I = ( x i x ) i = 1 ( 1) s Martiez-Iglewicz (10% Critical ad 5% Critical) The 10% ad 5% critical values are give here. If the value of the test statistic is greater tha this value, reject ormality at that level of sigificace. Martiez-Iglewicz Decisio (5%) This reports the outcome of this test at the 5% sigificace level. Kolmogorov-Smirov This test for ormality is based o the maximum differece betwee the observed distributio ad expected cumulative-ormal distributio. Sice it uses the sample mea ad stadard deviatio to calculate the expected ormal distributio, the Lilliefors adjustmet is used. The smaller the maximum differece the more likely that the distributio is ormal. This test has bee show to be less powerful tha the other tests i most situatios. It is icluded because of its historical popularity. bi 00-19

20 Kolmogorov-Smirov (10% Critical ad 5% Critical) The 10% ad 5% critical values are give here. If the value of the test statistic is greater tha this value, reject ormality at that level of sigificace. The critical values are the Lilliefors adjusted values as give by Dallal (1986). If the test value is greater tha the reject critical value, ormality is rejected at that level of sigificace. Kolmogorov-Smirov Decisio (5%) This reports the outcome of this test at the 5% sigificace level. D Agostio Skewess D Agostio (1990) describes a ormality test based o the skewess coefficiet, b 1. Recall that because the ormal distributio is symmetrical, b 1 is equal to zero for ormal data. Hece, a test ca be developed to determie if the value of b 1 is sigificatly differet from zero. If it is, the data are obviously oormal. The statistic, z s, is, uder the ull hypothesis of ormality, approximately ormally distributed. The computatio of this statistic, which is restricted to sample sizes >8, is where b m 1 = 3 3 m T z = T s d a + l + 1 a T = ( + 1)( + 3) b1 6( ) 3( )( + 1)( + 3) C = ( )( + 5)( + 7)( + 9) W = 1 + ( C 1) a = d = W 1 1 l( W ) Skewess Test (Prob Level) This is the two-tail, sigificace level for this test. Reject the ull hypothesis of ormality if this value is less tha a pre-determied value, say Skewess Test Decisio (5%) This reports the outcome of this test at the 5% sigificace level. D Agostio Kurtosis D Agostio (1990) describes a ormality test based o the kurtosis coefficiet, b. Recall that for the ormal distributio, the theoretical value of b is 3. Hece, a test ca be developed to determie if the value of b is sigificatly differet from 3. If it is, the data are obviously oormal. The statistic, z k, is, uder the ull hypothesis of ormality, approximately ormally distributed for sample sizes >0. The calculatio of this test proceeds as follows: 00-0

21 where b m = 4 m G = b ( )( 3) ( + 1) ( + 3)( + 5) z k = 1 1 A 9A 1+ G A 4 9A 1/ 3 6( 5 + ) E = ( + 7)( + 9) 6( + 3)( + 5) ( )( 3) 8 4 A = E E E Prob Level of Kurtosis Test This is the two-tail sigificace level for this test. Reject the ull hypothesis of ormality if this value is less tha a pre-determied value, say Decisio of Kurtosis Test This reports the outcome of this test at the 5% sigificace level. D Agostio Omibus D Agostio (1990) describes a ormality test that combies the tests for skewess ad kurtosis. The statistic, K, is approximately distributed as a chi-square with two degrees of freedom. After calculated z s ad z k, calculate K as follows: s k K = z + z Prob Level D Agostio Omibus This is the sigificace level for this test. Reject the ull hypothesis of ormality if this value is less tha a predetermied value, say Decisio of D Agostio Omibus Test This reports the outcome of this test at the 5% sigificace level. 00-1

22 Histogram Plot The followig plot shows a histogram of the data. Histogram The histogram is a traditioal way of displayig the shape of a group of data. It is costructed from a frequecy distributio, where choices o the umber of bis ad bi width have bee made. These choices ca drastically affect the shape of the histogram. The ideal shape to look for i the case of ormality is a bell-shaped distributio. Normal Probability Plot This is a plot of the iverse of the stadard ormal cumulative versus the ordered observatios. If the uderlyig distributio of the data is ormal, the poits will fall alog a straight lie. Deviatios from this lie correspod to various types of oormality. Stragglers at either ed of the ormal probability plot idicate outliers. Curvature at both eds of the plot idicates log or short distributio tails. Covex, or cocave, curvature idicates a lack of symmetry. Gaps, plateaus, or segmetatio i the plot idicate certai pheomeo that eed closer scrutiy. Cofidece bads serve as a visual referece for departures from ormality. If ay of the observatios fall outside the cofidece bads, the data are ot ormal. The umerical ormality tests will usually cofirm this fact statistically. If oly oe observatio falls outside the cofidece limits, it may be a outlier. Note that these cofidece bads are based o large sample formulas. They may ot be accurate for small samples (less tha 30). 00-

23 Percetile Sectio Percetile Sectio of Height Percetile Value 95% LCL 95% UCL Exact Cof. Level Percetile Formula: Ave X(p[+1]) This sectio gives a larger set of percetiles tha was icluded i the Quartile Sectio. Use it whe you eed a less commo percetile. Percetile This is the percetage amout that you wat the percetile of. Value This gives the value of the p th percetile. Note that the percetile method used is listed at the bottom of the report. 95%LCL ad 95% UCL These give a exact, 100(1-α)% cofidece iterval for the populatio percetile. This cofidece iterval does ot assume ormality. Istead, it oly assumes a radom sample of items from a cotiuous distributio. The iterval is based o the equatio: 1 α = I ( r, r + 1) I ( r + 1, r) Here I p(a,b) is the itegral of the icomplete beta fuctio: p p ad q=1-p ad I p(a,b) = 1- I 1-p(b,a). r 1 Iq r r k p k ( + 1, ) = ( p ) 1 k= 0 k Exact Cof. Level Because of the discrete ature of the cofidece iterval costructed above, NCSS fids a iterval that is less tha the specified alpha level. This colum gives the actual cofidece coefficiet of the iterval. 00-3

24 Stem-ad-Leaf Plot Sectio Stem-ad-Leaf Plot Sectio of Height Depth Stem Leaves 4 5* * * Uit = 1 Example: 1 Represets 1 The stem-leaf plot is a type of histogram which retais much of the idetity of the origial data. It is useful for fidig data-etry errors as well as for studyig the distributio of a variable. Depth This is the cumulative umber of leaves, coutig i from the earest ed. Stem The stem is the first digit of the actual umber. For example, the stem of the umber 53 is 5 ad the stem of is 3. This is modified appropriately if the batch cotais umbers of differet orders of magitude. The largest order of magitude is used i determiig the stem. Depedig upo the umber of leaves, a stem may be divided ito two or more sub-stems. A special set of symbols is the used to mark the stems. The star (*) represets umbers i the rage of zero to four, while the period (.) represets umbers i the rage of five to ie. Leaf The leaf is the secod digit of the actual umber. For example, the leaf of the umber 53 is ad the leaf of is. This is modified appropriately if the batch cotais umbers of differet orders of magitude. The largest order of magitude is used i determiig the leaf. Uit This lie at the bottom idicates how the data were scaled to make the plot. 00-4

NCSS Statistical Software. Tolerance Intervals

NCSS Statistical Software. Tolerance Intervals Chapter 585 Itroductio This procedure calculates oe-, ad two-, sided tolerace itervals based o either a distributio-free (oparametric) method or a method based o a ormality assumptio (parametric). A two-sided

More information

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers Chapter 4 4-1 orth Seattle Commuity College BUS10 Busiess Statistics Chapter 4 Descriptive Statistics Summary Defiitios Cetral tedecy: The extet to which the data values group aroud a cetral value. Variatio:

More information

Chapter 2 Descriptive Statistics

Chapter 2 Descriptive Statistics Chapter 2 Descriptive Statistics Statistics Most commoly, statistics refers to umerical data. Statistics may also refer to the process of collectig, orgaizig, presetig, aalyzig ad iterpretig umerical data

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

MEASURES OF DISPERSION (VARIABILITY)

MEASURES OF DISPERSION (VARIABILITY) POLI 300 Hadout #7 N. R. Miller MEASURES OF DISPERSION (VARIABILITY) While measures of cetral tedecy idicate what value of a variable is (i oe sese or other, e.g., mode, media, mea), average or cetral

More information

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2 Aa Jaicka Mathematical Statistics 18/19 Lecture 1, Parts 1 & 1. Descriptive Statistics By the term descriptive statistics we will mea the tools used for quatitative descriptio of the properties of a sample

More information

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all! ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Solutios Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Median and IQR The median is the value which divides the ordered data values in half.

Median and IQR The median is the value which divides the ordered data values in half. STA 666 Fall 2007 Web-based Course Notes 4: Describig Distributios Numerically Numerical summaries for quatitative variables media ad iterquartile rage (IQR) 5-umber summary mea ad stadard deviatio Media

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements. CHAPTER 2 umerical Measures Graphical method may ot always be sufficiet for describig data. You ca use the data to calculate a set of umbers that will covey a good metal picture of the frequecy distributio.

More information

Binomial Distribution

Binomial Distribution 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

A goodness-of-fit test based on the empirical characteristic function and a comparison of tests for normality

A goodness-of-fit test based on the empirical characteristic function and a comparison of tests for normality A goodess-of-fit test based o the empirical characteristic fuctio ad a compariso of tests for ormality J. Marti va Zyl Departmet of Mathematical Statistics ad Actuarial Sciece, Uiversity of the Free State,

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For

More information

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for

More information

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls Ecoomics 250 Assigmet 1 Suggested Aswers 1. We have the followig data set o the legths (i miutes) of a sample of log-distace phoe calls 1 20 10 20 13 23 3 7 18 7 4 5 15 7 29 10 18 10 10 23 4 12 8 6 (1)

More information

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion Poit Estimatio Poit estimatio is the rather simplistic (ad obvious) process of usig the kow value of a sample statistic as a approximatio to the ukow value of a populatio parameter. So we could for example

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the

More information

Sample Size Determination (Two or More Samples)

Sample Size Determination (Two or More Samples) Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test. Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Elementary Statistics

Elementary Statistics Elemetary Statistics M. Ghamsary, Ph.D. Sprig 004 Chap 0 Descriptive Statistics Raw Data: Whe data are collected i origial form, they are called raw data. The followig are the scores o the first test of

More information

Data Description. Measure of Central Tendency. Data Description. Chapter x i

Data Description. Measure of Central Tendency. Data Description. Chapter x i Data Descriptio Describe Distributio with Numbers Example: Birth weights (i lb) of 5 babies bor from two groups of wome uder differet care programs. Group : 7, 6, 8, 7, 7 Group : 3, 4, 8, 9, Chapter 3

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Summarizing Data. Major Properties of Numerical Data

Summarizing Data. Major Properties of Numerical Data Summarizig Data Daiel A. Meascé, Ph.D. Dept of Computer Sciece George Maso Uiversity Major Properties of Numerical Data Cetral Tedecy: arithmetic mea, geometric mea, media, mode. Variability: rage, iterquartile

More information

Read through these prior to coming to the test and follow them when you take your test.

Read through these prior to coming to the test and follow them when you take your test. Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying. Lecture Mai Topics: Defiitios: Statistics, Populatio, Sample, Radom Sample, Statistical Iferece Type of Data Scales of Measuremet Describig Data with Numbers Describig Data Graphically. Defiitios. Example

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A) REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data

More information

GG313 GEOLOGICAL DATA ANALYSIS

GG313 GEOLOGICAL DATA ANALYSIS GG313 GEOLOGICAL DATA ANALYSIS 1 Testig Hypothesis GG313 GEOLOGICAL DATA ANALYSIS LECTURE NOTES PAUL WESSEL SECTION TESTING OF HYPOTHESES Much of statistics is cocered with testig hypothesis agaist data

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information

Statisticians use the word population to refer the total number of (potential) observations under consideration

Statisticians use the word population to refer the total number of (potential) observations under consideration 6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Chapter 6. Sampling and Estimation

Chapter 6. Sampling and Estimation Samplig ad Estimatio - 34 Chapter 6. Samplig ad Estimatio 6.. Itroductio Frequetly the egieer is uable to completely characterize the etire populatio. She/he must be satisfied with examiig some subset

More information

Probability and statistics: basic terms

Probability and statistics: basic terms Probability ad statistics: basic terms M. Veeraraghava August 203 A radom variable is a rule that assigs a umerical value to each possible outcome of a experimet. Outcomes of a experimet form the sample

More information

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y. Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed

More information

AP Statistics Review Ch. 8

AP Statistics Review Ch. 8 AP Statistics Review Ch. 8 Name 1. Each figure below displays the samplig distributio of a statistic used to estimate a parameter. The true value of the populatio parameter is marked o each samplig distributio.

More information

ANALYSIS OF EXPERIMENTAL ERRORS

ANALYSIS OF EXPERIMENTAL ERRORS ANALYSIS OF EXPERIMENTAL ERRORS All physical measuremets ecoutered i the verificatio of physics theories ad cocepts are subject to ucertaities that deped o the measurig istrumets used ad the coditios uder

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Computing Confidence Intervals for Sample Data

Computing Confidence Intervals for Sample Data Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios

More information

Activity 3: Length Measurements with the Four-Sided Meter Stick

Activity 3: Length Measurements with the Four-Sided Meter Stick Activity 3: Legth Measuremets with the Four-Sided Meter Stick OBJECTIVE: The purpose of this experimet is to study errors ad the propagatio of errors whe experimetal data derived usig a four-sided meter

More information

Error & Uncertainty. Error. More on errors. Uncertainty. Page # The error is the difference between a TRUE value, x, and a MEASURED value, x i :

Error & Uncertainty. Error. More on errors. Uncertainty. Page # The error is the difference between a TRUE value, x, and a MEASURED value, x i : Error Error & Ucertaity The error is the differece betwee a TRUE value,, ad a MEASURED value, i : E = i There is o error-free measuremet. The sigificace of a measuremet caot be judged uless the associate

More information

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract Goodess-Of-Fit For The Geeralized Expoetial Distributio By Amal S. Hassa stitute of Statistical Studies & Research Cairo Uiversity Abstract Recetly a ew distributio called geeralized expoetial or expoetiated

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

Access to the published version may require journal subscription. Published with permission from: Elsevier.

Access to the published version may require journal subscription. Published with permission from: Elsevier. This is a author produced versio of a paper published i Statistics ad Probability Letters. This paper has bee peer-reviewed, it does ot iclude the joural pagiatio. Citatio for the published paper: Forkma,

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9 BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous

More information

Chapter 10: Power Series

Chapter 10: Power Series Chapter : Power Series 57 Chapter Overview: Power Series The reaso series are part of a Calculus course is that there are fuctios which caot be itegrated. All power series, though, ca be itegrated because

More information

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008 Chapter 6 Part 5 Cofidece Itervals t distributio chi square distributio October 23, 2008 The will be o help sessio o Moday, October 27. Goal: To clearly uderstad the lik betwee probability ad cofidece

More information

Measures of Variation

Measures of Variation Chapter : Measures of Variatio from Statistical Aalysis i the Behavioral Scieces by James Raymodo Secod Editio 97814669676 01 Copyright Property of Kedall Hut Publishig CHAPTER Measures of Variatio Key

More information

32 estimating the cumulative distribution function

32 estimating the cumulative distribution function 32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio

More information

Lecture 24 Floods and flood frequency

Lecture 24 Floods and flood frequency Lecture 4 Floods ad flood frequecy Oe of the thigs we wat to kow most about rivers is what s the probability that a flood of size will happe this year? I 100 years? There are two ways to do this empirically,

More information

Power and Type II Error

Power and Type II Error Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review from last week What is a cofidece iterval? 2 Review from last week What is a cofidece

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Statistical Fundamentals and Control Charts

Statistical Fundamentals and Control Charts Statistical Fudametals ad Cotrol Charts 1. Statistical Process Cotrol Basics Chace causes of variatio uavoidable causes of variatios Assigable causes of variatio large variatios related to machies, materials,

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis Sectio 9.2 Tests About a Populatio Proportio P H A N T O M S Parameters Hypothesis Assess Coditios Name the Test Test Statistic (Calculate) Obtai P value Make a decisio State coclusio Sectio 9.2 Tests

More information

BUSINESS STATISTICS (PART-9) AVERAGE OR MEASURES OF CENTRAL TENDENCY: THE GEOMETRIC AND HARMONIC MEANS

BUSINESS STATISTICS (PART-9) AVERAGE OR MEASURES OF CENTRAL TENDENCY: THE GEOMETRIC AND HARMONIC MEANS BUSINESS STATISTICS (PART-9) AVERAGE OR MEASURES OF CENTRAL TENDENCY: THE GEOMETRIC AND HARMONIC MEANS. INTRODUCTION We have so far discussed three measures of cetral tedecy, viz. The Arithmetic Mea, Media

More information

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Circle the single best answer for each multiple choice question. Your choice should be made clearly. TEST #1 STA 4853 March 6, 2017 Name: Please read the followig directios. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directios This exam is closed book ad closed otes. There are 32 multiple choice questios.

More information