DESCRIPTIVE STATISTICS
|
|
- Lorraine Atkins
- 5 years ago
- Views:
Transcription
1 DESCRIPTIVE STATISTICS REVIEW OF KEY CONCEPTS SECTION. Measures of Locatio.. Arithmetic Mea xi x i x+ x + + x Cosider the data i Table.. They represet serum-cholesterol levels from a group of hospital workers who were regularly eatig a stadard U.S. diet ad who agreed to chage their diet to a vegetaria diet for a 6-week period. Cholesterol levels were measured before ad after adoptig the diet. The mea serum-cholesterol level before adoptig the diet is computed as follows: 4 x i i , x mgdl 4 Advatages. It is represetative of all the poits.. If the uderlyig distributio is Gaussia (bell shaped), the it is the most efficiet estimator of the middle of the distributio. 3. May statistical tests are based o the arithmetic mea. Disadvatages. It is sesitive to outliers, particularly i small samples; e.g., if oe of the cholesterol values were 800 rather tha 00, the the mea would be icreased by 5 mg/dl.. It is iappropriate if the uderlyig distributio is far from beig Gaussia; for example, serum triglycerides have a distributio that looks highly skewed (i.e., asymmetric).
2 STUDY GUIDE/FUNDAMENTALS OF BIOSTATISTICS 3 Table. Serum-Cholesterol levels before ad after adoptig a vegetaria diet (mg/dl) Subject Before After Before After Mea sd Alteratives to the Arithmetic Mea-Media Oe iterestig property from the table is that the diet appears to work best i people with high baselie levels versus people with low baselie levels. How ca we test if this is true? Divide the group i half, ad look at cholesterol chage i each half. To do this we must compute the media 50% poit i the distributio. Specifically, For example, F H..3 Stem-ad-Leaf Plots I K + Media th largest poit if is odd average of L th + F + th largest poits if is eve H I K O NM if 7, the the media 4th largest poit if 4, the the media average of (th + 3th) largest poit How ca we easily compute the media? We would have to order the data to obtai the th ad 3th largest poits. A easier way is to compute a stem-ad-leaf plot. Divide each data value ito a leaf (the least-sigificat digit or digits) ad a stem (the most-sigificat digit or digits) ad collect all data poits with the same stem o a sigle row. For example, the umber 95 has a stem of 9 ad a leaf of 5. A stem ad leaf plot of the before measuremets is give below. QP
3 4 CHAPTER /DESCRIPTIVE STATISTICS Cumulative total Stems Leaves Media 79 mg dl We have added a cumulative total colum which gives the total umber of poits with a stem that is the stem i that row. It is easy to compute the media from the stem ad leaf plot sice the media average of th ad 3th largest values ( ) 79. Note that the leaves withi a give row (stem) are ot ecessarily i order. Oe use of stem ad leaf plots is to provide a visual compariso of the values i differet data sets. The stem-ad-leaf plots of the chage i cholesterol for the subgroups of people below ad above the media are give as follows: 79 mg dl 80 mg dl The chage scores i the subgroups look quite differet; the subgroup with iitial value above the media is showig more chage. We will be able to test if the average chage score is sigificatly differet based o a t test (to be covered i Chapter 8 of the text)...4 Percetiles We ca also use stem-ad-leaf plots to obtai percetiles of the distributio. To compute the p th percetile, if p 0 is a iteger, the average the F H I K p p 0 th th largest poits Otherwise, pth percetile { p 0 + } the largest poit, where p 0 largest iteger p 0. For example, to compute the th percetile of the baselie cholesterol distributio, also kow as the lower decile, we have 4, p, p 0. 4, p 0, lower decile 3rd largest poit 5 mg/dl. To compute the 90th percetile (or upper decile), 4, p 90, p 0 6., p 0. Upper decile d largest poit 38 mg/dl.
4 STUDY GUIDE/FUNDAMENTALS OF BIOSTATISTICS 5 Media Commoly used percetiles, 0,,90% (deciles) 5, 50, 75% (quartiles) 0, 40,, 80% (quitiles) 33.3, 66.7% (tertiles) Advatages. Always guaratees that 50% of the data values are o either side of the media.. Isesitive to outliers (extreme values). If oe of the cholesterol values icreased from 00 to 800, the media would remai at 79 but the mea would icrease from 88 mg/dl to mg/dl. Disadvatages. It is ot as efficiet a estimator of the middle as the mea if the distributio really is Gaussia i that it is mostly sesitive to the middle of the distributio.. Most statistical procedures are based o the mea. We ca get a impressio of how symmetric a distributio is by lookig at the stem ad leaf plot. If we look at the stem ad leaf plot of the baselie values o the previous page we see that the distributio is oly slightly skewed, ad the mea may be adequate...5 Geometric Mea Oe way to get aroud the disadvatages of the arithmetic mea are to trasform the data oto a differet scale to make the distributio more symmetric ad compute the arithmetic mea o the ew scale. The most popular such scale is the l (atural log or log e ) scale: a f l x,, l ax f We ca ow take a average i the l scale ad deote it by l x : a f l x l x a f + + l x The problem with this is that the average is i the l scale rather tha the origial scale. Thus, we take the atilog of l x to obtai GM el x geometric mea The ERG (electroretiogram) amplitude (µ V ) is a measure of electrical activity of the retia ad is used to moitor retial fuctio i patiets with retiitis pigmetosa, a ofte-blidig ocular coditio. The followig data were obtaied from patiets to moitor the course of the coditio over a -year period. Year ERG amplitude (µ V ) Year ERG amplitude (µ V ) Absolute chage (µ V )
5 6 CHAPTER /DESCRIPTIVE STATISTICS The distributio of values at each year is highly skewed, with chage scores domiated by people with high year- ERG amplitudes. The distributio i the l scale is much more symmetric. Let s compute the GM for year ad year. Year l ( 9. ) + + l ( 63. ) l x 844. GM e µ V We ca quatify the % chage by Year l ( 4. ) + + l ( 35. ) l x GM e µ V GM GM % declie 0% ( ( )) Thus, the ERG has declied, o average, by 3.% over year. Geometric Mea Advatages. Useful for certai types of skewed distributios.. Stadard statistical procedures ca be used o the log scale. Disadvatages. Not appropriate for symmetric data.. More sesitive to outliers tha the media but less so tha the mea. SECTION. Measures of Spread.. Rage The rage the iterval from the smallest value to the largest value. This gives a quick feelig for the overall spread but is misleadig because it is solely iflueced by the most extreme values; e.g., cholesterol data iitial readigs; rage (37, 50)... Quasi-Rage A quasi-rage is similar to the rage but is derived after excludig a specified percetage of the sample at each ed; e.g., the iterval from the th percetile to the 90th percetile. For example, for the cholesterol data % poit 3rd largest from bottom 5 mg/dl 90% poit 3rd largest from top 38 mg/dl quasi-rage (5,38)
6 STUDY GUIDE/FUNDAMENTALS OF BIOSTATISTICS 7..3 Stadard Deviatio, Variace If the distributio is ormal or ear ormal, the the stadard deviatio is more frequetly used as a measure of spread. Why s rather tha s? x x s i sample variace i s sample stadard deviatio variace We wat a estimator of spread i the same uits as x ; i.e., if uits chage by a factor of c, ad the trasformed data is referred to as y, the y cx s cs but s c s y x a f y x Note that s chages by a factor of c (the same as x ), but s chages by a factor of c. Thus, s ad x ca be directly related to each other while s ad x caot. How ca we use x ad s to get a impressio of the spread of the distributio? If the distributio is ormal, the x ± s comprises about /3 of the distributio x ± s (more precisely,.96s) comprises about 95% of the distributio x ± 5. s (more precisely,.576s) comprises about 99% of the distributio If the distributio is ot ormal or ear ormal, the the distributio is ot well characterized by x, s. It is better to use the percetiles i this case (e.g., the media could be used istead of the mea ad the quasi-rage istead of the stadard deviatio). For example, for the cholesterol data, the variace ad stadard deviatio of the before measuremets are computed as follows: Let s see how ormal the distributio looks. 4 axi xf i s 5, s 33. x ± 96. s ± 96. ( 33. ) (. 8, 5. 8) icludes all poits; it should iclude 95% (or 3 out of 4 poits) uder a ormal distributio. x ± s ± 33. ( 54. 6,. ) icludes %. of poits; it should be /3 uder a ormal distributio. The ormal distributio appears to provide a reasoable approximatio. Note that computer programs such as Excel ca be used to compute may types of descriptive statistics. See the cd-rom for a example of usig Excel to easily compute the mea ad stadard deviatio. s
7 8 CHAPTER /DESCRIPTIVE STATISTICS..4 Coefficiet of Variatio (CV) CV s 0% x The CV is used if the variability is thought to be related to the mea. For the cholesterol data, CV 0% % SECTION.3 Some Other Meas for Describig Data.3. Frequecy Distributio This is a listig of each value ad how frequetly it occurs (or i additio, the % of scores associated with each value). This ca be doe either based o the origial values, or i grouped form; e.g., if we group the cholesterol chage scores by -mg icremets, the we would have Frequecy % 40.0, , < , < , < , < , < , < This ca be doe either i umeric or graphic form. If i graphic form, it is ofte represeted as a bar graph..3. Box Plot Aother graphical techique for displayig data ofte used i computer packages is provided by a Box plot. The box (rectagle) displays the upper ad lower quartiles, the media, arithmetic mea, ad outlyig values (if ay). It is a cocise way to look at the symmetry ad rage of a distributio. Outlyig value O Upper quartile Arithmetic mea Media Lower quartile O
8 STUDY GUIDE/FUNDAMENTALS OF BIOSTATISTICS 9 PROBLEMS... Suppose the origi for a data set is chaged by addig a costat to each observatio.. What is the effect o the media?. What is the effect o the mode?.3 What is the effect o the geometric mea?.4 What is the effect o the rage? Real Disease For a study of kidey disease, the followig measuremets were made o a sample of wome workig i several factories i Switzerlad. They represet cocetratios of bacteria i a stadard-size urie specime. High cocetratios of these bacteria may idicate possible kidey pathology. The data are preseted i Table.. Table. Cocetratio of bacteria i the urie i a sample of female factory workers i Switzerlad Cocetratio Frequecy Compute the arithmetic mea for this sample..6 Compute the geometric mea for this sample..7 Which do you thik is a more appropriate measure of locatio? Cardiovascular Disease The mortality rates from heart disease (per 0,000 populatio) for each of the 50 states ad the District of Columbia i 973 are give i descedig order i Table.3 []. Cosider this data set as a sample of size 5 ax, x,, x5f. 5 x i i 5 a i f i If 7, 409 ad x x 49, the do the followig: Table.3 Mortality rates from heart disease (per 0,000 populatio) for the 50 states ad the District of Columbia i 973 West Virgiia Wiscosi DC 37. Pesylvaia Vermot South Carolia Maie Nebraska Motaa Missouri 4.9 Teessee Marylad Illiois 40.8 New Hampshire Georgia Florida Idiaa Virgiia 3. 7 Rhode Islad North Dakota Califoria Ketucky Delaware Wyomig New York Mississippi Texas Iowa Louisiaa Idaho 97.4 Arkasas Coecticut Colorado 74.6 New Jersey Orego Arizoa Massachusetts Washigto Nevada Kasas Miesota Utah 4. 5 Oklahoma Michiga New Mexico Ohio Alabama Hawaii South Dakota North Carolia Alaska 83.9
9 CHAPTER /DESCRIPTIVE STATISTICS.8 Compute the arithmetic mea of this sample..9 Compute the media of this sample.. Compute the stadard deviatio of this sample.. The atioal mortality rate for heart disease i 973 was per 0,000. Why does this figure ot correspod to your aswer for Problem.8?. Does the differetial i raw rates betwee Florida (47.4) ad Georgia (3.8) actually imply that the risk of dyig from heart disease is greater i Florida tha i Georgia? Why or why ot? Nutritio Table.4 shows the distributio of dietary vitami-a itake as reported by 4 studets who filled out a dietary questioaire i class. The total itake is a combiatio of itake from idividual food items ad from vitami pills. The uits are i IU/0 (Iteratioal Uits/0). Table.4 Distributio of dietary vitami-a itake as reported by 4 studets Studet umber Itake (IU/0) Studet umber Itake (IU/0) Compute the mea ad media from these data..4 Compute the stadard deviatio ad coefficiet of variatio from these data..5 Suppose the data are expressed i IU rather tha IU/0. What are the mea, stadard deviatio, ad coefficiet of variatio i the ew uits?.6 Costruct a stem-ad-leaf plot of the data o some coveiet scale..7 Do you thik the mea or media is a more appropriate measure of locatio for this data set? SOLUTIONS.... Each data value is chaged from x i to xi + a, for some costat a. The media also icreases by a.. The mode icreases by a..3 The geometric mea is chaged by a udetermied amout, because the geometric mea is give by atilog laxi + af ad there is o simple relatioship betwee l a xi + a f ad l af. x i.4 The rage is ot chaged, sice it is the distace betwee the largest ad smallest values, ad distaces betwee poits will ot be chaged by shiftig the origi..5 The arithmetic mea is give by ( ) ( ) 7.6 To compute the geometric mea, we first compute the mea log to the base as follows: a f a f 5 log log The geometric mea is the give by The geometric mea is more appropriate because the distributio is i powers of ad is very skewed. I the log scale, the distributio becomes less skewed, ad the mea provides a more cetral measure of locatio. Notice that oly 33 of the 77 data poits are greater tha the arithmetic mea, while 46 of the 77 poits are greater tha the geometric mea..8 We have that x 7, per 0, Sice 5 is odd, the media is give by the ( 5 + ) th or 6th largest value mortality rate for Mississippi 35.6 per 0,000.
10 STUDY GUIDE/FUNDAMENTALS OF BIOSTATISTICS. We have that ( x ) i x i 49, , s Thus, s 4, per 0,000. The atioal mortality rate is a weighted average of the state-specific mortality rates, where the weights are the umber of people i each state. The arithmetic mea i Problem.8 is a uweighted average of the state-specific mortality rates that weights the large ad small states equally.. No. The demographic characteristics of the residets of Florida may be very differet from those of Georgia, which would accout for the differece i the rates. I particular, Florida has a large retiree populatio, which would lead to higher mortality rates. I order to make a accurate compariso betwee the states, we would, at a miimum, eed to compare disease rates amog specific age-sex-race groups i the two states x Media average of the 7th ad 8th largest values s 4 a xi x i 3 358, s s CV 0% x 0% f 67. 9%.5 Mea , 687 IU, s , IU, CV 67. 9% (uchaged)..6 We will roud each umber to the earest iteger i costructig the stem-ad-leaf plot: The media is more appropriate, because the distributio appears to be skewed to the right. REFERENCE... [] Natioal Ceter for Health Statistics. (975, February ), Mothly vital statistics report, summary report, fial mortality statistics (973), 3() (Suppl. ).
Chapter 2 Descriptive Statistics
Chapter 2 Descriptive Statistics Statistics Most commoly, statistics refers to umerical data. Statistics may also refer to the process of collectig, orgaizig, presetig, aalyzig ad iterpretig umerical data
More informationData Description. Measure of Central Tendency. Data Description. Chapter x i
Data Descriptio Describe Distributio with Numbers Example: Birth weights (i lb) of 5 babies bor from two groups of wome uder differet care programs. Group : 7, 6, 8, 7, 7 Group : 3, 4, 8, 9, Chapter 3
More informationMedian and IQR The median is the value which divides the ordered data values in half.
STA 666 Fall 2007 Web-based Course Notes 4: Describig Distributios Numerically Numerical summaries for quatitative variables media ad iterquartile rage (IQR) 5-umber summary mea ad stadard deviatio Media
More informationCHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.
CHAPTER 2 umerical Measures Graphical method may ot always be sufficiet for describig data. You ca use the data to calculate a set of umbers that will covey a good metal picture of the frequecy distributio.
More informationChapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers
Chapter 4 4-1 orth Seattle Commuity College BUS10 Busiess Statistics Chapter 4 Descriptive Statistics Summary Defiitios Cetral tedecy: The extet to which the data values group aroud a cetral value. Variatio:
More informationExample: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.
1 (*) If a lot of the data is far from the mea, the may of the (x j x) 2 terms will be quite large, so the mea of these terms will be large ad the SD of the data will be large. (*) I particular, outliers
More informationMEASURES OF DISPERSION (VARIABILITY)
POLI 300 Hadout #7 N. R. Miller MEASURES OF DISPERSION (VARIABILITY) While measures of cetral tedecy idicate what value of a variable is (i oe sese or other, e.g., mode, media, mea), average or cetral
More informationElementary Statistics
Elemetary Statistics M. Ghamsary, Ph.D. Sprig 004 Chap 0 Descriptive Statistics Raw Data: Whe data are collected i origial form, they are called raw data. The followig are the scores o the first test of
More informationInferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.
Iferetial Statistics ad Probability a Holistic Approach Iferece Process Chapter 8 Poit Estimatio ad Cofidece Itervals This Course Material by Maurice Geraghty is licesed uder a Creative Commos Attributio-ShareAlike
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationEconomics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls
Ecoomics 250 Assigmet 1 Suggested Aswers 1. We have the followig data set o the legths (i miutes) of a sample of log-distace phoe calls 1 20 10 20 13 23 3 7 18 7 4 5 15 7 29 10 18 10 10 23 4 12 8 6 (1)
More information1 Lesson 6: Measure of Variation
1 Lesso 6: Measure of Variatio 1.1 The rage As we have see, there are several viable coteders for the best measure of the cetral tedecy of data. The mea, the mode ad the media each have certai advatages
More informationENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!
ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Solutios Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced
More informationACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics
ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the
More informationNumber of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day
LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the
More informationLecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.
Lecture Mai Topics: Defiitios: Statistics, Populatio, Sample, Radom Sample, Statistical Iferece Type of Data Scales of Measuremet Describig Data with Numbers Describig Data Graphically. Defiitios. Example
More informationAnalysis of Experimental Data
Aalysis of Experimetal Data 6544597.0479 ± 0.000005 g Quatitative Ucertaity Accuracy vs. Precisio Whe we make a measuremet i the laboratory, we eed to kow how good it is. We wat our measuremets to be both
More informationENGI 4421 Confidence Intervals (Two Samples) Page 12-01
ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationChapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008
Chapter 6 Part 5 Cofidece Itervals t distributio chi square distributio October 23, 2008 The will be o help sessio o Moday, October 27. Goal: To clearly uderstad the lik betwee probability ad cofidece
More information(# x) 2 n. (" x) 2 = 30 2 = 900. = sum. " x 2 = =174. " x. Chapter 12. Quick math overview. #(x " x ) 2 = # x 2 "
Chapter 12 Describig Distributios with Numbers Chapter 12 1 Quick math overview = sum These expressios are algebraically equivalet #(x " x ) 2 = # x 2 " (# x) 2 Examples x :{ 2,3,5,6,6,8 } " x = 2 + 3+
More information1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable
More informationANALYSIS OF EXPERIMENTAL ERRORS
ANALYSIS OF EXPERIMENTAL ERRORS All physical measuremets ecoutered i the verificatio of physics theories ad cocepts are subject to ucertaities that deped o the measurig istrumets used ad the coditios uder
More informationHUMBEHV 3HB3 Measures of Central Tendency & Variability Week 2
Describig Data Distributios HUMBEHV 3HB3 Measures of Cetral Tedecy & Variability Week 2 Prof. Patrick Beett Ofte we wish to summarize distributios of data, rather tha showig histograms Two basic descriptios
More informationSTAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)
STAT 350 Hadout 9 Samplig Distributio, Cetral Limit Theorem (6.6) A radom sample is a sequece of radom variables X, X 2,, X that are idepedet ad idetically distributed. o This property is ofte abbreviated
More informationSpatial Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University
Spatial Aalysis ad Modelig (GIST 4302/5302) Guofeg Cao Departmet of Geoscieces Texas Tech Uiversity Outlie of This Week Last week, we leared: spatial poit patter aalysis (PPA) focus o locatio distributio
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationPower and Type II Error
Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error
More information7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals
7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses
More informationmultiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.
Lesso 3- Lesso 3- Scale Chages of Data Vocabulary scale chage of a data set scale factor scale image BIG IDEA Multiplyig every umber i a data set by k multiplies all measures of ceter ad the stadard deviatio
More informationBinomial Distribution
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationSummarizing Data. Major Properties of Numerical Data
Summarizig Data Daiel A. Meascé, Ph.D. Dept of Computer Sciece George Maso Uiversity Major Properties of Numerical Data Cetral Tedecy: arithmetic mea, geometric mea, media, mode. Variability: rage, iterquartile
More information2: Describing Data with Numerical Measures
: Describig Data with Numerical Measures. a The dotplot show below plots the five measuremets alog the horizotal axis. Sice there are two s, the correspodig dots are placed oe above the other. The approximate
More informationMeasures of Spread: Variance and Standard Deviation
Lesso 1-6 Measures of Spread: Variace ad Stadard Deviatio BIG IDEA Variace ad stadard deviatio deped o the mea of a set of umbers. Calculatig these measures of spread depeds o whether the set is a sample
More informationComparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading
Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationComputing Confidence Intervals for Sample Data
Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios
More informationOctober 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1
October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 1 Populatio parameters ad Sample Statistics October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 2 Ifereces
More informationContinuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised
Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for
More informationStat 421-SP2012 Interval Estimation Section
Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible
More informationAnna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2
Aa Jaicka Mathematical Statistics 18/19 Lecture 1, Parts 1 & 1. Descriptive Statistics By the term descriptive statistics we will mea the tools used for quatitative descriptio of the properties of a sample
More informationLecture 24 Floods and flood frequency
Lecture 4 Floods ad flood frequecy Oe of the thigs we wat to kow most about rivers is what s the probability that a flood of size will happe this year? I 100 years? There are two ways to do this empirically,
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationProvläsningsexemplar / Preview TECHNICAL REPORT INTERNATIONAL SPECIAL COMMITTEE ON RADIO INTERFERENCE
TECHNICAL REPORT CISPR 16-4-3 2004 AMENDMENT 1 2006-10 INTERNATIONAL SPECIAL COMMITTEE ON RADIO INTERFERENCE Amedmet 1 Specificatio for radio disturbace ad immuity measurig apparatus ad methods Part 4-3:
More information4.1 Sigma Notation and Riemann Sums
0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas
More informationError & Uncertainty. Error. More on errors. Uncertainty. Page # The error is the difference between a TRUE value, x, and a MEASURED value, x i :
Error Error & Ucertaity The error is the differece betwee a TRUE value,, ad a MEASURED value, i : E = i There is o error-free measuremet. The sigificace of a measuremet caot be judged uless the associate
More information4.1 SIGMA NOTATION AND RIEMANN SUMS
.1 Sigma Notatio ad Riema Sums Cotemporary Calculus 1.1 SIGMA NOTATION AND RIEMANN SUMS Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each
More informationApril 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE
April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece
More informationStatistical Fundamentals and Control Charts
Statistical Fudametals ad Cotrol Charts 1. Statistical Process Cotrol Basics Chace causes of variatio uavoidable causes of variatios Assigable causes of variatio large variatios related to machies, materials,
More informationThe Hong Kong University of Science & Technology ISOM551 Introductory Statistics for Business Assignment 3 Suggested Solution
The Hog Kog Uiversity of ciece & Techology IOM55 Itroductory tatistics for Busiess Assigmet 3 uggested olutio Note All values of statistics i Q ad Q4 are obtaied by Excel. Qa. Let be the robability that
More informationActivity 3: Length Measurements with the Four-Sided Meter Stick
Activity 3: Legth Measuremets with the Four-Sided Meter Stick OBJECTIVE: The purpose of this experimet is to study errors ad the propagatio of errors whe experimetal data derived usig a four-sided meter
More informationSTP 226 EXAMPLE EXAM #1
STP 226 EXAMPLE EXAM #1 Istructor: Hoor Statemet: I have either give or received iformatio regardig this exam, ad I will ot do so util all exams have bee graded ad retured. PRINTED NAME: Siged Date: DIRECTIONS:
More informationµ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion
Poit Estimatio Poit estimatio is the rather simplistic (ad obvious) process of usig the kow value of a sample statistic as a approximatio to the ukow value of a populatio parameter. So we could for example
More informationBIOS 4110: Introduction to Biostatistics. Breheny. Lab #9
BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous
More informationChapter 23: Inferences About Means
Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For
More informationCHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics
CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS 8.1 Radom Samplig The basic idea of the statistical iferece is that we are allowed to draw ifereces or coclusios about a populatio based
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationAnalysis of Experimental Measurements
Aalysis of Experimetal Measuremets Thik carefully about the process of makig a measuremet. A measuremet is a compariso betwee some ukow physical quatity ad a stadard of that physical quatity. As a example,
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationWHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT
WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still
More informationChapter 12 Correlation
Chapter Correlatio Correlatio is very similar to regressio with oe very importat differece. Regressio is used to explore the relatioship betwee a idepedet variable ad a depedet variable, whereas correlatio
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More informationCURRICULUM INSPIRATIONS: INNOVATIVE CURRICULUM ONLINE EXPERIENCES: TANTON TIDBITS:
CURRICULUM INSPIRATIONS: wwwmaaorg/ci MATH FOR AMERICA_DC: wwwmathforamericaorg/dc INNOVATIVE CURRICULUM ONLINE EXPERIENCES: wwwgdaymathcom TANTON TIDBITS: wwwjamestatocom TANTON S TAKE ON MEAN ad VARIATION
More informationKinetics of Complex Reactions
Kietics of Complex Reactios by Flick Colema Departmet of Chemistry Wellesley College Wellesley MA 28 wcolema@wellesley.edu Copyright Flick Colema 996. All rights reserved. You are welcome to use this documet
More informationResponse Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable
Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More informationIntroducing Sample Proportions
Itroducig Sample Proportios Probability ad statistics Aswers & Notes TI-Nspire Ivestigatio Studet 60 mi 7 8 9 0 Itroductio A 00 survey of attitudes to climate chage, coducted i Australia by the CSIRO,
More informationChapter 6. Sampling and Estimation
Samplig ad Estimatio - 34 Chapter 6. Samplig ad Estimatio 6.. Itroductio Frequetly the egieer is uable to completely characterize the etire populatio. She/he must be satisfied with examiig some subset
More information( ) = p and P( i = b) = q.
MATH 540 Radom Walks Part 1 A radom walk X is special stochastic process that measures the height (or value) of a particle that radomly moves upward or dowward certai fixed amouts o each uit icremet of
More informationGoodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)
Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe
More informationRecall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.
Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed
More informationAP Statistics Review Ch. 8
AP Statistics Review Ch. 8 Name 1. Each figure below displays the samplig distributio of a statistic used to estimate a parameter. The true value of the populatio parameter is marked o each samplig distributio.
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationTopic 6 Sampling, hypothesis testing, and the central limit theorem
CSE 103: Probability ad statistics Fall 2010 Topic 6 Samplig, hypothesis testig, ad the cetral limit theorem 61 The biomial distributio Let X be the umberofheadswhe acoiofbiaspistossedtimes The distributio
More informationOn a Smarandache problem concerning the prime gaps
O a Smaradache problem cocerig the prime gaps Felice Russo Via A. Ifate 7 6705 Avezzao (Aq) Italy felice.russo@katamail.com Abstract I this paper, a problem posed i [] by Smaradache cocerig the prime gaps
More informationIE 230 Seat # Name < KEY > Please read these directions. Closed book and notes. 60 minutes.
IE 230 Seat # Name < KEY > Please read these directios. Closed book ad otes. 60 miutes. Covers through the ormal distributio, Sectio 4.7 of Motgomery ad Ruger, fourth editio. Cover page ad four pages of
More informationBig Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.
5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece
More informationChapter 4 - Summarizing Numerical Data
Chapter 4 - Summarizig Numerical Data 15.075 Cythia Rudi Here are some ways we ca summarize data umerically. Sample Mea: i=1 x i x :=. Note: i this class we will work with both the populatio mea µ ad the
More informationSpatial Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University
Spatial Aalysis ad Modelig (GIST 4302/5302) Guofeg Cao Departmet of Geoscieces Texas Tech Uiversity Outlie of This Week Last week, we leared: spatial poit patter aalysis (PPA) focus o locatio distributio
More informationIntroducing Sample Proportions
Itroducig Sample Proportios Probability ad statistics Studet Activity TI-Nspire Ivestigatio Studet 60 mi 7 8 9 10 11 12 Itroductio A 2010 survey of attitudes to climate chage, coducted i Australia by the
More informationConfidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.
MATH1005 Statistics Lecture 24 M. Stewart School of Mathematics ad Statistics Uiversity of Sydey Outlie Cofidece itervals summary Coservative ad approximate cofidece itervals for a biomial p The aïve iterval
More informationx c the remainder is Pc ().
Algebra, Polyomial ad Ratioal Fuctios Page 1 K.Paulk Notes Chapter 3, Sectio 3.1 to 3.4 Summary Sectio Theorem Notes 3.1 Zeros of a Fuctio Set the fuctio to zero ad solve for x. The fuctio is zero at these
More informationData Analysis and Statistical Methods Statistics 651
Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio
More informationSimulation. Two Rule For Inverting A Distribution Function
Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationEco411 Lab: Central Limit Theorem, Normal Distribution, and Journey to Girl State
Eco411 Lab: Cetral Limit Theorem, Normal Distributio, ad Jourey to Girl State 1. Some studets may woder why the magic umber 1.96 or 2 (called critical values) is so importat i statistics. Where do they
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationEconomics Spring 2015
1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures
More informationt distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference
EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The
More informationKLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions
We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give
More informationBayesian Methods: Introduction to Multi-parameter Models
Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested
More information[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:
PROBABILITY FUNCTIONS A radom variable X has a probabilit associated with each of its possible values. The probabilit is termed a discrete probabilit if X ca assume ol discrete values, or X = x, x, x 3,,
More informationCentral Limit Theorem the Meaning and the Usage
Cetral Limit Theorem the Meaig ad the Usage Covetio about otatio. N, We are usig otatio X is variable with mea ad stadard deviatio. i lieu of sayig that X is a ormal radom Assume a sample of measuremets
More informationBecause it tests for differences between multiple pairs of means in one test, it is called an omnibus test.
Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal
More informationA sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as
More informationBivariate Sample Statistics Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 7
Bivariate Sample Statistics Geog 210C Itroductio to Spatial Data Aalysis Chris Fuk Lecture 7 Overview Real statistical applicatio: Remote moitorig of east Africa log rais Lead up to Lab 5-6 Review of bivariate/multivariate
More information