Biostatistics. Biostatistics Introduction & Data Presentation (Short Version) Intro - 1. Variable Types. Basic Terms In Statistics

Size: px
Start display at page:

Download "Biostatistics. Biostatistics Introduction & Data Presentation (Short Version) Intro - 1. Variable Types. Basic Terms In Statistics"

Transcription

1 Biostatistics Itroductio & Data Presetatio (Short Versio) Biostatistics Ady Chag Yougstow State Uiversity Statistics i Broader Sese Statistics is a field of study cocered with the ) data collectio, [Producig data] ) orgaizatio, summarizatio, examiatio ad providig a overview of the geeral features of data, [Explorig Data] 3) ad the drawig of ifereces about a body of data (populatio) based o the properties of a part of the data (sample) observed. [Statistical Iferece] Producig Data Explorig Data Statistical Iferece I health ad medical (or cliical) study, researchers ivestigate a sample of subjects to uderstad the effectiveess of a treatmet or a itervetio o target populatio. Public health is fudametally cocered with prevetig disease, disability, ad premature death i huma populatio or commuity. Therefore, statistical iferece is very importat i public health research. Therefore, statistical iferece is very importat i health ad medical research. Goal of the Healthy People : Elimiate Health Disparities 3 4 Basic Terms I Statistics Idividuals (subjects, experimetal uit): the etities o which data are collected. Variable: a characteristic of iterest for the idividual which takes o differet values i differet idividual. Purpose of Statistics Examie a variable Examie correlatio betwee two or more variables Variable Types Quatitative Variables (umeric) [height, umber of subscriptios,...] Cotiuous: a variable that has a ucoutable umber of possible values. (measuremets) Discrete: a variable that has a coutable umber of possible values.(couts) Qualitative (Categorical) Variables [hair color, geder,...] 5 6 Itro -

2 Biostatistics Itroductio & Data Presetatio (Short Versio) Measuremet Scales Nomial: cosists of labels, ames or categories. Ordial: data that the order or rak is meaigful. Iterval: umerical data that arithmetic operatios are meaigful. Ratio: data that the ratio of two data is meaigful. Producig Data 7 8 Data i Public Health : Vital Statistics ad the Cesus Public Health Surveillace Survey Registries Epidemic Ivestigatios Research Program Evaluatios Descriptive Statistics Data Presetatioi Groupig tables graphical summary Numerical Summary Ceter Dispersio (Spread) Exploratory Data Aalysis 9 Data Presetatio What type of statistical techique is appropriate for Data Presetatio? Categorical variable? Quatitative variable? Data Sheet (Raw data) ID Height(i) Weight(lb) BirthMoth Exp. Geder H F H F T M H F T F H M H F H M... Itro -

3 Biostatistics Itroductio & Data Presetatio (Short Versio) (A complete list) ID Height Weight BirthMoth Exp. Geder H F H F T M H F T F H M H F H M T M H M T M T F H M T F H M H F T F T M H M T M 7 7 H M H M 3 Groupig ad Displayig A Categorical Variable 4 7 Frequecy Table ad Charts (Oe Categorical Variable) Class Frequecy Relative Frequecy Female 9 9/ =.49 = 4.9% Male 3 3/ =.59 = 59.% Total % Cout Pareto chart Percet 6585 Female 4.9% White Asia 3 Male 59.% Pe rcet sex Female Male 5 Black or Africa Ame Hispaic or Latio Multiple - No-hispa Multiple - Hispaic How do you describe yourself Am. Idia or Alaska * Bars arraged accordig to their frequecies. N. Hawaii/Other Pac 6 Frequecy Distributio Table Data: 35, 9, 75, 6, 35, 7, 8, 5, 95, Groupig ad Displayig A Quatitative Data Class Tally Frequecy - < - <4 4 - < - <8 8 - < - < - <4 4 - < - <8 8 - <3 Total 7 8 to less tha 3 8 Itro - 3

4 Biostatistics Itroductio & Data Presetatio (Short Versio) Frequecy Distributio Table Data: 35, 9, 75, 6, 35, 7, 8, 5, 95, Class Tally Frequecy - < 3 - < < - < < 7 - < - <4 4 - < - <8 8 - <3 Total Frequecy Distributio Table (From data sheet) Class Frequecy Relative Freq. Cumulative R.F. - < 3 3/ =.36 3/ - <4 3 3/ =.36 6/ 4 - < / =.9 8/ - <8 4 4/ =.8 / 8 - < 7 7/ =.38 9/ - < / =.45 / - <4 / =.45 / 4 - < / =. / - <8 / =. / 8 - <3 / =.45 / Total. 8 to less tha 3 9 Classes: Categories for groupig data. Frequecy (class frequecy): The umber of data values i a class. Relative frequecy: The ratio of the frequecy of a class to the total umber of pieces of data. Frequecy distributio: A listig of classes ad their frequecies. Relative Frequecy distributio: A listig of classes ad their relative frequecies. Upper class limit: The largest value that ca go i a class. Lower class limit: The smallest value that ca go i a class. Class width: The differece betwee the lower class limit of the give class ad the lower class limit of the ext higher class. Class midpoit (class mark): The midpoit of a class. Guidelies for groupig data: (for quatitative variable) There should be betwee five ad twety classes. Each piece of data must belog to oe, ad oly oe, class.(mutually Exclusive) Wheever feasible, all classes should have the same width. To build a Frequecy Table: Fid the rage of the data: Rage = Largest value smallest value Use the rage ad try differet class width to determie how may classes you eed to make frequecy table or histogram. Studet data example: Rage = 85 6 = 79/ 9 If usig a class width of, there ll be about 9 classes which is good. 3 Frequecy Distributio Table (From data sheet) Class Frequecy Relative Freq. Cumulative R.F. < - 3 3/ =.36 3/ < / =.36 6/ 4< - 3 3/ =.36 9/ < / =.7 4/ 8< - 5 5/ =.7 9/ < - / =.9 / < - 4 / =. / 4< - / =. / < - 8 / =. / 8< - 3 / =.45 / Total. 4 Itro - 4

5 Biostatistics Itroductio & Data Presetatio (Short Versio) Histogram (SPSS) Polygo (SPSS) 5 6 Polygo (SPSS) Cumulative R. F. Histogram % % Cumulative R. F. Polygo (Ogive) What to observe i Histograms? % % Outliers: observatios that stad out from the rest for some reaso. Ceter: the middle of the data. Spread: the rage; the extet of the data; how far the values are from each other. Shape: distributio patter. [Skewess, symmetry, uiform, Normal,...] Itro - 5

6 Biostatistics Itroductio & Data Presetatio (Short Versio) Histogram & Desity Curve Symmetric (Bell) shape Skewed to the right, or positively skewed Percet A smooth curve that describes the distributio Bimodal Desity fuctio, f (x) Uiform Skewed to the left, or egatively skewed Use a mathematical model to describe the variable. 3 3 Stemplots (or Stem-ad-leaf plots) -- leadig digits are called stems -- fial digits are called leaves Example: (umber of hysterectomies performed by 5 male doctors) 7,, 33, 5, 86, 5, 85, 3, 37, 44,, 36, 59, 34, Stemplot 34 Example: (umber of hysterectomies performed by 5 male doctors) 7,, 33, 5, 86, 5, 85, 3, 37, 44,, 36, 59, 34, Ordered Stemplot 35 Example: [Back-to-back stem-plot] Number of hysterectomies performed by 5 male doctors: 7,, 33, 5, 86, 5, 85, 3, 37, 44,, 36, 59, 34, 8 by female doctors, the umbers are: 5, 7,, 4, 8, 9, 5, 9, 3, 33 (Female) (Male) Itro - 6

7 Biostatistics Itroductio & Data Presetatio (Short Versio) Box Plot 8 7 Examie Bivariate Data (Bivariate Aalysis) Examie the relatio betwee two variables N = HEIGHT Two Categorical Variables Cotigecy Table Two Categorical Variables Cluster bar chart Smoker No- Smoker Colum Total Cacer No cacer Row Total (%) 5 (5%) 5 (5%) 3 (3%) 45 (45%) 75 (75%) (%) (%) Odds of smoker to have cacer: /3 = 6/9 Odds of osmoker to have cacer: 5/45 = /9 Odds Ratio = (6/9)/(/9) = Two Quatitative Variables Two Quatitative Variables Data: Temperature Mortality Idex Average aual temperature ad the mortality idex for a type of breast cacer i wome i certai regio of Europe. Mortality Idex Average Temperature 4 Itro - 7

8 Biostatistics Itroductio & Data Presetatio (Short Versio) Respose/Explaatory Variables Respose (Depedet, Outcome) Variable Lug Cacer, Mortality Idex Explaatory (Idepedet, Predictor) Variable Smokig, Average Temperature A Categorical & A Quatitative Variables 8 7 Side-by-side Boxplot HEIGHT N = 8 3 Female Male 43 sex 44 Time Plot Rate Time Rate Time Value HRATE Time Plot Yougstow Homicide Rate by Year YEAR Time Plot Misleadig Chart Natioal Homicide Rate By Year Natioal Homicide Rate By Year Natioal Homicide Rate Natioal Homicide Rate Cout Cout 3 YEAR YEAR Female Male Female Male Geder Geder Differet Scales Differet Scales Itro - 8

9 Biostatistics Itroductio & Data Presetatio (Short Versio) Icorrect ad Misleadig Chart Type of Statistical Studies Observatioal Study: coditios to which subjects are exposed are ot cotrolled by the ivestigator. (o attempt is made to cotrol or ifluece the variables of iterest) 49 Experimetal (Cotrolled) Study: coditios to which subjects are exposed to are cotrolled by the ivestigator. (treatmets are used i order to observe the respose) (Radomizatio, Replicatios) Results from observig behavior ad outcomes from the use of medicie for radomly selected patiets. (Patiets chose their medicie) Treatmet Drug A Drug B Hypertesio Yes 44 9 No 56 7 Total Treatmet Drug A Drug B Simpso s Paradox Yes 5 7 Below 65 No 8 Hypertesio Total 3 77 Yes No 38 Total 77 3 Total 73 7 Drug A: 44/ = 44% Drug B: 9/ = 9% 5 * Older patiets prefer Drug A OR <65: Drug A: 5/3 = % Drug B: 7/77 = % OR 65+: Drug A: 39/77 = 5% Drug B: /3 = 5% 5 Cofoudig Effect Treatmet & Treatmet Cofoudig variables Cause? Patiet s Age & Health Coditio Patiet s Survival Variables, whether part of a study or ot, are said to be cofouded whe their effects o the outcome caot be distiguished from each other Age may affect the reactio to drug ad may also affect drug choosig decisio Itro - 9

10 Descriptive Statistics (Short Versio) Example: Birth weights (i lb) of 5 babies bor from two groups of wome uder differet care programs. Group : 7, 6, 8, 7, 7 Group : 3, 4, 8, 9, Numerical Summary Measures Describe Distributio with Numbers Measure of Ceter Measure of Variatio Measure of Positio Measure of Cetral Tedecy Mea: the average value of the data. Example: Birth weights (i lb) of 5 babies bor from a group of wome uder certai diet. 7, 6, 8, 7, 7 If observatios are deoted by x, x,..., x, their (sample) mea is x x + x x i= = = x i Sol: mea = = = [ear the ceter of the data set] 3 4 Media: of a data set is the data value exactly i the middle of its ordered list if the umber of pieces of data is odd, the mea of the two middle data values i its ordered list if the umber of pieces of data is eve. [media is ot iflueced by outliers ad is best for o-symmetric distributio] Example: (umber of hysterectomies performed by 5 doctors) 7,, 33, 5, 86, 5, 85, 3, 37, 44,, 36, 59, 34, 8 ordered list =>, 5, 5, 7, 8, 3, 33, 34, 36, 37, 44,, 59, 85, 86 media = Descriptive Stat -

11 Descriptive Statistics (Short Versio) Example: (Birth weights for 6 ifats.) 5, 7, 6, 8, 5, 9 ordered list => 5, 5, 6, 7, 8, 9 Mode: of a data set is the observatio that occurs most frequetly. media = (6+7) / = Example : (umber of times visited class website by 5 studets) 7,, 33, 5, 86, 5, 85, 3, 37, 44,, 36, 59, 34, 8 ordered list =>, 5, 5, 7, 8, 3, 33, 34, 36, 37, 44,, 59, 85, 86 Mode = 5 Example : (Blood type of 5 studets) A, B, A, A, O, AB, A, A, B, B, O, O, A, A, A Mode = A A 8 B 3 O 3 Mea? Media? Mode? Skewed to the Right AB 9 Measure of Dispersio (Variability) Rage = largest data value smallest data value Sample from group I (diet program I): 7, 6, 8, 7, 7 => mea = ( ) / 5 = 35/5 = 7 Sample from group II (diet program II): 3, 4, 8, 9, => mea = ( ) / 5 = 35/5 = 7 Is there ay differece betwee the two samples? rage of sample I = 8-6 = rage of sample II = - 3 = 8 Does the mother s diet program affect the birth weights of babies? Descriptive Stat -

12 Descriptive Statistics (Short Versio) Example: Birth weights (i lb) of 5 babies bor from a group of wome uder diet program II. 3, 4, 8, 9, mea = x = 7 Variace ad Stadard Deviatio Measure the spread of the data aroud the ceter of the data. Data Value x i Total Deviatio from mea x i x 3 7 = = = 9 7 = 7 = 4 Squared Dev. ( x i x) Sample Variace = 46/4 =.5 lb, Sample Stadard Deviatio = 46/ 4 = 3.39 lb. 4 s A Short Cut formula: i = = i x xi i= 35 9 = 5 =.5 4 Data, x x What is the stadard deviatio of the weights of babies from the sample of mothers who received diet program I? Data: 7, 6, 8, 7, 7 s = (++++)/(5-) = ½ s = =.7 Does the mother s diet program affect the birth weights of babies? Diet I: mea = 7, s =.7 Diet II: mea = 7, s = If observatios are deoted by x, x,..., x, their variace ad stadard deviatio are ( xi x) i= Sample Variace: s = (ubiased estimator for variace of a ifiite populatio.) Sample Mea: Sample Stadard Deviatio: s = x + x x = i= x ( x x) i = i= x i 7 Populatio Parameters If N observatios are deoted by x, x,..., x, are all the observatio i a fiite populatio, their mea, μ, variace σ, ad stadard deviatio, σ, are x + x x Populatio Mea: μ = = x N N i= ( xi μ) i= Populatio Variace: σ = Populatio Stadard Deviatio: N σ i= = ( x μ) i N i 8 Descriptive Stat - 3

13 Descriptive Statistics (Short Versio) About s (sample stadard deviatio) : s measures the spread aroud the mea. the larger s is, the more spread out the data are. if s =, the all the observatios must be equal. s is strogly iflueced by outliers. The Use of Mea ad Stadard Deviatio Describe distributio Uderstad the ceter ad the spread of the distributio 9 Uit: mg/ml Female Boe Desity Data Mea,. x Stadard Deviatio, s 5 May distributios ca be described by a mathematical fuctio with specific parameters, such as mea ad stadard deviatio. Example: Normal Distributio (Bell-shaped) Male 8.4 σ μ Empirical Rule Properties of a symmetric ad bell-shaped (Normal) distributio: The distributio is symmetric about it mea (μ), 68% of the area betwee μ σ ad μ + σ, 95% of the area betwee μ σ ad μ + σ, 99.7% of the area betwee μ 3σ ad μ + 3σ. Heart rates for a certai populatio at a certai coditio follow a bell shape symmetric distributio with mea 7 ad stadard deviatio. What percetage of people i this populatio will have heart rates betwee 66 ad 74? 95%?% μ 3σ μ μ + 3σ Descriptive Stat - 4

14 Descriptive Statistics (Short Versio) Chebychev s Rule Chebychev s iequality There is at least (/k ) of the data i a data set lie withi k stadard deviatio of their mea. Example: Heart rates for asthmatic patiets i a state of respiratory arrest has a mea of 4 beats per miute ad a stadard deviatio of 35.5 beats per miute. What percetage of the populatio of this type of patiets have heart rates lie betwee two stadard deviatios of the mea i a state of respiratory arrest? It will be at least 75%, because k =, ad (/ ) = ¾ = 75%. 5 6 Heart rates example: mea=44, s.d.=35.5 k = 75% = (/ ) What about withi three stadard deviatios? Heart rates example: mea=44, s.d.=35.5 k = 3?% 89% (/3 ) ) At least 75% At At least 89%?% x35.5 = x35.5 = x35.5 = x35.5 = Measure of Positio Z-score (Stadard Score) If x is a observatio from a distributio that has mea μ, ad stadard deviatio σ, the stadardized value of x is, Stadard Score, Percetile, Quartile z-score of x : Populatio z-score x μ x mea z = = σ stadard deviatio μ + 3σ has a z-score 3, sice it is 3 s.d. from mea. 9 3 Descriptive Stat - 5

15 Descriptive Statistics (Short Versio) If a distributio has a mea ad a s.d., the value 7 has a z-score.5. z-score = (7 )/ =.5. Sample z-score z = x x s Example: If the mea of a radom sample is 5 ad the stadard deviatio is, what would be the sample z-score of the value 6? x = 5, s =, x = 6.5 s.d z = = =.5 3 Example: Boe Mieral Desity DEXA BMD Values T score > -. S.D Defiitio Normal boe mieral desity The WHO Workig Group defies osteoporosis accordig to measuremets of boe mieral desity (BMD) usig dualeergy X-ray absorptiometry (DEXA). Thus osteoporosis is defied as a boe desity T score at or below.5 stadard deviatios (T score) below ormal peak values for youg adults. T score betwee. ad.5 SD T score < -.5 SD T score < -.5 SD with or more fragility fractures Osteopaeia Osteoporosis Severe osteoporosis These criteria were iitially established for the assessmet of osteoporosis i Caucasia wome. BMD reports may iclude a Z score which is the umber of stadard deviatios by which the subject of iterest differs from the mea for their age Quartiles: (Measure of Positio) The first quartile, Q, or 5 th percetile, is the media of the lower half of the list of ordered observatios. The third quartile, Q 3, or 75 th percetile, is the media of the upper half of the list of ordered observatios. Example: [odd umber of data values] ( = ),6,63,64,64,65,65,65,66,67,69,7,7,7,7,7,7,7,73,74,75 Q =? 64.5 Media = 69 Q 3 =? 7 Measure of spread: Iterquartile rage (IQR) = Q 3 Q IQR = = Descriptive Stat - 6

16 Descriptive Statistics (Short Versio) Example: [eve umber of data] ( = ) 6,,6,63,64,64,65,65,65,66,67,69,7,7,7,7,7,7,7,73,74,75 Q = 64? Media = 68? Q 3 =? 7 Measure of spread: Iterquartile rage (IQR) = Q 3 Q The five-umber summary.miimum value.q.media.q 3.Maximum value IQR = 7-64 = Example: (data sheet without outlier 6 ),6,63,64,64,65,65,65,66,67,69,7,7,7,7,7,7,7,73,74,75 Mi =, Q = 64.5, Media = 69, Q 3 = 7, Max = With 6 i the data: 6,,6,63,64,64,65,65,65,66,67,69,7,7,7,7,7,7,7,73,74,75 Q = 64 Media = 68 Q 3 = 7 IQR = 7-64 = N = HEIGHT 39 N = HEIGHT 4 Ier ad outer feces for outliers IQR = 7 64 = 8; Q = 64; Q 3 = 7 The ier feces are located at a distace of.5 IQR below Q (lower ier fece = Q -.5 x IQR ) ad at a distace of.5 IQR above Q 3 (upper ier fece = Q x IQR ). The outer feces are located at a distace of 3 IQR below Q (lower outer fece = Q 3 x IQR ) ad at a distace of 3 IQR above Q 3 (upper outer fece = Q x IQR ). The ier feces are located at a distace of.5 IQR below Q (lower ier fece = x 8 = 5 ) ad at a distace of.5 IQR above Q 3 (upper ier fece = x 8 = 84). The outer feces are located at a distace of 3 IQR below Q (lower outer fece = 64 3 x 8 = 4) ad at a distace of 3 IQR above Q 3 (upper outer fece = x 8 = 96). 4 4 Descriptive Stat - 7

17 Descriptive Statistics (Short Versio) 84 8 UIF: x 8 = 84 Ier fece 96 8 UOF:7 + 3 x 8 = 96 Outer fece Ier fece IQR IQR 5 4 LIF: x 8 = 5 Ier fece 4 4 LOF: 64-3 x 8 = 4 Ier fece Outer fece Q = 64; Q 3 = 7; IQR = 7 64 = 8 N = N = HEIGHT HEIGHT Mild ad Extreme outliers Side-by-side Box Plot Data values fallig betwee the ier ad outer feces are cosidered mild outliers. Data values fallig outside the outer feces are cosidered extreme outliers Whe outliers exist, the whisker exteded to the smallest ad largest data values withi the ier fece. HEIGHT N = Female Male 45 sex 46 Remarks: If the distributio of the data is symmetric, the the mea ad media will be about the same. The five-umber summary is best for o-symmetric data. The media, quartiles, iter-quartile rage are ot iflueced by outliers. The mea ad stadard deviatio are most appropriate to use oly if the data are symmetric because both of these measures are easily iflueced by outliers. Boxplot For the followig data: Fid the five-umber-summary & IRQ Make a boxplot Fid the th percetile Descriptive Stat - 8

18 Probability (Short Versio) Probability ad Coutig Rules A researcher claims that % of a large populatio have disease H. A radom sample of people is take from this populatio ad examied. If people i this radom sample have the disease, what does it mea? How likely would this happe if the researcher is right? Sample Space ad Probability Radom Experimet: (Probability Experimet) a experimet whose outcomes deped o chace. Sample Space (S): collectio of all possible outcomes i radom experimet. Evet (E): a collectio of outcomes of iterest i a radom experimet. Sample Space ad Evet Sample Space: S = {Head, Tail} S = {Life spa of a huma} = {x x, x R} Evet: E = {Head} E = {Life spa of a huma is less tha 3 years} 3 A Simple Example What s the probability of gettig a head o the toss of a sigle fair coi? Use a scale from (o way) to (sure thig). So toss a coi twice. Do it! Did you get oe head & oe tail? What s it all mea? 4 Defiitio of Probability A rough defiitio: (frequetist defiitio) Probability of a certai outcome to occur i a radom experimet is the proportio of times that the this outcome would occur i a very log series of repetitios of the radom experimet. Total Heads / Number of Tosses Number of Tosses 5 5 Determiig Probability How to determie probability? Empirical Probability Theoretical Probability (Subjective approach) 6 Probability -

19 Probability (Short Versio) Empirical Probability Assigmet Empirical study: (Do t kow if it is a balaced Coi?) Outcome Head Tail Total Frequecy Empirical Probability Assigmet Empirical probability assigmet: Number of times evet E occurs P(E) = Number of times experimet is repeated m = Probability of Head: P(Head) = 5 =.5 = 5.% 7 8 Empirical Probability Distributio Empirical study: Theoretical Probability Assigmet Make a reasoable assumptio: Outcome Frequecy Probability Head 5.5 Tail Total. Empirical Probability Distributio 9 What is the probability distributio i tossig a coi? Assumptio: We have a balaced coi! Theoretical Probability Assigmet Theoretical probability assigmet: Theoretical Probability Distributio (Model) Empirical study: Number of equally likely outcomes i evet E P(E) = Size of the sample space ( E) = ( S) Probability of Head: P(Head) = =.5 = % Outcome Probability Head. Tail. Total. Empirical Probability Distributio Probability -

20 Probability (Short Versio) Relative Frequecy ad Probability Distributios Number of times visited a doctor from a radom sample of 3 idividuals from a commuity Class Frequecy Relative Frequecy 54.8 P() = P() = P() = P(3) = P(4) = P(5) =. Total Discrete Distributio Relative Frequecy Distributio Relative Frequecy ad Probability Whe selectig oe idividual at radom from a populatio, the probability distributio ad the relative frequecy distributio are the same. 5 Probability for the Discrete Case If a idividual is radomly selected from this group 3, what is the probability that this perso visited doctor 3 times? P(3 times) = (4)/3 Class Frequecy Relative Frequecy Total 3. =.4 or 4% 6 Discrete Distributio If a idividual is radomly selected from this group 3, what is the probability that this perso visited doctor 4 or 5 times? Class Frequecy Total 3. P(4 or 5 times) = P(4) + P(5) Relative =.4 +. Frequecy =.5 It would be a empirical probability distributio, if the sample of 3 idividuals is utilized for uderstadig a large populatio. 7 Properties of Probability Probability is always a value betwee ad. Total probability (all outcomes together) equals. Probability of either oe of the disjoit evets A or B to occur is the sum of their idividual probabilities. P(A or B) = P(A) + P(B) 8 Probability - 3

21 Probability (Short Versio) Complemetatio Rule For ay evet E, P(E does ot occur) = P(E) Complemetatio Rule If a ubalaced coi has a probability of.7 to tur up Head each time tossig this coi. What is the probability of ot gettig a Head for a radom toss? Complemet of E = E * Some places use E c or E EP(E) P(E) E P(ot gettig Head) =.7 =.3 9 Complemetatio Rule Birthday Problem If the chace of a radomly selected idividual livig i commuity A to have disease H is., what is the probability that this perso does ot have disease H? P(havig disease H) =. P(ot havig disease H) = P(havig disease H) =. =.999 I a group of radomly select 3 people, what is the probability that at least two people have the same birth date? (Assume there are 365 days i a year.) P(at least two people have the same birth date) Too hard!!! = P(everybody has differet birth date) = [365x364x x(365-3+)] / S = {,, 3, 4, 5, 6} A = {,, 3} B = {3, 6} Itersectio of evets: A B <=> A ad B Example: A B = {3} Uio of evets: A B <=> A or B Example: A B = {,, 3, 6} A B S Ve Diagram (with elemets listed) 3 Ve Diagram (with couts) Give total of subjects 3 A A=Smokers, (A) = B=Lug Cacer, (B) = 5 5 B (A B) =?55 A B (A B) =? 45 Joit Evet 4 Probability - 4

22 Probability (Short Versio) Ve Diagram (with relative frequecies) Give a sample space.3. A. 5 B P(A B) =.55 Cotigecy Table Cacer, B No Cacer, B c Total Smoke, A 3 Not Smoke, A c ?.45 Ve Diagram A B A=Smokers, P(A) =. B=Lug Cacer, P(B) =.5 A B P(A B) =. Joit Evet 5 6 Coditioal Probability The coditioal probability of evet A to occur give evet B has occurred (or give the coditio B) is deoted as P(A B) ad is, if P(B) is ot zero, (E) = # of equally likely outcomes i E, P( A B) ( A B) P( A B) = or P( A B) = P( B) ( B) A B Coditioal Probability Smoke S Not Smoke S Cacer C 5 5 No Cacer C ( A B) P( A B) = ( B) Total 7 P(C S) = / =.4 P(C S ' ) = 5/ =. 8 Coditioal Probability Smoke S Not Smoke S Cacer C (.) 5 (.5) 5 P(C) =(.5) P(C S) =./.5 =.4 P(C S ' ) =.5/.5 =. No Cacer C 3 (.3) 45 (.45) 75 P(C)=(.75) P( A B) P( A B) = P( B) Total P(S) =(.5) P(S ) =(.5) (.) P(C S) What is P(C S ) =? 4 (Relative Risk ) 9 Idepedet Evets Evets A ad B are idepedet if P(A B) = P(A) or P(B A) = P(B) or P(A ad B) = P(A) P(B) 3 Probability - 5

23 Probability (Short Versio) Example If a balaced die is rolled twice, what is the probability of havig two 6 s? 6 = the evet of gettig a 6 o the st trial 6 = the evet of gettig a 6 o the d trial P(6 ) = /6, P(6 ) = /6, 6 ad 6 are idepedet evets P(6 ad 6 ) = P(6 ) P(6 ) = (/6)(/6) = /36 3 Idepedet Evets % of the people i a large populatio has disease H. If a radom sample of two subjects was selected from this populatio, what is the probability that both subjects have disease H? H i : Evet that the i-th radomly selected subject has disease H. P(H H ) = P(H ) [Evets are almost idepedet] P(H H ) =? P(H ) P(H ) =. x. =. 3 Idepedet Evets If evets A, A,, A k are idepedet, the P(A ad A ad ad A k ) = P(A ) P(A ) P(A k ) What is the probability of gettig all heads i tossig a balaced coi four times experimet? P(H ) P(H ) P(H 3 ) P(H 4 ) = (.5) 4 = Biomial Probability What is the probability of gettig two 6 s i castig a balaced die 5 times experimet? P(S S S S S ) = (/6) x (5/6) 3 =.6 P(S S S S S ) = (/6) x (5/6) 3 P(S S S S S ) = (/6) x (5/6) 3 5 5! How may of them? = =!3! Probability (two 6 s) =.6 x =.6 34 Multiplicatio Rule (Geeral Multiplicatio Rule) For ay two evets A ad B, P(A ad B) = P(A B) P(B) = P(B A) P(A) Multiplicatio Rule If i the populatio, % of the people smoked, ad 4% of the smokers have lug cacer, what percetage of the populatio that are smoker ad have lug cacer? P(A B) = P(B A) = P(A ad B) P(B) P(A ad B) P(A) 35 P(S) = % of the subjects smoked P(C S) = 4% of the smokers have cacer P(C ad S) = P(C S) P(S) =.4 x.5 =. 36 Probability - 6

Data Description. Measure of Central Tendency. Data Description. Chapter x i

Data Description. Measure of Central Tendency. Data Description. Chapter x i Data Descriptio Describe Distributio with Numbers Example: Birth weights (i lb) of 5 babies bor from two groups of wome uder differet care programs. Group : 7, 6, 8, 7, 7 Group : 3, 4, 8, 9, Chapter 3

More information

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying. Lecture Mai Topics: Defiitios: Statistics, Populatio, Sample, Radom Sample, Statistical Iferece Type of Data Scales of Measuremet Describig Data with Numbers Describig Data Graphically. Defiitios. Example

More information

Chapter 2 Descriptive Statistics

Chapter 2 Descriptive Statistics Chapter 2 Descriptive Statistics Statistics Most commoly, statistics refers to umerical data. Statistics may also refer to the process of collectig, orgaizig, presetig, aalyzig ad iterpretig umerical data

More information

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements. CHAPTER 2 umerical Measures Graphical method may ot always be sufficiet for describig data. You ca use the data to calculate a set of umbers that will covey a good metal picture of the frequecy distributio.

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Quick Review of Probability

Quick Review of Probability Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter & Teachig Material.

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Quick Review of Probability

Quick Review of Probability Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter 2 & Teachig

More information

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers Chapter 4 4-1 orth Seattle Commuity College BUS10 Busiess Statistics Chapter 4 Descriptive Statistics Summary Defiitios Cetral tedecy: The extet to which the data values group aroud a cetral value. Variatio:

More information

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls Ecoomics 250 Assigmet 1 Suggested Aswers 1. We have the followig data set o the legths (i miutes) of a sample of log-distace phoe calls 1 20 10 20 13 23 3 7 18 7 4 5 15 7 29 10 18 10 10 23 4 12 8 6 (1)

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

CHAPTER SUMMARIES MAT102 Dr J Lubowsky Page 1 of 13 Chapter 1: Introduction to Statistics

CHAPTER SUMMARIES MAT102 Dr J Lubowsky Page 1 of 13 Chapter 1: Introduction to Statistics CHAPTER SUMMARIES MAT102 Dr J Lubowsky Page 1 of 13 Chapter 1: Itroductio to Statistics Misleadig Iformatio: Surveys ad advertisig claims ca be biased by urepresetative samples, biased questios, iappropriate

More information

Median and IQR The median is the value which divides the ordered data values in half.

Median and IQR The median is the value which divides the ordered data values in half. STA 666 Fall 2007 Web-based Course Notes 4: Describig Distributios Numerically Numerical summaries for quatitative variables media ad iterquartile rage (IQR) 5-umber summary mea ad stadard deviatio Media

More information

What is Probability?

What is Probability? Quatificatio of ucertaity. What is Probability? Mathematical model for thigs that occur radomly. Radom ot haphazard, do t kow what will happe o ay oe experimet, but has a log ru order. The cocept of probability

More information

Formulas and Tables for Gerstman

Formulas and Tables for Gerstman Formulas ad Tables for Gerstma Measuremet ad Study Desig Biostatistics is more tha a compilatio of computatioal techiques! Measuremet scales: quatitative, ordial, categorical Iformatio quality is primary

More information

Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes

Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes Admiistrative Notes s - Lecture 7 Fial review Fial Exam is Tuesday, May 0th (3-5pm Covers Chapters -8 ad 0 i textbook Brig ID cards to fial! Allowed: Calculators, double-sided 8.5 x cheat sheet Exam Rooms:

More information

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for

More information

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2 Aa Jaicka Mathematical Statistics 18/19 Lecture 1, Parts 1 & 1. Descriptive Statistics By the term descriptive statistics we will mea the tools used for quatitative descriptio of the properties of a sample

More information

(# x) 2 n. (" x) 2 = 30 2 = 900. = sum. " x 2 = =174. " x. Chapter 12. Quick math overview. #(x " x ) 2 = # x 2 "

(# x) 2 n. ( x) 2 = 30 2 = 900. = sum.  x 2 = =174.  x. Chapter 12. Quick math overview. #(x  x ) 2 = # x 2 Chapter 12 Describig Distributios with Numbers Chapter 12 1 Quick math overview = sum These expressios are algebraically equivalet #(x " x ) 2 = # x 2 " (# x) 2 Examples x :{ 2,3,5,6,6,8 } " x = 2 + 3+

More information

Binomial Distribution

Binomial Distribution 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

(6) Fundamental Sampling Distribution and Data Discription

(6) Fundamental Sampling Distribution and Data Discription 34 Stat Lecture Notes (6) Fudametal Samplig Distributio ad Data Discriptio ( Book*: Chapter 8,pg5) Probability& Statistics for Egieers & Scietists By Walpole, Myers, Myers, Ye 8.1 Radom Samplig: Populatio:

More information

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY Structure 2.1 Itroductio Objectives 2.2 Relative Frequecy Approach ad Statistical Probability 2. Problems Based o Relative Frequecy 2.4 Subjective Approach

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Lecture 1 Probability and Statistics

Lecture 1 Probability and Statistics Wikipedia: Lecture 1 Probability ad Statistics Bejami Disraeli, British statesma ad literary figure (1804 1881): There are three kids of lies: lies, damed lies, ad statistics. popularized i US by Mark

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Axioms of Measure Theory

Axioms of Measure Theory MATH 532 Axioms of Measure Theory Dr. Neal, WKU I. The Space Throughout the course, we shall let X deote a geeric o-empty set. I geeral, we shall ot assume that ay algebraic structure exists o X so that

More information

Describing the Relation between Two Variables

Describing the Relation between Two Variables Copyright 010 Pearso Educatio, Ic. Tables ad Formulas for Sulliva, Statistics: Iformed Decisios Usig Data 010 Pearso Educatio, Ic Chapter Orgaizig ad Summarizig Data Relative frequecy = frequecy sum of

More information

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all! ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Solutios Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced

More information

STP 226 EXAMPLE EXAM #1

STP 226 EXAMPLE EXAM #1 STP 226 EXAMPLE EXAM #1 Istructor: Hoor Statemet: I have either give or received iformatio regardig this exam, ad I will ot do so util all exams have bee graded ad retured. PRINTED NAME: Siged Date: DIRECTIONS:

More information

Introduction to Probability and Statistics Twelfth Edition

Introduction to Probability and Statistics Twelfth Edition Itroductio to Probability ad Statistics Twelfth Editio Robert J. Beaver Barbara M. Beaver William Medehall Presetatio desiged ad writte by: Barbara M. Beaver Itroductio to Probability ad Statistics Twelfth

More information

Topic 10: Introduction to Estimation

Topic 10: Introduction to Estimation Topic 0: Itroductio to Estimatio Jue, 0 Itroductio I the simplest possible terms, the goal of estimatio theory is to aswer the questio: What is that umber? What is the legth, the reactio rate, the fractio

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

As stated by Laplace, Probability is common sense reduced to calculation.

As stated by Laplace, Probability is common sense reduced to calculation. Note: Hadouts DO NOT replace the book. I most cases, they oly provide a guidelie o topics ad a ituitive feel. The math details will be covered i class, so it is importat to atted class ad also you MUST

More information

Chapter 1 (Definitions)

Chapter 1 (Definitions) FINAL EXAM REVIEW Chapter 1 (Defiitios) Qualitative: Nomial: Ordial: Quatitative: Ordial: Iterval: Ratio: Observatioal Study: Desiged Experimet: Samplig: Cluster: Stratified: Systematic: Coveiece: Simple

More information

Tables and Formulas for Sullivan, Fundamentals of Statistics, 2e Pearson Education, Inc.

Tables and Formulas for Sullivan, Fundamentals of Statistics, 2e Pearson Education, Inc. Table ad Formula for Sulliva, Fudametal of Statitic, e. 008 Pearo Educatio, Ic. CHAPTER Orgaizig ad Summarizig Data Relative frequecy frequecy um of all frequecie Cla midpoit: The um of coecutive lower

More information

Chapter 4 - Summarizing Numerical Data

Chapter 4 - Summarizing Numerical Data Chapter 4 - Summarizig Numerical Data 15.075 Cythia Rudi Here are some ways we ca summarize data umerically. Sample Mea: i=1 x i x :=. Note: i this class we will work with both the populatio mea µ ad the

More information

MEASURES OF DISPERSION (VARIABILITY)

MEASURES OF DISPERSION (VARIABILITY) POLI 300 Hadout #7 N. R. Miller MEASURES OF DISPERSION (VARIABILITY) While measures of cetral tedecy idicate what value of a variable is (i oe sese or other, e.g., mode, media, mea), average or cetral

More information

Discrete probability distributions

Discrete probability distributions Discrete probability distributios I the chapter o probability we used the classical method to calculate the probability of various values of a radom variable. I some cases, however, we may be able to develop

More information

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS 8.1 Radom Samplig The basic idea of the statistical iferece is that we are allowed to draw ifereces or coclusios about a populatio based

More information

Understanding Dissimilarity Among Samples

Understanding Dissimilarity Among Samples Aoucemets: Midterm is Wed. Review sheet is o class webpage (i the list of lectures) ad will be covered i discussio o Moday. Two sheets of otes are allowed, same rules as for the oe sheet last time. Office

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700 Class 7 Daiel B. Rowe, Ph.D. Departmet of Mathematics, Statistics, ad Computer Sciece Copyright 013 by D.B. Rowe 1 Ageda: Skip Recap Chapter 10.5 ad 10.6 Lecture Chapter 11.1-11. Review Chapters 9 ad 10

More information

Computing Confidence Intervals for Sample Data

Computing Confidence Intervals for Sample Data Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9 BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous

More information

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process. Iferetial Statistics ad Probability a Holistic Approach Iferece Process Chapter 8 Poit Estimatio ad Cofidece Itervals This Course Material by Maurice Geraghty is licesed uder a Creative Commos Attributio-ShareAlike

More information

Module 1 Fundamentals in statistics

Module 1 Fundamentals in statistics Normal Distributio Repeated observatios that differ because of experimetal error ofte vary about some cetral value i a roughly symmetrical distributio i which small deviatios occur much more frequetly

More information

Statistics Independent (X) you can choose and manipulate. Usually on x-axis

Statistics Independent (X) you can choose and manipulate. Usually on x-axis Statistics-6000 Variable: are characteristic that ca take o differet values with respect to persos, time, ad place ad types of variables are as follow: Idepedet (X) you ca choose ad maipulate. Usually

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

Central Limit Theorem the Meaning and the Usage

Central Limit Theorem the Meaning and the Usage Cetral Limit Theorem the Meaig ad the Usage Covetio about otatio. N, We are usig otatio X is variable with mea ad stadard deviatio. i lieu of sayig that X is a ormal radom Assume a sample of measuremets

More information

Lecture 1 Probability and Statistics

Lecture 1 Probability and Statistics Wikipedia: Lecture 1 Probability ad Statistics Bejami Disraeli, British statesma ad literary figure (1804 1881): There are three kids of lies: lies, damed lies, ad statistics. popularized i US by Mark

More information

Lecture 24 Floods and flood frequency

Lecture 24 Floods and flood frequency Lecture 4 Floods ad flood frequecy Oe of the thigs we wat to kow most about rivers is what s the probability that a flood of size will happe this year? I 100 years? There are two ways to do this empirically,

More information

Analysis of Experimental Data

Analysis of Experimental Data Aalysis of Experimetal Data 6544597.0479 ± 0.000005 g Quatitative Ucertaity Accuracy vs. Precisio Whe we make a measuremet i the laboratory, we eed to kow how good it is. We wat our measuremets to be both

More information

Summarizing Data. Major Properties of Numerical Data

Summarizing Data. Major Properties of Numerical Data Summarizig Data Daiel A. Meascé, Ph.D. Dept of Computer Sciece George Maso Uiversity Major Properties of Numerical Data Cetral Tedecy: arithmetic mea, geometric mea, media, mode. Variability: rage, iterquartile

More information

Elementary Statistics

Elementary Statistics Elemetary Statistics M. Ghamsary, Ph.D. Sprig 004 Chap 0 Descriptive Statistics Raw Data: Whe data are collected i origial form, they are called raw data. The followig are the scores o the first test of

More information

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}. 1 (*) If a lot of the data is far from the mea, the may of the (x j x) 2 terms will be quite large, so the mea of these terms will be large ad the SD of the data will be large. (*) I particular, outliers

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

Probability and statistics: basic terms

Probability and statistics: basic terms Probability ad statistics: basic terms M. Veeraraghava August 203 A radom variable is a rule that assigs a umerical value to each possible outcome of a experimet. Outcomes of a experimet form the sample

More information

Exam 2 Instructions not multiple versions

Exam 2 Instructions not multiple versions Exam 2 Istructios Remove this sheet of istructios from your exam. You may use the back of this sheet for scratch work. This is a closed book, closed otes exam. You are ot allowed to use ay materials other

More information

Sets and Probabilistic Models

Sets and Probabilistic Models ets ad Probabilistic Models Berli Che Departmet of Computer ciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Referece: - D. P. Bertsekas, J. N. Tsitsiklis, Itroductio to Probability, ectios 1.1-1.2

More information

Final Review for MATH 3510

Final Review for MATH 3510 Fial Review for MATH 50 Calculatio 5 Give a fairly simple probability mass fuctio or probability desity fuctio of a radom variable, you should be able to compute the expected value ad variace of the variable

More information

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece

More information

Read through these prior to coming to the test and follow them when you take your test.

Read through these prior to coming to the test and follow them when you take your test. Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Chapter two: Hypothesis testing

Chapter two: Hypothesis testing : Hypothesis testig - Some basic cocepts: - Data: The raw material of statistics is data. For our purposes we may defie data as umbers. The two kids of umbers that we use i statistics are umbers that result

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6) STAT 350 Hadout 9 Samplig Distributio, Cetral Limit Theorem (6.6) A radom sample is a sequece of radom variables X, X 2,, X that are idepedet ad idetically distributed. o This property is ofte abbreviated

More information

Elements of Statistical Methods Lots of Data or Large Samples (Ch 8)

Elements of Statistical Methods Lots of Data or Large Samples (Ch 8) Elemets of Statistical Methods Lots of Data or Large Samples (Ch 8) Fritz Scholz Sprig Quarter 2010 February 26, 2010 x ad X We itroduced the sample mea x as the average of the observed sample values x

More information

Biostatistics for Med Students. Lecture 2

Biostatistics for Med Students. Lecture 2 Biostatistics for Med Studets Lecture 2 Joh J. Che, Ph.D. Professor & Director of Biostatistics Core UH JABSOM JABSOM MD7 February 22, 2017 Lecture Objectives To uderstad basic research desig priciples

More information

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Fial Review Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech 1 Radom samplig model radom samples populatio radom samples: x 1,..., x

More information

CS 330 Discussion - Probability

CS 330 Discussion - Probability CS 330 Discussio - Probability March 24 2017 1 Fudametals of Probability 11 Radom Variables ad Evets A radom variable X is oe whose value is o-determiistic For example, suppose we flip a coi ad set X =

More information

2: Describing Data with Numerical Measures

2: Describing Data with Numerical Measures : Describig Data with Numerical Measures. a The dotplot show below plots the five measuremets alog the horizotal axis. Sice there are two s, the correspodig dots are placed oe above the other. The approximate

More information

PRACTICE PROBLEMS FOR THE FINAL

PRACTICE PROBLEMS FOR THE FINAL PRACTICE PROBLEMS FOR THE FINAL Math 36Q Fall 25 Professor Hoh Below is a list of practice questios for the Fial Exam. I would suggest also goig over the practice problems ad exams for Exam ad Exam 2 to

More information

AMS570 Lecture Notes #2

AMS570 Lecture Notes #2 AMS570 Lecture Notes # Review of Probability (cotiued) Probability distributios. () Biomial distributio Biomial Experimet: ) It cosists of trials ) Each trial results i of possible outcomes, S or F 3)

More information

Statisticians use the word population to refer the total number of (potential) observations under consideration

Statisticians use the word population to refer the total number of (potential) observations under consideration 6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space

More information

MATH/STAT 352: Lecture 15

MATH/STAT 352: Lecture 15 MATH/STAT 352: Lecture 15 Sectios 5.2 ad 5.3. Large sample CI for a proportio ad small sample CI for a mea. 1 5.2: Cofidece Iterval for a Proportio Estimatig proportio of successes i a biomial experimet

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Chapter 18 Summary Sampling Distribution Models

Chapter 18 Summary Sampling Distribution Models Uit 5 Itroductio to Iferece Chapter 18 Summary Samplig Distributio Models What have we leared? Sample proportios ad meas will vary from sample to sample that s samplig error (samplig variability). Samplig

More information

Lecture 5. Random variable and distribution of probability

Lecture 5. Random variable and distribution of probability Itroductio to theory of probability ad statistics Lecture 5. Radom variable ad distributio of probability prof. dr hab.iż. Katarzya Zarzewsa Katedra Eletroii, AGH e-mail: za@agh.edu.pl http://home.agh.edu.pl/~za

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions Chapter 11: Askig ad Aswerig Questios About the Differece of Two Proportios These otes reflect material from our text, Statistics, Learig from Data, First Editio, by Roxy Peck, published by CENGAGE Learig,

More information

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n, CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

Lecture 2: Probability, Random Variables and Probability Distributions. GENOME 560, Spring 2017 Doug Fowler, GS

Lecture 2: Probability, Random Variables and Probability Distributions. GENOME 560, Spring 2017 Doug Fowler, GS Lecture 2: Probability, Radom Variables ad Probability Distributios GENOME 560, Sprig 2017 Doug Fowler, GS (dfowler@uw.edu) 1 Course Aoucemets Problem Set 1 will be posted Due ext Thursday before class

More information

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36 Probability Distributios A Example With Dice If X is a radom variable o sample space S, the the probablity that X takes o the value c is Similarly, Pr(X = c) = Pr({s S X(s) = c} Pr(X c) = Pr({s S X(s)

More information

Power and Type II Error

Power and Type II Error Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis Sectio 9.2 Tests About a Populatio Proportio P H A N T O M S Parameters Hypothesis Assess Coditios Name the Test Test Statistic (Calculate) Obtai P value Make a decisio State coclusio Sectio 9.2 Tests

More information

DAWSON COLLEGE DEPARTMENT OF MATHEMATICS 201-BZS-05 PROBABILITY AND STATISTICS FALL 2015 FINAL EXAM

DAWSON COLLEGE DEPARTMENT OF MATHEMATICS 201-BZS-05 PROBABILITY AND STATISTICS FALL 2015 FINAL EXAM DAWSON COLLEGE DEPARTMENT OF MATHEMATICS 201-BZS-05 PROBABILITY AND STATISTICS FALL 2015 FINAL EXAM Name: Date: December 24th, 2015 Studet Number: Time: 9:30 12:30 Grade: / 116 Examier: Matthew MARCHANT

More information

Economics Spring 2015

Economics Spring 2015 1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures

More information

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1 PH 425 Quatum Measuremet ad Spi Witer 23 SPIS Lab Measure the spi projectio S z alog the z-axis This is the experimet that is ready to go whe you start the program, as show below Each atom is measured

More information

STATS 200: Introduction to Statistical Inference. Lecture 1: Course introduction and polling

STATS 200: Introduction to Statistical Inference. Lecture 1: Course introduction and polling STATS 200: Itroductio to Statistical Iferece Lecture 1: Course itroductio ad pollig U.S. presidetial electio projectios by state (Source: fivethirtyeight.com, 25 September 2016) Pollig Let s try to uderstad

More information

MATHEMATICAL SCIENCES

MATHEMATICAL SCIENCES SET7-Math.Sc.-II-D Roll No. 57 (Write Roll Number from left side exactly as i the Admit Card) Subject Code : 5 PAPER II Sigature of Ivigilators.. Questio Booklet Series Questio Booklet No. (Idetical with

More information