Statistics. Nicodème Paul Faculté de médecine, Université de Strasbourg. 9/5/2018 Statistics

Size: px
Start display at page:

Download "Statistics. Nicodème Paul Faculté de médecine, Université de Strasbourg. 9/5/2018 Statistics"

Transcription

1 Statistics Nicodème Paul Faculté de médecine, Université de Strasbourg file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 1/62

2 Course logistics Statistics Course website: ( Lecture slides and lecture notes Lectures, quizzes and practical exercises Exam file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 2/62 2/62

3 Course structure Descriptive statistics and probability Estimation Hypothesis testing (one sample, two-sample tests) Independence test (Chi-square, Fisher's exact) One-way ANOVA file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 3/62 3/62

4 Statistics - De nition A statistic is a quantity or numerical value calculated from a set of data. - Average height of people living in Strasbourg Statistics refer to global caracteristics of population Number of people who smoke Number of people owning a car Relation between smoking and owing a car Statistics is the scienti c discipline that provides methods to make sense of data. - - Descriptive statistics : collecting, summarizing and presenting data Inferential statistics : making inferences, hypothesis testing, determining relationships and making predictions file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 4/62 4/62

5 Statistics applications Biology - Comparison betwen two population of mice: knockout versus wildtype Medecine - Perform clinical trials and data analysis Pharmacy - Knowing whether a new drug is better than the current one Finance - Pricing and portfolio management, risk modelling Agriculture - Plant breeding, the study of the in uence of particular factors on agricultural production, measuring of contribution of production factors, fertilizers and technical progress. file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 5/62 5/62

6 Terminology A population is collection of individuals or objects about which information is desired. A sample is a subset of the population selected for study. A random sample of size n is a sample that is selected in such a way that ensures that every di erent possible sample of the desired size has the same chance of being selected. A variable is any characteristic whose value may change from one individual or object to another. A variable can be categorical: - - Nominal (color : red, black, green, white) Ordinal (size : small, medium, big) A variable can be numerical: - - Discrete (number of s received per day) Continuous (height, weight) file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 6/62 6/62

7 Data Example. Consider di erent activity levels in sports such as weak (W), moderate (M) and intense (I). A sample of ten individuals results in di erent possible data sets. Univariate data Activity : W M M W I W I M W W Age : Bivariate data Activity, Age : (W, 35) (M, 33) (M, 50) (W, 21) (I, 40) (W, 39) (I, 51) (M, 47) (W, 36) (W, 30) file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 7/62 7/62

8 Multivariate Data Variables as columns and individuals as rows Multivariate sample data set ID SBP TOBACCO LDL ADIPOSITY FAMHIST OBESITY ALCOHOL AGE CHD Present Absent Present Present Present Present Absent Present Present Present file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 8/62 8/62

9 Categorical data representation Sample data for activity levels M M M I I M I M I I W M M I M I I M I M W M M M M M M M M W M I I M M M M M M I M I W M M M I M M M M I W M I I M W M M W M M M M M M W I M M M M M M M I M I W M I M M M M M M M M file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 9/62 9/62

10 Categorical data representation - barplot Frequency distribution ACTIVITY FREQUENCY RELATIVE FREQUENCY Weak Moderate Intense file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 10/62 10/62

11 Group comparison for categorical data file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 11/62 11/62

12 Continuous data representation - histogram B j ( x 0, h) = [ x 0 + (j 1)h, x 0 + jh[, j Z 1, I{ x i B j ( x 0, h)} = { 0, if (, h) x i B j x 0 otherwise n i=1 1 hist(x) = I{ x i B j ( x 0, h)}i{x B j ( x 0, h)} hn j Z file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 12/62 12/62

13 Histogram file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 13/62 13/62

14 Histogram The histogram is a method of displaying data. It displays the shape of the distribution of data values. The range of the data is divided into intervals proportion of the observations falling in each bin c i ], ] a i is plotted. b i or bins, and the number or A histogram is said to be unimodal if it has a single peak, bimodal if it has two peaks and multimodal if it has more than two peaks. A histogram is symmetric if there is a vertical line of symmetry such that the part of the histogram to the left of the line is a mirror image of the part to the right. A unimodal histogram that is not symmetric is said to be skewed. - - If the upper tail of the histogram stretches out much farther than the lower tail, then the distribution of values is positively skewed or right skewed. If the lower tail is much longer than the upper tail, the histogram is negatively skewed or left skewed. file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 14/62 14/62

15 Histogram file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 15/62 15/62

16 Comparison between groups for variable SBP file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 16/62 16/62

17 Comparison between groups for variable SBP file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 17/62 17/62

18 Comparison between groups for variable Age file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 18/62 18/62

19 Comparison between groups for variable Age file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 19/62 19/62

20 Measures of location x 1 x 2 x n by x, is: The sample mean of a sample consisting of numerical observations,,...,, denoted xˉ = 1 n n i=1 x i The population mean, denoted by μ, is the average of all x values in the entire population. The sample median or Q 2 is obtained by rst ordering the n observations from smallest to largest as. Then: x (1) x (2)... x (n) sample median = x (n+1)/2, if n is odd 1 ( + ), if n is even 2 x n x n 2 ( 2 +1) file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 20/62 20/62

21 Measures of location Data = 75, 69, 88, 93, 95, 54, 87, 88, 27 Ordered data = 27, 54, 69, 75, 87, 88, 88, 93, 95 Sample median = 87 If Data = 100, 75, 69, 88, 93, 95, 54, 87, 88, What is the median? Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 21/62 21/62

22 Measures of location For any particular number r between 0 and 100, the rth percentile is a value such that r percent of the observations in the data set fall at or below that value. The lower quartile or 25th percentile or Q 1 The upper quartile or 75th percentile or Q 3 is the median of the lower half of the sample. is the median of the upper half of the sample. The mode is the most observed value file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 22/62 22/62

23 Robust statistics file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 23/62 23/62

24 Check yourself The value can be used as a measure of skewness (either right or left). If this statistic is less than 1, the distribution is most likely left skewed. True False mean median Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 24/62 24/62

25 Measures of dispersion The sample variance, denoted by s 2, is the sum of squared deviations from the mean divided by n 1. That is, s 2 1 = ( x i xˉ) 2 n 1 i=1 n The sample standard deviation is the positive square root of the sample variance and is denoted by s. σ 2 n The variance, denoted by, is the sum of squared deviations from the mean divided by. That is, n σ 2 1 = ( x i μ) 2 n i=1 file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 25/62 25/62

26 Check yourself Which of the below data sets has the lowest standard deviation? You do not need to calculate the exact standard deviations to answer this question. 0,1,2,3,4,5,6 0,1,3,3,3,5,6 100, 100, 100, 100, 100, 100, 101 0, 25, 50, 100, 125, 150, 1000 Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 26/62 26/62

27 Measures of dispersion The standard deviation of the population is the positive square root of the variance and is σ denoted by. The interquartile range (IRQ), is a measure of variability de ned as: IRQ = upper quartile lower quartile An observation is an outlier if it is more than 1.5(IRQ) away from the nearest quartile. An outlier is extreme if it is more than 3(IRQ) from the nearest quartile and it is mild otherwise. The coe cient of variation (CV)is a normalized measure of variability de ned as: s CV = 100 xˉ file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 27/62 27/62

28 Boxplot: description file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 28/62 28/62

29 Example: boxplot comparison file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 29/62 29/62

30 Check yourself Which of the following statements is supported by the plot? The mean of the distribution is smaller than its median It is not possible to estimate the median without knowing the sample size The distribution is multimodal The IQR of the distribution is roughly 10 Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 30/62 30/62

31 Check yourself Which of the following statements is not supported by the plot? Both distributions are unimodal B is more variable than A Median of A is higher than median of B Both distributions are roughly symmetric Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 31/62 31/62

32 Check yourself Suppose we have a drug that we know, from long experience, cures a patient with some speci c illness in 70% of cases. A new drug is proposed as having a higher cure rate than the present one. To assess this claim, the new drug is given to 1000 people su ering from the illness, among these, 741 are cured. Do we have signi cant evidence that this new drug is better than the current one? Yes No Need more information to decide Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 32/62 32/62

33 Probability - motivation Suppose we have a drug that we know, from long experience, cures a patient with some speci c illness in 70% of cases. A new drug is proposed as having a higher cure rate than the present one. To assess this claim, the new drug is given to 1000 people su ering from the illness, among these, 741 are cured. Do we have signi cant evidence that this new drug is better than the current one? Consider the following hypotheses: - H 0 : the new drug is equally e ective than the the current one (hypothesis of no e ect or no di erence or not better) - H 1 : the new drug is better than the current one Probability calculation - If the new drug is equally e ective as the current one, how likely is it that, by chance, 741 or more people given the new drug will be cured? Statisical inference - Based on the above probability calculation, the data may provide convincing evidence that the new drug is better than the current one. file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 33/62 33/62

34 Check yourself Suppose that the probabiltiy to observe 741 or more cured patients under the assumption that the new medicine in no better that the old is Do the data provide convincing evidence that the new drug is better than the current one? Yes No Need more information Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 34/62 34/62

35 Probability - terminology A random experiment is any activity or situation in which there is uncertainty about which of two or more possible outcomes will result. A bernoulli trial is a random experiment with exactly two possible outcomes: success or failure. - Tossing a coin with Head or H and Tail or T as possible outcome - A patient can be cured by the new medicine or not The collection of all possible outcomes of a random experiment is the sample space Ω the experiment. An outcome from the sample is denoted as. ω for Examples of sample space: Ω = {H, T}, Ω = {HH, HT, T H, T T} An event E is any collection of outcomes from the sample space of a chance experiment. A simple event is an event consisting of exactly one outcome. Tossing a coin twice and obtain at least one head : E = {HH, HT, T H} file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 35/62 35/62

36 Probability - axioms A function that assigns a real number to each event is a probability distribution or a P P(A) A probability measure if it satis es the following three axioms: P(A) 0 1. for every A 2. P(Ω) = 1 A 1, A 2,... A i A j = i j 3. If are disjoint, meanning for, i P( ) = P( ) A i i=1 A i file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 36/62 36/62

37 Random variables - De nition A random variable X is a real-valued function de ned on a sample space. In other terms, a random variable associates a numerical value to each outcome of a random experiment. A random variable X is discrete if its set of possible values is discrete. Otherwise, it is continous. X(H) = 1 X(T) = 0 Tossing a coin:,. We noted X = {0, 1} Drug trial: number of patient cured by the new medecine in a sample of a 1000 patients. X = {0, 1, 2,..., 1000} file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 37/62 37/62

38 Discrete probability - distribution The probability distribution of a discrete random variable X taking values in {,,..., } can be represented by a table: Probability distribution of X X x 1 x 2... x n P p 1 p 2... p n 0 p i 1 n i=1 p i = 1 x 1 x 2 x n Drug trial with Cured = 1 and Not cured = 0 Probability distribution example X P file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 38/62 38/62

39 Couple of discrete random variables Given two discrete random variables X and Y, we can de ne a new random variable (X, Y) whose joint distribution is de ned by: n i=1 m j=1 p ij 0 p ij 1 = 1 p ij = P(X = x i, Y = y j ) with,. The distribution can be represented as a table: Y X x 1 x 2 y 1 p 11 p 21 y 2 p 12 p y m p 1m p 2m x n p n1 p n2... p nm x i p i. m j=1 p ij The marginal distribution of X : P(X = ) = = y j p.j n i=1 p ij The marginal distribution of Y : P(Y = ) = = file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 39/62 39/62

40 Example - Diagnosing Tuberculosis (TB) Before 1998, culturing was the existing gold standard for diagnosing TB This method took 10 to 15 days to yield a positive or negative result. In 1998, investigators evaluated a DNA technique that turned out to be much faster ("LCx: A Diagnostic Alternative for the Early Detection of Mycobacterium tuberculosis Complex," Diagnostic Microbiology and Infectious Diseases [1998]: ). T models the outcome of the gold standard method: 1 indicates TB, 0 not TB N models the outcome of the DNA test: 1 indicates positive test, 0 negative test The data is summarized in the following table: T N file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 40/62 40/62

41 Example - Joint distribution calculation T N P(N = 0, T = 0) =, P(N = 0, T = 1) = P(N = 1, T = 0) =, P(N = 1, T = 1) = T N file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 41/62 41/62

42 Check yourself Calculate P(T = 1). Choose the right answer Not de ned Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 42/62 42/62

43 Check yourself Calculate P(N = 0). Choose the right answer Not de ned Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 43/62 43/62

44 Parameters of a random variable X The expectation of a discrete random variable X taking the values probabily values p 1, p 2,..., p n is the number:,,..., x 1 x 2 x n, with n μ = E[X] = i=1 x i p i We call variance of X, the number if it exists: σ 2 = V(X) = E[(X E[X] ) 2 ] = E[ X 2 ] E[X ] 2 n = p i ( x i μ) 2 i=1 σ X is called the standard deviation of. X Y E[X] E[Y ] a b If and are two random variables with expected values and, and two real numbers, we have the following: E[X + Y ] = E[X] + E[Y ], E[aX + b] = ae[x] + b file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 44/62 44/62

45 Discrete distribution - Bernoulli p L(X) = B(p) A random variable X follows a Bernoulli distribution with parameter noted, if it takes only two values commonly noted 0 and 1 with probabilities: P(X = 1) = p P(X = 0) = 1 p - Example: drug trial where a patient is cured with a probability 0.7 The expected value of X is p as: The variance of X is p(1 p) as: E[X] = 1 p + 0 (1 p) = p V(X) = E[(X E[X] ) 2 ] = E[ X 2 ] E[X ] 2 = p p 2 = p(1 p) file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 45/62 45/62

46 Discrete distribution - Binomial distribution X 1, X 2,..., X n B(p) the random variable Y = X 1 + X X n distribution noted B(n; p) with parameters n, and p. Its distribution is de ned by: Given n independent random variables having the same distribution, with n taken the values 0, 1,..., n follows a binomial n P(Y = k) = ( ) (1 p k = 0, 1,..., n k pk ) n k ( ) = and x! = x (x 1) (x 2) k n! k!(n k)! As sum of independent Bernoulli random variables we have: E[Y ] = np V(Y ) = np(1 p) file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 46/62 46/62

47 Binomial distribution - Example Sickle cell anemia is a genetic blood disorder where red blood cells lose their exibility and assume an abnormal, rigid, "sickle" shape, which results in a risk of various complications. If both parents are carriers of the disease, then a child has a 25% chance of having the disease, 50% chance of being a carrier, and 25% chance of neither having the disease nor being a carrier. If two parents who are carriers of the disease have 3 children, what is the probability that: (a) two will have the disease? (b) none will have the disease? (c) at least one will neither have the disease nor be a carrier? file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 47/62 47/62

48 Binomial distribution - Example X Let be a random variable that represents the number of children with the disease and the number of children that have neither the disease nor be a carrier. We have: L(X) = B(3; 0.25) and L(Y ) = B(3; 0.25) Y Answers to the questions: - - (a) (b) 3 P(X = 2) = ( ) (1 0.25) = = P(X = 0) = ( ) ( = ( = ) 3 ) 3 - (c) P(Y = 1) + P(Y = 2) + P(Y = 3) = 1 P(Y = 0) = 0.58 file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 48/62 48/62

49 Normal distribution A random variable is said to follow a normal distribution de parameters and σ 2 > 0 if: X N (μ; ) σ 2 μ R 1 1 f X (t) = exp( (t μ ) 2 ), t R E(X) = μ and V ar(x) = σ 2π 2σ 2 σ 2 file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 49/62 49/62

50 Normal distribution rule file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 50/62 50/62

51 Check yourself A doctor collects a large set of heart rate measurements that approximately follow a normal distribution. He only reports 3 statistics, the mean = 110 beats per minute, the minimum = 65 beats per minute, and the maximum = 155 beats per minute. Which of the following is most likely to be the standard deviation of the distribution? Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 51/62 51/62

52 Calculate with the normal distribution L(X) = N (μ; ) N (0; 1) σ 2 Z = X μ If, then random variable has the standard normal distribution σ L(X) = N (μ; σ 2 ) [a, b[ If and given an interval: a μ X μ b μ P(a X < b) = P( < ) σ σ σ a μ b μ P(a X < b) = P( Z < ) σ σ b μ a μ P(a X < b) = P(Z < ) P(Z ) σ σ b μ a μ P(a X < b) = Φ( ) Φ( ) σ σ Φ is the cummulative distribution of the standard normal such that Φ(z) = P(Z z). file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 52/62 52/62

53 Standard normal distribution table P(Z 0.14) = P(Z 0.58) = P(0.14 Z 0.58) = = file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 53/62 53/62

54 Calculations P(Z > 0.23) = 1 P(Z 0.23) = = file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 54/62 54/62

55 Calculations P(Z 0.53) = P(Z 0.53) = 1 P(Z 0.53) = = file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 55/62 55/62

56 Calculations with L(X) = N (25; 16) P(X 26.4) = P((X 25)/4 ( )/4) = P(Z 0.35) = file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 56/62 56/62

57 Calculations If L(X) = N (100; 25), calculate P(90 X 105) X P( ) = P( 2 Z 1) P( 2 Z 1) = P(Z 1) P(Z < 2) P(Z 1) P(Z < 2) = P(Z 1) (1 P(Z 2)) P( 2 Z 1) = P(Z 1) + P(Z 2) 1 P( 2 Z 1) = P( 2 Z 1) = P(90 X 105) = file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 57/62 57/62

58 Properties X 1 X 2 N ( μ 1 ; σ1 2 ) α β If two random variables et are independant with distribution and N ( μ 2 ; σ2 2 ) respectively and, real numbers, then: L( + ) = N ( +, + ) X 1 X 2 μ 1 μ 2 σ 2 1 σ 2 2 L( ) = N (, + ) X 1 X 2 μ 1 μ 2 σ 2 1 σ 2 2 L(αX 1 β X 2 ) = N (αμ 1 β μ 2, α 2 σ1 2 + β 2 σ2 2 ) L( X 1 ) = N (15; 16) L( X 2 ) = N (10; 9) Y = If and, let X 1 X 2, we have: P( 3) = P(Y 3) X 1 X 2 Y 5 2 P( X 1 X2 3) = P( ) 5 5 P( 3) = P(Z 0.4) = P(Z > 0.4) X 1 X 2 P( 3) = 1 P(Z 0.4) = X 1 X 2 file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 58/62 58/62

59 Check yourself X 1 X 2 X 3 N (0, 1) Y = , and are independent and normally distributed with the same normal distribution. X 1 X 2 X 3 What is the distribution of Y? Binomial Normal Multivariate normal We cannot add random variables Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 59/62 59/62

60 Check yourself Y = X 1 X 2 X 3 What is the expected value of Y? Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 60/62 60/62

61 Check yourself Y = X 1 X 2 X 3 What is variance of Y? Submit Show Hint Show Answer Clear file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 61/62 61/62

62 See you next time file:///users/home/npaul/enseignement/esbs/ /cours/01/index.html#21 62/62 62/62

Statistics. Nicodème Paul Faculté de médecine, Université de Strasbourg

Statistics. Nicodème Paul Faculté de médecine, Université de Strasbourg Statistics Nicodème Paul Faculté de médecine, Université de Strasbourg Course logistics Statistics & Experimental plani cation Course website: http://statnipa.appspot.com/ (http://statnipa.appspot.com/)

More information

Statistics - Lecture 04

Statistics - Lecture 04 Statistics - Lecture 04 Nicodème Paul Faculté de médecine, Université de Strasbourg file:///users/home/npaul/enseignement/esbs/2018-2019/cours/04/index.html#40 1/40 Correlation In many situations the objective

More information

Recap of Basic Probability Theory

Recap of Basic Probability Theory 02407 Stochastic Processes? Recap of Basic Probability Theory Uffe Høgsbro Thygesen Informatics and Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: uht@imm.dtu.dk

More information

Statistics - Lecture 05

Statistics - Lecture 05 Statistics - Lecture 05 Nicodème Paul Faculté de médecine, Université de Strasbourg http://statnipa.appspot.com/cours/05/index.html#47 1/47 Descriptive statistics and probability Data description and graphical

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Recap of Basic Probability Theory

Recap of Basic Probability Theory 02407 Stochastic Processes Recap of Basic Probability Theory Uffe Høgsbro Thygesen Informatics and Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: uht@imm.dtu.dk

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

Lecture 1: Descriptive Statistics

Lecture 1: Descriptive Statistics Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

An introduction to biostatistics: part 1

An introduction to biostatistics: part 1 An introduction to biostatistics: part 1 Cavan Reilly September 6, 2017 Table of contents Introduction to data analysis Uncertainty Probability Conditional probability Random variables Discrete random

More information

Exam 1 Review (Notes 1-8)

Exam 1 Review (Notes 1-8) 1 / 17 Exam 1 Review (Notes 1-8) Shiwen Shen Department of Statistics University of South Carolina Elementary Statistics for the Biological and Life Sciences (STAT 205) Basic Concepts 2 / 17 Type of studies:

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

BNG 495 Capstone Design. Descriptive Statistics

BNG 495 Capstone Design. Descriptive Statistics BNG 495 Capstone Design Descriptive Statistics Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential statistical methods, with a focus

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Homework 4 Solution, due July 23

Homework 4 Solution, due July 23 Homework 4 Solution, due July 23 Random Variables Problem 1. Let X be the random number on a die: from 1 to. (i) What is the distribution of X? (ii) Calculate EX. (iii) Calculate EX 2. (iv) Calculate Var

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

University of Jordan Fall 2009/2010 Department of Mathematics

University of Jordan Fall 2009/2010 Department of Mathematics handouts Part 1 (Chapter 1 - Chapter 5) University of Jordan Fall 009/010 Department of Mathematics Chapter 1 Introduction to Introduction; Some Basic Concepts Statistics is a science related to making

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

Introduction to Probability and Statistics Slides 1 Chapter 1

Introduction to Probability and Statistics Slides 1 Chapter 1 1 Introduction to Probability and Statistics Slides 1 Chapter 1 Prof. Ammar M. Sarhan, asarhan@mathstat.dal.ca Department of Mathematics and Statistics, Dalhousie University Fall Semester 2010 Course outline

More information

Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions

Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions Histograms, Mean, Median, Five-Number Summary and Boxplots, Standard Deviation Thought Questions 1. If you were to

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

Chapter 1 Descriptive Statistics

Chapter 1 Descriptive Statistics MICHIGAN STATE UNIVERSITY STT 351 SECTION 2 FALL 2008 LECTURE NOTES Chapter 1 Descriptive Statistics Nao Mimoto Contents 1 Overview 2 2 Pictorial Methods in Descriptive Statistics 3 2.1 Different Kinds

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

Final Exam STAT On a Pareto chart, the frequency should be represented on the A) X-axis B) regression C) Y-axis D) none of the above

Final Exam STAT On a Pareto chart, the frequency should be represented on the A) X-axis B) regression C) Y-axis D) none of the above King Abdul Aziz University Faculty of Sciences Statistics Department Final Exam STAT 0 First Term 49-430 A 40 Name No ID: Section: You have 40 questions in 9 pages. You have 90 minutes to solve the exam.

More information

Biostatistics and Epidemiology, Midterm Review

Biostatistics and Epidemiology, Midterm Review Biostatistics and Epidemiology, Midterm Review New York Medical College By: Jasmine Nirody This review is meant to cover lectures from the first half of the Biostatistics course. The sections are not organised

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

MATH4427 Notebook 4 Fall Semester 2017/2018

MATH4427 Notebook 4 Fall Semester 2017/2018 MATH4427 Notebook 4 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH4427 Notebook 4 3 4.1 K th Order Statistics and Their

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Random Variables (RVs) are discrete quantitative continuous nominal qualitative ordinal Notation and Definitions: a Sample is

More information

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables To be provided to students with STAT2201 or CIVIL-2530 (Probability and Statistics) Exam Main exam date: Tuesday, 20 June 1

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

Tastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that?

Tastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that? Tastitsticsss? What s that? Statistics describes random mass phanomenons. Principles of Biostatistics and Informatics nd Lecture: Descriptive Statistics 3 th September Dániel VERES Data Collecting (Sampling)

More information

STAT 4385 Topic 01: Introduction & Review

STAT 4385 Topic 01: Introduction & Review STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics

More information

Lecture 2: Probability and Distributions

Lecture 2: Probability and Distributions Lecture 2: Probability and Distributions Ani Manichaikul amanicha@jhsph.edu 17 April 2007 1 / 65 Probability: Why do we care? Probability helps us by: Allowing us to translate scientific questions info

More information

Statistical Theory 1

Statistical Theory 1 Statistical Theory 1 Set Theory and Probability Paolo Bautista September 12, 2017 Set Theory We start by defining terms in Set Theory which will be used in the following sections. Definition 1 A set is

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 9: Logistic regression (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 28 Regression methods for binary outcomes 2 / 28 Binary outcomes For the duration of this lecture suppose

More information

2. AXIOMATIC PROBABILITY

2. AXIOMATIC PROBABILITY IA Probability Lent Term 2. AXIOMATIC PROBABILITY 2. The axioms The formulation for classical probability in which all outcomes or points in the sample space are equally likely is too restrictive to develop

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Chapter 3. Chapter 3 sections

Chapter 3. Chapter 3 sections sections 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.4 Bivariate Distributions 3.5 Marginal Distributions 3.6 Conditional Distributions 3.7 Multivariate Distributions

More information

Probability Distributions.

Probability Distributions. Probability Distributions http://www.pelagicos.net/classes_biometry_fa18.htm Probability Measuring Discrete Outcomes Plotting probabilities for discrete outcomes: 0.6 0.5 0.4 0.3 0.2 0.1 NOTE: Area within

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya BBM 205 Discrete Mathematics Hacettepe University http://web.cs.hacettepe.edu.tr/ bbm205 Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya Resources: Kenneth Rosen, Discrete

More information

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text)

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text) Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text) 1. A quick and easy indicator of dispersion is a. Arithmetic mean b. Variance c. Standard deviation

More information

are the objects described by a set of data. They may be people, animals or things.

are the objects described by a set of data. They may be people, animals or things. ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms

More information

MULTINOMIAL PROBABILITY DISTRIBUTION

MULTINOMIAL PROBABILITY DISTRIBUTION MTH/STA 56 MULTINOMIAL PROBABILITY DISTRIBUTION The multinomial probability distribution is an extension of the binomial probability distribution when the identical trial in the experiment has more than

More information

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability? Probability: Why do we care? Lecture 2: Probability and Distributions Sandy Eckel seckel@jhsph.edu 22 April 2008 Probability helps us by: Allowing us to translate scientific questions into mathematical

More information

STATISTICS 141 Final Review

STATISTICS 141 Final Review STATISTICS 141 Final Review Bin Zou bzou@ualberta.ca Department of Mathematical & Statistical Sciences University of Alberta Winter 2015 Bin Zou (bzou@ualberta.ca) STAT 141 Final Review Winter 2015 1 /

More information

Notes slides from before lecture. CSE 21, Winter 2017, Section A00. Lecture 16 Notes. Class URL:

Notes slides from before lecture. CSE 21, Winter 2017, Section A00. Lecture 16 Notes. Class URL: Notes slides from before lecture CSE 21, Winter 2017, Section A00 Lecture 16 Notes Class URL: http://vlsicad.ucsd.edu/courses/cse21-w17/ Notes slides from before lecture Notes March 8 (1) This week: Days

More information

Last time. Numerical summaries for continuous variables. Center: mean and median. Spread: Standard deviation and inter-quartile range

Last time. Numerical summaries for continuous variables. Center: mean and median. Spread: Standard deviation and inter-quartile range Lecture 4 Last time Numerical summaries for continuous variables Center: mean and median Spread: Standard deviation and inter-quartile range Exploratory graphics Histogram (revisit modes ) Histograms Histogram

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing

More information

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?! Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of

More information

Probability Theory and Simulation Methods

Probability Theory and Simulation Methods Feb 28th, 2018 Lecture 10: Random variables Countdown to midterm (March 21st): 28 days Week 1 Chapter 1: Axioms of probability Week 2 Chapter 3: Conditional probability and independence Week 4 Chapters

More information

Conditional Probability (cont'd)

Conditional Probability (cont'd) Conditional Probability (cont'd) April 26, 2006 Conditional Probability (cont'd) Midterm Problems In a ten-question true-false exam, nd the probability that a student get a grade of 70 percent or better

More information

Conditional Probability (cont...) 10/06/2005

Conditional Probability (cont...) 10/06/2005 Conditional Probability (cont...) 10/06/2005 Independent Events Two events E and F are independent if both E and F have positive probability and if P (E F ) = P (E), and P (F E) = P (F ). 1 Theorem. If

More information

Lecture 2. Descriptive Statistics: Measures of Center

Lecture 2. Descriptive Statistics: Measures of Center Lecture 2. Descriptive Statistics: Measures of Center Descriptive Statistics summarize or describe the important characteristics of a known set of data Inferential Statistics use sample data to make inferences

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

Chapter 2. Continuous random variables

Chapter 2. Continuous random variables Chapter 2 Continuous random variables Outline Review of probability: events and probability Random variable Probability and Cumulative distribution function Review of discrete random variable Introduction

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

MATH 19B FINAL EXAM PROBABILITY REVIEW PROBLEMS SPRING, 2010

MATH 19B FINAL EXAM PROBABILITY REVIEW PROBLEMS SPRING, 2010 MATH 9B FINAL EXAM PROBABILITY REVIEW PROBLEMS SPRING, 00 This handout is meant to provide a collection of exercises that use the material from the probability and statistics portion of the course The

More information

Chapter 2 Solutions Page 15 of 28

Chapter 2 Solutions Page 15 of 28 Chapter Solutions Page 15 of 8.50 a. The median is 55. The mean is about 105. b. The median is a more representative average" than the median here. Notice in the stem-and-leaf plot on p.3 of the text that

More information

Useful material for the course

Useful material for the course Useful material for the course Suggested textbooks: Mood A.M., Graybill F.A., Boes D.C., Introduction to the Theory of Statistics. McGraw-Hill, New York, 1974. [very complete] M.C. Whitlock, D. Schluter,

More information

Description of Samples and Populations

Description of Samples and Populations Description of Samples and Populations Random Variables Data are generated by some underlying random process or phenomenon. Any datum (data point) represents the outcome of a random variable. We represent

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

Statistics 1B. Statistics 1B 1 (1 1)

Statistics 1B. Statistics 1B 1 (1 1) 0. Statistics 1B Statistics 1B 1 (1 1) 0. Lecture 1. Introduction and probability review Lecture 1. Introduction and probability review 2 (1 1) 1. Introduction and probability review 1.1. What is Statistics?

More information

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014 Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E Salt Lake Community College MATH 1040 Final Exam Fall Semester 011 Form E Name Instructor Time Limit: 10 minutes Any hand-held calculator may be used. Computers, cell phones, or other communication devices

More information

Lecture 3. Measures of Relative Standing and. Exploratory Data Analysis (EDA)

Lecture 3. Measures of Relative Standing and. Exploratory Data Analysis (EDA) Lecture 3. Measures of Relative Standing and Exploratory Data Analysis (EDA) Problem: The average weekly sales of a small company are $10,000 with a standard deviation of $450. This week their sales were

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different

More information

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for

More information

Chapter 7: Theoretical Probability Distributions Variable - Measured/Categorized characteristic

Chapter 7: Theoretical Probability Distributions Variable - Measured/Categorized characteristic BSTT523: Pagano & Gavreau, Chapter 7 1 Chapter 7: Theoretical Probability Distributions Variable - Measured/Categorized characteristic Random Variable (R.V.) X Assumes values (x) by chance Discrete R.V.

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Outline PMF, CDF and PDF Mean, Variance and Percentiles Some Common Distributions. Week 5 Random Variables and Their Distributions

Outline PMF, CDF and PDF Mean, Variance and Percentiles Some Common Distributions. Week 5 Random Variables and Their Distributions Week 5 Random Variables and Their Distributions Week 5 Objectives This week we give more general definitions of mean value, variance and percentiles, and introduce the first probability models for discrete

More information

Lecture Lecture 5

Lecture Lecture 5 Lecture 4 --- Lecture 5 A. Basic Concepts (4.1-4.2) 1. Experiment: A process of observing a phenomenon that has variation in its outcome. Examples: (E1). Rolling a die, (E2). Drawing a card form a shuffled

More information

Notation: X = random variable; x = particular value; P(X = x) denotes probability that X equals the value x.

Notation: X = random variable; x = particular value; P(X = x) denotes probability that X equals the value x. Ch. 16 Random Variables Def n: A random variable is a numerical measurement of the outcome of a random phenomenon. A discrete random variable is a random variable that assumes separate values. # of people

More information

Bayesian statistics, simulation and software

Bayesian statistics, simulation and software Module 1: Course intro and probability brush-up Department of Mathematical Sciences Aalborg University 1/22 Bayesian Statistics, Simulations and Software Course outline Course consists of 12 half-days

More information

Chapter 4.notebook. August 30, 2017

Chapter 4.notebook. August 30, 2017 Sep 1 7:53 AM Sep 1 8:21 AM Sep 1 8:21 AM 1 Sep 1 8:23 AM Sep 1 8:23 AM Sep 1 8:23 AM SOCS When describing a distribution, make sure to always tell about three things: shape, outliers, center, and spread

More information

Math 180B Problem Set 3

Math 180B Problem Set 3 Math 180B Problem Set 3 Problem 1. (Exercise 3.1.2) Solution. By the definition of conditional probabilities we have Pr{X 2 = 1, X 3 = 1 X 1 = 0} = Pr{X 3 = 1 X 2 = 1, X 1 = 0} Pr{X 2 = 1 X 1 = 0} = P

More information

3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability

3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability 3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability 3.1 Week 1 Review Creativity is more than just being different. Anybody can plan weird; that s easy. What s hard is to be

More information

Discrete Structures for Computer Science

Discrete Structures for Computer Science Discrete Structures for Computer Science William Garrison bill@cs.pitt.edu 6311 Sennott Square Lecture #24: Probability Theory Based on materials developed by Dr. Adam Lee Not all events are equally likely

More information

4. Conditional Probability

4. Conditional Probability 1 of 13 7/15/2009 9:25 PM Virtual Laboratories > 2. Probability Spaces > 1 2 3 4 5 6 7 4. Conditional Probability Definitions and Interpretations The Basic Definition As usual, we start with a random experiment

More information

dates given in your syllabus.

dates given in your syllabus. Slide 2-1 For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of paper with formulas and notes written or typed on both sides to each exam. For the rest of the quizzes, you will take your

More information

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior:

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior: Pi Priors Unobservable Parameter population proportion, p prior: π ( p) Conjugate prior π ( p) ~ Beta( a, b) same PDF family exponential family only Posterior π ( p y) ~ Beta( a + y, b + n y) Observed

More information

Lecture 01: Introduction

Lecture 01: Introduction Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Brief Review of Probability

Brief Review of Probability Brief Review of Probability Nuno Vasconcelos (Ken Kreutz-Delgado) ECE Department, UCSD Probability Probability theory is a mathematical language to deal with processes or experiments that are non-deterministic

More information

Advanced Herd Management Probabilities and distributions

Advanced Herd Management Probabilities and distributions Advanced Herd Management Probabilities and distributions Anders Ringgaard Kristensen Slide 1 Outline Probabilities Conditional probabilities Bayes theorem Distributions Discrete Continuous Distribution

More information

Conditional Probabilities

Conditional Probabilities Lecture Outline BIOST 514/517 Biostatistics I / pplied Biostatistics I Kathleen Kerr, Ph.D. ssociate Professor of Biostatistics University of Washington Probability Diagnostic Testing Random variables:

More information

REVIEW: Midterm Exam. Spring 2012

REVIEW: Midterm Exam. Spring 2012 REVIEW: Midterm Exam Spring 2012 Introduction Important Definitions: - Data - Statistics - A Population - A census - A sample Types of Data Parameter (Describing a characteristic of the Population) Statistic

More information

Math 10 - Compilation of Sample Exam Questions + Answers

Math 10 - Compilation of Sample Exam Questions + Answers Math 10 - Compilation of Sample Exam Questions + Sample Exam Question 1 We have a population of size N. Let p be the independent probability of a person in the population developing a disease. Answer the

More information

Practice problems from chapters 2 and 3

Practice problems from chapters 2 and 3 Practice problems from chapters and 3 Question-1. For each of the following variables, indicate whether it is quantitative or qualitative and specify which of the four levels of measurement (nominal, ordinal,

More information

Dynamic Programming Lecture #4

Dynamic Programming Lecture #4 Dynamic Programming Lecture #4 Outline: Probability Review Probability space Conditional probability Total probability Bayes rule Independent events Conditional independence Mutual independence Probability

More information

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS EXAM Exam # Math 3342 Summer II, 2 July 2, 2 ANSWERS i pts. Problem. Consider the following data: 7, 8, 9, 2,, 7, 2, 3. Find the first quartile, the median, and the third quartile. Make a box and whisker

More information

1. Poisson distribution is widely used in statistics for modeling rare events.

1. Poisson distribution is widely used in statistics for modeling rare events. Discrete probability distributions - Class 5 January 20, 2014 Debdeep Pati Poisson distribution 1. Poisson distribution is widely used in statistics for modeling rare events. 2. Ex. Infectious Disease

More information