Introduction to Probability and Statistics

Size: px
Start display at page:

Download "Introduction to Probability and Statistics"

Transcription

1 Introduction to Probability and Statistics Xi Kathy Zhou, PhD Division of Biostatistics and Epidemiology Department of Public Health Feb. 2008

2 Overview Statistics: The mathematics of the collection, organization and interpretation of numerical data, especially the analysis of population characteristics by inference from sampling definition in American Heritage Dictionary Why statistics: Through studying the characteristics of a small collection of observations proper inference for the entire population could be derived Probability theory is the basic tool for statistical inference

3 Outline Basic concepts in probability Events and random variables Probability and probability distributions Means, variance and moments Joint, marginal and conditional probabilities Dependence and independence Basic concepts in statistics Data Descriptive statistics Statistical Inference Estimation Statistical Inference Hypothesis testing

4 Probability a measure of uncertainty Example: Random experiment Possible Outcome Toss a coin {H}, {T} Roll a 6-sided die {3}, {5}, {1,2,3} What do you think you ll get in the above experiments? How sure are you? Why? each outcome is equally probable. - Probability is used as a way to measure uncertainty.

5 Events Definitions: Random experiment: an experiment which can result in different outcomes, and for which the outcome is unknown in advance. Sample space Ω : a set of all possible elementary outcomes of an experiment Event: a subset of the sample space Ω Random experiment sample space Events Toss a coin {H, T} {H}, {T} Roll a 6-sided die: {1,2,3,4,5,6} {3}, {5}, {1,2,3}

6 Probability measure Sigma field F: a set that satisfies the following, 1. If 2. If 3. Ø F A, B F A F,, then then A A B F c F, and A B F Probability measure P on (Ω, F ): a function P: F [0,1] satisfying the following properties (Ø denotes the empty set): 1. P( A) 0 for any A F 2. P( Ω) = 1 3. If A, B F and A I B = Ø, then P( A B) = P( A) + P( B) The 6-sided die example, Sigma field: {Ø, {1},, {6}, {1,2},, {1,2,3,4,5,6}} Sigma field: {Ø, {1,2,3}, {4,5,6}, {1,2,3,4,5,6}}

7 Probability measure - some properties Comparing the uncertainty of events: If A, B Ω and A B, then P( A) P( B) Assessing the uncertainty associated with other events Ω A-B A B B-A A B Illustration of rule 6. P( A) = 1 P( A) for A Ω and A = Ω A P(A1 A2... Ak ) = P( A1 ) P( Ak ) for pairwise disjoint A1,..., A P( A B) = P( A) + P( B) P( A B) for any A, B Ω k Ω Example (rolling a 6-sided die): If we know P({1}),, P({6}), we should know the uncertainty of more complicated events such as P({1,2,4})

8 Probabilities of events - Examples Experiment: randomly picking a DNA sequence of length 3 Event A: the picked sequence is ATG P(A)= 1/4 3 = 1/64 Experiment: randomly taking a DNA sequence of length 20 from a length 100 sequence with 20A s Event A: having 20A s in a row What is the sample space, what is the probability of event A? Answer: / 20

9 Relationship of two events Conditional probability Let Ω be an event space and let P be a probability measure on Ω. Let B єωbe an event (on which we want to condition). The function P(A B) P(. B) : Ω [0,1], P(A B) =, A Ω P(B) defines a probability measure on Ω, the conditional probability given B. (Proof as exercise) Independence: Let Ω be an event space and let P be a probability measure on Ω. Two events A,B єωwith P(A)>0, P(B)>0 are called (stochastically) independent if one of the following equivalent conditions holds: P(A B) = P(A) P(B) P(A B) = P(A) P(B A) = P(B) EXAMPLE: Throwing a six sided fair dice, event A= even number, event B=<5, C= prime number, D= <4 P(A)=?, P(B)=?, P(A B)=? Are A and B independent? P(C)=?, P(D)=?, P(C D)=? Are C and D independent?

10 Random variable Random Variable:A function X : Ω { ω Ω : X( ω) x} F for each x R. R with the property that A more common description of the results of a random experiment. Takes on value from a set of mutually exclusive and collectively exhaustive states and represent it with a number. Usually denoted by capital letters, e.g. X, Y, Z, etc. Realizations of random variables are usually denoted in lower case, e.g., x,y,z, etc. Can be discrete or continuous

11 Types of Random Variables Discrete random variable Any variable whose possible values either form a finite set, or else can be listed in a countable infinite sequence Continuous random variable Any variable whose possible values consist of an entire interval on the number line.

12 Probability distributions Definition: Probability Distribution of a random variable X is the function F: R ->[0,1] given by F(x)= P(X<=x) Characterizes the uncertainties of a random experiment before the experiment is conducted, i.e. we know that some results are more likely than others.

13 Discrete Random Variable Probability distribution function (pdf) A discrete random variable X with values x 1, x 2,, x k, has a Probability density distribution of X is P(X=x i )=p i, I = 1, 2,, k, where p i is the probability mass function that satisfies 0 p i 1 p 1 +p 2 + +p k + =1 Range of this random variable: {x 1, x 2,, x k, }

14 Discrete Random Variable Cumulative distribution function (cdf) The cumulative distribution function F(x) of a discrete random variable X defined by F( x) Properties of the cdf: 0 F(x) 1 If x y then F(x) F(y) = P( X x) = i: x x i p i Discrete case: step function, continuous from the right, jump discontinues at x 1, x 2,, x k, with heights p 1, p 2,, p k,

15 Discrete Random Variable pdf and cdf example Random experiment: Roll 2 dice Random variable X: Sum of both values probability distribution function cumulative distribution function

16 Discrete Random Variable Probability calculation rules if

17 Discrete random variables Example of common distributions Discrete uniform distribution (rolling a fair 6-sided die) Geometric distribution (Repeat a Bernoulli experiment until the first success, the first occurrence of a event A.)

18 Discrete random variable Discrete uniform distribution A discrete random variable X is called a uniformly distributed on the range {x 1, x 2,, x k } if for all i = 1,, k: Example: Roll a fair die Probability distribution of a uniform distribution

19 Discrete Random Variable Geometric distribution (1) Random experiment (Repeat a Bernoulli experiment until the first success) Event: {TH, TTH, } Probability for a success: P(H) = π Random variable X: Number of trials until the first success {1, 2, } X has a geometric distribution with parameter π The probability distribution function has the form The cumulative distribution function has the form

20 Discrete random variable Geometric distribution (2)

21 Discrete Random variable Geometric distribution (3)

22 Discrete random variable - Mean The value you expect to get in a random experiment is the mean. Example: If you toss a coin 10 times, you expect to get 5 heads and 5 tails. You expect this value because the probability of getting "heads" is 0.5 and if you toss 10 times you should get 5. Definition: The mean of a discrete random variable with values x 1, x 2,, x k, and probability distribution p 1, p 2,, p k, is Note that E(x) characterizes the random experiment

23 Discrete random variable Mean (Example) Binary random variable X: Assume P(X=1) = π and P(X=0)=1- π, then E(x) = 0*P(X=0) + 1*P(X=1) = π Toss a fair coin: X gain/loss of a monetary unit If P(X=1) = P(x=0), E(X) =? Roll a fair die: Once X value of the landing, E(X)=? Twice X sum of values, E(X)=?

24 Discrete random variable Variance and Standard deviation The variance of a discrete random variable is The standard deviation is

25 Discrete random variables Independence Definition: 2 discrete random variables X with range {x 1, x 2,, x k, } and Y with range {y 1, y 2,, y l, } are called independent, if for all and More general: n discrete random variables X 1, X 2,, X 3 are called independent, if for arbitrary values x 1, x 2,, x n in their respective range the following term is true

26 Discrete Random Variable Properties of the mean and calculation rules (1) Linear transformations: Nonlinear transformations: real function Example: Note: In general, Example:

27 Discrete Random Variable Properties of the mean and calculation rules (2) Linearity Rule: Mean of a sum of (discrete) random variables: E(X+Y) = E(X)+E(Y) E(a 1 X a n X n ) = a 1 E(X 1 )+ + a n E(X n ) Product rule independent (discrete) random variables: If X, Y are independent, then E(XY) = E(X)E(Y) Example: Roll a die twice. What is the mean of the product of the values

28 Discrete random variable Properties of the variance Linear transformations For independent random variables X, Y and X 1,, X n respectively, we can show Var(X+Y)= Var(X)+ Var(Y) and for any constant a 1,, a n

29 Discrete random variable Variance (Examples) Binary random variable: Var(X)= π(1 π) Proof Roll a fair die once: X is the value at the landing Var(X)=? Roll a fair die twice: X is the sum of the values Var(X) =?

30 Discrete Random variable Independence (Example) Random experiment: Roll two dice For all 1 i, j 6 P(X=I, Y=j) = 1/36 =1/6 X1/6 = P(X=i)P(Y=j) Random experiment: Roll a die Y = 1 if the value is a prime number Z = 1 if the value is smaller than four Are these two events independent? No Because Y=1 and Z=1 means 2 or 3, so P(Y=1, Z=1) = 2/6 1/2 1/2 = P(Y) P(Z). or equivalent: Is P(Y Z) = P(Y)?

31 Continuous Random Variables

32 Continuous random variable Probability distribution Definition. If X:Ω IR is a random variable, the function F : IR IR, F( x) = P( X x) is called the distribution function of X. If X is a continuous random variable with density f, the distribution function F can be expressed as x F( x) = f ( x) dx. This formula is the continuous analogue of the discrete case, in which the distribution function was defined as F x) = f ( x ). ( x x j j

33 Continuous random variables mean and variance The statistics mean and variance, which were already defined for discrete random variables can be defined in an analogous way for continuous random variables: Mean X discrete, X j є {x 1,x 2,,} p.d.f. P(X=x j ) c.d.f. P(X x j ) = x P( X = E ( X) ) x j = j x j 2 Variance Var( X) ( x E( X)) P( X = x ) x j j j X continuous with density f density function f(x) distribution function F(x) xf E ( X ) = ( x) dx 2 Var ( X ) = ( x E( X )) f ( x) dx

34 Continuous random variables Example 1 Uniform distribution. A continuous random variable X is called uniform or uniformly distributed (in the interval [a,b]) if it has a density function of the form 1 f ( x) = b a 0 for x [a, b] otherwise for some real values a<b. This is denoted by X ~ U(a,b). 1 1 f F 0 a b a b 0

35 Continuous random variables Example 2 Exponential distribution. A continuous random variable having a density for some real parameter λ>0 is called exponentially distributed. Denote this by X ~ Ex(λ). The corresponding distribution function is λ exp(-λx) for x 0 f ( x) = 0 otherwise 1 exp(-λx) for x 0 F( x) = 0 otherwise f λ=3 λ=1 λ=0.3 λ=3 λ=1 λ=0.3 F Density and distribution function of an exponentially distributed rancom variable X ~ Ex(λ) for λ = 0.3, 1, 3.

36 Continuous random variables Example 3 Normal distribution. A continuous random variable X is called normally distributed or gaussian (with mean μ and standard deviation σ>0), write X ~ N(μ,σ 2 ), if it has a density function of the form. 1 f ( x) = exp σ 2π x μ ( σ ) There is no closed form for the distribution function F of such a variable. the distribution function has to be computed numerically. Standard normal distribution μ=0, σ=1 μ=1,σ= 2 μ=0, σ=0.8 f μ=1,5, σ=0.8 F μ=0, σ=1 μ=0, σ=0.8 μ=0, σ=2 Density and distribution function of some normally distributed rancom variables X ~ N(μ,σ 2 )

37 Two more continuous distributions The χ 2 -distribution. If X 1,,X n are independent random variables that are N(0,1)- distributed, then the random variable is said to be Chi-squared distributed with n degrees of freedom, for short Z ~ χ 2 (n). Student t-distribution (t-distribution). If X~N(0,1) and Z~ χ 2 (n) are independent, then the random variable 2 2 Z = X + X T X n X = Z / n is said to have a t-distribution with n degrees of freedom, for short T ~ t(n). This list of continuous random variables is by no means complete. For a survey, consult the statistics literature given in the reference list to this lecture series.

38 Definition. Let Ω be a probability space with probability measure P. Let X:Ω IR and Y:Ω IR be continuous random variables. X and Y are called independent if for all x,y є IR. ) ( ) ( ) ( ) ( ), ( y F x F y Y P x X P y Y x X P Y X = = Corollary. If the continuous random variables are independent, for all real values of a 1 <a 2,b 1 <b 2. ) ( ) ( ), ( b Y b P a X a P b Y b a X a P = Continuous random variables Independence

39 Continuous random variables Joint and marginal probability distributions Let X and Y be two random variables on the same probability space Ω. If there exists a function f: IR x IR IR such that P( a1 X a2, b1 Y b2 ) For all real values of a 1 <a 2,b 1 <b 2, then X and Y are said to have a continuous joint (multivariate) distribution, and f is called their joint density. We will be considered only with this case here. The marginal distribution of X is given by = a 2 a 1 b b 2 1 f ( x, y) dydx a 2 P( a1 X a2 ) = f ( x, y) dydx = a 1 a a 2 1 f X ( x) dx, f X ( x) = f ( x, y) dy where is the density of the marginal distribution of X.

40 Continuous Random Variable Conditional probability distributions The conditional distribution of X, given Y=b is given by P ( a 1 X a 2 f X ( x Y Y = b ) f X ( x Y = where is the density of the conditional distribution of X, given Y=b. = a a 2 1 b ) dx = b ) = f ( x, b ) f ( t, b ) dt We mention an equivalent condition for independence: The random variables X and Y are independent if 1. f(x,y)=f X (x)f Y (y) for all x,y є IR 2. f X (x Y=b)=f X (x) for all x,b є IR. 3. f Y (y X=a)=f Y (y) for all a,y є IR.

41 Basic Concepts in Statistics

42 Data, sampling and statistical inference Data: Characteristics/properties of a random sample from a population. For example: y 1,, y n (n realizations of a random variable Y) Sampling: Ways to select the subjects for which the characteristics/ properties of interest will be assessed Examples: SRS, stratified, clustered Statistical inference: Learning from data i.e. assuming these data are n draws from distribution f θ, what we know about the population parameter. Probability theory: reasoning from f->y if the experiment is like, then f will be, and (y 1,, y n ) will be like, or E(Y) must be Statistics: Reasoning from Y to f Since (y 1,, y n ), turned out to be, it seems that f is likely to be, or the parameter is likely to be around

43 Types of Data There are different types of data: Affymetrix Gene- Id Signal Detection p-value BioB-5_at 258 P BioB-M_at 470 P numerical data (discrete, continuous) categorical data (ordered, nonordered) mixtures of both BioB-3_at 247 P BioC-5_at 787 P BioC-3_at 695 P BioDn-5_at 939 P BioDn-3_at 4356 P CreX-5_at 9992 P CreX-3_at P DapX-5_at 5 N DapX-M_at 14 N If the properties consist of multiple features (like Signal, Detection, pvalue in the example), the data is called multivariate, otherwise it is called univariate. DapX-3_at 1 N LysX-5_at 4 N LysX-M_at 3 N LysX-3_at 2 N PheX-5_at 14 N PheX-M_at 65 N PheX-3_at 9 N ThrX-5_at 25 N ThrX-M_at 4 N ThrX-3_at 118 N

44 Steps in statistical analysis of data? Describing the data (descriptive statistics) Propose reasonable probabilistic model Making inference about parameters in the model Check the model fitting/assumption Report results

45 Describing univariate categorical data Frequency table: Simply list all object-property pairs in a table. Count the number of objects in each category, display the result in a table Calculate the relative size of the category, display the result in the table Example:

46 Describing univariate categorical data Assume we have a dataset with objects 1,, n and their real-valued properties x 1,, x n. Histogram Choose intervals with C 1 =[a 1,a 2 ), C 2 =[a 2,a 3 ),, C k =[a k,a k+1 ), and a 1 < a 2 < < a k+1 (this process is called binning ). Let y k = C k iff x k є C k. Display the categorical dataset y 1,,y n as a bar plot, with the width of the bars proportional to the length of the intervals. Dataset: the height of a population of 10,000 people. Histograms were plotted with k equidistant bins, k = 8,16,32,64 A local maximum of the abundance distribution is called a mode, x mode. Distribution with only one mode are called unimodal, distributions with more modes are called multimodal.

47 Descriptive statistics 1 The second and by far the most important way is to summarize the data by appropriate statistics. A statistics is a rule that assigns a number to a dataset. This number is meant to tell us something about the underlying dataset. Examples: Arithmetic mean. Given x 1,, x n, calculate the arithmetic mean as x = 1 n n x j j= 1 The arithmetic mean is one of the many statistics that aim to describe where the centre of the data is. The arithmetic mean minimizes the sum of the quadratic distances to the data points, namely x n = argmin ( x j= 1 2 x x j ) x

48 Descriptive statistics 2 Median. Let x 1,, x n, be given in ascending order. The median x med is defined as x med = x ( n+ 1) / 2 ( x n / 2 + x if n n / 2+ 1 is ) / 2 odd if n is even The median is a value such that the number of data points smaller than x med equals the number of data points greater than x med. Like the arithmetic mean, the median is also a location measure for the centre of the data. Mean Median Mode (this distribution is unimodal!)

49 Descriptive statistics 3 Symmetry. A frequency distribution is called symmetric if it has an axis of symmetry. Skewness. A frequency distribution is called skewed to the right if the right tail of the distribution falls off slower than the left tail. Analogously: skewed to the left. Mean Median Mode Posture rules: Left skew: Symmetric: left skew symmetric skewed to the right x < med x < x x mode x med xmode Right skew: x > xmed > xmode

50 Descriptive statistics 4 Quantiles. Let q є (0,1). A q-quantile of a frequency distribution is a value x q such that the fraction of data lying left to x q is at least q, and the fraction lying right to x q is at least 1-q. If the data is ordered x x... x ), then ( 1 2 n x q = x qn [ x qn if qn is not an integer, xqn + 1] if qn is an integer X 0.05 X 0.25 X 0.5 X 0.75 X 0.95 Special quantiles are the quartiles, x 0.25,x 0.5,x 0.75 (which split up the data into four classes), and the quintiles x 0.2,x 0.4,x 0.6,x 0.8. They are frequently used to give a summary of the data distribution.

51 Descriptive statistics 5 Variance, Standard deviation. The variance v=var(x 1,,x n )=Var(x) of a dataset x=(x 1,,x n ) is defined as n v = s 2 = 1 n j= 1 ( x j x) (the average squared distance from all data points to ). The standard deviation s=s(x) is the positive square root of the variance, s 2 =v. The variance and the standard deviation are measures for the dispersion of the data. 2 x Relative frequency Relative frequency small variance vs. high variance

52 Detailed description of univariate data Density plots. If the number of data points is large, it is often convenient to approximate a histogram (of the relative frequencies) by a density curve (red line): A density function is a non-negative real-valued integrable function f such that x 0 x 1 f ( x) dx (this condition says that the area enclosed by the graph of f and the x-axis is 1). Interpretation: The area of a segment enclosed by the x-axis, the graph of f and y=x 0 and y=x 1 (the grey shaded area in the figure) equals the fraction of data points with values between x 0 and x 1. = 1

53 Continuous univariate data An important distribution Normal distributions = Gaussian distributions. A very important family of density functions are the Gaussian distributions, defined as 1 f ( x) = exp σ 2π x μ ( σ ) This distribution is symmetric (around x=μ), unimodal (with mode at x=μ) and shaped like a bell. The mean of gaussian distributed data is μ, its variance is σ 2. With parameters μ and σ>0. The rule. If a dataset has a gaussian distribution with mean μ and variance σ 2, then 68% of the data lie within the interval [ μ-σ, μ+σ ] 95% of the data lie within the interval [ μ-2σ, μ+2σ ] 99.7% of the data lie within the interval [ μ-3σ, μ+3σ ]

54 Summary Frequency tables, Bar plots, Pie charts, Histograms, Density plots are possible ways to display statistical data. Mean, Median and Quantiles are measures of location for numerical data The variance is a measure of variation for numerical data, it has pleasant transformation properties The Gaussian distribution is a very important density function.

55 Multivariate descriptive statistics

56 Multidimensional data (1) In many applications a set of properties/features is measured. If we want to learn facts about a single property, we use univariate statistical measures, e.g. mean, median, variance, quantiles. If we want to learn how two or more properties depend on each other we need multivariate statistical measures. Examples (multidimensional data) measure age and gender of the same person. microarray gene expression data are multidimensional Ways to describe these data

57 Multidimensional data (2) For each object i, i=1, n, we measure simultaneously several features X,Y, Z, multidimensional or multivariate data We get the values (x i,y i,z i ) of the features for object i In the following, we consider two features. Question: X <--> Y How does the correlation between X and Y look like? Correlation (Association) X --> Y How does X affect the feature Y (response)? Regression

58 Discrete/grouped data If a feature has only a finite or countable infinite number of possible values, we call it discrete. E.g. number of A s on a DNA-sequence. If a feature s possible values range in an interval, we call it continuous. E.g. weight of a person. To know: How to describe the distribution of two discrete features. How to evaluate whether the two features are correlated This also includes continuous features grouped into categories.

59 General description: Contingency table Absolute frequencies A (k x m) contingency table of absolute frequencies has the form: The contingency table describes the joint distribution of X and Y in terms of absolute frequencies

60 Contingency table Marginal frequencies The column and row sums of the contingency table are called the marginal frequencies of the features X and Y. We write h i. =h i1 + h im, i=1,,k and h.j =h 1j + h kj, j=1,,m The resulting sums h 1.,, h k. and h.1,, h.m describe the univariate distributions of the features X and Y. This distribution is also called the marginal distribution.

61 Contingency table Relative frequencies A (k x m) contingency table of relative frequencies has the form: The contingency table describes the joint distribution of X and Y. The margins describe the marginal distributions of X and Y.

62 Contingency table Conditional frequencies By looking at the absolute or relative frequencies alone it is not immediately possible to decide whether there is a correlation between features. Therefore: Look at conditional frequencies, i.e. the distribution of a feature for a fixed value of the second feature

63 Contingency table Conditional frequency distribution (1) Conditional frequency distribution of Y under the condition X=a i, also written Y X=a i, is given by: Conditional frequency distribution of X under the condition Y=b j, also written X Y=b j, is given by:

64 Contingency table Conditional frequency distribution (2) Because of we also have The conditional distributions are computed by dividing the joint frequencies by the appropriate marginal frequencies.

65 Contingency-table χ 2 coefficients Starting point: How should the joint frequencies look like, so that we could empirically assume independence between X and Y (given the marginal distributions)

66 Contingency table Empirical independence Idea: X and Y are empirically independent if and only if the conditional frequencies are equal in each sub-population X=a i, i.e. independent of a i

67 Contingency table Assessing empirical independence Idea: Compare for each cell (i,j) the theoretical frequency with the observed frequency under the assumption of independence χ 2 coefficient:

68 Contingency table Properties of the χ 2 coefficients X and Y are empirically independent large <==> strong correlation small <==> weak correlation Disadvantage: depends on the dimension of the table

69 Graphical representation of quantitative features Graphical representation of the values (x i,y i ), i=1,,n from two continuous features X and Y. The simplest representation of (x 1,y 1 ),,(x n,y n ) in a coordinate system is call a scatterplot

70 Correlation of continuous features Aim: Find a measure that describes the correlation of two continuous features X and Y. Measure with no or only weak correlation strong positive correlation strong negative correlation

71 Pearson s correlation coefficient (1) The Pearson correlation coefficient for the data (x i,y i ), i=1,,n is defined as The range of r is [-1,1] r > 0 positive correlation, positive linear relationship, i.e. values are around a straight line with positive slope r < 0 negative correlation, negative linear relationship, i.e. values are around a straight line with negative slope r = 0 no correlation, uncorrelated

72 Pearson s correlation coefficient (2) The correlation coefficient r measures the strength of a linear relationship

73 Pearson s correlation coefficient (3) Rule of thumb: weak correlation medium correlation strong correlation Linear transformations: correlation coefficient between and correlation coefficient between and or or

74 Equivalent forms of r Multiplying out yields: Remember the formula for variances! In terms of standard deviations and covariance with covariance and standard deviations

75 Statistic Inference Estimation Finding approximations of the model parameters point estimation Finding the uncertainty associated with the population parameter interval estimation (finding the confidence intervals) Hypothesis testing

76 Point Estimation Finding: ˆ( θ x,..., x 1 n ) Desired properties of the estimator: unbiasedness (bias is measured as the expected difference between the estmator and the population parameter) efficiency (could be described by the inverse of variance of the estimator) 2 small mean square error (MSE) E( ˆ θ θ) other: consistency, etc. Common methods to find estimators: Method of moments Maximum likelihood estimation

77 Estimation: Method of Moments Method of moments: Match the first E(X), second (E(X 2 )),, order moments to the parameters Solve the equation system If sample E(X k )=g(θ), then ˆ -1 k θ = g ( x ) Maximum Likelihood Estimation (MLE) Assuming the data come from a parametric family indexed by a population parameter θ, i.e. X 1,, X n ~ i.i.d. f(x θ), the joint density of the data is f ( X1,..., X θ) = Πf ( X θ) The probability of observing the data is the likelihood function of the parameter θ under the assumed probabilistic model, i.e. Likelihood = n f ( x 1,..., x θ ) = Πf ( x θ ) n i i

78 Example: Binomial data Data: 6,3,5,6,8 number of successes in 5 repeated experiments of tossing a coin 10 times Is this a fair coin? What is going to come up for the 11 th toss? Assuming a probabilistic model: X ~Binom (π,10) Estimating π MOM: Because E(X)= π, estimate of π = sample mean = ( )/5 MLE: L(π data) = P(x 1 =6,, x 5 = 8 π)=p(x 1 =6 π)...p(x 5 =8 π), then find the value that maximize the likelihood function

79 Example: Normal data x 1,x 2,...,x n ~ iid N(μ,σ 2 ) 2 f N ) ( x μ Joint pdf for the whole random sample, σ 1 = e 2πσ 2 (x μ) 2 2σ 2 f ( x, x,..., x μ, σ ) = f ( x μ, σ ) f ( x μ, σ )... f n 1 2 n ( x μ, σ 2 ) Likelihood function is basically the pdf for the fixed sample l ( μ, σ x, x2,..., xn ) = f ( x1 μ, σ) f ( x2 μ, σ)... f ( x 1 n μ, σ) Maximum likelihood estimates of the model parameters μ and σ 2 are numbers that maximize the joint pdf for the fixed sample which is called the Likelihood function μˆ = n x i ( μˆ) 2 xi σˆ = n 2

80 Hypothesis Testing Making inference about the value of the population parameter based on the data Start with hypotheses about the population parameter (include: null, alternative) Using data to assess the sample variability of the null hypothesis Conclusion: Reject the H0, the data is highly unlikely to be generated from the probabilistic model defined by H0 Fail to reject H0, the data is not highly unlikely to happen with H0

81 Example: Hypothesis Testing X: the expression level of gene A under condition 1 Y: the expression level of gene A under condition 2 To decide: if the average expression levels are equal Null hypothesis H 0 = both expression levels are equal. Alternative hypothesis H 1 = the expression levels are unequal. Specify a method how to decide between these two alternatives. Choosing an appropriate statistics D that is able to discriminate between the two hypotheses and Choose a rejection region in which H 0 is rejected. The selection of the statistics defines the test.

82 Hypothesis Testing (Example) The biologist may proceed in the following way: He has n x replicate measurements of the gene of interest in condition X, (x 1,,x n y), and n y replicate measurements of condition Y, (y 1,,y n x). He might divide the average of the X measurements by the average of the Y measurements and obtain the statistics: D = x 1 + x x n X n y 1 + y y n y k Y = gene2 X = gene1 Then, the biologist might define the acceptance region as [-1/2,2], i.e. if log 2 D > 1, he rejects the null hypothesis in favour of the alternative hypothesis (differential gene expression). If log 2 D 1, he does not reject H 0. This test is not optimal (see the Exercises), but it is still used by many researchers. The great advantage of this approach is that the choice of the confidence interval can be done implicitly by prescribing a significance level.

83 Hypothesis Testing Significance level (Example) Let a α є(0,1) be given. Usually α is a number close to zero. The statistics D can be interpreted as a random variable. If we assume the null hypothesis is valid, we can find a (not necessarily unique) interval J on the real line such that P ( 0 D J H ) This means that given the null hypothesis is valid, the probability of observing a value of D outside the interval J is α (and hence small, if α is small). The complement of J in IR is then taken as the rejection region for the test. = α In the biologist example, there are better ways to design a test for differential gene expression. Under the assumption that the expression values for X respectively Y follow a normal distribution, we can conduct a t-test:

84 Assume that X=(x 1,,x n x) resp. Y=(y 1,,y n y) are two samples of independent normally distributed random variables with mean μ x resp. μ y and standard deviation σ x resp. σ y. The null hypothesis can be stated as H 0 = μ x = μ y. X n y Y Var n X Var Y E X E T ) ( ) ( ) ( ) ( + = The T statistic Is perfectly designed to answer this question. If the null hypothesis is true, i.e. μ x - μ y is near 0, then T should be close to 0 except for random outcomes that are pretty unusual. The T statistic is random variable with a little bit complicated distribution: If and, then T has approx. a t-distribution with d degrees of freedom, where d is the closest integer to ), ( ~ 2 X X N X σ μ ), ( ~ 2 Y Y N Y σ μ ) / ( 1 1 ) / ( 1 1 / ) / / ( Y Y Y X X X Y Y X X n S n n S n n S n S The two sample Student t-test.

85 Student t-test The density of the T statistic tells us how far from 0 we should expect T to be most of the time, given the null hypothesis is true. E.g. for k=8, and significance level α = 5%, we would expect only 5% of the time for T to be above t(0.975; 8)=2.306 or below t(0.025; 8) = Thus a typical decision rule in this case would be to reject H 0 in favour of H 1 if T > t(0.975;8) = Density of the t-statistic for k=8. Symmetric confidence interval for α = 5%. density of t(k=8) % 2.5% 95% P-values. x The probability of observing values of D that are at least as extreme as d. Calculate the p-value p = P( D > d ) Given a significance level α, we reject the null hypothesis if p< α.

86 Hypothesis Testing Error types If we reject the null hypothesis when it is actually true, we have made what is called a type I error or a false positive. (Example: Falsely declare a gene as differentially expressed) If we accept the null hypothesis when it is actually false, then we have made a type II error or a false negative. (Example: Failed to identify a truly differentially expressed gene) H 0 true H 0 not true Hypothesis not rejected True negatives Type II error (false negatives) Hypothesis rejected Type I error (false positives) True positives

87 Hypothesis Testing Error Types (Cont.) In hypothesis testing, the probability of a Type I error is controled to be at most as high as significance level of the test. It is harder to control the probability of a Type II error because we usually do not have a statistics for testing the alternative hypothesis. The smaller the true existing difference in expression levels the larger the probability of a Type II error. Given a statistical testing procedure, it is impossible to keep both error types arbitrarily large by selecting a special significance level. There is a trade-off between type I and type II error, as depicted by the next figure.

88 Two types of tests Parametric tests: A parametric distribution is assumed for the measured random variables. E.g. the t-test assumes that the variables are normally distributed. (If this were not the case, this would lead to wrong p-values or wrong confidence intervals.) Non-parametric tests: No parametric distribution function is assumed for the measured random variable when the distribution of the measured variables is not known or when there is no appropriate test that can deal with the distribution of the measured variables are non-parametric tests. merely rely on the relative order of the values of on some very mild constraints concerning the shape of the probability distributions of the measured variables (e.g. unimodality, symmetry). Often, prior to computing a test statistic, data is transformed in order to produce random variables that are easier to handle (e.g. to produce approximately normally distributed data). We mention one parametric and one non-parametric test which are commonly used.

89 Wilcoxon rank sum test Given two samples x=(x 1,,x n ) and y=(y 1,,y m ) drawn independently from the random variables X and Y resp. Testing whether the distibutions of X and Y are identical. For large numbers it is almost as sensitive as the two Sample Student t- test. For small numbers with unknown distributions this test is even more sensitive than the Student t-test. The only requirement for the Wilcoxon test to be applicable is that the distributions are symmetric.

90 Wilcoxon rank sum test (Cont.) State the hypotheses: Null hypothesis: The two variables X and Y have the same distribution. Alternative hypothesis: The two variables X and Y do not have the same distribution Choose a significance level α Compute Test statistics: Rank order all N=n+m values from both samples combined. Sum the ranks of the smaller sample and call this value w. Calculate p-value Look up the level of significance (p-value) in a table using w, m and n. Calculating the exact p-value is based on calculating all permutations of ranks over both samples. (This is infeasible for n, m>10. Fortunately, there are approximations available (and implemented in R)). Compare p-value with α and state the conclusion P-value < α : Reject H 0 P-value >= α : Fail to reject H 0

91 Summary Null hypothesis, test statistics Significance level, rejection region, p-value Type I and type II errors 5-Step testing procedure Parametric tests: t -test, χ 2 -test, ANOVA Non-parametric test: Wilcoxon rank sum test, Kruskal Wallis

92 Multiple hypothesis testing Golub et al. (1999) were interested in identifying genes that are differentially expressed in patients with two type of leukemias: - acute lymphoblastic leukemia (ALL, class 0) and - acute myeloid leukemia (AML, class 1). Gene expression levels were measured using Affymetrix chips containing g = 6817 human genes. n = 38 samples = 27 ALL cases + 11 AML cases.

93 Multiple hypothesis testing Following Golub et al. Three preprocessing steps were applied to the normalized matrix of intensity values available on the website: (i) Thresholding floor of 100 and ceiling of ; (ii) filtering: exclusion of genes with (max /min) 5 or (max-min) 500, where max and min refer respectively to the maximum and minimum intensities for a particular gene across all mrna samples ; (iii) base 10 logarithmic transformation. The data were then summarized by a 3051 x 38 matrix. A two sample t-test for was computed for each of the 3051 genes.

94 Multiple hypothesis testing Did you expect that? Did you expect that? Histogram of teststat Histogram of p-values Frequency Frequency teststat * (1 - pnorm(abs(teststat)))

95 Multiple Comparison p-value: probability of finding a difference equal or greater than the observed one just by chance under the null hypothesis. Measure of false positive rate (F/ m 0 ) Commonly used significance level, 5% (+/-1.96 s.d.), is arbitrary In multiple comparisons, 5% significance level for each comparison often results in too large overall significance level. Do not involve the alternative hypothesis. Called significant Called not significant Total Null true F m 0 -F m 0 Alternative true T m 1 -T m 1 Total S m - S m

96 Multiple Comparison (Cont. 1) Family-wise error rate (FWER) probability of having at least one false positives in multiple comparisons. Many versions of controlling procedure. Bonferroni, Holm (1979), Hochberg (1988), Hommel (1988) Can be too conservative for genomic studies. Table: FWER (expected number of false postives) for different number of comparisons (N) at different α level α N (0.01) (0.05) (0.1) (0.5) (1) (10) (0.05) (0.25) (0.5) (2.5) (5) (50)

97 Multiple Comparison (Cont. 2) False discovery rate (FDR / pfdr): Proportion of hits that are false (F/S). Several versions of controlling procedure. (Benjamini & Hochberg (1995), and Benjamini & Yekutieli (2001)) A significance measure based on pfdr: q-value (Storey & Tibshirani (2003)) q-value: minimum false discovery rate that can be attained when calling a feature significant Require to estimate the proportion of true null (m 0 /m) For FDRs estimated using Benjamini s and Storey s approaches, the same cut-off resulted in different numbers of significant genes. No formula to describe what are the quantities related to FDR and how they are related.

98 Summary This only provides some flavor of probability, statistics and their usage. To learn more: taking a full course! Introduction to biostatistics for clinical investigators Statistical methods for observational studies

99 References and some useful info Statistical methods in bioinformatics course slides developed by Dr. Christian Gieger and Dr. Achim Tresch Statistical Methods in Bioinformatics by Warren Ewens and Gregory Grant Introduction to Statistical Thought by Michael Lavine, The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani and Jerome Friedman Statistical software package and program language R,

Introductionn to Probability and Statistics

Introductionn to Probability and Statistics Introductionn to Probability and Statistics Xi Kathy Zhou, PhD Division of Biostatistics and Epidemiology Departmentt of Public Health http://www.med.cornell.edu/public.health/biostat.htm Feb. 2010 Overview

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying

More information

Probability. Table of contents

Probability. Table of contents Probability Table of contents 1. Important definitions 2. Distributions 3. Discrete distributions 4. Continuous distributions 5. The Normal distribution 6. Multivariate random variables 7. Other continuous

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES VARIABLE Studying the behavior of random variables, and more importantly functions of random variables is essential for both the

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

QUANTITATIVE TECHNIQUES

QUANTITATIVE TECHNIQUES UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION (For B Com. IV Semester & BBA III Semester) COMPLEMENTARY COURSE QUANTITATIVE TECHNIQUES QUESTION BANK 1. The techniques which provide the decision maker

More information

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Review of Probability. CS1538: Introduction to Simulations

Review of Probability. CS1538: Introduction to Simulations Review of Probability CS1538: Introduction to Simulations Probability and Statistics in Simulation Why do we need probability and statistics in simulation? Needed to validate the simulation model Needed

More information

Non-specific filtering and control of false positives

Non-specific filtering and control of false positives Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview

More information

Appendix A : Introduction to Probability and stochastic processes

Appendix A : Introduction to Probability and stochastic processes A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of

More information

Introduction to Probability and Statistics (Continued)

Introduction to Probability and Statistics (Continued) Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Brief Review of Probability

Brief Review of Probability Brief Review of Probability Nuno Vasconcelos (Ken Kreutz-Delgado) ECE Department, UCSD Probability Probability theory is a mathematical language to deal with processes or experiments that are non-deterministic

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

Midterm Exam 1 Solution

Midterm Exam 1 Solution EECS 126 Probability and Random Processes University of California, Berkeley: Fall 2015 Kannan Ramchandran September 22, 2015 Midterm Exam 1 Solution Last name First name SID Name of student on your left:

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Lecture 2: Review of Probability

Lecture 2: Review of Probability Lecture 2: Review of Probability Zheng Tian Contents 1 Random Variables and Probability Distributions 2 1.1 Defining probabilities and random variables..................... 2 1.2 Probability distributions................................

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables Chapter 2 Some Basic Probability Concepts 2.1 Experiments, Outcomes and Random Variables A random variable is a variable whose value is unknown until it is observed. The value of a random variable results

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

This does not cover everything on the final. Look at the posted practice problems for other topics.

This does not cover everything on the final. Look at the posted practice problems for other topics. Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry

More information

Statistical testing. Samantha Kleinberg. October 20, 2009

Statistical testing. Samantha Kleinberg. October 20, 2009 October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find

More information

CME 106: Review Probability theory

CME 106: Review Probability theory : Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability Theory for Machine Learning. Chris Cremer September 2015 Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares

More information

Bivariate distributions

Bivariate distributions Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient

More information

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics, Appendix

More information

Preliminary Statistics. Lecture 3: Probability Models and Distributions

Preliminary Statistics. Lecture 3: Probability Models and Distributions Preliminary Statistics Lecture 3: Probability Models and Distributions Rory Macqueen (rm43@soas.ac.uk), September 2015 Outline Revision of Lecture 2 Probability Density Functions Cumulative Distribution

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

STAT 4385 Topic 01: Introduction & Review

STAT 4385 Topic 01: Introduction & Review STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics

More information

Chapter 2. Continuous random variables

Chapter 2. Continuous random variables Chapter 2 Continuous random variables Outline Review of probability: events and probability Random variable Probability and Cumulative distribution function Review of discrete random variable Introduction

More information

STA1000F Summary. Mitch Myburgh MYBMIT001 May 28, Work Unit 1: Introducing Probability

STA1000F Summary. Mitch Myburgh MYBMIT001 May 28, Work Unit 1: Introducing Probability STA1000F Summary Mitch Myburgh MYBMIT001 May 28, 2015 1 Module 1: Probability 1.1 Work Unit 1: Introducing Probability 1.1.1 Definitions 1. Random Experiment: A procedure whose outcome (result) in a particular

More information

Chapter 2 Class Notes

Chapter 2 Class Notes Chapter 2 Class Notes Probability can be thought of in many ways, for example as a relative frequency of a long series of trials (e.g. flips of a coin or die) Another approach is to let an expert (such

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Some Concepts of Probability (Review) Volker Tresp Summer 2018

Some Concepts of Probability (Review) Volker Tresp Summer 2018 Some Concepts of Probability (Review) Volker Tresp Summer 2018 1 Definition There are different way to define what a probability stands for Mathematically, the most rigorous definition is based on Kolmogorov

More information

Lecture 6 Basic Probability

Lecture 6 Basic Probability Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

01 Probability Theory and Statistics Review

01 Probability Theory and Statistics Review NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement

More information

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces.

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces. Probability Theory To start out the course, we need to know something about statistics and probability Introduction to Probability Theory L645 Advanced NLP Autumn 2009 This is only an introduction; for

More information

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr. Topic 2: Probability & Distributions ECO220Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit utoronto.ca November 21, 2017 Dr. Nick

More information

CONTINUOUS RANDOM VARIABLES

CONTINUOUS RANDOM VARIABLES the Further Mathematics network www.fmnetwork.org.uk V 07 REVISION SHEET STATISTICS (AQA) CONTINUOUS RANDOM VARIABLES The main ideas are: Properties of Continuous Random Variables Mean, Median and Mode

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables To be provided to students with STAT2201 or CIVIL-2530 (Probability and Statistics) Exam Main exam date: Tuesday, 20 June 1

More information

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and

More information

Basic Probability. Introduction

Basic Probability. Introduction Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

II. The Normal Distribution

II. The Normal Distribution II. The Normal Distribution The normal distribution (a.k.a., a the Gaussian distribution or bell curve ) is the by far the best known random distribution. It s discovery has had such a far-reaching impact

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

2. AXIOMATIC PROBABILITY

2. AXIOMATIC PROBABILITY IA Probability Lent Term 2. AXIOMATIC PROBABILITY 2. The axioms The formulation for classical probability in which all outcomes or points in the sample space are equally likely is too restrictive to develop

More information

B.N.Bandodkar College of Science, Thane. Random-Number Generation. Mrs M.J.Gholba

B.N.Bandodkar College of Science, Thane. Random-Number Generation. Mrs M.J.Gholba B.N.Bandodkar College of Science, Thane Random-Number Generation Mrs M.J.Gholba Properties of Random Numbers A sequence of random numbers, R, R,., must have two important statistical properties, uniformity

More information

L2: Review of probability and statistics

L2: Review of probability and statistics Probability L2: Review of probability and statistics Definition of probability Axioms and properties Conditional probability Bayes theorem Random variables Definition of a random variable Cumulative distribution

More information

Single Maths B: Introduction to Probability

Single Maths B: Introduction to Probability Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Class 26: review for final exam 18.05, Spring 2014

Class 26: review for final exam 18.05, Spring 2014 Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

4th IIA-Penn State Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur

4th IIA-Penn State Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur 4th IIA-Penn State Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur Laws of Probability, Bayes theorem, and the Central Limit Theorem Rahul Roy Indian Statistical Institute, Delhi. Adapted

More information

An introduction to biostatistics: part 1

An introduction to biostatistics: part 1 An introduction to biostatistics: part 1 Cavan Reilly September 6, 2017 Table of contents Introduction to data analysis Uncertainty Probability Conditional probability Random variables Discrete random

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Probability. Lecture Notes. Adolfo J. Rumbos

Probability. Lecture Notes. Adolfo J. Rumbos Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................

More information

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg Statistics for Data Analysis PSI Practical Course 2014 Niklaus Berger Physics Institute, University of Heidelberg Overview You are going to perform a data analysis: Compare measured distributions to theoretical

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability

More information

Review of Basic Probability Theory

Review of Basic Probability Theory Review of Basic Probability Theory James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 35 Review of Basic Probability Theory

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y. CS450 Final Review Problems Fall 08 Solutions or worked answers provided Problems -6 are based on the midterm review Identical problems are marked recap] Please consult previous recitations and textbook

More information

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets

More information

High-Throughput Sequencing Course

High-Throughput Sequencing Course High-Throughput Sequencing Course DESeq Model for RNA-Seq Biostatistics and Bioinformatics Summer 2017 Outline Review: Standard linear regression model (e.g., to model gene expression as function of an

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

1 INFO Sep 05

1 INFO Sep 05 Events A 1,...A n are said to be mutually independent if for all subsets S {1,..., n}, p( i S A i ) = p(a i ). (For example, flip a coin N times, then the events {A i = i th flip is heads} are mutually

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

Discrete Probability Refresher

Discrete Probability Refresher ECE 1502 Information Theory Discrete Probability Refresher F. R. Kschischang Dept. of Electrical and Computer Engineering University of Toronto January 13, 1999 revised January 11, 2006 Probability theory

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique

More information

Masters Comprehensive Examination Department of Statistics, University of Florida

Masters Comprehensive Examination Department of Statistics, University of Florida Masters Comprehensive Examination Department of Statistics, University of Florida May 6, 003, 8:00 am - :00 noon Instructions: You have four hours to answer questions in this examination You must show

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

1 Probability theory. 2 Random variables and probability theory.

1 Probability theory. 2 Random variables and probability theory. Probability theory Here we summarize some of the probability theory we need. If this is totally unfamiliar to you, you should look at one of the sources given in the readings. In essence, for the major

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability

MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability STA301- Statistics and Probability Solved MCQS From Midterm Papers March 19,2012 MC100401285 Moaaz.pk@gmail.com Mc100401285@gmail.com PSMD01 MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability

More information