Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Size: px
Start display at page:

Download "Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda"

Transcription

1 Review DS GA 1002 Statistical and Mathematical Models Carlos Fernandez-Granda

2 Probability and statistics Probability: Framework for dealing with uncertainty Statistics: Framework for extracting information from data making probabilistic assumptions

3 Probability Probability basics: probability spaces, conditional probability, independence, conditional independence Random variables: pmf, cdf, pdf, important distributions, functions of random variables Multivariate random variables: joint pmf, joint cdf, joint pdf, marginal distributions, conditional distributions, independence, joint distribution of discrete/continuous random variables

4 Probability Expectation: definition, mean, median, variance, Markov and Chebyshev inequalities, covariance, correlation coefficient, covariance matrix, conditional expectation Random processes: definition, mean, autocovariance, important processes (iid, Gaussian, Poisson, random walk), Markov chains Convergence: types of convergence, law of large numbers, central limit theorem, convergence of Markov chains Simulation: motivation, inverse-transform sampling, rejection sampling, Markov-chain Monte Carlo

5 Statistics Descriptive statistics: histogram, empirical mean/variance, order statistics, empirical covariance, principal component analysis Statistical estimation: frequentist perspective, mean square error, consistency, confidence intervals Learning models: method of moments, maximum likelihood, empirical cdf, kernel density estimation

6 Statistics Hypothesis testing: definitions (null/alternative hypothesis, Type I/II errors), significance level, power, p value, parametric testing, power function, likelihood-ratio test, permutation test, multiple testing, Bonferroni s method Bayesian statistics: prior, likelihood, posterior, posterior mean/mode Linear regression: linear models, least squares, geometric interpretation, probabilistic interpretation, overfitting

7 Random walk with a drift We define the random walk X as the discrete-state discrete-time random process X (0) := 0, X (i) := X (i 1) + S (i) + 1, i = 1, 2,... where S (i) = { +1 with probability 1 2, 1 with probability 1 2, is an iid sequence of steps

8 Random walk with a drift What is the mean of this random process? ( ) E X (i)

9 Random walk with a drift What is the mean of this random process? ( ) i ( ) E X (i) = E S (j) + 1 j=1

10 Random walk with a drift What is the mean of this random process? ( ) i ( ) E X (i) = E S (j) + 1 = i j=1 j=1 ( ) E S (j) + n

11 Random walk with a drift What is the mean of this random process? ( ) i ( ) E X (i) = E S (j) + 1 = = i i j=1 j=1 ( ) E S (j) + n

12 Random walk with a drift What is the autocovariance? Use the fact that the autocovariance of the random walk without drift W that we studied in the lecture notes is R W (i, j) = min {i, j}

13 Random walk with a drift X (i)

14 Random walk with a drift X (i) = W (i) + i

15 Random walk with a drift ) E ( W (i) X (i) = W (i) + i

16 Random walk with a drift ) E ( W (i) X (i) = W (i) + i = 0

17 Random walk with a drift ) E ( W (i) X (i) = W (i) + i = 0 R X (i, j)

18 Random walk with a drift ) E ( W (i) X (i) = W (i) + i = 0 ( ) ( ) ( ) R X (i, j) := E X (i) X (j) E X (i) E X (j)

19 Random walk with a drift ) E ( W (i) X (i) = W (i) + i = 0 ( ) R X (i, j) := E X (i) X (j) = E ( ) E X (i) ( ) E X (j) (( W (i) + i ) ( W (j) + j )) E ) ) ( W (i) + i E ( W (j) + j

20 Random walk with a drift ) E ( W (i) X (i) = W (i) + i = 0 ( ) R X (i, j) := E X (i) X (j) ( ) E X (i) ) )) (( W (i) + i ( W (j) + j ( ) E X (j) ) = E E ( W (i) + i ) ) ) = E ( W (i) W (j) + ie ( W (j) + je ( W (i) + ij ie ( W (j) ) je ( W (i) ) ij ) E ( W (j) + j

21 Random walk with a drift ) E ( W (i) X (i) = W (i) + i = 0 ( ) R X (i, j) := E X (i) X (j) ( ) E X (i) ) )) (( W (i) + i ( W (j) + j ( ) E X (j) ) = E E ( W (i) + i ) ) ) = E ( W (i) W (j) + ie ( W (j) + je ( W (i) + ij ie = min {i, j} ( W (j) ) je ( W (i) ) ij ) E ( W (j) + j

22 Random walk with a drift Compute the first-order pmf of X (i). Recall that the first-order pmf of the random walk W equals {( i ) 1 i+x if i + x is even and i x i p W (i) (x) = 2 2 i 0 otherwise

23 Random walk with a drift p X (i) (x)

24 Random walk with a drift ( ) p X (i) (x) = P X (i) = x

25 Random walk with a drift ( ) p X (i) (x) = P X (i) = x ) = P ( W (i) = x i

26 Random walk with a drift ( ) p X (i) (x) = P X (i) = x ) = P ( W (i) = x i = p W (i) (x 1)

27 Random walk with a drift ( ) p X (i) (x) = P X (i) = x ) = P ( W (i) = x i = p W (i) (x 1) { =

28 Random walk with a drift ( ) p X (i) (x) = P X (i) = x ) = P ( W (i) = x i = p W (i) (x 1) = {( ix ) i

29 Random walk with a drift ( ) p X (i) (x) = P X (i) = x ) = P ( W (i) = x i = p W (i) (x 1) {( ix ) 1 if x is even and 0 x 2i 2 i = 2

30 Random walk with a drift ( ) p X (i) (x) = P X (i) = x ) = P ( W (i) = x i = p W (i) (x 1) {( ix ) 1 if x is even and 0 x 2i 2 i = 2 0 otherwise

31 Random walk with a drift Does the process satisfy the Markov condition? p X (i+1) X (1), X (2),..., X (i) (x i+1 x 1, x 2,..., x i ) = p X (i+1) X (i) (x i+1 x i )

32 Random walk with a drift p X (i+1) X (1), X (2),..., X (i) (x i+1 x 1, x 2,..., x i )

33 Random walk with a drift p X (i+1) X (1), X (2),..., X (i) (x i+1 x 1, x 2,..., x i ) = P (x i + S ) (i + 1) + 1 = x i+1

34 Random walk with a drift p X (i+1) X (1), X (2),..., X (i) (x i+1 x 1, x 2,..., x i ) = P (x i + S ) (i + 1) + 1 = x i+1 = p X (i+1) X (i) (x i+1 x i )

35 Random walk with a drift We observe that X (10) = 16 and X (20) = 30. What is the best estimator for X (21) in terms of probability of error?

36 Random walk with a drift p X (21) X (10), X (20) (x 16, 30)

37 Random walk with a drift p X (21) X (10), X (20) (x 16, 30) = p X (21) X (20) (x 30)

38 Random walk with a drift p X (21) X (10), X (20) (x 16, 30) = p X (21) X (20) (x 30) 1 2 if x = 32 = 1 2 if x = 30 0 otherwise

39 Markov chain Consider a Markov chain X with transition matrix [ ] a 1 T X :=, 1 a 0 where a is a constant between 0 and 1. We label the two states 0 and 1. The transition matrix T X has two eigenvectors q 1 := [ 1 ] [ ] 1 a 1, q 1 2 := 1 The corresponding eigenvalues are λ 1 := 1 and λ 2 := a 1

40 Markov chain For what values of a is the Markov chain irreducible?

41 Markov chain For what values of a is the Markov chain periodic?

42 Markov chain Express the stationary distribution of X in terms of a p stat

43 Markov chain Express the stationary distribution of X in terms of a p stat = 1 ( q 1 ) 1 + ( q 1 ) 2 q 1

44 Markov chain Express the stationary distribution of X in terms of a 1 p stat = q 1 ( q 1 ) 1 + ( q 1 ) 2 = 1 [ ] 1 2 a 1 a

45 Markov chain Does the Markov chain always converge in probability for all values of a? Justify that this is the case or provide a counterexample.

46 Markov chain Express the conditional pmf of X (i) conditioned on X (1) = 0 as a function of a and i. (Hint: Computing q 1 + q 2 could be a helpful first step.) Evaluate the expression at a = 0 and a = 1. Does the result make sense?

47 Markov chain We have q 1 + q 2

48 Markov chain We have q 1 + q 2 = [ 1 ] [ ] 1 a

49 Markov chain We have q 1 + q 2 = = [ 1 1 a ] [ ] [ 2 a 1 a 0 ]

50 Markov chain We have q 1 + q 2 = = [ 1 1 a 1 [ 2 a 1 a 0 ] [ ] ] p X (0)

51 Markov chain We have q 1 + q 2 = = [ 1 1 a 1 [ 2 a 1 a 0 ] [ ] ] p X (0) = [ ] 1 0

52 Markov chain We have q 1 + q 2 = = [ 1 1 a 1 [ 2 a 1 a 0 ] [ ] ] p X (0) = [ ] 1 0 = 1 a 2 a ( q 1 + q 2 )

53 Markov chain p X (i)

54 Markov chain p X (i) = T ĩ X p X (0)

55 Markov chain p X (i) = T ĩ p X X (0) = T ĩ 1 a X 2 a ( q 1 + q 2 )

56 Markov chain p X (i) = T ĩ p X X (0) = T ĩ 1 a X 2 a ( q 1 + q 2 ) = 1 a ( λ i 2 a 1 q 1 + λ i 2 q 2)

57 Markov chain p X (i) = T ĩ p X X (0) = T ĩ 1 a X 2 a ( q 1 + q 2 ) = 1 a ( λ i 2 a 1 q 1 + λ i 2 q 2) = 1 a 2 a ([ 1 ] [ ]) 1 a + (a 1) i 1 1 1

58 Markov chain p X (i) = T ĩ p X X (0) = T ĩ 1 a X 2 a ( q 1 + q 2 ) = 1 a ( λ i 2 a 1 q 1 + λ i 2 q 2) = 1 a 2 a = 1 2 a ([ 1 1 a ] [ ]) + (a 1) i 1 1 ] 1 [ 1 (a 1) i+1 (1 a) (1 (a 1) i)

59 Markov chain For a = 1 we have p X (i) = [ ] 1 0

60 Markov chain For a = 0 we have p X (i) = 1 2 [ ] 1 ( 1) i+1 1 ( 1) i

61 Markov chain For a = 0 we have [ ] p X (i) = 1 1 ( 1) i ( 1) i [ ] 0 if i is odd, 1 = [ ] 1 if i is even. 0

62 Sampling from multivariate distributions We are interested in generating samples from the joint distribution of two random variables X and Y. If we generate a sample x according to the pdf f X and a sample y according to the pdf f Y, are these samples a realization of the joint distribution of X and Y? Explain your answer with a simple example.

63 Sampling from multivariate distributions Now, assume that X is discrete and Y is continuous. Propose a method to generate a sample from the joint distribution using the pmf of X and the conditional cdf of Y given X using two independent samples from a distribution that is uniform between 0 and 1. Assume that the conditional cdf is invertible.

64 Sampling from multivariate distributions 1. Obtain two independent samples u 1 and u 2 from the uniform distribution.

65 Sampling from multivariate distributions 1. Obtain two independent samples u 1 and u 2 from the uniform distribution. 2. Set x to equal the smallest value a such that p X (a) 0 and u 1 F X (a).

66 Sampling from multivariate distributions 1. Obtain two independent samples u 1 and u 2 from the uniform distribution. 2. Set x to equal the smallest value a such that p X (a) 0 and u 1 F X (a). 3. Define Set y := F 1 x (u 2 ) F x ( ) := F Y X ( x)

67 Sampling from multivariate distributions Explain how to generate samples from a random variable with pdf f W (w) = 0.1 λ 1 exp ( λ 1 w) λ 2 exp ( λ 2 w), w 0, where λ 1 and λ 2 are positive constants, using two iid uniform samples between 0 and 1.

68 Sampling from multivariate distributions Let us define a Bernoulli random variable X with parameter 0.9, such that if X = 0 then Y is exponential with parameter λ 1 and if X = 1 then Y is exponential with parameter λ 2

69 Sampling from multivariate distributions Let us define a Bernoulli random variable X with parameter 0.9, such that if X = 0 then Y is exponential with parameter λ 1 and if X = 1 then Y is exponential with parameter λ 2 The marginal distribution of Y is f Y (w) = p X (0) f Y X (w 0) + p X (1) f Y X (w 1)

70 Sampling from multivariate distributions Let us define a Bernoulli random variable X with parameter 0.9, such that if X = 0 then Y is exponential with parameter λ 1 and if X = 1 then Y is exponential with parameter λ 2 The marginal distribution of Y is f Y (w) = p X (0) f Y X (w 0) + p X (1) f Y X (w 1) = 0.1 λ 1 exp ( λ 1 w) λ 2 exp ( λ 2 w)

71 Sampling from multivariate distributions 1. We obtain two independent samples u 1 and u 2 from the uniform distribution.

72 Sampling from multivariate distributions 1. We obtain two independent samples u 1 and u 2 from the uniform distribution. 2. If u we set w := 1 ( ) 1 log λ 1 1 u 2 otherwise we set w := 1 ( ) 1 log λ 2 1 u 2

73 Convergence Let U be a random variable uniformly distributed between 0 and 1. If we define the discrete random process X X (i) = U for all i, does X converge to 1 U in probability?

74 Convergence Does X converge to 1 U in distribution?

75 Convergence You draw some iid samples x 1, x 2,... from a Cauchy random variable. Will the empirical mean 1 n n i=1 x i converge in probability as n grows large? Explain why briefly and if the answer is yes state what it converges to.

76 Convergence You draw m iid samples x 1, x 2,..., x m from a Cauchy random variable. Then you draw iid samples y 1, y 2,... uniformly from {x 1, x 2,..., x m } (each y i is equal to each element of {x 1, x 2,..., x m } with probability 1/m). Will the empirical mean 1 n n i=1 y i converge in probability as n grows large? Explain why very briefly and if the answer is yes state what it converges to.

77 Earthquake We are interested in learning a model for the occurrence of earthquakes. We decide to model the time between earthquakes as an exponential random variable with parameter λ. Compute the maximum-likelihood estimate of λ given t 1, t 2,..., t n, which are interarrival times for past earthquakes. Assume that the data are iid.

78 Earthquake L (λ)

79 Earthquake L (λ) := f T (1),..., T (n) (t 1,..., t n )

80 Earthquake L (λ) := f T (1),..., T (n) (t 1,..., t n ) n = λ exp ( λt i ) i=1

81 Earthquake L (λ) := f T (1),..., T (n) (t 1,..., t n ) n = λ exp ( λt i ) i=1 ( = λ n exp λ ) n t i i=1

82 Earthquake L (λ) := f T (1),..., T (n) (t 1,..., t n ) n = λ exp ( λt i ) i=1 ( = λ n exp λ ) n t i i=1 log L (λ)

83 Earthquake L (λ) := f T (1),..., T (n) (t 1,..., t n ) n = λ exp ( λt i ) i=1 ( = λ n exp λ ) n t i i=1 log L (λ) = n log λ λ n i=1 t i

84 Earthquake d log L t1,...,t n (λ) dλ

85 Earthquake d log L t1,...,t n (λ) dλ = n λ n i=1 t i

86 Earthquake d log L t1,...,t n (λ) dλ = n λ n i=1 t i d 2 log L t1,...,t n (λ) dλ 2

87 Earthquake d log L t1,...,t n (λ) dλ = n λ n i=1 t i d 2 log L t1,...,t n (λ) dλ 2 = n λ 2

88 Earthquake d log L t1,...,t n (λ) dλ = n λ n i=1 t i d 2 log L t1,...,t n (λ) dλ 2 = n λ 2 λ ML

89 Earthquake d log L t1,...,t n (λ) dλ = n λ n i=1 t i d 2 log L t1,...,t n (λ) dλ 2 = n λ 2 1 λ ML = 1 n n i=1 t i

90 Earthquake Find an approximate 0.95 confidence interval based on the central limit theorem for the value of λ. Assume that you know a bound b on the standard deviation (i.e. the variance of the exponential 1/λ 2 is bounded by b 2 ) and express your answer using the Q function. (Hint: Express the ML estimate in terms of the empirical mean.) (See solutions.)

91 Earthquake What is the posterior distribution of the parameter Λ if we model it as a random variable with a uniform distribution between 0 and u? Express your answer in terms of the sum n i=1 t i, u and the marginal pdf of the data evaluated at t 1, t 2,..., t n c := f T (1),..., T (n) (t 1,..., t n ).

92 Earthquake f Λ T (1),..., T (n) (λ t 1,..., t n )

93 Earthquake f Λ T (1),..., T (n) (λ t 1,..., t n ) = f Λ (λ) λ n exp ( λ n i=1 t i) f T (1),..., T (n) (t 1,..., t n )

94 Earthquake f Λ T (1),..., T (n) (λ t 1,..., t n ) = f Λ (λ) λ n exp ( λ n i=1 t i) f T (1),..., T (n) (t 1,..., t n ) = 1 u c λn exp ( λ ) n t i i=1

95 Earthquake f Λ T (1),..., T (n) (λ t 1,..., t n ) = f Λ (λ) λ n exp ( λ n i=1 t i) f T (1),..., T (n) (t 1,..., t n ) = 1 u c λn exp ( λ for 0 λ u and zero otherwise ) n t i i=1

96 Earthquake f Λ T (1),..., T (n) (λ t1,..., tn) λ

97 Earthquake Explain how you would use the answer in the previous question to construct a confidence interval for the parameter

98 Chad You hate a coworker and want to predict when he is in the office from the temperature. Chad No Chad You model his presence using a random variable C which is equal to 1 if he is there and 0 if he is not. Estimate p C.

99 Chad The empirical pmf is p C (0) = 5 15 = 1 3, p C (1) = = 2 3.

100 Chad You model the temperature using a random variable T. Sketch the kernel density estimator of the conditional distribution of T given C using a rectangular kernel with width equal to 2.

101 Chad 0.20 f T C (t 0) 0.15 f T C (t 1)

102 Chad If T = 68 what is the ML estimate of C?

103 Chad If T = 68 what is the ML estimate of C? f T C (68 0) = 0.2 f T C (68 1) = 0

104 Chad If T = 64 what is the MAP estimate of C?

105 Chad If T = 64 what is the MAP estimate of C? p C T (0 64)

106 Chad If T = 64 what is the MAP estimate of C? p C (0) f T C (64 0) p C T (0 64) = p C (0) f T C (64 0) + p C (1) f T C (64 1)

107 Chad If T = 64 what is the MAP estimate of C? p C (0) f T C (64 0) p C T (0 64) = p C (0) f T C (64 0) + p C (1) f T C (64 1) =

108 Chad If T = 64 what is the MAP estimate of C? p C (0) f T C (64 0) p C T (0 64) = p C (0) f T C (64 0) + p C (1) f T C (64 1) = =

109 Chad If T = 64 what is the MAP estimate of C? p C (0) f T C (64 0) p C T (0 64) = p C (0) f T C (64 0) + p C (1) f T C (64 1) = = p C T (1 64)

110 Chad If T = 64 what is the MAP estimate of C? p C (0) f T C (64 0) p C T (0 64) = p C (0) f T C (64 0) + p C (1) f T C (64 1) = = p C T (1 64) = 1 p C T (0 64)

111 Chad If T = 64 what is the MAP estimate of C? p C (0) f T C (64 0) p C T (0 64) = p C (0) f T C (64 0) + p C (1) f T C (64 1) = = p C T (1 64) = 1 p C T (0 64) = 1 2

112 Chad What happens if the temperature is 57? Explain how using parametric estimation may alleviate this problem.

113 3-point shooting The New York Knicks hire you as a data analyst. Your first task is to come up with a way to determine whether a 3-point shooter is any good. You will use the following graph of the function g (θ, n) = θ n. g(θ,n) n = 4 n = 9 n = 14 n = 19 n = θ

114 3-point shooting 1. Interpret g (θ, n).

115 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis?

116 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis? 3. What number of shots does a player need to make in a row for you to reject the null hypothesis with a confidence level of 5%?

117 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis? 3. What number of shots does a player need to make in a row for you to reject the null hypothesis with a confidence level of 5%? 14

118 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis? 3. What number of shots does a player need to make in a row for you to reject the null hypothesis with a confidence level of 5%? A player makes 9 shots in a row. What is the corresponding p value? Do you declare him as a good shooter?

119 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis? 3. What number of shots does a player need to make in a row for you to reject the null hypothesis with a confidence level of 5%? A player makes 9 shots in a row. What is the corresponding p value? Do you declare him as a good shooter? 0.14

120 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis? 3. What number of shots does a player need to make in a row for you to reject the null hypothesis with a confidence level of 5%? A player makes 9 shots in a row. What is the corresponding p value? Do you declare him as a good shooter? What is the probability that you do not declare a player who has a shooting percentage of 90% as a good shooter?

121 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis? 3. What number of shots does a player need to make in a row for you to reject the null hypothesis with a confidence level of 5%? A player makes 9 shots in a row. What is the corresponding p value? Do you declare him as a good shooter? What is the probability that you do not declare a player who has a shooting percentage of 90% as a good shooter? 1 g (0.9, 14) 0.76

122 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis? 3. What number of shots does a player need to make in a row for you to reject the null hypothesis with a confidence level of 5%? A player makes 9 shots in a row. What is the corresponding p value? Do you declare him as a good shooter? What is the probability that you do not declare a player who has a shooting percentage of 90% as a good shooter? 1 g (0.9, 14) You apply the test on 10 players. You adapt the threshold applying Bonferroni s method. What is the new threshold?

123 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis? 3. What number of shots does a player need to make in a row for you to reject the null hypothesis with a confidence level of 5%? A player makes 9 shots in a row. What is the corresponding p value? Do you declare him as a good shooter? What is the probability that you do not declare a player who has a shooting percentage of 90% as a good shooter? 1 g (0.9, 14) You apply the test on 10 players. You adapt the threshold applying Bonferroni s method. What is the new threshold? n = 24

124 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis? 3. What number of shots does a player need to make in a row for you to reject the null hypothesis with a confidence level of 5%? A player makes 9 shots in a row. What is the corresponding p value? Do you declare him as a good shooter? What is the probability that you do not declare a player who has a shooting percentage of 90% as a good shooter? 1 g (0.9, 14) You apply the test on 10 players. You adapt the threshold applying Bonferroni s method. What is the new threshold? n = With the correction, what is the probability that you do not declare a player who has a shooting percentage of 90% as a good shooter?

125 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis? 3. What number of shots does a player need to make in a row for you to reject the null hypothesis with a confidence level of 5%? A player makes 9 shots in a row. What is the corresponding p value? Do you declare him as a good shooter? What is the probability that you do not declare a player who has a shooting percentage of 90% as a good shooter? 1 g (0.9, 14) You apply the test on 10 players. You adapt the threshold applying Bonferroni s method. What is the new threshold? n = With the correction, what is the probability that you do not declare a player who has a shooting percentage of 90% as a good shooter? 1 g (0.9, 14) 0.92

126 3-point shooting 1. Interpret g (θ, n). 2. The coach tells you: I want to make sure that the guy has a shooting percentage over 80%. What is your null hypothesis? 3. What number of shots does a player need to make in a row for you to reject the null hypothesis with a confidence level of 5%? A player makes 9 shots in a row. What is the corresponding p value? Do you declare him as a good shooter? What is the probability that you do not declare a player who has a shooting percentage of 90% as a good shooter? 1 g (0.9, 14) You apply the test on 10 players. You adapt the threshold applying Bonferroni s method. What is the new threshold? n = With the correction, what is the probability that you do not declare a player who has a shooting percentage of 90% as a good shooter? 1 g (0.9, 14) What is the advantage of adapting the threshold? What is the disadvantage?

Overview. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Overview. DS GA 1002 Probability and Statistics for Data Science.   Carlos Fernandez-Granda Overview DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing

More information

Random Processes. DS GA 1002 Probability and Statistics for Data Science.

Random Processes. DS GA 1002 Probability and Statistics for Data Science. Random Processes DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Modeling quantities that evolve in time (or space)

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

Statistics & Data Sciences: First Year Prelim Exam May 2018

Statistics & Data Sciences: First Year Prelim Exam May 2018 Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Fundamentals of Applied Probability and Random Processes

Fundamentals of Applied Probability and Random Processes Fundamentals of Applied Probability and Random Processes,nd 2 na Edition Oliver C. Ibe University of Massachusetts, LoweLL, Massachusetts ip^ W >!^ AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Linear regression. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Linear regression. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Linear regression DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall15 Carlos Fernandez-Granda Linear models Least-squares estimation Overfitting Example:

More information

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. STAT 302 Introduction to Probability Learning Outcomes Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. Chapter 1: Combinatorial Analysis Demonstrate the ability to solve combinatorial

More information

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics Table of Preface page xi PART I INTRODUCTION 1 1 The meaning of probability 3 1.1 Classical definition of probability 3 1.2 Statistical definition of probability 9 1.3 Bayesian understanding of probability

More information

Convergence of Random Processes

Convergence of Random Processes Convergence of Random Processes DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Define convergence for random

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the

More information

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y. CS450 Final Review Problems Fall 08 Solutions or worked answers provided Problems -6 are based on the midterm review Identical problems are marked recap] Please consult previous recitations and textbook

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Bayesian statistics. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Bayesian statistics. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Bayesian statistics DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall15 Carlos Fernandez-Granda Frequentist vs Bayesian statistics In frequentist statistics

More information

Probability Distributions Columns (a) through (d)

Probability Distributions Columns (a) through (d) Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018 15-388/688 - Practical Data Science: Basic probability J. Zico Kolter Carnegie Mellon University Spring 2018 1 Announcements Logistics of next few lectures Final project released, proposals/groups due

More information

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Expectation. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Descriptive statistics Techniques to visualize

More information

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian

More information

Probability and Statistics for Data Science. Carlos Fernandez-Granda

Probability and Statistics for Data Science. Carlos Fernandez-Granda Probability and Statistics for Data Science Carlos Fernandez-Granda Preface These notes were developed for the course Probability and Statistics for Data Science at the Center for Data Science in NYU.

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Machine learning: Hypothesis testing. Anders Hildeman

Machine learning: Hypothesis testing. Anders Hildeman Location of trees 0 Observed trees 50 100 150 200 250 300 350 400 450 500 0 100 200 300 400 500 600 700 800 900 1000 Figur: Observed points pattern of the tree specie Beilschmiedia pendula. Location of

More information

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

HYPOTHESIS TESTING: FREQUENTIST APPROACH. HYPOTHESIS TESTING: FREQUENTIST APPROACH. These notes summarize the lectures on (the frequentist approach to) hypothesis testing. You should be familiar with the standard hypothesis testing from previous

More information

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition Preface Preface to the First Edition xi xiii 1 Basic Probability Theory 1 1.1 Introduction 1 1.2 Sample Spaces and Events 3 1.3 The Axioms of Probability 7 1.4 Finite Sample Spaces and Combinatorics 15

More information

Expectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Expectation. DS GA 1002 Probability and Statistics for Data Science.   Carlos Fernandez-Granda Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your

More information

Problem 1 (20) Log-normal. f(x) Cauchy

Problem 1 (20) Log-normal. f(x) Cauchy ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Probability and Stochastic Processes

Probability and Stochastic Processes Probability and Stochastic Processes A Friendly Introduction Electrical and Computer Engineers Third Edition Roy D. Yates Rutgers, The State University of New Jersey David J. Goodman New York University

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests

More information

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1

More information

. Find E(V ) and var(v ).

. Find E(V ) and var(v ). Math 6382/6383: Probability Models and Mathematical Statistics Sample Preliminary Exam Questions 1. A person tosses a fair coin until she obtains 2 heads in a row. She then tosses a fair die the same number

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Class 26: review for final exam 18.05, Spring 2014

Class 26: review for final exam 18.05, Spring 2014 Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event

More information

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks Recap Probability, stochastic processes, Markov chains ELEC-C7210 Modeling and analysis of communication networks 1 Recap: Probability theory important distributions Discrete distributions Geometric distribution

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

More information

Masters Comprehensive Examination Department of Statistics, University of Florida

Masters Comprehensive Examination Department of Statistics, University of Florida Masters Comprehensive Examination Department of Statistics, University of Florida May 10, 2002, 8:00am - 12:00 noon Instructions: 1. You have four hours to answer questions in this examination. 2. There

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Definition of stochastic process (random

More information

Statistical Data Analysis

Statistical Data Analysis DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the

More information

Previously Monte Carlo Integration

Previously Monte Carlo Integration Previously Simulation, sampling Monte Carlo Simulations Inverse cdf method Rejection sampling Today: sampling cont., Bayesian inference via sampling Eigenvalues and Eigenvectors Markov processes, PageRank

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Fundamental Probability and Statistics

Fundamental Probability and Statistics Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are

More information

Estimation of Quantiles

Estimation of Quantiles 9 Estimation of Quantiles The notion of quantiles was introduced in Section 3.2: recall that a quantile x α for an r.v. X is a constant such that P(X x α )=1 α. (9.1) In this chapter we examine quantiles

More information

Non-parametric Inference and Resampling

Non-parametric Inference and Resampling Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 00 MODULE : Statistical Inference Time Allowed: Three Hours Candidates should answer FIVE questions. All questions carry equal marks. The

More information

Need for Sampling in Machine Learning. Sargur Srihari

Need for Sampling in Machine Learning. Sargur Srihari Need for Sampling in Machine Learning Sargur srihari@cedar.buffalo.edu 1 Rationale for Sampling 1. ML methods model data with probability distributions E.g., p(x,y; θ) 2. Models are used to answer queries,

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Supplementary Note on Bayesian analysis

Supplementary Note on Bayesian analysis Supplementary Note on Bayesian analysis Structured variability of muscle activations supports the minimal intervention principle of motor control Francisco J. Valero-Cuevas 1,2,3, Madhusudhan Venkadesan

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume II: Probability Emlyn Lloyd University oflancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester - New York - Brisbane

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

8 Basics of Hypothesis Testing

8 Basics of Hypothesis Testing 8 Basics of Hypothesis Testing 4 Problems Problem : The stochastic signal S is either 0 or E with equal probability, for a known value E > 0. Consider an observation X = x of the stochastic variable X

More information

Chapter 6: Random Processes 1

Chapter 6: Random Processes 1 Chapter 6: Random Processes 1 Yunghsiang S. Han Graduate Institute of Communication Engineering, National Taipei University Taiwan E-mail: yshan@mail.ntpu.edu.tw 1 Modified from the lecture notes by Prof.

More information

EXAM # 3 PLEASE SHOW ALL WORK!

EXAM # 3 PLEASE SHOW ALL WORK! Stat 311, Summer 2018 Name EXAM # 3 PLEASE SHOW ALL WORK! Problem Points Grade 1 30 2 20 3 20 4 30 Total 100 1. A socioeconomic study analyzes two discrete random variables in a certain population of households

More information

Statistical Methods in HYDROLOGY CHARLES T. HAAN. The Iowa State University Press / Ames

Statistical Methods in HYDROLOGY CHARLES T. HAAN. The Iowa State University Press / Ames Statistical Methods in HYDROLOGY CHARLES T. HAAN The Iowa State University Press / Ames Univariate BASIC Table of Contents PREFACE xiii ACKNOWLEDGEMENTS xv 1 INTRODUCTION 1 2 PROBABILITY AND PROBABILITY

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

10-701/15-781, Machine Learning: Homework 4

10-701/15-781, Machine Learning: Homework 4 10-701/15-781, Machine Learning: Homewor 4 Aarti Singh Carnegie Mellon University ˆ The assignment is due at 10:30 am beginning of class on Mon, Nov 15, 2010. ˆ Separate you answers into five parts, one

More information

Bayesian GLMs and Metropolis-Hastings Algorithm

Bayesian GLMs and Metropolis-Hastings Algorithm Bayesian GLMs and Metropolis-Hastings Algorithm We have seen that with conjugate or semi-conjugate prior distributions the Gibbs sampler can be used to sample from the posterior distribution. In situations,

More information

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Linear Models DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Linear regression Least-squares estimation

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

6.1 Moment Generating and Characteristic Functions

6.1 Moment Generating and Characteristic Functions Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

An Introduction to Bayesian Linear Regression

An Introduction to Bayesian Linear Regression An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,

More information

1 Exercises for lecture 1

1 Exercises for lecture 1 1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring Lecture 8 A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2015 http://www.astro.cornell.edu/~cordes/a6523 Applications: Bayesian inference: overview and examples Introduction

More information

Lecture 1: Bayesian Framework Basics

Lecture 1: Bayesian Framework Basics Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of

More information

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours MATH2750 This question paper consists of 8 printed pages, each of which is identified by the reference MATH275. All calculators must carry an approval sticker issued by the School of Mathematics. c UNIVERSITY

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Probability Models in Electrical and Computer Engineering Mathematical models as tools in analysis and design Deterministic models Probability models

Probability Models in Electrical and Computer Engineering Mathematical models as tools in analysis and design Deterministic models Probability models Probability Models in Electrical and Computer Engineering Mathematical models as tools in analysis and design Deterministic models Probability models Statistical regularity Properties of relative frequency

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Swarthmore Honors Exam 2012: Statistics

Swarthmore Honors Exam 2012: Statistics Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information