Simulations. . p.1/25
|
|
- Alison Terry
- 6 years ago
- Views:
Transcription
1 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications.. p.1/25
2 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications. We can easily investigate a large number of scenarios (different P s) and a large number of replications.. p.1/25
3 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications. We can easily investigate a large number of scenarios (different P s) and a large number of replications. We can compute transformed distributions numerically.. p.1/25
4 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications. We can easily investigate a large number of scenarios (different P s) and a large number of replications. We can compute transformed distributions numerically. We can compute (complicated) mean values this is known as Monte Carlo simulation.. p.1/25
5 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications. We can easily investigate a large number of scenarios (different P s) and a large number of replications. We can compute transformed distributions numerically. We can compute (complicated) mean values this is known as Monte Carlo simulation. We can investigate the behaviour of methods for statistical inference.. p.1/25
6 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications. We can easily investigate a large number of scenarios (different P s) and a large number of replications. We can compute transformed distributions numerically. We can compute (complicated) mean values this is known as Monte Carlo simulation. We can investigate the behaviour of methods for statistical inference. But how can the deterministic computer generate the outcome from a probability measure?. p.1/25
7 Generic simulation Two step procedure behind simulation of random variables The computer emulates the generation of independent, identically distributed random variables with the uniform distribution on the unit interval [0, 1].. p.2/25
8 Generic simulation Two step procedure behind simulation of random variables The computer emulates the generation of independent, identically distributed random variables with the uniform distribution on the unit interval [0, 1]. The emulated uniformly distributed random variables are by transformation turned into variables with the desired distribution.. p.2/25
9 Theorem behind The following result is behind the generic simulation procedure for simulation from any P on E: Theorem: Let P 0 denote the uniform distribution on [0, 1] and h : [0, 1] E a map with the transformed probability measure on E being P = h(p 0 ). Then X 1, X 2,...,X n defined by X i = h(u i ) are n iid random variables each with distribution P.. p.3/25
10 The real problem What we need in practice is thus the construction of a transformation that can transform the uniform distribution on [0, 1] to the desired probability distribution.. p.4/25
11 The real problem What we need in practice is thus the construction of a transformation that can transform the uniform distribution on [0, 1] to the desired probability distribution. We focus here on two cases A general method for discrete distributions. A general method for probability measures on R given in terms of the distribution function.. p.4/25
12 But what about the simulation of the independent, uniformly distributed random variables?. p.5/25
13 But what about the simulation of the independent, uniformly distributed random variables? Thats a completely different story. Read D. E. Knuth, ACP, Chapter 3 or trust that R behaves well and that runif works correctly.. p.5/25
14 But what about the simulation of the independent, uniformly distributed random variables? Thats a completely different story. Read D. E. Knuth, ACP, Chapter 3 or trust that R behaves well and that runif works correctly. We rely on a sufficiently good pseudo random number generator with the property that as long as we can not statistically detect differences from what the generator produces and true iid [0, 1]-uniformly distributed random variables, then we live happily in ignorance.. p.5/25
15 Discrete random variables If P is a probability measure on a discrete sample space E given by point probabilities p(x), x E choose for each x E an interval I(x) = (a(x), b(x)] [0, 1]. p.6/25
16 Discrete random variables If P is a probability measure on a discrete sample space E given by point probabilities p(x), x E choose for each x E an interval such that I(x) = (a(x), b(x)] [0, 1] the length, b(x) a(x), of I(x) equals p(x), and the intervals I(x) are mutually disjoint: I(x) I(y) = for x y.. p.6/25
17 Discrete random variables If P is a probability measure on a discrete sample space E given by point probabilities p(x), x E choose for each x E an interval such that I(x) = (a(x), b(x)] [0, 1] the length, b(x) a(x), of I(x) equals p(x), and the intervals I(x) are mutually disjoint: I(x) I(y) = for x y. Letting u 1,...,u n be generated by a pseudo random number generator we define x i = x if u i I(x) for i = 1,...,n. Then x 1,...,x n is a realization of n iid random variables with distribution having point probabilities p(x), x E.. p.6/25
18 Generalized inverse Definition: Let F : R [0, 1] be a distribution function. A function F : (0, 1) R that satisfies F(x) y x F (y) (0) for all x R and y (0, 1) is called a generalized inverse of F.. p.7/25
19 Generalized inverse If F has a true inverse (F is strictly increasing and continuous) then F equals the inverse, F 1, of F.. p.8/25
20 Generalized inverse If F has a true inverse (F is strictly increasing and continuous) then F equals the inverse, F 1, of F. All distribution functions has a generalized inverse we find it by solving the inequality F(x) y.. p.8/25
21 Continuous sample space We will simulate from P on R having distribution function F. First find the generalized inverse, F : (0, 1) R, of F.. p.9/25
22 Continuous sample space We will simulate from P on R having distribution function F. First find the generalized inverse, F : (0, 1) R, of F. Then we let u 1,...,u n be generated by a pseudo random number generator and we define x i = F (u i ) for i = 1,...,n. Then x 1,...,x n is a realization of n iid random variables with distribution having distribution function F.. p.9/25
23 Local alignments Assume that X 1,...,X n and Y 1,...,Y m are in total n + m iid random variables with values in the 20 letter amino acid alphabet E = { A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,T,W,Y,V }.. p.10/25
24 Local alignments Assume that X 1,...,X n and Y 1,...,Y m are in total n + m iid random variables with values in the 20 letter amino acid alphabet E = { A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,T,W,Y,V }. We want to find optimal local alignment and in particular we are interested in the score for optimal local alignment. This is a function h : E n+m R.. p.10/25
25 Local alignments Assume that X 1,...,X n and Y 1,...,Y m are in total n + m iid random variables with values in the 20 letter amino acid alphabet E = { A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,T,W,Y,V }. We want to find optimal local alignment and in particular we are interested in the score for optimal local alignment. This is a function h : E n+m R. An X- and a Y -subsequence that are matched letter by letter, matched letters are given a score, positive or negative, and gaps in the subsequences are given a penalty.. p.10/25
26 Local alignment scores Denote by S n,m = h(x 1,...,X n, Y 1,...,Y m ) the transformed, real valued random variable. What is the distribution of S n,m?. p.11/25
27 Local alignment scores Denote by S n,m = h(x 1,...,X n, Y 1,...,Y m ) the transformed, real valued random variable. What is the distribution of S n,m? We can in principle compute its discrete distribution from the distribution of the X- and Y -variables. p.11/25
28 Local alignment scores Denote by S n,m = h(x 1,...,X n, Y 1,...,Y m ) the transformed, real valued random variable. What is the distribution of S n,m? We can in principle compute its discrete distribution from the distribution of the X- and Y -variables futile and not possible in practice.. p.11/25
29 Local alignment scores Denote by S n,m = h(x 1,...,X n, Y 1,...,Y m ) the transformed, real valued random variable. What is the distribution of S n,m? We can in principle compute its discrete distribution from the distribution of the X- and Y -variables futile and not possible in practice. It is possible to use simulations, but it may be quite time-consuming and not a pratical solution for current database usage.. p.11/25
30 Local alignment scores Denote by S n,m = h(x 1,...,X n, Y 1,...,Y m ) the transformed, real valued random variable. What is the distribution of S n,m? We can in principle compute its discrete distribution from the distribution of the X- and Y -variables futile and not possible in practice. It is possible to use simulations, but it may be quite time-consuming and not a pratical solution for current database usage. Develop a good theoretical approximation.. p.11/25
31 Local alignment scores Under certain conditions on the scoring mechanism and the letter distribution a valid approximation, for n and m large, is for parameters λ, K > 0 P(S n,m x) exp( Knm exp( λx)). p.12/25
32 Local alignment scores Under certain conditions on the scoring mechanism and the letter distribution a valid approximation, for n and m large, is for parameters λ, K > 0 P(S n,m x) exp( Knm exp( λx)) This is a scale location transformation S n,m = log(knm) λ + S n,m λ where S n,m has a Gumbel distribution.. p.12/25
33 Statistical models Example: We measure the expression of out favorite gene number i on a microarray. The additive noise model reads that our measurement can be written as X i = µ i + σ i ǫ i, where µ i R, σ i > 0 and ǫ i has mean 0 and variance 1.. p.13/25
34 Statistical models Example: We measure the expression of out favorite gene number i on a microarray. The additive noise model reads that our measurement can be written as X i = µ i + σ i ǫ i, where µ i R, σ i > 0 and ǫ i has mean 0 and variance 1. If ǫ i N(0, 1) we have X i N(µ i, σi 2 ) and we have fully specified our model with unknown parameters (µ i, σ i ) R (0, ).. p.13/25
35 Statistical models Example: We want to consider pairs of nucleotides (X i, Y i ) that are evolutionary related. We assume that they are independent and identically distributed and that P(X 1 = x, Y 1 = y) = p(x)p t (x, y) p(x)( exp( 4αt)) if x = y = p(x)( exp( 4αt)) if x y.. p.14/25
36 Statistical models Example: We want to consider pairs of nucleotides (X i, Y i ) that are evolutionary related. We assume that they are independent and identically distributed and that P(X 1 = x, Y 1 = y) = p(x)p t (x, y) p(x)( exp( 4αt)) if x = y = p(x)( exp( 4αt)) if x y. The unknown parameters are α > 0 and the four-dimensional probability vector p. Perhaps t is also an unknown parameter.. p.14/25
37 Statistical models We need a sample space E.. p.15/25
38 Statistical models We need a sample space E. We need a parameter space Θ of unknown parameters... p.15/25
39 Statistical models We need a sample space E. We need a parameter space Θ of unknown parameters.... and for each θ Θ we need a probability measure P θ on E.. p.15/25
40 Statistical models We need a sample space E. We need a parameter space Θ of unknown parameters.... and for each θ Θ we need a probability measure P θ on E. We call (P θ ) θ Θ a parameterized family of probability measures.. p.15/25
41 Exponential distribution Let E 0 = [0, ), let θ (0, ), and let P θ be the distribution of n iid exponentially distributed random variables X 1,...,X n with intensity parameter θ.. p.16/25
42 Exponential distribution Let E 0 = [0, ), let θ (0, ), and let P θ be the distribution of n iid exponentially distributed random variables X 1,...,X n with intensity parameter θ. The distribution of X i has density f θ (x) = θ exp( θx) for x 0. The probability measure P θ on E = E0 n = (0, ) n has density f θ (x 1,...,x n ) = θ exp( θx 1 )... θ exp( θx n ) = θ n exp( θ(x x n )).. p.16/25
43 Exponential distribution Let E 0 = [0, ), let θ (0, ), and let P θ be the distribution of n iid exponentially distributed random variables X 1,...,X n with intensity parameter θ. The distribution of X i has density f θ (x) = θ exp( θx) for x 0. The probability measure P θ on E = E0 n = (0, ) n has density f θ (x 1,...,x n ) = θ exp( θx 1 )... θ exp( θx n ) = θ n exp( θ(x x n )). With Θ = (0, ) the family (P θ ) θ Θ of probability measures is a statistical model on E.. p.16/25
44 Estimators Definition: An estimator is a map ˆθ : E Θ. For a given observation x E the value of ˆθ at x, ˆϑ = ˆθ(x), is called the estimate of θ.. p.17/25
45 Estimators Definition: An estimator is a map ˆθ : E Θ. For a given observation x E the value of ˆθ at x, ˆϑ = ˆθ(x), is called the estimate of θ. If X has distribution P θ, the transformed random variable ˆθ(X) is also called the estimator it has distribution ˆθ(P θ ).. p.17/25
46 Identifiability Definition: The parameter θ is said to be identifiable if the map θ P θ is one-to-one. That is, for two different parameters θ 1 and θ 2 the corresponding measures P θ1 and P θ2 differ.. p.18/25
47 Identifiability Definition: The parameter θ is said to be identifiable if the map θ P θ is one-to-one. That is, for two different parameters θ 1 and θ 2 the corresponding measures P θ1 and P θ2 differ. We can not in a meaningful way estimate an unknown parameter that is not identifiable!. p.18/25
48 Simulations Write a function, my.rexp, that takes two parameters such that > tmp <- my.rexp(10, 1) generates the realization of 10 iid random variables with the exponential distribution with parameter λ = 1. How do you make the second parameter to be equal to 1 by default such that > tmp <- my.rexp(10) produces the same result?. p.19/25
49 Solution > my.rexp <- function(n, lambda) { + -log(runif(n))/lambda + } To make λ = 1 by default we define instead > my.rexp <- function(n, lambda = 1) { + -log(runif(n))/lambda + } Note that we have used that if U is uniformly distributed on [0, 1] then 1 U is uniformly distributed on [0, 1] to get rid of unnecessary computations.. p.20/25
50 Maximum of random variables Use > tmp <- replicate(1000, max(rexp(10, 1))) to generate 1000 replications of the maximum of 10 independent exponential random variables. Plot the distribution function for the Gumbel distribution with location parameter log(10) and compare it with > emdf <- function(x) sapply(x, function(x) sum(tmp <= x)/1000) What if we take the max of 100 exponential random variables?. p.21/25
51 Solutions > x <- seq(0, 5, by = 0.1) > plot(x, exp(-exp(-(x - log(10)))), type = "l") > points(x, emdf(x), type = "p", pch = 20, col = "red"). p.22/25
52 Solutions x. p.23/25 exp( exp( (x log(10))))
53 Solutions > tmp <- replicate(1000, max(rexp(100, 1))) > emdf <- function(x) sapply(x, function(x) sum(tmp <= x)/1000) > x <- seq(0, 8, by = 0.1) > plot(x, exp(-exp(-(x - log(100)))), type = "l") > points(x, emdf(x), type = "p", pch = 20, col = "red"). p.24/25
54 Solutions x. p.25/25 exp( exp( (x log(100))))
(NRH: Sections 2.6, 2.7, 2.11, 2.12 (at this point in the course the sections will be difficult to follow))
Curriculum, second lecture: Niels Richard Hansen November 23, 2011 NRH: Handout pages 1-13 PD: Pages 55-75 (NRH: Sections 2.6, 2.7, 2.11, 2.12 (at this point in the course the sections will be difficult
More informationMean and variance. Compute the mean and variance of the distribution with density
Mean and variance Compute the mean and variance of the distribution with density > f
More informationHypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes
Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis
More informationIndependent Events. Two events are independent if knowing that one occurs does not change the probability of the other occurring
Independent Events Two events are independent if knowing that one occurs does not change the probability of the other occurring Conditional probability is denoted P(A B), which is defined to be: P(A and
More informationDefinition 1.1 (Parametric family of distributions) A parametric distribution is a set of distribution functions, each of which is determined by speci
Definition 1.1 (Parametric family of distributions) A parametric distribution is a set of distribution functions, each of which is determined by specifying one or more values called parameters. The number
More informationEE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018
Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code
More informationECE 275A Homework 7 Solutions
ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator
More informationPart III. A Decision-Theoretic Approach and Bayesian testing
Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to
More informationLecture 1. Stochastic Optimization: Introduction. January 8, 2018
Lecture 1 Stochastic Optimization: Introduction January 8, 2018 Optimization Concerned with mininmization/maximization of mathematical functions Often subject to constraints Euler (1707-1783): Nothing
More informationChapter 4. Continuous Random Variables
Chapter 4. Continuous Random Variables Review Continuous random variable: A random variable that can take any value on an interval of R. Distribution: A density function f : R R + such that 1. non-negative,
More information3.3 Estimator quality, confidence sets and bootstrapping
Estimator quality, confidence sets and bootstrapping 109 3.3 Estimator quality, confidence sets and bootstrapping A comparison of two estimators is always a matter of comparing their respective distributions.
More informationWeek 2 Statistics for bioinformatics and escience
Week 2 Statistics for bioinformatics and escience Line Skotte 20. november 2008 2.5.1-5) Revisited. When solving these exercises, some of you tried to capture a whole open reading frame by pattern matching
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationThe bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap
Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate
More informationImportance sampling in scenario generation
Importance sampling in scenario generation Václav Kozmík Faculty of Mathematics and Physics Charles University in Prague September 14, 2013 Introduction Monte Carlo techniques have received significant
More informationLecture 1: Random number generation, permutation test, and the bootstrap. August 25, 2016
Lecture 1: Random number generation, permutation test, and the bootstrap August 25, 2016 Statistical simulation 1/21 Statistical simulation (Monte Carlo) is an important part of statistical method research.
More informationChapters 9. Properties of Point Estimators
Chapters 9. Properties of Point Estimators Recap Target parameter, or population parameter θ. Population distribution f(x; θ). { probability function, discrete case f(x; θ) = density, continuous case The
More information16 : Markov Chain Monte Carlo (MCMC)
10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions
More informationComposite Hypotheses and Generalized Likelihood Ratio Tests
Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve
More informationECE 275B Homework # 1 Solutions Version Winter 2015
ECE 275B Homework # 1 Solutions Version Winter 2015 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2
More informationMarkov Chain Monte Carlo Lecture 1
What are Monte Carlo Methods? The subject of Monte Carlo methods can be viewed as a branch of experimental mathematics in which one uses random numbers to conduct experiments. Typically the experiments
More informationCourse: ESO-209 Home Work: 1 Instructor: Debasis Kundu
Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More informationExample: Letter Frequencies
Example: Letter Frequencies i a i p i 1 a 0.0575 2 b 0.0128 3 c 0.0263 4 d 0.0285 5 e 0.0913 6 f 0.0173 7 g 0.0133 8 h 0.0313 9 i 0.0599 10 j 0.0006 11 k 0.0084 12 l 0.0335 13 m 0.0235 14 n 0.0596 15 o
More informationThe logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is
Example The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is log p 1 p = β 0 + β 1 f 1 (y 1 ) +... + β d f d (y d ).
More informationBioinformatics (GLOBEX, Summer 2015) Pairwise sequence alignment
Bioinformatics (GLOBEX, Summer 2015) Pairwise sequence alignment Substitution score matrices, PAM, BLOSUM Needleman-Wunsch algorithm (Global) Smith-Waterman algorithm (Local) BLAST (local, heuristic) E-value
More informationECE 275B Homework # 1 Solutions Winter 2018
ECE 275B Homework # 1 Solutions Winter 2018 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2 < < x n Thus,
More informationMachine Learning. Lecture 9: Learning Theory. Feng Li.
Machine Learning Lecture 9: Learning Theory Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Why Learning Theory How can we tell
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationExample: Letter Frequencies
Example: Letter Frequencies i a i p i 1 a 0.0575 2 b 0.0128 3 c 0.0263 4 d 0.0285 5 e 0.0913 6 f 0.0173 7 g 0.0133 8 h 0.0313 9 i 0.0599 10 j 0.0006 11 k 0.0084 12 l 0.0335 13 m 0.0235 14 n 0.0596 15 o
More informationProbability and Measure
Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability
More informationMonte Carlo Studies. The response in a Monte Carlo study is a random variable.
Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating
More informationChapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye
Chapter 8: Differential entropy Chapter 8 outline Motivation Definitions Relation to discrete entropy Joint and conditional differential entropy Relative entropy and mutual information Properties AEP for
More informationMathematical statistics
October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter
More informationEnergy Based Models. Stefano Ermon, Aditya Grover. Stanford University. Lecture 13
Energy Based Models Stefano Ermon, Aditya Grover Stanford University Lecture 13 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 13 1 / 21 Summary Story so far Representation: Latent
More informationProbability on a Riemannian Manifold
Probability on a Riemannian Manifold Jennifer Pajda-De La O December 2, 2015 1 Introduction We discuss how we can construct probability theory on a Riemannian manifold. We make comparisons to this and
More information1 Probability and Random Variables
1 Probability and Random Variables The models that you have seen thus far are deterministic models. For any time t, there is a unique solution X(t). On the other hand, stochastic models will result in
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationOrdered Sample Generation
Ordered Sample Generation Xuebo Yu November 20, 2010 1 Introduction There are numerous distributional problems involving order statistics that can not be treated analytically and need to simulated through
More informationIntroduction to Statistical Learning Theory
Introduction to Statistical Learning Theory In the last unit we looked at regularization - adding a w 2 penalty. We add a bias - we prefer classifiers with low norm. How to incorporate more complicated
More information10. Composite Hypothesis Testing. ECE 830, Spring 2014
10. Composite Hypothesis Testing ECE 830, Spring 2014 1 / 25 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve unknown parameters
More informationStat Lecture 20. Last class we introduced the covariance and correlation between two jointly distributed random variables.
Stat 260 - Lecture 20 Recap of Last Class Last class we introduced the covariance and correlation between two jointly distributed random variables. Today: We will introduce the idea of a statistic and
More informationDistributions of Functions of Random Variables. 5.1 Functions of One Random Variable
Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions 3.6 Conditional
More information1 Probability Model. 1.1 Types of models to be discussed in the course
Sufficiency January 18, 016 Debdeep Pati 1 Probability Model Model: A family of distributions P θ : θ Θ}. P θ (B) is the probability of the event B when the parameter takes the value θ. P θ is described
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationExample: An experiment can either result in success or failure with probability θ and (1 θ) respectively. The experiment is performed independently
Chapter 3 Sufficient statistics and variance reduction Let X 1,X 2,...,X n be a random sample from a certain distribution with p.m/d.f fx θ. A function T X 1,X 2,...,X n = T X of these observations is
More informationMonte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan
Monte-Carlo MMD-MA, Université Paris-Dauphine Xiaolu Tan tan@ceremade.dauphine.fr Septembre 2015 Contents 1 Introduction 1 1.1 The principle.................................. 1 1.2 The error analysis
More informationStat 451 Lecture Notes Simulating Random Variables
Stat 451 Lecture Notes 05 12 Simulating Random Variables Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 22 in Lange, and Chapter 2 in Robert & Casella 2 Updated:
More informationIntroduction to Machine Learning Midterm Exam Solutions
10-701 Introduction to Machine Learning Midterm Exam Solutions Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes,
More information6.867 Machine Learning
6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.
More informationMachine Learning And Applications: Supervised Learning-SVM
Machine Learning And Applications: Supervised Learning-SVM Raphaël Bournhonesque École Normale Supérieure de Lyon, Lyon, France raphael.bournhonesque@ens-lyon.fr 1 Supervised vs unsupervised learning Machine
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationLesson 9 Exploring Graphs of Quadratic Functions
Exploring Graphs of Quadratic Functions Graph the following system of linear inequalities: { y > 1 2 x 5 3x + 2y 14 a What are three points that are solutions to the system of inequalities? b Is the point
More informationSTATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero
STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero 1999 32 Statistic used Meaning in plain english Reduction ratio T (X) [X 1,..., X n ] T, entire data sample RR 1 T (X) [X (1),..., X (n) ] T, rank
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationExercises in Extreme value theory
Exercises in Extreme value theory 2016 spring semester 1. Show that L(t) = logt is a slowly varying function but t ǫ is not if ǫ 0. 2. If the random variable X has distribution F with finite variance,
More informationMarkov Chain Monte Carlo Lecture 6
Sequential parallel tempering With the development of science and technology, we more and more need to deal with high dimensional systems. For example, we need to align a group of protein or DNA sequences
More informationLecture 5: Importance sampling and Hamilton-Jacobi equations
Lecture 5: Importance sampling and Hamilton-Jacobi equations Henrik Hult Department of Mathematics KTH Royal Institute of Technology Sweden Summer School on Monte Carlo Methods and Rare Events Brown University,
More informationSequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment
Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the log-likelihood ratio of
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More information1 Complete Statistics
Complete Statistics February 4, 2016 Debdeep Pati 1 Complete Statistics Suppose X P θ, θ Θ. Let (X (1),..., X (n) ) denote the order statistics. Definition 1. A statistic T = T (X) is complete if E θ g(t
More informationMGMT 69000: Topics in High-dimensional Data Analysis Falll 2016
MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 Lecture 14: Information Theoretic Methods Lecturer: Jiaming Xu Scribe: Hilda Ibriga, Adarsh Barik, December 02, 2016 Outline f-divergence
More informationIEOR 4703: Homework 2 Solutions
IEOR 4703: Homework 2 Solutions Exercises for which no programming is required Let U be uniformly distributed on the interval (0, 1); P (U x) = x, x (0, 1). We assume that your computer can sequentially
More informationLecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary
ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood
More informationSolutions to Homework Set #3 Channel and Source coding
Solutions to Homework Set #3 Channel and Source coding. Rates (a) Channels coding Rate: Assuming you are sending 4 different messages using usages of a channel. What is the rate (in bits per channel use)
More information13. Parameter Estimation. ECE 830, Spring 2014
13. Parameter Estimation ECE 830, Spring 2014 1 / 18 Primary Goal General problem statement: We observe X p(x θ), θ Θ and the goal is to determine the θ that produced X. Given a collection of observations
More informationCOMP2610/COMP Information Theory
COMP2610/COMP6261 - Information Theory Lecture 9: Probabilistic Inequalities Mark Reid and Aditya Menon Research School of Computer Science The Australian National University August 19th, 2014 Mark Reid
More informationDirected and Undirected Graphical Models
Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections Chapter 3 - continued 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions
More informationparameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).
4. Interval estimation The goal for interval estimation is to specify the accurary of an estimate. A 1 α confidence set for a parameter θ is a set C(X) in the parameter space Θ, depending only on X, such
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationLikelihoods. P (Y = y) = f(y). For example, suppose Y has a geometric distribution on 1, 2,... with parameter p. Then the pmf is
Likelihoods The distribution of a random variable Y with a discrete sample space (e.g. a finite sample space or the integers) can be characterized by its probability mass function (pmf): P (Y = y) = f(y).
More informationEE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.
EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 28 Please submit on Gradescope. Start every question on a new page.. Maximum Differential Entropy (a) Show that among all distributions supported
More informationPart IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationMath Review Sheet, Fall 2008
1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the
More informationCS281A/Stat241A Lecture 17
CS281A/Stat241A Lecture 17 p. 1/4 CS281A/Stat241A Lecture 17 Factor Analysis and State Space Models Peter Bartlett CS281A/Stat241A Lecture 17 p. 2/4 Key ideas of this lecture Factor Analysis. Recall: Gaussian
More informationChapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye
Chapter 2: Entropy and Mutual Information Chapter 2 outline Definitions Entropy Joint entropy, conditional entropy Relative entropy, mutual information Chain rules Jensen s inequality Log-sum inequality
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationInformation Theory and Communication
Information Theory and Communication Ritwik Banerjee rbanerjee@cs.stonybrook.edu c Ritwik Banerjee Information Theory and Communication 1/8 General Chain Rules Definition Conditional mutual information
More informationSummary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing
Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper
More informationLecture 11: Continuous-valued signals and differential entropy
Lecture 11: Continuous-valued signals and differential entropy Biology 429 Carl Bergstrom September 20, 2008 Sources: Parts of today s lecture follow Chapter 8 from Cover and Thomas (2007). Some components
More informationSpring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =
Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically
More informationTHE QUEEN S UNIVERSITY OF BELFAST
THE QUEEN S UNIVERSITY OF BELFAST 0SOR20 Level 2 Examination Statistics and Operational Research 20 Probability and Distribution Theory Wednesday 4 August 2002 2.30 pm 5.30 pm Examiners { Professor R M
More informationEconometría 2: Análisis de series de Tiempo
Econometría 2: Análisis de series de Tiempo Karoll GOMEZ kgomezp@unal.edu.co http://karollgomez.wordpress.com Segundo semestre 2016 II. Basic definitions A time series is a set of observations X t, each
More informationPart IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationThe sample complexity of agnostic learning with deterministic labels
The sample complexity of agnostic learning with deterministic labels Shai Ben-David Cheriton School of Computer Science University of Waterloo Waterloo, ON, N2L 3G CANADA shai@uwaterloo.ca Ruth Urner College
More informationProbability and Measure
Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?
Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics
More informationCS281A/Stat241A Lecture 22
CS281A/Stat241A Lecture 22 p. 1/4 CS281A/Stat241A Lecture 22 Monte Carlo Methods Peter Bartlett CS281A/Stat241A Lecture 22 p. 2/4 Key ideas of this lecture Sampling in Bayesian methods: Predictive distribution
More information