Notes on the Multivariate Normal and Related Topics
|
|
- Charlene Cameron
- 5 years ago
- Views:
Transcription
1 Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions and sampling distributions One might say that anyone worth knowing, knows these distinctions We start with the simple univariate framework and then move on to the multivariate context A) Begin by positing a population of interest that is operationalized by some random variable, say X In this Theory World framework, X is characterized by parameters, such as the expectation of X, µ = E(X), or its variance, σ 2 = V(X) The random variable X has a (population) distribution, which for us will typically be assumed normal B) A sample is generated by taking observations on X, say, X 1,, X n, considered independent and identically distributed as X, ie, they are exact copies of X In this Data World context, statistics are functions of the sample, and therefore, characterize the sample: the sample mean, ˆµ = 1 n n X i ; the sample variance, ˆσ 2 = 1 n n (X i ˆµ) 2, with some possible variation in dividing by n 1 to generate an unbiased estimator of σ 2 The statistics, ˆµ and ˆσ 2, are point estimators of µ and σ 2 They are random variables by themselves, so they have distributions that are called sampling distributions The general problem of statistical inference is to ask what the sample statistics, ˆµ and ˆσ 2, tell us about their population counterparts, 1
2 such as µ and σ 2 Can we obtain some notion of accuracy from the sampling distribution, eg, confidence intervals? The multivariate problem is generally the same but with more notation: A) The population (Theory World) is characterized by a collection of p random variables, X = [X 1, X 2,, X p ], with parameters: µ i = E(X i ); σ 2 i = V(X i ); σ ij = Cov(X i, X j ) = E[(X i E(X i ))(X j E(X j ))]; or the correlation between X i and X j, ρ ij = σ ij /σ i σ j B) The sample (Data World) is defined by n independent observations on the random vector X, with the observations placed into an n p data matrix (eg, subject by variable) that we also (with a slight abuse of notation) denote by a bold-face capital letter, X n p : X n p = x 11 x 12 x 1p x 21 x 22 x 2p x n1 x n2 x np The statistics corresponding to the parameters of the population are: ˆµ i = 1 n n k=1 x ki ; ˆσ 2 i = 1 n n k=1 (x ki ˆµ i ) 2, with again some possible variation to a division by n 1 for an unbiased estimate; ˆσ ij = n k=1 (x ki ˆµ i )(x kj ˆµ j ); and ˆρ ij = ˆσ ij /ˆσ iˆσ j 1 n To obtain a good sense of what the estimators tell us about the population parameters, we will have to make some assumption about the population, eg, [X 1,, X p ] has a multivariate normal distribution As we will see, this assumption leads to some very nice results = x 1 x 2 x n 2
3 01 Developing the Multivariate Normal Distribution Suppose X 1,, X p are p continuous random variables with density functions, f 1 (x 1 ),, f p (x p ), and distribution functions, F 1 (x 1 ),, F p (x p ), where P (X i x i ) = F i (x i ) = x i f i(x i )dx We define a p-dimensional random variable (or random vector) as the vector, X = [X 1,, X p ]; X has the joint cumulative distribution function F (x 1,, x p ) = x p x 1 f(x 1,, x p )dx 1 dx p If the random variables, X 1,, X p are independent, then the joint density and cumulative distribution functions factor: f(x 1,, x p ) = f 1 (x 1 ) f p (x p ) and F (x 1,, x p ) = F 1 (x 1 ) F p (x p ) The independence property will not be assumed in the following; in fact, the whole idea is to investigate what type of dependency exists in the random vector X When X = [X 1, X 2,, X p ] MVN(µ, Σ), the joint density function has the form 1 φ(x 1,, x p ) = (2π) p/2 Σ exp{ 1 1/2 2 (x µ) Σ 1 (x µ)} In the univariate case, the density provides the usual bell-shaped curve: φ(x) = 1 2πσ exp{ 1 2 [(x µ)/σ)2 ]} Given the MVN assumption on the generation of the data matrix, 3
4 X n p, we know the sampling distribution of the vector of means: ˆµ = ˆµ 1 ˆµ 2 ˆµ p MVN(µ, (1/n)Σ) In the univariate case, we have that ˆµ N(µ, (1/n)σ 2 ) As might be expected, a Central Limit Theorem (CLT) states that these results also hold asymptotically when the MVN assumption is relaxed Also, if we estimate the variance-covariance matrix, Σ, with divisions by n 1 rather than n, and denote the result as ˆΣ, then E(ˆΣ) = Σ Linear combinations of random variables form the backbone of all of multivariate statistics For example, suppose X = [X 1, X 2,, X p ] MVN(µ, Σ), and consider the following q linear combinations: Z 1 = c 11 X c 1p X p Z q = c q1 X c qp X p Then the vector Z = [Z 1, Z 2,, Z q ] MVN(Cµ, CΣC ), where C q p = c 11 c q1 c 1p c qp These same results hold in the sample if we observe the q linear combinations, [Z 1, Z 2,, Z q ]: ie, the sample mean vector is Cˆµ and the sample variance-covariance matrix is CˆΣC 4
5 02 Multiple Regression and Partial Correlation (in the Population) Suppose we partition our vector of p random variables as follows: X = [X 1, X 2,, X p ] = [[X 1, X 2,, X k ], [X k+1, X k+2, X p ]] [X 1, X 2] Partitioning the mean vector and variance-covariance matrix in the same way, µ µ = 1 Σ ; Σ = 11 Σ 12 µ 2 Σ 12 Σ, 22 we have X 1 MVN(µ 1, Σ 11 ) ; X 2 MVN(µ 2, Σ 22 ) X 1 and X 2 are statistically independent vectors if and only if Σ 12 = 0 k p, the k p zero matrix What I call the Master Theorem refers to the conditional density of X 1 given X 2 = x 2 ; it is multivariate normal with mean vector of order k 1: µ 1 + Σ 12 Σ 1 22 (x 2 µ 2 ), and variance-covariance matrix of order k k: Σ 11 Σ 12 Σ 1 22 Σ 12 If we denote the (i, j) th partial covariance element in this latter matrix ( holding all variables in the second set constant ) as σ ij (k+1) p, then the partial correlation of variables i and j, holding all variables in the second set constant, is ρ ij (k+1) p = σ ij (k+1) p / σ ii (k+1) p σ jj (k+1) p 5
6 Notice that in the formulas just given, the inverse of Σ 22 must exist; otherwise, we have what is called multicollinearity, and solutions don t exist (or are very unstable numerically if the inverse almost doesn t exist) So, as one moral: don t use both total and subtest scores in the same set of independent variables If the set X 1 contains just one random variable, X 1, then the mean vector of the conditional distribution can be given as E(X 1 ) + σ 12 Σ 1 22 (x 2 µ 2 ), where σ 12 is 1 (p 1) and of the form [Cov(X 1, X 2 ),, Cov(X 1, X p )] This is nothing but our old regression equation written out in matrix notation If we let β = σ 12 Σ 1 22, then the conditional mean vector (predicted X 1 ) is equal to E(X 1 ) + β (x 2 µ 2 ) = E(X 1 ) + β 1 (x 2 µ 2 )+ +β p 1 (x p µ p ) The covariance matrix, Σ 11 Σ 12 Σ 1 22 Σ 12, takes on the form Var(X 1 ) σ 12 Σ 1 22 σ 12, ie, the variation in X 1 that is not explained If we take explained variation, σ 12 Σ 1 22 σ 12, and consider the proportion to the total variance, Var(X 1 ), we define the squared multiple correlation coefficient: ρ p = σ 12Σ 1 22 σ 12 Var(X 1 ) In fact, the linear combination, β X 2, has the highest correlation of any linear combination with X 1 ; and this correlation is the positive square root of the squared multiple correlation coefficient 03 Moving to the Sample Up to this point, the concepts of regression and kindred ideas, have been discussed only in terms of population parameters We now have 6
7 the task of obtaining estimates of these various quantities based on a sample; also, associated inference procedures need to be developed Generally, we will rely on maximum-likelihood estimation and the related likelihood-ratio tests To define how maximum-likelihood estimation proceeds, we will first give a series of general steps, and then operationalize for a univariate normal distribution example A) Let X 1,, X n be univariate observations on X (independent and continuous, and depending on some parameters, θ 1,, θ k The density function of X i is denoted as f(x i ; θ 1,, θ k ) B) The likelihood of observing x 1,, x n for values of X 1,, X n is the joint density of the n observations, and because the observations are independent, L(θ 1,, θ k ) n f(x i ; θ 1,, θ k ), ie, we assume x 1,, x n are already observed, and that L is a function of the parameters only Now, choose parameter values such that L(θ 1,, θ k ) is at a maximum ie, the probability of observing that particular sample is maximized by the choice of θ 1,, θ k Generally, it is easier and equivalent to maximize the log-likelihood using l(θ 1,, θ k ) log(l(θ 1,, θ k )) C) Generally, the maximum values are found through differentiation, and by setting the partial derivatives equal to zero Explicitly, 7
8 we find θ 1,, θ k such that θ j l(θ 1,, θ k ) = 0, for j = 1,, k (We should probably check second derivatives to see if we have a maximum, but this is almost always true, anyways) Example: Suppose X i N(µ, σ 2 ), with density f(x i ; µ, σ 2 ) = 1 σ 2π exp{ (x i µ) 2 /2σ 2 } The likelihood has the form L(µ, σ 2 ) = n 1 σ 2π exp{ (x i µ) 2 /2σ 2 } = (1/ 2π) n (1/σ 2 ) n/2 exp{ (1/2σ 2 ) n (x i µ) 2 } The log-likelihood reduces: l(µ, σ 2 ) = log L(µ, σ 2 ) = log(1/ 2π) n + log(1/σ 2 ) n/2 + ( (1/2σ 2 ) n (x i µ) 2 ) = constant (n/2) log σ 2 (1/2σ 2 ) n (x i µ) 2 The partial derivatives have the form: l(µ, σ 2 ) µ = (1/2σ 2 ) n 2(x i µ)( 1), 8
9 and l(µ, σ 2 ) σ 2 = (n/2)(1/σ 2 ) (1/2(σ 2 ) 2 )( 1) n (x i µ) 2 Setting these two expressions to zero, gives ˆµ = n x i /n ; ˆσ 2 = (1/n) n (x i ˆµ) 2 Maximum likelihood (ML) estimates are generally consistent, asymptotically normal (for large n), and efficient; a function of the sufficient statistics; and have an invariance to operations performed on them eg, the ML estimate for σ is just ˆσ 2 As the ML estimator for σ 2 shows, ML estimators are not necessarily unbiased Another Example: Suppose X 1,, X n are observations on the Poisson discrete random variable X (having outcomes 0, 1, ): The likelihood is L(λ) = n P (X i = x i ) = exp( λ)λx i x i! and the log-likelihood P (X i = x i ) = exp( nλ)λ i x i i x i! l(λ) = log L(λ) = nλ + log(λ i x i ) log i, x i! The partial derivative l(λ) λ = n + (1/λ) n x i, 9
10 and when set to zero, gives 031 Likelihood Ratio Tests ˆλ = n x i /n Besides using the likelihood concept to find good point estimates, the likelihood can also be used to find good tests of hypotheses We will develop this idea in terms of a simple example: Suppose X 1 = x 1,, X n = x n denote values obtained on n independent observations on a N(µ, σ 2 ) random variable, and where σ 2 is assumed known The standard test of H 0 : µ = µ 0 versus H 1 : µ µ 0, would compare ( x µ 0 ) 2 /(σ 2 /n) to a χ 2 random variable with one-degree of freedom Or, if we chose the significance level to be 05, we could reject H 0 if x µ 0 /(σ/ n) 196 We now develop this same test using likelihood ideas: The likelihood of observing x 1,, x n, L(µ), is only a function of µ (because σ 2 is assumed to be known): L(µ) = (1/σ 2π) n exp{ (1/2σ 2 ) n (x i µ) 2 } Under H 0, L(µ) is at a maximum for µ = µ 0 ; under H 1, L(µ) is at a maximum for µ = ˆµ, the maximum likelihood estimator If H 0 is true, L(ˆµ) and L(µ 0 ) should be close to each other; If H 0 is false, L(ˆµ) should be much larger than L(µ 0 ) Thus, the decision rule would be to reject H 0 if L(µ 0 )/L(ˆµ) λ, where λ is some number less than 100 and chosen to obtain a particular α level The ratio, (L(µ 0 )/L(ˆµ)) = exp{ (1/2)(ˆµ µ 0 ) 2 }/(σ 2 /n) Thus, we could rephrase the decision rule: reject H 0 if (ˆµ µ 0 ) 2 }/(σ 2 /n) 10
11 2 log(λ), or if x µ 0 /(σ/ n) 2 log λ Thus, for an 05 level of significance, choose λ so 196 = 2 log λ Generally, we can phrase likelihood ratio tests as: 2 log(likelihood ratio) χ 2 ν ν 0, where ν is the dimension of the parameter space generally, and ν 0 is the dimension under H Estimation To obtain estimates of the various quantities we need, merely replace the variances and covariances by their maximum likelihood estimates This process generates sample partial correlations or covariances; sample multiple squared correlations; sample regression parameters; and so on 11
Exercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More informationMATH5745 Multivariate Methods Lecture 07
MATH5745 Multivariate Methods Lecture 07 Tests of hypothesis on covariance matrix March 16, 2018 MATH5745 Multivariate Methods Lecture 07 March 16, 2018 1 / 39 Test on covariance matrices: Introduction
More informationExam 2. Jeremy Morris. March 23, 2006
Exam Jeremy Morris March 3, 006 4. Consider a bivariate normal population with µ 0, µ, σ, σ and ρ.5. a Write out the bivariate normal density. The multivariate normal density is defined by the following
More information2.6.3 Generalized likelihood ratio tests
26 HYPOTHESIS TESTING 113 263 Generalized likelihood ratio tests When a UMP test does not exist, we usually use a generalized likelihood ratio test to verify H 0 : θ Θ against H 1 : θ Θ\Θ It can be used
More informationMath 181B Homework 1 Solution
Math 181B Homework 1 Solution 1. Write down the likelihood: L(λ = n λ X i e λ X i! (a One-sided test: H 0 : λ = 1 vs H 1 : λ = 0.1 The likelihood ratio: where LR = L(1 L(0.1 = 1 X i e n 1 = λ n X i e nλ
More informationIntroduction to Normal Distribution
Introduction to Normal Distribution Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 17-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationLecture 3. Inference about multivariate normal distribution
Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates
More informationTesting Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata
Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationMath 152. Rumbos Fall Solutions to Assignment #12
Math 52. umbos Fall 2009 Solutions to Assignment #2. Suppose that you observe n iid Bernoulli(p) random variables, denoted by X, X 2,..., X n. Find the LT rejection region for the test of H o : p p o versus
More informationReview of Discrete Probability (contd.)
Stat 504, Lecture 2 1 Review of Discrete Probability (contd.) Overview of probability and inference Probability Data generating process Observed data Inference The basic problem we study in probability:
More informationAsymptotic Statistics-III. Changliang Zou
Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (
More informationIn this course we: do distribution theory when ǫ i N(0, σ 2 ) discuss what if the errors, ǫ i are not normal? omit proofs.
Distribution Theory Question: What is distribution theory? Answer: How to compute the distribution of an estimator, test or other statistic, T : Find P(T t), the Cumulative Distribution Function (CDF)
More informationSTT 843 Key to Homework 1 Spring 2018
STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ
More informationChapter 17: Undirected Graphical Models
Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For
More informationHypothesis Testing. A rule for making the required choice can be described in two ways: called the rejection or critical region of the test.
Hypothesis Testing Hypothesis testing is a statistical problem where you must choose, on the basis of data X, between two alternatives. We formalize this as the problem of choosing between two hypotheses:
More informationRandom vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.
Random vectors Recall that a random vector X = X X 2 is made up of, say, k random variables X k A random vector has a joint distribution, eg a density f(x), that gives probabilities P(X A) = f(x)dx Just
More informationCanonical Correlation Analysis of Longitudinal Data
Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationIf we want to analyze experimental or simulated data we might encounter the following tasks:
Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationPOLI 8501 Introduction to Maximum Likelihood Estimation
POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,
More informationSpace Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses
Space Telescope Science Institute statistics mini-course October 2011 Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses James L Rosenberger Acknowledgements: Donald Richards, William
More informationSTAT 4385 Topic 01: Introduction & Review
STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics
More informationStatistics and Data Analysis
Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli Statistics and Data
More informationThe t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary
Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis
More informationDirection: This test is worth 250 points and each problem worth points. DO ANY SIX
Term Test 3 December 5, 2003 Name Math 52 Student Number Direction: This test is worth 250 points and each problem worth 4 points DO ANY SIX PROBLEMS You are required to complete this test within 50 minutes
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More informationMachine learning - HT Maximum Likelihood
Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce
More informationMultivariate Statistics
Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More informationChapter 5 continued. Chapter 5 sections
Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationIntroduction to Machine Learning. Lecture 2
Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for
More informationLikelihood Ratio tests
Likelihood Ratio tests For general composite hypotheses optimality theory is not usually successful in producing an optimal test. instead we look for heuristics to guide our choices. The simplest approach
More informationTopic 19 Extensions on the Likelihood Ratio
Topic 19 Extensions on the Likelihood Ratio Two-Sided Tests 1 / 12 Outline Overview Normal Observations Power Analysis 2 / 12 Overview The likelihood ratio test is a popular choice for composite hypothesis
More informationAnswer Key for STAT 200B HW No. 7
Answer Key for STAT 200B HW No. 7 May 5, 2007 Problem 2.2 p. 649 Assuming binomial 2-sample model ˆπ =.75, ˆπ 2 =.6. a ˆτ = ˆπ 2 ˆπ =.5. From Ex. 2.5a on page 644: ˆπ ˆπ + ˆπ 2 ˆπ 2.75.25.6.4 = + =.087;
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationThis does not cover everything on the final. Look at the posted practice problems for other topics.
Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry
More informationRandom Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30
Random Vectors 1 STA442/2101 Fall 2017 1 See last slide for copyright information. 1 / 30 Background Reading: Renscher and Schaalje s Linear models in statistics Chapter 3 on Random Vectors and Matrices
More informationA732: Exercise #7 Maximum Likelihood
A732: Exercise #7 Maximum Likelihood Due: 29 Novemer 2007 Analytic computation of some one-dimensional maximum likelihood estimators (a) Including the normalization, the exponential distriution function
More informationBTRY 4090: Spring 2009 Theory of Statistics
BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)
More informationThis paper is not to be removed from the Examination Halls
~~ST104B ZA d0 This paper is not to be removed from the Examination Halls UNIVERSITY OF LONDON ST104B ZB BSc degrees and Diplomas for Graduates in Economics, Management, Finance and the Social Sciences,
More informationJoint Probability Distributions and Random Samples (Devore Chapter Five)
Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete
More informationSome Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2
STA 248 H1S MIDTERM TEST February 26, 2008 SURNAME: SOLUTIONS GIVEN NAME: STUDENT NUMBER: INSTRUCTIONS: Time: 1 hour and 50 minutes Aids allowed: calculator Tables of the standard normal, t and chi-square
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationReview. December 4 th, Review
December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter
More informationBiostat 2065 Analysis of Incomplete Data
Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies
More informationHypothesis Testing: The Generalized Likelihood Ratio Test
Hypothesis Testing: The Generalized Likelihood Ratio Test Consider testing the hypotheses H 0 : θ Θ 0 H 1 : θ Θ \ Θ 0 Definition: The Generalized Likelihood Ratio (GLR Let L(θ be a likelihood for a random
More informationLecture 23 Maximum Likelihood Estimation and Bayesian Inference
Lecture 23 Maximum Likelihood Estimation and Bayesian Inference Thais Paiva STA 111 - Summer 2013 Term II August 7, 2013 1 / 31 Thais Paiva STA 111 - Summer 2013 Term II Lecture 23, 08/07/2013 Lecture
More informationEstimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator
Estimation Theory Estimation theory deals with finding numerical values of interesting parameters from given set of data. We start with formulating a family of models that could describe how the data were
More informationStatistical Inference with Regression Analysis
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing
More informationQualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama
Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Instructions This exam has 7 pages in total, numbered 1 to 7. Make sure your exam has all the pages. This exam will be 2 hours
More informationA Likelihood Ratio Test
A Likelihood Ratio Test David Allen University of Kentucky February 23, 2012 1 Introduction Earlier presentations gave a procedure for finding an estimate and its standard error of a single linear combination
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationPrimer on statistics:
Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationLinear Methods for Prediction
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationTest Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics
Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationSpring 2012 Math 541B Exam 1
Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote
More informationCS 195-5: Machine Learning Problem Set 1
CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of
More informationCh. 5 Hypothesis Testing
Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,
More informationStatistical Inference
Statistical Inference Classical and Bayesian Methods Revision Class for Midterm Exam AMS-UCSC Th Feb 9, 2012 Winter 2012. Session 1 (Revision Class) AMS-132/206 Th Feb 9, 2012 1 / 23 Topics Topics We will
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationParameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!
Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses
More information5.2 Fisher information and the Cramer-Rao bound
Stat 200: Introduction to Statistical Inference Autumn 208/9 Lecture 5: Maximum likelihood theory Lecturer: Art B. Owen October 9 Disclaimer: These notes have not been subjected to the usual scrutiny reserved
More informationProbabilities & Statistics Revision
Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF
More informationHypothesis Testing. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA
Hypothesis Testing Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA An Example Mardia et al. (979, p. ) reprint data from Frets (9) giving the length and breadth (in
More informationNotes on Random Vectors and Multivariate Normal
MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution
More informationStatistical Inference
Statistical Inference Bernhard Klingenberg Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Outline Estimation: Review of concepts
More informationLoglikelihood and Confidence Intervals
Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,
More informationTwo hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45
Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS 21 June 2010 9:45 11:45 Answer any FOUR of the questions. University-approved
More informationLecture 11. Multivariate Normal theory
10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances
More informationTopic 17: Simple Hypotheses
Topic 17: November, 2011 1 Overview and Terminology Statistical hypothesis testing is designed to address the question: Do the data provide sufficient evidence to conclude that we must depart from our
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationLectures on Statistics. William G. Faris
Lectures on Statistics William G. Faris December 1, 2003 ii Contents 1 Expectation 1 1.1 Random variables and expectation................. 1 1.2 The sample mean........................... 3 1.3 The sample
More informationTAMS39 Lecture 2 Multivariate normal distribution
TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More informationParameter estimation: ACVF of AR processes
Parameter estimation: ACVF of AR processes Yule-Walker s for AR processes: a method of moments, i.e. µ = x and choose parameters so that γ(h) = ˆγ(h) (for h small ). 12 novembre 2013 1 / 8 Parameter estimation:
More informationMAS223 Statistical Modelling and Inference Examples
Chapter MAS3 Statistical Modelling and Inference Examples Example : Sample spaces and random variables. Let S be the sample space for the experiment of tossing two coins; i.e. Define the random variables
More informationGeneralized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More informationEstimation, Inference, and Hypothesis Testing
Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of
More information