Lecture 3. Inference about multivariate normal distribution
|
|
- Valerie Clarke
- 5 years ago
- Views:
Transcription
1 Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates of µ and Σ. Recall that the joint density of X 1 is f(x) = 2πΣ 1 2 exp [ 1 ] 2 (x µ) Σ 1 (x µ), for x R p. The negative log likelihood function, given observations x n 1 := {x 1,..., x n }, up to an additive constant independent of µ and Σ, is then l (µ, Σ x n 1) = n 2 log Σ + 1 n (x i µ) Σ 1 (x i µ) 2 = n 2 log Σ + n 2 ( x µ) Σ 1 ( x µ) n (x i x) Σ 1 (x i x). On this end, denote the centered data matrix by X = [ x 1,..., x n ] p n, where x i = x i x. Let S 0 = 1 n X X = 1 n x i x n i. Proposition 1. The MLEs of µ and Σ, that jointly minimize l (µ, Σ x n 1), are ˆµ MLE = x, Σ MLE = S 0. Note that S 0 is a biased estimator of Σ. The sample variance covariance matrix S = n S n 1 0 is unbiased. For interval estimation of µ, we largely follow Section 7.1 of Härdle and Simar (2012). First note that since µ R p, we need to generalize the notion of intervals (primarily defined for R 1 ) to a higher dimensional space. A simple extension is a direct product of marginal intervals: for example, for intervals a < x < b and c < y < d, we obtain a rectangular region {(x, y) R 2 : a < x < b, c < y < d}. A confidence region A is a subset of R p and it depends on (random) observations X 1,..., X n. A = A(X n 1) is a confidence region of size 1 α (0, 1) for parameter µ R p if P (µ A(X n 1)) 1 α where P is with respect to the distribution of X n 1. (Elliptical confidence region) Corollary 7 in lecture 2 provides a pivot which paves a way to construct a confidence region for µ.
2 Since n p ( X µ) S 1 p 0 ( X µ) F p,n p and ( P ( X µ) S 1 0 ( X µ) < p ) n p F 1 α;p,n p = 1 α, we have that A := { ν R p : ( X ν) S 1 0 ( X ν) < p } n p F 1 α;p,n p is a confidence region of size 1 α for parameter µ. Note: F 1 α;p,n p is defined as the constant such that P (F < F 1 α;p,n p ) = 1 α where F is F p,n p distributed. The resulting confidence region for a given sample is an ellipse. (Simultaneous confidence intervals) Simultaneous confidence intervals for all linear combinations of elements of µ, i.e., a µ for arbitrary a R p, provide confidence of size 1 α for any interval in this framework, which covers a µ. This includes the marginal means µ 1,..., µ p (when choosing a wisely). We are interested in evaluating lower and upper bounds L(a) and U(a) satisfying P (L(a) < a µ < U(a), for all a R p ) 1 α Note: the lower and upper limits of the interval depend on the data (hence is random) and is also specified by a. First consider a single confidence interval by fixing a particular vector a. To evaluate a confidence interval for a µ, write new random variables Y i = a X i N 1 (a µ, a Σa) (i = 1,..., n), whose squared t-statistic is t 2 (a) := n (a µ a X) 2 a Sa fixed a, F 1,n 1. Thus, for any P (t 2 (a) F 1 α,1,n 1 ) = 1 α. (1) Next, consider many projection vectors a 1, a 2,..., a M (M is finite only for convenience). The simultaneous confidence intervals of the type similar to (1) are then ( M ) P {t 2 (a i ) h(α)} 1 α, for some h( ). We collect some facts below. 1. max a t 2 (a) h(α) implies t 2 (a i ) h(α) for all i. 2. max a t 2 (a) = n(µ X) S 1 (µ X), HS Theorem 2.5 (note the rank) 2
3 3. Corollary 7 in lecture 2. These suggest that we have h(α) = (n 1)p F n p 1 α;p,n p and ( M ) ( P {t 2 (a i ) h(α)} P max a Proposition 2. Simultaneously for all a R p, the interval contains a µ with probability 1 α. a X ± h(α)a Sa/n ) t2 (a) h(α) = 1 α. Example 1. From the Golub gene expression data, with dimension d = 7129, take the first and 1674th variables (genes), to focus on the bivariate case (p = 2). There are two populations: 11 observations from AML, 27 from ALL. Figure 1 illustrates the elliptical confidence region of size 95% and 99%. Figure 2 compares the elliptical confidence region with the simultaneous confidence intervals for a 1 = (1, 0) and a 2 = (0, 1). 2.5 x 104 Golub data, ALL (red) and AML(blue) gene # gene #1 Figure 1: Elliptical confidence regions of size 95% and 99%. 3
4 2.5 x 104 Golub data, ALL (red) and AML(blue) gene # gene #1 Figure 2: Simultaneous confidence intervals of size 95% 3.2 Hypotheses testing Consider testing a null hypothesis H 0 : θ Ω 0 against an alternative hypothesis H 1 : θ Ω 1. The principle of likelihood ratio test is as follows: Let L 0 be the maximized likelihood under H 0 : θ Ω 0, and L 1 be the maximized likelihood under θ Ω 0 Ω 1. The likelihood ratio statistic, or sometimes called Wilks statistic, is then W = 2 log( L 0 L 1 ) 0 The null hypothesis is rejected if the observed value of W is large. In some cases the exact distribution of W under H 0 can be evaluated. In other cases, Wilks theorem states that for large n (sample size), W = χ L 2 ν, where ν is the number of free parameters in H 1 but not in H 0. If the degrees of freedom in Ω 0 is q and the degrees of freedom in Ω 0 Ω 1 is r, then ν = r q. Consider testing hypotheses on µ and Σ of multivariate normal distribution, based on n-sample X 1,..., X n. case I: H 0 : µ = µ 0, H 1 : µ µ 0, Σ is known. In this case, we know the exact distribution of the likelihood ratio statistic W = n( x µ 0 ) Σ 1 ( x µ 0 ) χ 2 p, 4
5 under H 0. [Show on blackboard.] It is worthwhile to first write 2 log(l(x; µ, Σ)) = 2l(x; µ, Σ) = n log 2πΣ + n tr{σ 1 S} + n( x µ) T Σ 1 ( x µ) case II: H 0 : µ = µ 0, H 1 : µ µ 0, Σ is unknown. The MLEs under H 1 are ˆµ = x and Σ = S 0. The restricted MLE of Σ under H 0 is Σ (0) = 1 n n (x i µ 0 )(x i µ 0 ) = S 0 + δδ, where δ = n( x µ 0 ). The likelihood ratio statistic is then W = n log S 0 + δδ n log S 0. It turns out that W is a monotone increasing function of δ S 1 δ = n( x µ 0 )S 1 ( x µ 0 ), which is the Hotelling s T 2 (n 1) statistic. case III: H 0 : Σ = Σ 0, H 1 : Σ Σ 0, µ is unknown. We have the likelihood ratio statistic W = n log Σ 1 0 S 0 np + ntrace(σ 1 0 S 0 ). This is the case where the exact distribution of W is difficult to evaluate. For large n, use Wilks theorem to approximate the distribution of W by χ 2 ν with the degrees of freedom ν = p(p + 1)/2. Next, consider testing the equality of two mean vectors. Let X 11,..., X 1n1 be i.i.d. N p (µ 1, Σ) and X 21,..., X 2n2 be i.i.d. N p (µ 2, Σ). case IV: H 0 : µ 1 = µ 2, H 1 : µ 1 µ 2, Σ is unknown. Since X 1 X 2 N p (µ 1 µ 2, n 1 + n 2 Σ), n 1 n 2 (n 1 + n 2 2)S P W p (n 1 + n 2 2, Σ), where (n 1 + n 2 2)S P := (n 1 1)S 1 + (n 2 1)S 2, we have Hotelling s T 2 statistic for two-sample problem and by Theorem 5 in lecture 2 T 2 (n 1 + n 2 2) = n 1n 2 n 1 + n 2 ( X 1 X 2 ) S 1 P ( X 1 X 2 ), n 1 + n 2 p 1 (n 1 + n 2 2)p T 2 (n 1 + n 2 2) F p,n1 +n 2 p 1. Similar to case II above, the likelihood ratio statistic is a monotone function of T 2 (n 1 + n 2 2). 5
6 3.3 Hypothesis testing when p > n In the high dimensional situation where the dimension p is larger than sample size (p > n 1 or p > n 1 +n 2 2), the sample covariance S is not invertable, thus the Hotelling s T 2 statistic, which is essential in the testing procedures above, cannot be computed. We survey important proposals for testing hypotheses on means in High-Dimension, Low-Sample Size (HDLSS) data. A basic idea in generalizing a test procedure for the p > n case is to base the test on a computable test statistic which is also an estimator for µ µ 0 or µ 1 µ 2. In case II (one sample), Dempster (1960) proposed to replace S 1 in Hotelling s statistic by (trace(s)i p ) 1. He showed that under H 0 : µ = 0, T D = n X X trace(s) F r,(n 1)r, approximately, for r = (trace(σ))2, a measure of sphericity of Σ. An estimator ˆr of r is used in testing. trace(σ 2 ) Bai and Saranadasa (1996) proposed to simply replace S 1 in Hotelling s statistic by I p, yielding T B = n X X. However X X is not an unbiased estimator of µ µ since E( X X) = 1 n trace(σ) + µ µ. They showed that the standardized statistic M B = n X X trace(s) ŝd(n X X trace(s)) = 2(n 1)n (n 2)(n+1) has asymptotic N(0, 1) distribution for p, n. n X X trace(s) ( trace(s2 ) 1 (trace(s))2) n Srivastava and Du (2008) proposed to replace S in Hotelling s statistic by D S = diag(s). Then T S = n X D 1 X S n 1 n(n 1) p can be used to estimate D 1 2 n 3 n 3 Σ µ 2, which is zero under H 0 : µ = 0. Srivastava and Du s test statistic is then M S = T S ŝd(t S ) = n X D 1 S X n 1 n 3 p 2trace(R 2 ) p2 n 1, which has asymptotic N(0, 1) distribution for p, n. Here R = D S SD 2 S correlation matrix. is the sample Chen and Qin (2010) improves the two-sample test for mean vectors from that of Bai and Saranadasa (1996). In testing H 1 : µ 1 = µ 2, Bai and Saranadasa (1996) proposed to use T B = X X 1 2 n 1+n 2 n 1 n 2 trace(s P ). The substraction of trace(s P ) is to make sure that E(T B ) = µ 1 µ 2 2. Chen and Qin (2010) proposed to not use trace(s P ), by considering n1 i j T C = X n2 1iX 1j i j + X n1 n2 2iX 2j j=1 2 X 1iX 2j. n 1 (n 1 1) n 2 (n 2 1) n 1 n 2 Since E(T C ) = µ 1 µ 2 2, Chen and Qin proposed to test based on T C. There are many other ideas, including: 6
7 1. The test statistic is essentially the maximum of p normalized marginal mean differences (Cai et al., 2013); 2. Use a generalized inverse of S, denoted by S or S, to replace S 1 ; 3. Estimate Σ in a way that is invertible; 4. Reduce the dimension p of the random vector X by Z = h(x) R d, for d < n, then apply the traditional theory of hypothesis testing. Next lecture is on linear dimension reduction principal component analysis. References Bai, Z. and Saranadasa, H. (1996), Effect of high dimension: by an example of a two sample problem, Statist. Sinica, 6, Cai, T. T., Liu, W., and Xia, Y. (2013), Two-Sample Test of High Dimensional Means under Dependency, To appear in Journal of the Royal Statistical Society: Series B. Chen, S. X. and Qin, Y.-L. (2010), A two-sample test for high-dimensional data with applications to gene-set testing, The Annals of Statistics, 38, Dempster, A. P. (1960), A significance test for the separation of two highly multivariate small samples, Biometrics, 16, Srivastava, M. S. and Du, M. (2008), A test for the mean vector with fewer observations than the dimension, Journal of Multivariate Analysis, 99,
8 In evaluating the MLE of Σ for MVN, one can use the following famous result. Note: Y Y = Y (Y 1 ) T Y trace(ay 1 B) = (Y 1 BAY 1 ) T Σ {log Σ + trace(σ 1 S 0 )} = Σ 1 Σ (Σ 1 ) (Σ 1 S 0 Σ 1 ) = (Σ 1 ) (Σ 1 S 0 Σ 1 ) The first order condition leads to ˆΣ MLE = S 0 8
Multivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationMATH5745 Multivariate Methods Lecture 07
MATH5745 Multivariate Methods Lecture 07 Tests of hypothesis on covariance matrix March 16, 2018 MATH5745 Multivariate Methods Lecture 07 March 16, 2018 1 / 39 Test on covariance matrices: Introduction
More informationStatistical Inference
Statistical Inference Classical and Bayesian Methods Revision Class for Midterm Exam AMS-UCSC Th Feb 9, 2012 Winter 2012. Session 1 (Revision Class) AMS-132/206 Th Feb 9, 2012 1 / 23 Topics Topics We will
More informationHypothesis Testing. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA
Hypothesis Testing Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA An Example Mardia et al. (979, p. ) reprint data from Frets (9) giving the length and breadth (in
More informationHigh-dimensional two-sample tests under strongly spiked eigenvalue models
1 High-dimensional two-sample tests under strongly spiked eigenvalue models Makoto Aoshima and Kazuyoshi Yata University of Tsukuba Abstract: We consider a new two-sample test for high-dimensional data
More informationTAMS39 Lecture 2 Multivariate normal distribution
TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution
More informationNotes on the Multivariate Normal and Related Topics
Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions
More informationTesting linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices
Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Takahiro Nishiyama a,, Masashi Hyodo b, Takashi Seo a, Tatjana Pavlenko c a Department of Mathematical
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Two sample T 2 test 1 Two sample T 2 test 2 Analogous to the univariate context, we
More informationInferences about a Mean Vector
Inferences about a Mean Vector Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University
More informationOn testing the equality of mean vectors in high dimension
ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.
More informationSTAT 730 Chapter 5: Hypothesis Testing
STAT 730 Chapter 5: Hypothesis Testing Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 28 Likelihood ratio test def n: Data X depend on θ. The
More informationComparing two independent samples
In many applications it is necessary to compare two competing methods (for example, to compare treatment effects of a standard drug and an experimental drug). To compare two methods from statistical point
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationStatistical Inference On the High-dimensional Gaussian Covarianc
Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional
More informationComposite Hypotheses and Generalized Likelihood Ratio Tests
Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationMultivariate Statistics
Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical
More information(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.
Problem 1 (21 points) An economist runs the regression y i = β 0 + x 1i β 1 + x 2i β 2 + x 3i β 3 + ε i (1) The results are summarized in the following table: Equation 1. Variable Coefficient Std. Error
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationSTT 843 Key to Homework 1 Spring 2018
STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationSTA 437: Applied Multivariate Statistics
Al Nosedal. University of Toronto. Winter 2015 1 Chapter 5. Tests on One or Two Mean Vectors If you can t explain it simply, you don t understand it well enough Albert Einstein. Definition Chapter 5. Tests
More informationEconomics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,
Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem
More information2014/2015 Smester II ST5224 Final Exam Solution
014/015 Smester II ST54 Final Exam Solution 1 Suppose that (X 1,, X n ) is a random sample from a distribution with probability density function f(x; θ) = e (x θ) I [θ, ) (x) (i) Show that the family of
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More informationExam 2. Jeremy Morris. March 23, 2006
Exam Jeremy Morris March 3, 006 4. Consider a bivariate normal population with µ 0, µ, σ, σ and ρ.5. a Write out the bivariate normal density. The multivariate normal density is defined by the following
More informationReview. December 4 th, Review
December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter
More informationStatistical Inference with Monotone Incomplete Multivariate Normal Data
Statistical Inference with Monotone Incomplete Multivariate Normal Data p. 1/4 Statistical Inference with Monotone Incomplete Multivariate Normal Data This talk is based on joint work with my wonderful
More informationTUTORIAL 8 SOLUTIONS #
TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level
More informationF & B Approaches to a simple model
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys
More informationparameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).
4. Interval estimation The goal for interval estimation is to specify the accurary of an estimate. A 1 α confidence set for a parameter θ is a set C(X) in the parameter space Θ, depending only on X, such
More informationHypothesis testing: theory and methods
Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More informationExercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More informationThe purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.
Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That
More informationCentral Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions
Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Tiefeng Jiang 1 and Fan Yang 1, University of Minnesota Abstract For random samples of size n obtained
More informationMath 181B Homework 1 Solution
Math 181B Homework 1 Solution 1. Write down the likelihood: L(λ = n λ X i e λ X i! (a One-sided test: H 0 : λ = 1 vs H 1 : λ = 0.1 The likelihood ratio: where LR = L(1 L(0.1 = 1 X i e n 1 = λ n X i e nλ
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationMULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA
J. Japan Statist. Soc. Vol. 37 No. 1 2007 53 86 MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA M. S. Srivastava* In this article, we develop a multivariate theory for analyzing multivariate datasets
More informationLecture 5: Hypothesis tests for more than one sample
1/23 Lecture 5: Hypothesis tests for more than one sample Måns Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods 8/4 2011 2/23 Outline Paired comparisons Repeated
More informationOn corrections of classical multivariate tests for high-dimensional data
On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics
More informationSHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan
A New Test on High-Dimensional Mean Vector Without Any Assumption on Population Covariance Matrix SHOTA KATAYAMA AND YUTAKA KANO Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama,
More informationTAMS39 Lecture 10 Principal Component Analysis Factor Analysis
TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis
More informationIntroduction to Normal Distribution
Introduction to Normal Distribution Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 17-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More informationProbability Theory and Statistics. Peter Jochumzen
Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................
More informationElliptically Contoured Distributions
Elliptically Contoured Distributions Recall: if X N p µ, Σ), then { 1 f X x) = exp 1 } det πσ x µ) Σ 1 x µ) So f X x) depends on x only through x µ) Σ 1 x µ), and is therefore constant on the ellipsoidal
More informationApplied Multivariate and Longitudinal Data Analysis
Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationCOMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS
Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More information2.6.3 Generalized likelihood ratio tests
26 HYPOTHESIS TESTING 113 263 Generalized likelihood ratio tests When a UMP test does not exist, we usually use a generalized likelihood ratio test to verify H 0 : θ Θ against H 1 : θ Θ\Θ It can be used
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth
More informationSIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA
SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA Kazuyuki Koizumi Department of Mathematics, Graduate School of Science Tokyo University of Science 1-3, Kagurazaka,
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationy ˆ i = ˆ " T u i ( i th fitted value or i th fit)
1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationIf we want to analyze experimental or simulated data we might encounter the following tasks:
Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction
More informationis a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.
Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable
More informationThe Delta Method and Applications
Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationLecture 11. Multivariate Normal theory
10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances
More informationReview: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.
1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately
More informationFinal Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.
1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationAnalysis of variance, multivariate (MANOVA)
Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables
More informationTesting Some Covariance Structures under a Growth Curve Model in High Dimension
Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationHypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes
Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis
More informationMean Vector Inferences
Mean Vector Inferences Lecture 5 September 21, 2005 Multivariate Analysis Lecture #5-9/21/2005 Slide 1 of 34 Today s Lecture Inferences about a Mean Vector (Chapter 5). Univariate versions of mean vector
More informationCanonical Correlation Analysis of Longitudinal Data
Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More informationChapters 9. Properties of Point Estimators
Chapters 9. Properties of Point Estimators Recap Target parameter, or population parameter θ. Population distribution f(x; θ). { probability function, discrete case f(x; θ) = density, continuous case The
More informationHypothesis Testing One Sample Tests
STATISTICS Lecture no. 13 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 12. 1. 2010 Tests on Mean of a Normal distribution Tests on Variance of a Normal
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More information1. Density and properties Brief outline 2. Sampling from multivariate normal and MLE 3. Sampling distribution and large sample behavior of X and S 4.
Multivariate normal distribution Reading: AMSA: pages 149-200 Multivariate Analysis, Spring 2016 Institute of Statistics, National Chiao Tung University March 1, 2016 1. Density and properties Brief outline
More informationIntroduction to Maximum Likelihood Estimation
Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:
More informationOn a General Two-Sample Test with New Critical Value for Multivariate Feature Selection in Chest X-Ray Image Analysis Problem
Applied Mathematical Sciences, Vol. 9, 2015, no. 147, 7317-7325 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2015.510687 On a General Two-Sample Test with New Critical Value for Multivariate
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationx. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).
.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationQualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama
Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Instructions This exam has 7 pages in total, numbered 1 to 7. Make sure your exam has all the pages. This exam will be 2 hours
More informationTesting equality of two mean vectors with unequal sample sizes for populations with correlation
Testing equality of two mean vectors with unequal sample sizes for populations with correlation Aya Shinozaki Naoya Okamoto 2 and Takashi Seo Department of Mathematical Information Science Tokyo University
More informationexp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1
4 Hypothesis testing 4. Simple hypotheses A computer tries to distinguish between two sources of signals. Both sources emit independent signals with normally distributed intensity, the signals of the first
More informationMcGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper
McGill University Faculty of Science Department of Mathematics and Statistics Part A Examination Statistics: Theory Paper Date: 10th May 2015 Instructions Time: 1pm-5pm Answer only two questions from Section
More informationChapter 9: Hypothesis Testing Sections
Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses 9.2 Testing Simple Hypotheses 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.6 Comparing the Means of Two
More informationYou can compute the maximum likelihood estimate for the correlation
Stat 50 Solutions Comments on Assignment Spring 005. (a) _ 37.6 X = 6.5 5.8 97.84 Σ = 9.70 4.9 9.70 75.05 7.80 4.9 7.80 4.96 (b) 08.7 0 S = Σ = 03 9 6.58 03 305.6 30.89 6.58 30.89 5.5 (c) You can compute
More informationMaximum Likelihood Estimation; Robust Maximum Likelihood; Missing Data with Maximum Likelihood
Maximum Likelihood Estimation; Robust Maximum Likelihood; Missing Data with Maximum Likelihood PRE 906: Structural Equation Modeling Lecture #3 February 4, 2015 PRE 906, SEM: Estimation Today s Class An
More informationRandomized Complete Block Designs
Randomized Complete Block Designs David Allen University of Kentucky February 23, 2016 1 Randomized Complete Block Design There are many situations where it is impossible to use a completely randomized
More informationPrimer on statistics:
Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood
More informationTest Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics
Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests
More informationSYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More information