|
|
- Clifton Wheeler
- 5 years ago
- Views:
Transcription
1 sei@mistiu-tokyoacjp R?boxplot boxplotstats which does the computation?boxplotstats The two hinges are versions of the first and third quartile, ie, close to quantilex, c1,3/4 The hinges equal the quartiles for odd n where n <- lengthx and differ for even n Whereas the quartiles only equal observations for n %% 4 == 1 n = 1 mod 4, the hinges do so additionally for n %% 4 == 2 n = 2 mod 4, and are in the middle of two observations otherwise hinge R quantilex, c1,3/4 1 2 n , 1, 1 1, 2 1, 15, 2 1, 2, 3 15, 20, 25 1, 2, 3, 4 15, 25, 35 1, 2, 3, 4, 5 2, 3, 4 1, 2, 3, 4, 5, 6 2, 35, 5 1, 2, 3, 4, 5, 6, 7 25, 40, 55 1, 2, 3, 4, 5, 6, 7, 8 25, 45, 65 1
2 14 a 1 d 1 > b 1 c 1, a 2 d 2 > b 2 c 2 a 1 + a 2 d 1 + d 2 < b 1 + b 2 c 1 + c 2 a i, b i, c i, d i c 1 + c 2,d 1 + d 2 c 2,d 2 a 1 + a 2,b 1 + b 2 a 1,b 1 a 2,b 2 c 1,d p k n 1 p k 1 k! e 1 24 We obtain an estimate of the probability and its standard error as follows: ˆp = 03118, ˆp1 ˆp/N = 00046, which depend on the random seed Here N = 10 4 denotes the number of experiments The value we want to compute is p = i+j+k+l+m+r=10, maxi,j,k,l,m,r=4 10! i!j!k!l!m!r! One can obtain p = = by a brute-force method If you are interested in a faster algorithm, refer to C J Corrado 2011 The exact distribution of the maximum, minimum and the range of multinomial/dirichlet and multivariate hypergeometric frequencies, Stat Comput, 21, , ŷx = x ŷt = cos2πt/ sin2πt/12 35 â = ȳ, ˆb i = r xi ys y /s xi 2
3 35 By direct computation, the regression equations are ŷx 1, x 2 = x 1 + 3x 2 and ŷx 1 = x 1 35, respectively The sign of the coefficient of x 1 is changed is P 2 = P P = P P X Let X = QR be the QR decomposition of X Then the regression coefficient vector ˆβ = X X 1 X y = R Q QR 1 R Q y = R 1 Q y Let z = Q y Since R is an upper triangular matrix, the equation R ˆβ = z is quickly solved by the backward substitution This algorithm is numerically more stable than solving the normal equation directly In terms of numerical linear algebra, the condition number 1 of R is much smaller than that of X X Here we only give an example Let X = , y = Then the two equations R ˆβ = Q y and X X ˆβ = X y are ˆβ1 1 = 0 1 ˆβ and ˆβ1 1 =, ˆβ respectively Examine the Gaussian elimination method What happens if the is rounded to 1000? 41, See the following table positive any square symmetric definite R function spectral decomposition eigen singular value decomposition SVD svd Cholesky decomposition chol QR decomposition qr Jordan canonical form Schur canonical form LU decomposition Sylvester canonical form The spectral decomposition is available only if all the eigenvectors span the whole space 1 The condition number of a square matrix is defined by the ratio of the maximum singular value to the minimum singular value A linear equation with large condition number is hard to solve numerically 3
4 44 Denote the spectral decomposition of K by K = n i=1 λ iq i q i Let r = minn, p and assume that λ 1 > > λ r > 0 Then, for 1 i r, the scores of the i-th principal component are given by λ i q i Indeed, let X = r i=1 d iu i v i be the singular value decomposition Then we have K = r i=1 d2 i u iu i and therefore d i = λ i and u i = q i for 1 i r 45 46, fx = x 1 + x , ROC AUC 075 True positive rate False positive rate 58 ROC x, y X Y X, Y = 1 y, 1 x y = 1 x AUC 59, hθ ĥ = ĥx 1,, X n hθ = E θ [ĥ] hθ = n ĥx 1,, x n θ xt 1 θ 1 xt, θ 0, 1 x {0,1} n t=1 hθ = 1/θ θ 0 θ 0 ĥ0,, 0 63 E[ˆµ] = n i=1 w iµ = µ n i=1 w i = 1 Lagrange 64 4
5 65 i 1, ii 1/θ, iii 1/2 66 X, Y I X θ, I Y θ I Y θ = f Y y; θ{ θ log f Y y; θ} 2 dy = f X gy; θg y{ θ log f X gy; θ} 2 dy = f X x; θ{ θ log f X x; θ} 2 dx = I X θ 67 E[{ θ log fx; θ} 2 ] = { θ fx; θ}{ θ log fx; θ}dx { } = θ fx; θ θ log fx; θdx fx; θ θ 2 log fx; θdx = E[ θ 2 log fx; θ] i E θ [X] = {θ 1 + θ + θ + 1}/3 = θ ii θ ϕx ϕθ 1 + ϕθ + ϕθ = θ, θ Z = {0, ±1, }, ϕ 1 = ϕ0 = ϕ1 = 0 ϕx ϕ2 = ϕ3 = ϕ4 = 3, ϕ5 = ϕ6 = ϕ7 = 6, ϕx θ = 0 0 ϕ0 = ϕ1 = ϕ2 = 0 θ = 1 0 MVUE ˆθ 2 θ = 0, 1 V θ [ˆθ ] = 0 ˆθ 0 = ˆθ 1 = ˆθ 2 = ˆθ 3 MVUE 610 Nθ, 1 ˆθ = X θ ˆθ 2 θ 2 E[ˆθ 2 ] = θ 2 + 1/n θ Lθ ˆθ θ ˆθ Lθ < Lˆθ ϕ = hθ Lh 1 ϕ ϕ ˆϕ ϕ Lh 1 ϕ Lh 1 ˆϕ h h 1 ˆϕ = ˆθ 73 Γα + 1 = αγα E[X] = 0 β α x α e βx dx = 1 Γα βγα V[X] = E[X 2 ] E[X] 2 = 0 0 β α x α+1 e βx Γα 5 z α e z dz = Γα + 1 βγα = α β, dx α2 α + 1α = β2 β 2 α2 β 2 = α β 2
6 74, i ii fx; p = r+x 1 x expx log1 p + r log p θ = log1 p, sx = x ψθ = r log p = r log1 e θ iii x k = 1 k 1 i=1 x i fx; p = exp k 1 i=1 x i logp i /p k + log p k θ i = logp i /p k, s i x = x i 1 i k 1 ψθ = log p k = log1 + k 1 i=1 eθ i 77 fx; θ = axe θsx ψθ i Iθ = E θ [ θ 2 log fx; θ] = E θ[ψ θ] = ψ θ ii E θ [ θ log fx; θ] = 0 µθ = E θ [sx] = ψ θ 71 ψ θ > 0 µθ iii Iµ = Iθ/dµ/dθ 2 i, ii I µ = 1/ψ θ iv V θ [sx t ] = 1/Iµ cos2πx 1 E[cos2πX 1 ] = 1 0 cos2πxdx = 0, V[cos2πX 1] = E[cos 2 2πX 1 ] = 1 0 cos2 2πxdx = 1 2 Z n/ n N0, 1/ i N0, p1 p ii ˆp ± 196 ˆp1 ˆp/ n X = 099 ˆθ = 21 X = 002 ˆθ X % 002 ± = 002 ± 053 V[ˆθ] = 4 n V[X 1] = 4 n 1 θ/2 θ2 /4 ˆθ = 002 V[ˆθ] 1/ , 001, 0001 c 196/ n, 258/ n, 329/ n R = {x X c} 164/ n, 233/ n, 309/ n 92 6
7 93 i Lθ = n t=1 θxt 1 θ 1 xt ˆθ = n 1 n LLR = 2 log Lˆθ n Lθ 0 = 2 x t log ˆθ + 1 x t log 1 ˆθ θ 0 ii LLR = 2n ˆθ log ˆθ θ ˆθ + θ 0 0 iii LLR = 2n log ˆθ = 2n t=1 1 + θ 0 θ 0 ˆθ iv LLR = n log ˆσ2 σ ˆσ2 + ˆµ µ 0 2 σ0 2 ˆθ log ˆθ θ ˆθ log 1 ˆθ 1 θ 0 1 θ 0 fx; θ = axe θsx ψθ θ = θ 0 LLR n t=1 2 log fx t; ˆθ n ˆθ t=1 fx t; θ 0 = 2n θ 0 ψ ˆθ ψˆθ + ψθ 0 ψ ˆθ = n 1 n t=1 sx t t=1 x t 94, 95, i MLE ˆθ = x/n θ 1 = θ 3 MLE θ 1 = θ 3 = x 1 + x 3 /2n, θ 2 = 1 2 θ 1 ii MLE ˆθ = x 17 n = 40, 10 40, 13 x1 + x 3, θ = 40 2n, x 2 n, x 1 + x 3 2n T x = 2 17 log log + 13 log = = 15 40, 10 40, % 384 p , The likelihood function is Lµ, σ 2 = 2πσ 2 n/2 e y µ 2 /2σ 2, µ M, σ 2 > 0 The maximum likelihood estimator MLE of µ M and σ 2 > 0 is given by ˆµ = P y and ˆσ 2 = y P y 2 /n Note that ˆσ 2 is not unbiased Similarly, the MLE under the null hypothesis µ M 0 is ˆµ 0 = P 0 y and ˆσ 2 0 = y P 0y 2 /n Then the log-likelihood ratio test statistic is 2 log L ˆµ, ˆσ2 L ˆµ 0, ˆσ 0 2 = n log y ˆσ2 ˆµ 2 ˆσ 2 + n log ˆσ y ˆµ 0 2 ˆσ 0 2 = n log ˆσ 2 + n log ˆσ 2 0 = n log y P 0y 2 y P y 2 7
8 103 P P 0 R 2 = P y P 0 y 2 / y P 0 y 2 F F y = P y P 0 y 2 /p p 0 / y P y 2 /n p R 2 P y P 0 y 2 p p 0 = y P y 2 + P y P 0 y 2 = n p F y 1 + p p 0 n p F y R 2 F y 104 A statistical model for a paired sample is X i Nµ i, σ 2 /2 and Y i Nµ i +a, σ 2 /2, where µ i and a are unknown The null hypothesis is a = 0 The t-test statistic is nȳ x T x, y =, ˆσ 2 = 1 n y i x i ȳ x 2, ˆσ n 1 with the degree of freedom n 1 A statistical model for unpaired two samples is X i Nµ, σ 2 and Y j Nµ + a, σ 2, where µ and a are unknown The null hypothesis is a = 0 Note that µ cannot depend on the index i in contrast to the paired samples The t-test statistic is T x, y = n1 n 2 n 1 + n 2 ȳ x ˆσ, ˆσ 2 = i=1 1 n 1 + n 2 2 n 1 n 2 x i x 2 + y j ȳ 2, with the degree of freedom n 1 + n 2 2 The estimate ˆσ 2 is called the pooled variance Even if n 1 = n 2, the statistic T x, y is different from T x, y Indeed, if n 1 = n 2 = n, nȳ x T x, y =, ˆτ 2 = 1 n {x i x 2 + y i ȳ 2 } ˆτ n 1 It is easy to see that T x, y > T x, y if and only if x and y have positive correlation For example, let n 1 = n 2 = 2, x 1, y 1 = 0, 0 and x 2, y 2 = 50, 51 Then T x, y = 1 and T x, y = 0014 The p-value for each statistic is 025 and 0495, respectively i=1 i=1 j=1 105, Let y it 1 i 3, 1 t 4 be the observed data The statistical model is Y it = a i + ε it, ε it N0, σ 2 The F-test statistic for the null hypothesis a 1 = a 2 = a 3 is F = 3 4 i=1 t=1 ȳ i ȳ 2 / i=1 t=1 y it ȳ i 2 /12 3 = 835/2 505/9 = = 744 In summary, we obtain the following analysis-of-variance ANOVA table: sum of squares degree of freedom variance F-value p-value motor residuals total The p-value is smaller than 005 ie, significant at the level 005, and therefore we will reject the null hypothesis a 1 = a 2 = a 3 In fact, the motor A 3 seems to have better performance than the others since ȳ 1 = 1552, ȳ 2 = 1572 and ȳ 3 =
9 111 n e β x t y t n e β x t y t Lβ =, Lβ = e eβ x t 1 + e β x t y t! t=1 t=1 113 Y 1, Y 2 µ 1, µ 2 Y 1 + Y 2 µ 1 + µ 2 Y 1 + Y 2 Y 1, Y 2 PY 1 = y 1, Y 2 = y 2 PY 1 + Y 2 = y 1 + y 2 = µ y 1 1 y 1! e µ 1 µy2 2 y 2! e µ 2 µ 1 + µ 2 y 1+y 2 = e y 1 + y 2! µ 1 µ i 87 ii y 1 + y 2! y 1!y 2! µ1 µ 1 + µ 2 y1 µ2 µ 1 + µ 2 Model fy ϕ ay, ϕ ψη ψ 1 µ Normal linear 2πϕ 1/2 e y η2 /2ϕ σ 2 2πϕ 1/2 e y2 /2ϕ η 2 /2 µ Logistic e ηy /e η loge η + 1 logµ/1 µ Poisson e ηy /y!e eη 1 1/y! e η log µ y Here is a part of the output: Coefficients: Estimate Std Error z value Pr> z Intercept * stadiumhome rank * rank Signif codes: 0 *** 0001 ** 001 * The z value is the ratio of the estimate to the standard error For example, the z value of the intercept is / = 2007 Its p-value is P Z 2007 = 00447, where Z N0, 1 The variable stadium is a factor object and automatically encoded as 1 if stadium == Home and 0 if stadium == Away In the three explanatory variables, only rank1 is 5% significant rg rf = 2 fx logfx/gxdx 0, g = f Let ŷ k t be the fitted values predicted values of y t for each model k = 0, 1,, 5 The squared prediction error is n 1 n t=1 ỹ t ŷ k t 2, where n = 12 The AIC of the model k is given by AICk = n log ˆσ k k + 2, where ˆσ2 k = n 1 n t=1 y t ŷ k t 2 is the MLE of the variance parameter σ 2 By numerical computation, we obtain the following table of the prediction error and AIC k prediction error AIC
10 The number k which minimizes the prediction error is 5, and k which minimizes AIC is also 5 However, there is a large gap between the two models k = 0 and k = 1 Furthermore, in practice, the number of parameters in minimizing AIC is recommended to be at most n/2, where n is the sample size Then we may select the model k = The AIC values up to an additive constant of all submodels are shown in the following table, where 123 denotes the submodel using the variables x 1, x 2, x 3 and so on model AIC model AIC model AIC The submodel selected by the backward selection method is 23, and the linear predictor is µ log 1 µ = GDP per capita population density, where µ denotes the probability that the country is in Asia 125 We first show that E[ P Y µ 2 ] = p for any orthogonal projection matrix P onto a p-dimensional subspace Indeed, E [ P Y µ 2] = E [ Y µ P P Y µ ] = E [ tr P Y µy µ P ] trab = trba = tr P E[Y µy µ ]P = trp P Y N0, I n = trp 2 = trp = p i Since Y and Ỹ are iid, we have [ E Ỹ P Y 2] = E [ Ỹ µ + µ P µ + P µ P Y 2] = E [ Ỹ µ 2] + µ P µ 2 + E [ P Y µ 2] ii In a similar manner, we obatin iii The log-likelihood function is = n + µ P µ 2 + p E [ Y P Y 2] = E [ I n P Y 2] = E [ I n P Y µ 2] + I n P µ 2 = n p + µ P µ 2 log Lµ = n 2 log2π 1 2 Y µ 2 The MLE of µ in the subspace M is ˆµ = P Y Therefore AIC of the model M is the same as Y P Y 2 + 2p except for a constant term n log2π Finally, we obtain from the result of ii and i E[ Y P Y 2 + 2p] = µ P µ 2 + n p + 2p = E[ Ỹ P Y 2 ] 10
Generalized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationMLES & Multivariate Normal Theory
Merlise Clyde September 6, 2016 Outline Expectations of Quadratic Forms Distribution Linear Transformations Distribution of estimates under normality Properties of MLE s Recap Ŷ = ˆµ is an unbiased estimate
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationAnswer Key for STAT 200B HW No. 7
Answer Key for STAT 200B HW No. 7 May 5, 2007 Problem 2.2 p. 649 Assuming binomial 2-sample model ˆπ =.75, ˆπ 2 =.6. a ˆτ = ˆπ 2 ˆπ =.5. From Ex. 2.5a on page 644: ˆπ ˆπ + ˆπ 2 ˆπ 2.75.25.6.4 = + =.087;
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationAMS-207: Bayesian Statistics
Linear Regression How does a quantity y, vary as a function of another quantity, or vector of quantities x? We are interested in p(y θ, x) under a model in which n observations (x i, y i ) are exchangeable.
More informationExercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationMISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30
MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)
More informationThis paper is not to be removed from the Examination Halls
~~ST104B ZA d0 This paper is not to be removed from the Examination Halls UNIVERSITY OF LONDON ST104B ZB BSc degrees and Diplomas for Graduates in Economics, Management, Finance and the Social Sciences,
More information8. Hypothesis Testing
FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements
More informationPoisson regression 1/15
Poisson regression 1/15 2/15 Counts data Examples of counts data: Number of hospitalizations over a period of time Number of passengers in a bus station Blood cells number in a blood sample Number of typos
More informationStatistics and Econometrics I
Statistics and Econometrics I Point Estimation Shiu-Sheng Chen Department of Economics National Taiwan University September 13, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I September 13,
More information5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.
88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal
More informationSTAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.
STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationProblem 1 (20) Log-normal. f(x) Cauchy
ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples
ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will
More information2017 Financial Mathematics Orientation - Statistics
2017 Financial Mathematics Orientation - Statistics Written by Long Wang Edited by Joshua Agterberg August 21, 2018 Contents 1 Preliminaries 5 1.1 Samples and Population............................. 5
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationAnswer Key for STAT 200B HW No. 8
Answer Key for STAT 200B HW No. 8 May 8, 2007 Problem 3.42 p. 708 The values of Ȳ for x 00, 0, 20, 30 are 5/40, 0, 20/50, and, respectively. From Corollary 3.5 it follows that MLE exists i G is identiable
More informationStatistics Ph.D. Qualifying Exam: Part I October 18, 2003
Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your answer
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationFIRST MIDTERM EXAM ECON 7801 SPRING 2001
FIRST MIDTERM EXAM ECON 780 SPRING 200 ECONOMICS DEPARTMENT, UNIVERSITY OF UTAH Problem 2 points Let y be a n-vector (It may be a vector of observations of a random variable y, but it does not matter how
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Yan Lu Jan, 2018, week 3 1 / 67 Hypothesis tests Likelihood ratio tests Wald tests Score tests 2 / 67 Generalized Likelihood ratio tests Let Y = (Y 1,
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationRegression Estimation Least Squares and Maximum Likelihood
Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationLikelihoods for Generalized Linear Models
1 Likelihoods for Generalized Linear Models 1.1 Some General Theory We assume that Y i has the p.d.f. that is a member of the exponential family. That is, f(y i ; θ i, φ) = exp{(y i θ i b(θ i ))/a i (φ)
More informationP n. This is called the law of large numbers but it comes in two forms: Strong and Weak.
Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationGeneralized Linear Models
Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.
More informationAdministration. Homework 1 on web page, due Feb 11 NSERC summer undergraduate award applications due Feb 5 Some helpful books
STA 44/04 Jan 6, 00 / 5 Administration Homework on web page, due Feb NSERC summer undergraduate award applications due Feb 5 Some helpful books STA 44/04 Jan 6, 00... administration / 5 STA 44/04 Jan 6,
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationApproximating models. Nancy Reid, University of Toronto. Oxford, February 6.
Approximating models Nancy Reid, University of Toronto Oxford, February 6 www.utstat.utoronto.reid/research 1 1. Context Likelihood based inference model f(y; θ), log likelihood function l(θ; y) y = (y
More informationSampling distribution of GLM regression coefficients
Sampling distribution of GLM regression coefficients Patrick Breheny February 5 Patrick Breheny BST 760: Advanced Regression 1/20 Introduction So far, we ve discussed the basic properties of the score,
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More informationPh.D. Qualifying Exam Friday Saturday, January 3 4, 2014
Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationLikelihood Ratio tests
Likelihood Ratio tests For general composite hypotheses optimality theory is not usually successful in producing an optimal test. instead we look for heuristics to guide our choices. The simplest approach
More informationThis model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that
Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear
More informationSB1a Applied Statistics Lectures 9-10
SB1a Applied Statistics Lectures 9-10 Dr Geoff Nicholls Week 5 MT15 - Natural or canonical) exponential families - Generalised Linear Models for data - Fitting GLM s to data MLE s Iteratively Re-weighted
More informationGeneralized Linear Models (1/29/13)
STA613/CBB540: Statistical methods in computational biology Generalized Linear Models (1/29/13) Lecturer: Barbara Engelhardt Scribe: Yangxiaolu Cao When processing discrete data, two commonly used probability
More informationFor iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.
Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of
More informationLoglikelihood and Confidence Intervals
Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationFinal Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.
1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationComposite Hypotheses and Generalized Likelihood Ratio Tests
Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve
More informationHypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes
Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationLogistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20
Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)
More informationSTAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing
STAT763: Applied Regression Analysis Multiple linear regression 4.4 Hypothesis testing Chunsheng Ma E-mail: cma@math.wichita.edu 4.4.1 Significance of regression Null hypothesis (Test whether all β j =
More informationRegression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood
Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More informationChapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests
Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Throughout this chapter we consider a sample X taken from a population indexed by θ Θ R k. Instead of estimating the unknown parameter, we
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More informationMath 152. Rumbos Fall Solutions to Assignment #12
Math 52. umbos Fall 2009 Solutions to Assignment #2. Suppose that you observe n iid Bernoulli(p) random variables, denoted by X, X 2,..., X n. Find the LT rejection region for the test of H o : p p o versus
More informationChapter 17: Undirected Graphical Models
Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)
More informationTopic 19 Extensions on the Likelihood Ratio
Topic 19 Extensions on the Likelihood Ratio Two-Sided Tests 1 / 12 Outline Overview Normal Observations Power Analysis 2 / 12 Overview The likelihood ratio test is a popular choice for composite hypothesis
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationBeyond GLM and likelihood
Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence
More informationOptimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.
Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may
More informationCh. 5 Hypothesis Testing
Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationMATH5745 Multivariate Methods Lecture 07
MATH5745 Multivariate Methods Lecture 07 Tests of hypothesis on covariance matrix March 16, 2018 MATH5745 Multivariate Methods Lecture 07 March 16, 2018 1 / 39 Test on covariance matrices: Introduction
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationwhere x and ȳ are the sample means of x 1,, x n
y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =
More informationNotes on the Multivariate Normal and Related Topics
Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions
More informationMIT Spring 2016
Generalized Linear Models MIT 18.655 Dr. Kempthorne Spring 2016 1 Outline Generalized Linear Models 1 Generalized Linear Models 2 Generalized Linear Model Data: (y i, x i ), i = 1,..., n where y i : response
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More information