Dependencies in Interval-valued. valued Symbolic Data. Lynne Billard University of Georgia
|
|
- Bethanie Lynch
- 5 years ago
- Views:
Transcription
1 Dependencies in Interval-valued valued Symbolic Data Lynne Billard University of Georgia Tribute to Professor Edwin Diday: Paris, France; 5 September 2007
2 Naturally occurring Symbolic Data -- Mushrooms
3 Patient Records Single Hospital, Cardiology Patient Hospital Age Smoker. Patient 1 Fontaines 74 heavy Patient 2 Fontaines 78 light Patient 3 Beaune 69 no Patient 4 Beaune 73 heavy Patient 5 Beaune 80 light Patient 6 Fontaines 70 heavy Patient 7 Fontaines 82 heavy M M M M
4 Patient Records by Hospital -- aggregate over patients Result: Symbolic Data Patient Hospital Age Smoker Patient 1 Fontaines 74 heavy Patient 2 Fontaines 78 light Patient 3 Beaune 69 no Patient 4 Beaune 73 heavy Patient 5 Beaune 80 light Patient 6 Fontaines 70 heavy M M Patient 7 Fontaines 82 heavy M M Hospital Age Smoker Fontaines [70, 82] {light ¼, heavy ¾} Beaune [69, 80] {no, light, heavy} M M M
5 Histogram-valued Data -- Weight by Age Distribution:
6 Logical dependency rule E.g. Y 1 = age Y 2 = # children Classical: Y a = (10, 0), Y b = (20, 2), Y c = (18, 1) Aggregation Symbolic: ξ = (10, 20) (0, 1, 2) I.e., ξ implies classical Y d = (10, 2) is possible Need rule ν: {If Y 1 < 15, then Y 2 = 0}
7 Interval-valued data u Team Y 1 # At-Bats Y 2 # Hits u Team Y 1 # At-Bats Y 2 # Hits 1 (289, 538) (75, 162) 11 (212, 492) (57, 151) 2 (88, 422) (49, 149) 12 (177, 245) (189, 238) 3 (189, 223) (201, 254) 13 (342, 614) (121, 206) 4 (184, 476) (46, 148) 14 (120, 439) (35, 102) 5 (283, 447) (86, 115) 15 (80, 468) (55, 115) 6 (24, 26) (133, 141) 16 (75, 110) (75, 110) 7 (168, 445) (37, 135) 17 (116, 557) (95, 163) 8 (123, 148) (137, 148) 18 (197, 507) (52, 53) 9 (256, 510) (78, 124) 19 (167, 203) (48, 232) 10 (101, 126) (101, 132) ξ(2): Y 2 = 149 not possible when Y 1 < 149
8 Observation ξ(2) Y 2 Y 2 = αy R 1 R 4 R 2 R
9
10 Dependencies between Variables Interval-valued Variables E.g., Regression Analysis Dependent variable: Y = ( Y 1, L, Y q ), e.g., q=1 Predictor/regression variable: X = (X 1, L, X p ) Multiple regression model: Y = β 0 + β 1 X 1 + L + β p X p + e Error: e E(e)=0, Var(E) = σ 2, Cov(e i, e k )= 0, i k.
11 Multiple Regression Model: Y = β 0 + β 1 X 1 + L + β p X p + e In vector terms, Y = X β + e Observation matrix: Y 0 = (Y 1, L, Y n ) Design matrix: X = 1 X 11 X 1p... 1 X n1 X np Regression coefficient matrix: β 0 = (β 0, β 1, L, β p ) Error matrix: e 0 = (e 1, L, e n )
12 Model: Y = X β + e Least squares estimator of β is ˆβ = (X 0 X) -1 X 0 Y When p=1, ˆβ 1 = P ni=1 (X i X)(Y i Ȳ ) P ni=1 (X i X) 2 = Cov(X, Y ) Var(X), ˆβ 0 = Ȳ ˆβ X where Ȳ = 1 n nx i=1 Y i, X = 1 n nx i=1 X i.
13 Model: Y = β 0 + β 1 X 1 + L + β p X p + e Or, write as Y Ȳ = β 1 (X 1 X 1 )+...+ β p (X p X p )+e X j = 1 n nx i=1 X ij, j =1,...,p. β 0 Ȳ (β 1 X β p X p ) Then,
14 Y Ȳ = β 1 (X 1 X 1 )+...+ β p (X p X p )+e Least squares estimator of β is ˆβ =[(X X ) 0 (X X )] 1 (X X ) 0 (Y Ȳ ) where (X X) 0 (X X) = = Σ(X 1 X 1 ) 2. Σ(X 1 X 1 )(X p X p ). Σ(X p X p )(X 1 X 1 ) Σ(X p X p ) 2 = X i (X j1 X j1 )(X j2 X j2 ), j 1,j 2 =1,,p (X X) 0 (Y Ȳ )= X i (X j X j )(Y Ȳ ),j =1,,p
15 Interval-valued data: Yuj = [ auj, buj ], j= 1,..., p, u E= { w1,..., wu,... wm} Bertrand and Goupil (2000): Symbolic sample mean is 1 Yj = ( b a ), 2m uj + uj u E Symbolic sample variance is 1 1 S b b a a b a = ( ) [ ( )] j + + uj uj uj uj uj uj m u E 4m u E Notice, e.g., m = 1, Y = Weight Y 1 = [132, 138] Y 2 = [129, 141] 2 Y = 135, S = Y = 135, S =
16 Can rewrite 1 S [( a Y ) ( a Y )( b Y ) ( b Y ) ] j = uj j uj j uj j uj j 3 m + + u E Then, by analogy, for j = 1,2, for interval-valued variables Y 1 and Y 2, empirical covariance function Cov(Y 1, Y 2 ) is 1 Cov( Y, Y ) [ ] 1/2 1 2 = G1G 2 Q1Q 2 3m u E 2 2 Qj = ( auj Yj) + ( auj Yj)( buj Yj) + ( buj Yj) G j 1, if Yuj Yj, = 1, if Yuj > Yj, Y = ( a + b )/2. uj uj uj Notice, special cases: (i) Cov( Y, Y ) S (ii) If a uj = b uj = y j, for all u, i.e., classical data, 1 Cov( Y1, Y2) = Σ( y1 Y1)( y2 Y2) m
17 Back to Bertrand and Goupil (2000) Sample variance is 1 1 S b b a a a b = ( ) [ ( )] j + + uj uj uj uj uj uj m u E 4m u E This is total variance. Take Total Sum of Squares = Total SS j = ms 2 j Then, we can show where Total SS = Within Objects SS + Between Objects SS j j j
18 Within Objects SS j Between Objects 1 S [( a Y ) ( a Y )( b Y ) ( b Y ) ] j = uj j uj j uj j uj j 3 m + + u E 1 = [( ) + ( )( ) + ( ) ] 2 2 auj Yuj auj Yuj buj Yuj buj Yuj 3 u E SS = [( a + b ) / 2 Y ] j uj uj j u E with 1 Yuj = auj + buj Y j = auj + buj 2 m u E ( ) / 2, ( ). 2 Classical data: auj = buj = Yuj Within Objects SS j = 0
19 So, for Y j, we have Sum of Squares SS, Total SS = Within Objects SS + Between Objects SS j j j Likewise, for (Y i, Y j ), we have Sum of Products SP Total SPij = Within Objects SPij + Between Objects SPij
20 Can rewrite 1 S [( a Y ) ( a Y )( b Y ) ( b Y ) ] j = uj j uj j uj j uj j 3 m + + u E Then, by analogy, for j = 1,2, for interval-valued variables Y 1 and Y 2, empirical covariance function Cov(Y 1, Y 2 ) is 1 Cov( Y, Y ) [ ] 1/2 1 2 = G1G 2 Q1Q 2 3m u E 2 2 Qj = ( auj Yj) + ( auj Yj)( buj Yj) + ( buj Yj) G j 1, if Yuj Yj, = 1, if Yuj > Yj, Y = ( a + b )/2. uj uj uj
21 Can rewrite 1 S [( a Y ) ( a Y )( b Y ) ( b Y ) ] j = uj j uj j uj j uj j 3 m + + u E Then, by analogy, for j = 1,2, for interval-valued variables Y 1 and Y 2, empirical covariance function Cov(Y 1, Y 2 ) is 1 Cov( Y, Y ) [ ] 1/2 1 2 = G1G 2 Q1Q 2 3m u E 2 2 Qj = ( auj Yj) + ( auj Yj)( buj Yj) + ( buj Yj) G j 1, if Yuj Yj, = 1, if Yuj > Yj, Y = ( a + b )/2. uj uj uj (Total)SP part can be replaced by Total SP = 1 6 X u 2(a Ȳ )(c X)+(a Ȳ )(d X)+(b Ȳ )(c X) +2(b Ȳ )(d X)
22 How is this obtained? Recall that for a Uniform distribution, Y S (a, b), Var(Y )= (b a)2 12 By analogy, we can show, for u=1,,m observations, Within SP = 1 12 Between SP = mx u=1 mx u=1 (a u b u )(c u d u ) µ au + b u 2 Ȳ 1 µ cu + d u 2 Ȳ 2 where Y u1 =[a u,b u ], Y u2 =[c u,d u ] mx µ au + b u mx µ cu + d u Ȳ 1 = 1 m u=1 2, Ȳ 2 = 1 m u=1 2
23 Within SP = 1 12 Between SP = mx u=1 mx u=1 (a u b u )(c u d u ) µ au + b u 2 Ȳ 1 µ cu + d u 2 Ȳ 2 Hence, from Total SP = Within SP + Between SP = 1 mx 2(au Ȳ 1 )(c Ȳ 2 )+(a Ȳ 1 )(d Ȳ 2 ) 6 u=1 +(b Ȳ 1 )(c Ȳ 2 )+2(b Ȳ 1 )(d Ȳ 2 )
24 Y X1 X2 Pulse Systolic Diastolic u Rate Pressure Pressure 1 [44, 68] [90, 110] [50, 70] 2 [60, 72] [90, 130] [70, 90] 3 [56, 90] [140, 180] [90, 100] 4 [70, 112] [110, 142] [80, 108] 5 [54, 72] [90, 100] [50, 70] 6 [70, 100] [134, 142] [80, 110] 7 [72, 100] [130, 160] [76, 90] 8 [76, 98] [110, 190] [70, 110] 9 [86, 96] [138, 180] [90, 110] 10 [86, 100] [110, 150] [78, 100] 11 [63, 75] [60, 100] [140, 150] Rule: X2 = Diastolic Pressure < Systolic Pressure = X1
25
26 The regression equation becomes, for Y = Pulse Rate, X1 = Systolic Pressure Y = X1 Ȳ = X = Std Devn(Y) = Std Devn(X1) = Cov(Y, X1) = rho(y, X1) = 0.725
27 Y = Pulse Rate, X1 = Systolic Pressure Y = X1 Prediction with Ŷ u =[â u1,ˆb u1 ] â u1 = a u2 ˆb u1 = b u2
28 Symbolic Prediction Equation
29 Symbolic Prediction Intervals
30 Symbolic Prediction Intervals and Equation
31 Original Intervals Prediction Intervals
32 Data Intervals:. Prediction Intervals:
33 Predicted Pulse Rates and Residuals Observed (Y, X1) Predicted Y Residuals u Pulse Rate Systolic âu ˆbu Resa Res b 1 [44,68] [90,100] [62.099, ] [ , 1.805] 2 [60,72] [90,130] [62.099, ] [-2.099, ] 3 [56,90] [140,180] [ , ] [ , ] 4 [70,112] [110,142] [70.292, ] [-0.292, ] 5 [54,72] [90,100] [62.099, ] [-8.099, 5.805] 6 [70,100] [130,160] [78.486, ] [-8.486, ] 7 [72,100] [130,160] [78.486, ] [-6.486, ] 8 [76,98] [110,190] [70.292, ] [5.708, ] 9 [86,96] [138,180] [81.763, ] [4.237, ] 10 [86,100] [110,150] [70.292, ] [15.708, ] Y u = Pulse Rate =[a u, b u ] Ŷ u = Predicted Pulse Rate =[â u, ˆb u ] Residual =[Resa, Res b ]
34 Sum of Residuals for Symbolic Fit Sum of Min Residuals Σ u Res au = Sum of Max Residuals Σ u Res bu = Sum of Squared Residuals for Symbolic Fit Sum of Min Squared Residuals = Sum of Max Squared Residuals =
35 Classical Regression on Midpoints Y c u =(a u1+b u1 )/2, X c ju =(a uj+b uj )/2, j =1, 2 Y c = X 1 Ŷ c =[â c,ˆb c ] â c u = a u2 ˆb u = b u2
36 Classical Regression through Midpoints
37 Symbolic Regression ---- Classical regression ----
38 Comparison of Regression Fits Sum of Residuals for Symbolic Fit Sum of Min Residuals = Sum of Max Residuals = Sum of Squared Residuals for Symbolic Fit Sum of Min Squared Residuals = Sum of Max Squared Residuals = Sum of Residuals for Classical Fit Sum of Min Residuals = Sum of Max Residuals = Sum of Squared Residuals for Classical Fit Sum of Min Squared Residuals = Sum of Max Squared Residuals =
39 Centers and Range Regression DeCarvalho, Lima Neto, Tenorio, Freire,... (2004, 2005, ) Midpoint: Y c = (a + b)/2, X c = (c + d)/2 Range: Y r = (b a)/2, X r = (d - c)/2 Ŷ c = X c Ŷ r = X r
40 Centers and Range Regression DeCarvalho, Lima Neto, Tenorio, Freire,... (2004, 2005, ) Midpoint: Y c = (a + b)/2, X c = (c + d)/2 Range: Y r = (b a)/2, X r = (d - c)/2 Ŷ c = X c Ŷ r = X r Ŷ c = X1 c Xr 1 Ŷ r = X1 c Xr 1
41 Centers and Range Regression --Predictions Obs Single Multiple Y [Ŷ a, Ŷ b ] [Ŷ a, Ŷ b ] 1 [44,68] [52.572,77.439] [53.195,75.299] 2 [60,72] [59.230,82.365] [63.089,81.937] 3 [56,90] [78.537, ] [75.334, ] 4 [70,112] [65.178,88.774] [65.349,88.470] 5 [54,72] [52.572,77.439] [53.195,75.299] 6 [70,100] [72.457,96.168] [69.587,96.331] 7 [72,100] [72.457,96.168] [69.587,96.331] 8 [76,98] [75.831,96.655] [81.180,99.092] 9 [86,96] [78.209, ] [75.504, ] 10 [86,100] [66.953,90.087] [67.987,90.241]
42 Symbolic Principal Components -- BATS Y1=Head, Y2=Tail, Y3=Height, Y4=Forearm Obs [Y1a,Y1b] [Y2a,Y2b] [Y3a,Y3b] [Y4a,Y4b] [33, 52] [26, 33] [4, 7] [27, 32] 2 [38, 50] [30, 40] [7, 8] [32, 37] 3 [43, 48] [34, 39] [6, 7] [31, 38] 4 [44, 48] [34, 44] [7, 8] [31, 36] 5 [41, 51] [30, 39] [8, 11] [33, 41] 6 [40, 45] [39, 44] [9, 9] [36, 42] 7 [45, 53] [35, 38] [10, 12] [39, 44] 8 [44, 58] [41, 54] [6, 8] [35, 41] 9 [47, 53] [43, 53] [7, 9] [37, 41] 10 [50, 69] [30, 43] [11, 13] [51, 61] 11 [65, 80] [48, 60] [12, 16] [55, 68] 12 [82, 87] [46, 57] [11, 12] [58, 63]
43 Symbolic Principal Components -- BATS Y1=Head, Y2=Tail,Y3=Height,Y4=Forearm Obs PC1a PC1b PC2a PC2b PC3a PC3b
44 Symbolic Principal Component Analysis -- BATS
Descriptive Statistics for Symbolic Data
Outline Descriptive Statistics for Symbolic Data Paula Brito Fac. Economia & LIAAD-INESC TEC, Universidade do Porto ECI 2015 - Buenos Aires T3: Symbolic Data Analysis: Taking Variability in Data into Account
More informationA Resampling Approach for Interval-Valued Data Regression
A Resampling Approach for Interval-Valued Data Regression Jeongyoun Ahn, Muliang Peng, Cheolwoo Park Department of Statistics, University of Georgia, Athens, GA, 30602, USA Yongho Jeon Department of Applied
More informationVariance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18
Variance reduction p. 1/18 Variance reduction Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Variance reduction p. 2/18 Example Use simulation to compute I = 1 0 e x dx We
More informationDiscriminant Analysis for Interval Data
Outline Discriminant Analysis for Interval Data Paula Brito Fac. Economia & LIAAD-INESC TEC, Universidade do Porto ECI 2015 - Buenos Aires T3: Symbolic Data Analysis: Taking Variability in Data into Account
More informationLasso-based linear regression for interval-valued data
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS065) p.5576 Lasso-based linear regression for interval-valued data Giordani, Paolo Sapienza University of Rome, Department
More informationModelling and Analysing Interval Data
Modelling and Analysing Interval Data Paula Brito Faculdade de Economia/NIAAD-LIACC, Universidade do Porto Rua Dr. Roberto Frias, 4200-464 Porto, Portugal mpbrito@fep.up.pt Abstract. In this paper we discuss
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationA Short Course in Basic Statistics
A Short Course in Basic Statistics Ian Schindler November 5, 2017 Creative commons license share and share alike BY: C 1 Descriptive Statistics 1.1 Presenting statistical data Definition 1 A statistical
More informationGeneralization of the Principal Components Analysis to Histogram Data
Generalization of the Principal Components Analysis to Histogram Data Oldemar Rodríguez 1, Edwin Diday 1, and Suzanne Winsberg 2 1 University Paris 9 Dauphine, Ceremade Pl Du Ml de L de Tassigny 75016
More informationStudy Sheet. December 10, The course PDF has been updated (6/11). Read the new one.
Study Sheet December 10, 2017 The course PDF has been updated (6/11). Read the new one. 1 Definitions to know The mode:= the class or center of the class with the highest frequency. The median : Q 2 is
More informationSimple Linear Regression Estimation and Properties
Simple Linear Regression Estimation and Properties Outline Review of the Reading Estimate parameters using OLS Other features of OLS Numerical Properties of OLS Assumptions of OLS Goodness of Fit Checking
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationOptimization and Simulation
Optimization and Simulation Variance reduction Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique Fédérale de Lausanne M.
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix
More information11. Regression and Least Squares
11. Regression and Least Squares Prof. Tesler Math 186 Winter 2016 Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter 2016 1 / 23 Regression Given n points ( 1, 1 ), ( 2, 2 ),..., we want to determine
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationCovariance and Correlation
Covariance and Correlation ST 370 The probability distribution of a random variable gives complete information about its behavior, but its mean and variance are useful summaries. Similarly, the joint probability
More informationThe Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility
The Slow Convergence of OLS Estimators of α, β and Portfolio Weights under Long Memory Stochastic Volatility New York University Stern School of Business June 21, 2018 Introduction Bivariate long memory
More informationSimple Linear Regression for the MPG Data
Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory
More informationSTA 2201/442 Assignment 2
STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More information14 Multiple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) The Simple Linear Regression Model based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #2 The Simple
More informationEconometrics I. Professor William Greene Stern School of Business Department of Economics 3-1/29. Part 3: Least Squares Algebra
Econometrics I Professor William Greene Stern School of Business Department of Economics 3-1/29 Econometrics I Part 3 Least Squares Algebra 3-2/29 Vocabulary Some terms to be used in the discussion. Population
More informationChapter 10. Simple Linear Regression and Correlation
Chapter 10. Simple Linear Regression and Correlation In the two sample problems discussed in Ch. 9, we were interested in comparing values of parameters for two distributions. Regression analysis is the
More informationLecture 19 Multiple (Linear) Regression
Lecture 19 Multiple (Linear) Regression Thais Paiva STA 111 - Summer 2013 Term II August 1, 2013 1 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013 Lecture Plan 1 Multiple regression
More informationInverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1
Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is
More informationNonparametric Regression. Badr Missaoui
Badr Missaoui Outline Kernel and local polynomial regression. Penalized regression. We are given n pairs of observations (X 1, Y 1 ),...,(X n, Y n ) where Y i = r(x i ) + ε i, i = 1,..., n and r(x) = E(Y
More informationSimple Linear Regression
Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors
More information2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23
2.4. ASSESSING THE MODEL 23 2.4.3 Estimatingσ 2 Note that the sums of squares are functions of the conditional random variables Y i = (Y X = x i ). Hence, the sums of squares are random variables as well.
More informationECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors
ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption
More informationSTATISTICAL LEARNING SYSTEMS
STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationECON The Simple Regression Model
ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In
More informationStatistics 910, #15 1. Kalman Filter
Statistics 910, #15 1 Overview 1. Summary of Kalman filter 2. Derivations 3. ARMA likelihoods 4. Recursions for the variance Kalman Filter Summary of Kalman filter Simplifications To make the derivations
More informationREVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b)
REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b) Explain/Obtain the variance-covariance matrix of Both in the bivariate case (two regressors) and
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1
MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical
More information. a m1 a mn. a 1 a 2 a = a n
Biostat 140655, 2008: Matrix Algebra Review 1 Definition: An m n matrix, A m n, is a rectangular array of real numbers with m rows and n columns Element in the i th row and the j th column is denoted by
More informationMatrix Approach to Simple Linear Regression: An Overview
Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationFinal Exam. Economics 835: Econometrics. Fall 2010
Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each
More informationContents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects
Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:
More informationProperties of the least squares estimates
Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares
More informationChapter 5 Matrix Approach to Simple Linear Regression
STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:
More informationInteractions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept
Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and
More informationREPEATED MEASURES. Copyright c 2012 (Iowa State University) Statistics / 29
REPEATED MEASURES Copyright c 2012 (Iowa State University) Statistics 511 1 / 29 Repeated Measures Example In an exercise therapy study, subjects were assigned to one of three weightlifting programs i=1:
More informationRandom vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.
Random vectors Recall that a random vector X = X X 2 is made up of, say, k random variables X k A random vector has a joint distribution, eg a density f(x), that gives probabilities P(X A) = f(x)dx Just
More informationL2: Two-variable regression model
L2: Two-variable regression model Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: September 4, 2014 What we have learned last time...
More information18 Bivariate normal distribution I
8 Bivariate normal distribution I 8 Example Imagine firing arrows at a target Hopefully they will fall close to the target centre As we fire more arrows we find a high density near the centre and fewer
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationRegression and Covariance
Regression and Covariance James K. Peterson Department of Biological ciences and Department of Mathematical ciences Clemson University April 16, 2014 Outline A Review of Regression Regression and Covariance
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationStatistics 910, #5 1. Regression Methods
Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known
More informationProblem set 1: answers. April 6, 2018
Problem set 1: answers April 6, 2018 1 1 Introduction to answers This document provides the answers to problem set 1. If any further clarification is required I may produce some videos where I go through
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationTopic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model
Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is
More informationRegression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y
Regression and correlation Correlation & Regression, I 9.07 4/1/004 Involve bivariate, paired data, X & Y Height & weight measured for the same individual IQ & exam scores for each individual Height of
More informationWeek 9 The Central Limit Theorem and Estimation Concepts
Week 9 and Estimation Concepts Week 9 and Estimation Concepts Week 9 Objectives 1 The Law of Large Numbers and the concept of consistency of averages are introduced. The condition of existence of the population
More informationACM 116: Lectures 3 4
1 ACM 116: Lectures 3 4 Joint distributions The multivariate normal distribution Conditional distributions Independent random variables Conditional distributions and Monte Carlo: Rejection sampling Variance
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More informationStochastic Optimization One-stage problem
Stochastic Optimization One-stage problem V. Leclère September 28 2017 September 28 2017 1 / Déroulement du cours 1 Problèmes d optimisation stochastique à une étape 2 Problèmes d optimisation stochastique
More informationLecture 2: Linear and Mixed Models
Lecture 2: Linear and Mixed Models Bruce Walsh lecture notes Introduction to Mixed Models SISG, Seattle 18 20 July 2018 1 Quick Review of the Major Points The general linear model can be written as y =
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationWell-developed and understood properties
1 INTRODUCTION TO LINEAR MODELS 1 THE CLASSICAL LINEAR MODEL Most commonly used statistical models Flexible models Well-developed and understood properties Ease of interpretation Building block for more
More informationLecture 9 SLR in Matrix Form
Lecture 9 SLR in Matrix Form STAT 51 Spring 011 Background Reading KNNL: Chapter 5 9-1 Topic Overview Matrix Equations for SLR Don t focus so much on the matrix arithmetic as on the form of the equations.
More information18.440: Lecture 28 Lectures Review
18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is
More informationAdvanced Econometrics I
Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics
More informationAnswers to Problem Set #4
Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2
More informationRegression Analysis Chapter 2 Simple Linear Regression
Regression Analysis Chapter 2 Simple Linear Regression Dr. Bisher Mamoun Iqelan biqelan@iugaza.edu.ps Department of Mathematics The Islamic University of Gaza 2010-2011, Semester 2 Dr. Bisher M. Iqelan
More informationKey Algebraic Results in Linear Regression
Key Algebraic Results in Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Key Algebraic Results in
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationPrincipal Components Analysis
Principal Components Analysis Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Principal
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 12 1 / 34 Correlated data multivariate observations clustered data repeated measurement
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline
More informationRegression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin
Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n
More informationChapter 4: Models for Stationary Time Series
Chapter 4: Models for Stationary Time Series Now we will introduce some useful parametric models for time series that are stationary processes. We begin by defining the General Linear Process. Let {Y t
More informationTMA4255 Applied Statistics V2016 (5)
TMA4255 Applied Statistics V2016 (5) Part 2: Regression Simple linear regression [11.1-11.4] Sum of squares [11.5] Anna Marie Holand To be lectured: January 26, 2016 wiki.math.ntnu.no/tma4255/2016v/start
More informationsociology 362 regression
sociology 36 regression Regression is a means of modeling how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,
More informationAnalysis of Covariance
Analysis of Covariance (ANCOVA) Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 10 1 When to Use ANCOVA In experiment, there is a nuisance factor x that is 1 Correlated with y 2
More informationR 2 and F -Tests and ANOVA
R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.
More informationAnalysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington
Analsis of Longitudinal Data Patrick J. Heagert PhD Department of Biostatistics Universit of Washington 1 Auckland 2008 Session Three Outline Role of correlation Impact proper standard errors Used to weight
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationTopic 16 Interval Estimation
Topic 16 Interval Estimation Additional Topics 1 / 9 Outline Linear Regression Interpretation of the Confidence Interval 2 / 9 Linear Regression For ordinary linear regression, we have given least squares
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationStatistical Analysis in Modelling: MCMC methods. Heikki Haario Lappeenranta University of Technology
Statistical Analysis in Modelling: methods Heikki Haario Lappeenranta University of Technology heikki.haario@lut.fi Contents Introduction Linear Models Regression Analysis Covariance, classical error bounds
More informationApplied Regression Modeling: A Business Approach Chapter 3: Multiple Linear Regression Sections
Applied Regression Modeling: A Business Approach Chapter 3: Multiple Linear Regression Sections 3.4 3.6 by Iain Pardoe 3.4 Model assumptions 2 Regression model assumptions.............................................
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationRandom Vectors, Random Matrices, and Matrix Expected Value
Random Vectors, Random Matrices, and Matrix Expected Value James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 16 Random Vectors,
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear
More information1/15. Over or under dispersion Problem
1/15 Over or under dispersion Problem 2/15 Example 1: dogs and owners data set In the dogs and owners example, we had some concerns about the dependence among the measurements from each individual. Let
More informationMatrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =
Matrices and vectors A matrix is a rectangular array of numbers Here s an example: 23 14 17 A = 225 0 2 This matrix has dimensions 2 3 The number of rows is first, then the number of columns We can write
More informationMultivariate Linear Models
Multivariate Linear Models Stanley Sawyer Washington University November 7, 2001 1. Introduction. Suppose that we have n observations, each of which has d components. For example, we may have d measurements
More informationECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47
ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationLecture 4: Proofs for Expectation, Variance, and Covariance Formula
Lecture 4: Proofs for Expectation, Variance, and Covariance Formula by Hiro Kasahara Vancouver School of Economics University of British Columbia Hiro Kasahara (UBC) Econ 325 1 / 28 Discrete Random Variables:
More information