The Missing-Index Problem in Process Control

Size: px
Start display at page:

Download "The Missing-Index Problem in Process Control"

Transcription

1 The Missing-Index Problem in Process Control Joseph G. Voelkel CQAS, KGCOE, RIT October 2009 Voelkel (RIT) Missing-Index Problem 10/09 1 / 35

2 Topics 1 Examples Introduction to the Problem 2 Models, Inference, Likelihoods, EM Algorithm 3 Examples of the Method 4 Summary, Future Work Paper at Voelkel (RIT) Missing-Index Problem 10/09 2 / 35

3 Example 1. 8-spindle machine CP = Cyclic Permutation case Sample J=8 consecutive parts Sampled in order of product, but rst index value not known Index j=1,...,j? ? ? 3... J possibilities. Voelkel (RIT) Missing-Index Problem 10/09 3 / 35

4 Example 2. 4-cavity machine GP = General Permutation case Sample the J=4 parts from the shot No order to the parts Index j=1,...,j? ? ? ? 4... J! possibilities. Voelkel (RIT) Missing-Index Problem 10/09 4 / 35

5 Example cavity mold SP = Sampled Permutation Case Sample J=4 parts at random from the N=64 parts Index j=1,...,j? ? ? 3... Will not be discussed in this talk. N! (N J)! possibilities Voelkel (RIT) Missing-Index Problem 10/09 5 / 35

6 Models Index-Known Case General model we consider, if index j were known, is Y ij = + i + j + " j(i) + i ; i = 1; 2; : : : ; I; j = 1; 2; : : : ; J P i i = P j j = 0; " j(i) N 0; 2 ; i N 0; 2 ; ind r.v. s ( i : for non-stochastic events, e.g. mean drifts/shifts) 1 Injection Molding: j =cavities, " j(i) =within-shot variation, i =across-shot variation 2 Multiple-spindle machine: j =spindles. Spindle parts as a group ) " j(i), i analogous to above. Spindle parts one at a time ) model excludes i. Our interest here lies with (; ). Voelkel (RIT) Missing-Index Problem 10/09 6 / 35

7 Models Index Unknown. CP Case If J = 3, index order is either (1; 2; 3), (3; 1; 2), or (2; 3; 1) Ordered set of these 3 vectors CP (3), indexed by k Model same as before except Y ij = + i + j + " j(i) + i =) Y ij = + i + jk (i)+" jk (i) + i Here, (1 k ; 2 k ; : : : J k ) is the (unknown) k th member of CP (J) E.g, k = 2 =) (1 k ; 2 k ; 3 k ) = (3; 1; 2), so Y i1 = + i + 3 +" 2(i) + i Reasonable: k is discrete uniform on 1; 2; : : : J for each i. Voelkel (RIT) Missing-Index Problem 10/09 7 / 35

8 Example CP Case Indices in (unknown) correct order Y1 Y2 Y Actual Y data (CP d) Y1 Y2 Y Voelkel (RIT) Missing-Index Problem 10/09 8 / 35

9 CP Case: Inference Y ij = + i + jk (i) + " jk (i) + i Objective: inference on = ( 1 ; 2 ; : : : ; J ) and (Inference on f i g and (or =J) made from Y i ) Note: information on is contained in contrasts within each time i, e.g. in Y ij Y i for j = 1; 2; : : : ; J Restriction on. We will use J = 0, not P j j = 0 Still identi able (at best) only up to cyclic permutation. Voelkel (RIT) Missing-Index Problem 10/09 9 / 35

10 CP Case: Inference Will not use the J Y ij Y i (correlated) contrasts for inference Instead, will use the J 1 Helmert contrasts: Z i1 = c 1 (Y i2 Y i1 ) Z i2 = c 2 (Y i3 (Y i1 + Y i2 ) =2) ::: PJ 1 Z i;j 1 = c J 1 Y ij j=1 Y ij = (J 1) Here, c j = 1= p 1 + (1=j) de ned so var (Z ij ) = 2 when the indices are correctly aligned Note: Y is I J, Z is I (J 1). Voelkel (RIT) Missing-Index Problem 10/09 10 / 35

11 CP Case: Inference The J 1 Helmert contrasts: Z i1 = c 1 (Y i2 Y i1 ) Z i2 = c 2 (Y i3 (Y i1 + Y i2 ) =2) Actual Y data (CP d) Y1 Y2 Y Z data (w/o the c j...) Z1 Z Voelkel (RIT) Missing-Index Problem 10/09 11 / 35

12 CP Case: Likelihood Y ij = + i + jk (i) + " jk (i) + i Will estimate (; ) using likelihood methods (focus: estimation) Let f (z; ) = density function of N ; 2. For J = 3: Z i1 = c 1 (Y i2 Y i1 ), Z i2 = c 2 (Y i3 (Y i1 + Y i2 ) =2) E [Z i1 ] = c 1 2k (i) 1k (i) E [Z i2 ] = c 2 3k (i) 1k (i) + 2k (i) =2 k is discrete uniform on 1; 2; 3, so i th likelihood contribution is 1=3 of f (z i1 ; c 1 ( 2 1 )) f (z i2 ; c 2 ( 3 ( ) =2)) + f (z i1 ; c 1 ( 1 3 )) f (z i2 ; c 2 ( 2 ( ) =2)) + f (z i1 ; c 1 ( 3 2 )) f (z i2 ; c 2 ( 1 ( ) =2)) Recall 3 = 0, but kept in here for symmetry Note: sum of J terms each term is a product of J 1 terms. Voelkel (RIT) Missing-Index Problem 10/09 12 / 35

13 CP Case: Likelihood Note: sum of J terms each is a product of J 1 terms General J case. Contribution at i to likelihood is 1 P J Q J 1 k=1 j=1 J f Pj i z ij ; c j h jk+1 (i) l=1 jl(i) =j. Voelkel (RIT) Missing-Index Problem 10/09 13 / 35

14 GP Case: Likelihood GP case is analogous to CP case For J = 3, now have J! = 6 permutations (1; 2; 3), (1; 3; 2), (2; 1; 3), (2; 3; 1), (3; 1; 2), and (3; 2; 1) Contribution to the likelihood at time i is 1=6 times f (z i1 ; c 1 ( 2 1 )) f (z i2 ; c 2 ( 3 ( ) =2)) + f (z i1 ; c 1 ( 3 1 )) f (z i2 ; c 2 ( 2 ( ) =2)) + f (z i1 ; c 1 ( 1 2 )) f (z i2 ; c 2 ( 3 ( ) =2)) + f (z i1 ; c 1 ( 3 2 )) f (z i2 ; c 2 ( 1 ( ) =2)) + f (z i1 ; c 1 ( 1 3 )) f (z i2 ; c 2 ( 2 ( ) =2)) + f (z i1 ; c 1 ( 2 3 )) f (z i2 ; c 2 ( 1 ( ) =2)) General J: sum of J! terms each is a product of J Next: back to CP case. 1 terms Voelkel (RIT) Missing-Index Problem 10/09 14 / 35

15 Finding MLE s The log-likelihood is very complex, even in simpler CP case: P I i=1 ln 1 P J Q J 1 J k=1 j=1 f Pj i z ij ; c j h jk+1 (i) l=1 jl(i) =j Maximizing usually fails for all but the simplest CP cases Some similarities to estimation of parameters in normal-mixture case So, consider use of EM algorithm However, our problem is both more and less complex than the normal-mixture problem More complex: likelihood term is sum of products, not just a sum Less complex: mixture probabilities are known. Voelkel (RIT) Missing-Index Problem 10/09 15 / 35

16 EM Algorithm For compactness, de ne (E [Z ij ] for each i) Pj jk = jk () = c j jk l=1 j l =j Then i th contr n to likelihood is 1 P J Q J 1 J k=1 j=1 f z ij; jk Idea behind EM algorithm: consider how likelihood would look if all the CP s were known ( no missing data ) De ne k (i) to be the actual permutation index at time i, and de ne all of these to be k = (k (1) ; k (2) ; : : : ; k (I)) Then incomplete data is Z = (Z 1 ; Z 2 ; : : : ; Z I ), where Z i = (Z i1 ; Z i2 ; : : : ; Z i;j 1 ) is the observed (transformed) data at time i, and complete data is (Z; k ). Voelkel (RIT) Missing-Index Problem 10/09 16 / 35

17 EM Algorithm The i th contr n to incomplete-data log likelihood is the complex 1 P ln J Q J 1 k=1 j=1 J f z ij; jk However, the i th contr n to the complete-data log likelihood is then simply P J 1 j=1 z ln f ij ; jk (i) () Simple index-known case: easy to maximize with respect to (; ) For example, if data indices rearranged so that k (i) = 1 for each i, then the MLE of 2 1 would simply be Y 2 Y1 Idea of EM algorithm: estimate the correct indices so complete-data log likelihood can be used. Voelkel (RIT) Missing-Index Problem 10/09 17 / 35

18 Cycle of EM Algorithm Each cycle of the EM algorithm requires that we 1 Use current estimates ^ c at iteration c, of = (; ) 2 Find conditional expectation of complete-data (i.e., (Z; k )) log likelihood, conditional on the incomplete data Z (using at ^ c to obtain conditional expectation) 3 Maximize the result in (2) with respect to, to get ^ c+1 4 Continue until convergence Questions: Initial estimates Expectation step (interesting!) Maximization step (also interesting!) Convergence criteria. Voelkel (RIT) Missing-Index Problem 10/09 18 / 35

19 EM Algorithm: Expectation step For EM algorithm, it is useful to write complete-data log-likelihood ` () = P I P J 1 i=1 j=1 z ln f ij ; jk (i) () as ` () = P I i=1 P J k(i)=1 (k (i) ; k (i)) P J 1 j=1 z ln f ij ; jk(i) () where (k; k ) = 1 if k = k and is 0 otherwise This makes conditional expectation easier to obtain k (i) is now part of a linear term. Voelkel (RIT) Missing-Index Problem 10/09 19 / 35

20 EM Algorithm: Expectation step Expectation step requires nding `em () = E [` () jz] (conditional expectation evaluated at ^ c ) Here, need to nd `em () = P I i=1 P J k(i)=1 E [ (k (i) ; k (i)) jz i ] P J 1 j=1 z ln f ij ; jk(i) () Voelkel (RIT) Missing-Index Problem 10/09 20 / 35

21 EM Algorithm: Expectation step Need to nd `em () = P I i=1 We can solve this. Find that P J k(i)=1 E [ (k (i) ; k (i)) jz i ] P J 1 j=1 z ln f ij ; jk(i) () E [ (k (i) ; k (i)) jz i ] = P ( (k (i) ; k (i)) = 1 jz i ) Q J 1 j=1 z f ij ; jk(i) () = P J Q J 1 k(i)=1 j=1 z f ij ; jk(i) () (i; k (i)), where (i; k (i)) = P i th CP has index k is called the responsibility of permutation k to observation i Evaluating at ^ c, we obtain estimates ^ c (i; k (i)) of the (i; k (i)). Voelkel (RIT) Missing-Index Problem 10/09 21 / 35

22 EM Algorithm: Expectation step responsibility ^ (i; k (i)) = ^P i th CP has index k Assume ^ c = ^ c ; ^ c = ((4; 1; 0) ; 1) [index k=1] f (z i1 ; c 1 ( 2 1 )) f (z i2 ; :::) + [index k=2] f (z i1 ; c 1 ( 1 3 )) f (z i2 ; :::) + [index k=3] f (z i1 ; c 1 ( 3 2 )) f (z i2 ; :::) Z i data (w/o the c j...). Just consider Z i1 for i = 1 and i = ( 3)? 1 3 (4)? 3 2 ( 1)? ( 3)? 1 3 (4)? 3 2 ( 1)?. Voelkel (RIT) Missing-Index Problem 10/09 22 / 35

23 EM Algorithm: Maximization step Next, we need to maximize `em () = P I i=1 with respect to = (; ) P J k(i)=1 ^ (i; k (i)) P J 1 j=1 z ln f ij ; jk(i) () Close look at sum: portion is mathematically equivalent to a weighted sum of squares in a linear-regression framework, so associated matrix methods can be directly applied However, instead of the usual I terms, we now have IJ (J 1) terms. Voelkel (RIT) Missing-Index Problem 10/09 23 / 35

24 EM Algorithm: Maximization step For matrix machinery, de ne X (predictor) and W (weight) matrices and U (response) vector. For J = 3, here are portions for a given i. (k (i) ; j) X portion diag (W) portion U portion (1; 1) c 1 c 1 0 (i; 1) Z i1 (1; 2) c 2 =2 c 2 =2 c 2 (i; 1) Z i2 (2; 1) c 1 0 c 1 (i; 2) Z i1 (2; 2) 6 c 2 =2 c 2 c 2 =2 7 6 (i; 2) 7 6Z i2 7 (3; 1) 4 0 c 1 c (i; 3) 5 4Z i1 5 (3; 2) c 2 c 2 =2 c 2 =2 (i; 3) Z i2 Solution is (X not full column rank) ^ = X 0 WX X 0 WU. Voelkel (RIT) Missing-Index Problem 10/09 24 / 35

25 EM Algorithm: J=2 case Some algebra leads to An extreme case ^ 1 = (1=Ic 1 ) P I i=1 Z i1 (1 2 (i; 1)) = (1=I) P I i=1 (Y i2 Y i1 ) (1 2 (i; 1)) Say all permutations just happen to be (1; 2) and separation via is much larger than noise via Then estimated values (i; 1) 1, so ^ 1 P I i=1 (Y i1 Y i2 ) =I If permutations have clean separation, each (i; 1) weight either 0 or 1, an indicator of which permutation took place If poor separation, (i; k (i)) tend to be closer to 0:5 in extreme case (all weights = 0:5) get ^ 1 = 0: 0 estimated separation. Voelkel (RIT) Missing-Index Problem 10/09 25 / 35

26 EM Algorithm: Implementation Expectation step Maximization step Questions Initial estimates Convergence criteria Initial permutation of rows. (= Voelkel (RIT) Missing-Index Problem 10/09 26 / 35

27 EM Algorithm: Implementation Initial permutation of rows? Can get initial estimates if reasonable idea of how to permute rows If some signal in the data, this is possible Data Y Y1 Y2 Y Best signal=row with max var? Align other rows to this one Permuted data Y Y1 Y2 Y (= Voelkel (RIT) Missing-Index Problem 10/09 27 / 35

28 EM Algorithm: Implementation Advantage of initial permutation of rows? 1 Initial estimates of the parameters. For, estimate is Y 1 Y J; Y 2 Y J; : : : ; Y (J 1) Y J; 0 2 Best-guess row alignments in place. If reasonably strong signal, most likely permutation in CP (J) to be correct for each i is k = 1 (no permutation) So f^ c (i; k (i)) ; i = 1; 2; : : : ; Ig 1 for k (i) = 1 and 0 for other k (i), across iterations c = 1; 2; : : : 3 If signal very weak, will observe that the f^ c (i; 1) ; i = 1; 2; : : : ; Ig tends to decrease with c algorithm indicates that initial row-alignments could be other ones. Voelkel (RIT) Missing-Index Problem 10/09 28 / 35

29 EM Algorithm: Implementation Convergence criteria? 1 Stability of f^ c (i; k (i))g across iterations c determines stability of estimates 2 So, reasonable stopping rule is rst c such that max i;k where cutoff is, say, ^ c (i; k) ^ c 1 (i; k) < i;k(i) Voelkel (RIT) Missing-Index Problem 10/09 29 / 35

30 Example 1: 8-spindle machine, CP case 24 rows of data Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y Boxplots, after mean-centering each row, and then row-aligning... Voelkel (RIT) Missing-Index Problem 10/09 30 / 35

31 Distribution of Y (Row-Mean Centered) at each Index Y Index

32 Ref Row 4: Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y Distribution after Row-Alignment Permutations Y Index (Permuted)

33 β^c vs Iteration c Iteration c β^c

34 σ^ c vs Iteration c Iteration c σ^c

35 Iteration c= Permutation index k γ^c(., k)

36 Iteration c= Permutation index k γ^c(., k)

37 Iteration c= Permutation index k γ^c(., k)

38 Iteration c= Permutation index k γ^c(., k)

39 Iteration c= Permutation index k γ^c(., k)

40 Iteration c= Permutation index k γ^c(., k)

41 β^ and Actual Y means (Both Centered) β^ and Y Y Index

42 β^ and Actual Y means (Both Centered) β^ and Actual Y means (Centered and Aligned) β^ and Y Y β^ and Y Index Index

43 Y Density Estimates Indices Known (Row Adjusted) Y Density Normal Estimates Indices Unknown Y Y

44 Example 2: 4-cavity machine, GP case 24 rows of data, again... Y1 Y2 Y3 Y Results... Voelkel (RIT) Missing-Index Problem 10/09 31 / 35

45 Distr'n of Y at each Index (Row-Mean Centered) Distribution after Row-Alignment Permutations Y Y Index Index (Permuted)

46 β^ c vs Iteration c Iteration c β^c

47 σ^ c vs Iteration c σ^c Iteration c

48 Iteration c= 2 γ^c(., k) Permutation index k

49 Iteration c= 3 γ^c(., k) Permutation index k

50 Iteration c= 5 γ^c(., k) Permutation index k

51 Iteration c= 10 γ^c(., k) Permutation index k

52 Iteration c= 20 γ^c(., k) Permutation index k

53 Iteration c= 50 γ^c(., k) Permutation index k

54 ( ) ( ) ( ) ( ) 0.8 Iteration c= 112 γ^c(., k) Permutation index k

55 β^ and Actual Y means (Centered and Aligned) Y Index β^ and Y

56 Example 4: Random Data, CP case I=100 rows, J=8 N (0; 1) data Results... Voelkel (RIT) Missing-Index Problem 10/09 32 / 35

57 I=100 J=8 N(0,1) Distribution of Y (Row-Mean Centered) at each Index Y Index

58 Distribution after Row-Alignment Permutations Y Index (Permuted)

59 β^ c vs Iteration c β^c Iteration c

60 σ^ c vs Iteration c σ^c Iteration c

61 Iteration c= Permutation index k γ^c(., k)

62 Iteration c= Permutation index k γ^c(., k)

63 Iteration c= Permutation index k γ^c(., k)

64 β^ and Actual Y means (Both Centered) Y Index β^ and Y

65 Example 4a: Random Data, CP case I=100 rows, J=8 N (0; 1) data Shrinkage of initial location ( j ) estimates? Expansion of initial scale estimate? Results... Voelkel (RIT) Missing-Index Problem 10/09 33 / 35

66 shrink initial β est s by 2 enlarge initial σ est s by 2 β^ c vs Iteration c β^c Iteration c

67 New solution Old solution (!) β^ and Actual Y means (Both Centered) β^ and Actual Y means (Both Centered) β^ and Y Y β^ and Y Y Index Index

68 Using Same values... β^c ˆ ˆ β 5 Using ˆ β N 0, ( 0.01 = = ) β c= 1 c= 1 D β c = 1, ( 2 ) = = ( ) vs Iteration c β^c vs Iteration c β^c β^c Iteration c Iteration c

69 Summary (and future work) Both CP and GP methods appear to work well in cases with good signals Under H 0, results might be optimistic Future work 1 Approximate the standard errors of the estimates 2 Implement likelihood-ratio tests 3 Investigate asymptotics 4 Investigate H 0 case in detail 5 Consider shrinkage of nal estimates 6 Improve (linear) convergence rate, e.g. Aitken s acceleration technique. Voelkel (RIT) Missing-Index Problem 10/09 34 / 35

70 Questions? Voelkel (RIT) Missing-Index Problem 10/09 35 / 35

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 2: Direct Methods PD Dr.

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Likelihood-based inference with missing data under missing-at-random

Likelihood-based inference with missing data under missing-at-random Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation 1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations

More information

Lecture 11. Linear systems: Cholesky method. Eigensystems: Terminology. Jacobi transformations QR transformation

Lecture 11. Linear systems: Cholesky method. Eigensystems: Terminology. Jacobi transformations QR transformation Lecture Cholesky method QR decomposition Terminology Linear systems: Eigensystems: Jacobi transformations QR transformation Cholesky method: For a symmetric positive definite matrix, one can do an LU decomposition

More information

A Note on the Expectation-Maximization (EM) Algorithm

A Note on the Expectation-Maximization (EM) Algorithm A Note on the Expectation-Maximization (EM) Algorithm ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign March 11, 2007 1 Introduction The Expectation-Maximization

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions

More information

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do

More information

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

Managing Uncertainty

Managing Uncertainty Managing Uncertainty Bayesian Linear Regression and Kalman Filter December 4, 2017 Objectives The goal of this lab is multiple: 1. First it is a reminder of some central elementary notions of Bayesian

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Connexions module: m11446 1 Maximum Likelihood Estimation Clayton Scott Robert Nowak This work is produced by The Connexions Project and licensed under the Creative Commons Attribution License Abstract

More information

Variable selection for model-based clustering

Variable selection for model-based clustering Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition

More information

Section 6.3 Richardson s Extrapolation. Extrapolation (To infer or estimate by extending or projecting known information.)

Section 6.3 Richardson s Extrapolation. Extrapolation (To infer or estimate by extending or projecting known information.) Section 6.3 Richardson s Extrapolation Key Terms: Extrapolation (To infer or estimate by extending or projecting known information.) Illustrated using Finite Differences The difference between Interpolation

More information

Study Notes on the Latent Dirichlet Allocation

Study Notes on the Latent Dirichlet Allocation Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Numerical Methods

Numerical Methods Numerical Methods 263-2014 Prof. M. K. Banda Botany Building: 2-10. Prof. M. K. Banda (Tuks) WTW263 Semester II 1 / 18 Topic 1: Solving Nonlinear Equations Prof. M. K. Banda (Tuks) WTW263 Semester II 2

More information

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging

More information

Mixtures of Gaussians. Sargur Srihari

Mixtures of Gaussians. Sargur Srihari Mixtures of Gaussians Sargur srihari@cedar.buffalo.edu 1 9. Mixture Models and EM 0. Mixture Models Overview 1. K-Means Clustering 2. Mixtures of Gaussians 3. An Alternative View of EM 4. The EM Algorithm

More information

Introduction - Motivation. Many phenomena (physical, chemical, biological, etc.) are model by differential equations. f f(x + h) f(x) (x) = lim

Introduction - Motivation. Many phenomena (physical, chemical, biological, etc.) are model by differential equations. f f(x + h) f(x) (x) = lim Introduction - Motivation Many phenomena (physical, chemical, biological, etc.) are model by differential equations. Recall the definition of the derivative of f(x) f f(x + h) f(x) (x) = lim. h 0 h Its

More information

Linear Regression (9/11/13)

Linear Regression (9/11/13) STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter

More information

Generalized linear models

Generalized linear models Generalized linear models Søren Højsgaard Department of Mathematical Sciences Aalborg University, Denmark October 29, 202 Contents Densities for generalized linear models. Mean and variance...............................

More information

An exploration of matrix equilibration

An exploration of matrix equilibration An exploration of matrix equilibration Paul Liu Abstract We review three algorithms that scale the innity-norm of each row and column in a matrix to. The rst algorithm applies to unsymmetric matrices,

More information

Determinants - Uniqueness and Properties

Determinants - Uniqueness and Properties Determinants - Uniqueness and Properties 2-2-2008 In order to show that there s only one determinant function on M(n, R), I m going to derive another formula for the determinant It involves permutations

More information

Sherman-Morrison-Woodbury

Sherman-Morrison-Woodbury Week 5: Wednesday, Sep 23 Sherman-Mrison-Woodbury The Sherman-Mrison fmula describes the solution of A+uv T when there is already a factization f A. An easy way to derive the fmula is through block Gaussian

More information

Linear Algebra Practice Final

Linear Algebra Practice Final . Let (a) First, Linear Algebra Practice Final Summer 3 3 A = 5 3 3 rref([a ) = 5 so if we let x 5 = t, then x 4 = t, x 3 =, x = t, and x = t, so that t t x = t = t t whence ker A = span(,,,, ) and a basis

More information

Applications of Hidden Markov Models

Applications of Hidden Markov Models 18.417 Introduction to Computational Molecular Biology Lecture 18: November 9, 2004 Scribe: Chris Peikert Lecturer: Ross Lippert Editor: Chris Peikert Applications of Hidden Markov Models Review of Notation

More information

This ensures that we walk downhill. For fixed λ not even this may be the case.

This ensures that we walk downhill. For fixed λ not even this may be the case. Gradient Descent Objective Function Some differentiable function f : R n R. Gradient Descent Start with some x 0, i = 0 and learning rate λ repeat x i+1 = x i λ f(x i ) until f(x i+1 ) ɛ Line Search Variant

More information

MS-C1620 Statistical inference

MS-C1620 Statistical inference MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents

More information

Selected Topics in Optimization. Some slides borrowed from

Selected Topics in Optimization. Some slides borrowed from Selected Topics in Optimization Some slides borrowed from http://www.stat.cmu.edu/~ryantibs/convexopt/ Overview Optimization problems are almost everywhere in statistics and machine learning. Input Model

More information

For a system with more than one electron, we can t solve the Schrödinger Eq. exactly. We must develop methods of approximation, such as

For a system with more than one electron, we can t solve the Schrödinger Eq. exactly. We must develop methods of approximation, such as VARIATIO METHOD For a system with more than one electron, we can t solve the Schrödinger Eq. exactly. We must develop methods of approximation, such as Variation Method Perturbation Theory Combination

More information

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is

More information

Lecture 21: Spectral Learning for Graphical Models

Lecture 21: Spectral Learning for Graphical Models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Lecturer: Stefanos Zafeiriou Goal (Lectures): To present discrete and continuous valued probabilistic linear dynamical systems (HMMs

More information

Machine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20.

Machine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20. 10-601 Machine Learning, Midterm Exam: Spring 2008 Please put your name on this cover sheet If you need more room to work out your answer to a question, use the back of the page and clearly mark on the

More information

CS 246 Review of Linear Algebra 01/17/19

CS 246 Review of Linear Algebra 01/17/19 1 Linear algebra In this section we will discuss vectors and matrices. We denote the (i, j)th entry of a matrix A as A ij, and the ith entry of a vector as v i. 1.1 Vectors and vector operations A vector

More information

Matrices and systems of linear equations

Matrices and systems of linear equations Matrices and systems of linear equations Samy Tindel Purdue University Differential equations and linear algebra - MA 262 Taken from Differential equations and linear algebra by Goode and Annin Samy T.

More information

Strassen-like algorithms for symmetric tensor contractions

Strassen-like algorithms for symmetric tensor contractions Strassen-like algorithms for symmetric tensor contractions Edgar Solomonik Theory Seminar University of Illinois at Urbana-Champaign September 18, 2017 1 / 28 Fast symmetric tensor contractions Outline

More information

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in Vectors and Matrices Continued Remember that our goal is to write a system of algebraic equations as a matrix equation. Suppose we have the n linear algebraic equations a x + a 2 x 2 + a n x n = b a 2

More information

Accelerating Convergence

Accelerating Convergence Accelerating Convergence MATH 375 Numerical Analysis J. Robert Buchanan Department of Mathematics Fall 2013 Motivation We have seen that most fixed-point methods for root finding converge only linearly

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

The Solution of Linear Systems AX = B

The Solution of Linear Systems AX = B Chapter 2 The Solution of Linear Systems AX = B 21 Upper-triangular Linear Systems We will now develop the back-substitution algorithm, which is useful for solving a linear system of equations that has

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular form) Given: matrix C = (c i,j ) n,m i,j=1 ODE and num math: Linear algebra (N) [lectures] c phabala 2016 DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

CSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression

CSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression CSC2515 Winter 2015 Introduction to Machine Learning Lecture 2: Linear regression All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

An average case analysis of a dierential attack. on a class of SP-networks. Distributed Systems Technology Centre, and

An average case analysis of a dierential attack. on a class of SP-networks. Distributed Systems Technology Centre, and An average case analysis of a dierential attack on a class of SP-networks Luke O'Connor Distributed Systems Technology Centre, and Information Security Research Center, QUT Brisbane, Australia Abstract

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X. Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Variable Selection in Predictive Regressions

Variable Selection in Predictive Regressions Variable Selection in Predictive Regressions Alessandro Stringhi Advanced Financial Econometrics III Winter/Spring 2018 Overview This chapter considers linear models for explaining a scalar variable when

More information

Machine Learning Lecture 5

Machine Learning Lecture 5 Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory

More information

Direct Methods for Solving Linear Systems. Matrix Factorization

Direct Methods for Solving Linear Systems. Matrix Factorization Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011

More information

Expectation-Maximization (EM) algorithm

Expectation-Maximization (EM) algorithm I529: Machine Learning in Bioinformatics (Spring 2017) Expectation-Maximization (EM) algorithm Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2017 Contents Introduce

More information

Expectation-Maximization

Expectation-Maximization Expectation-Maximization Léon Bottou NEC Labs America COS 424 3/9/2010 Agenda Goals Representation Capacity Control Operational Considerations Computational Considerations Classification, clustering, regression,

More information

Gaussian Mixture Models, Expectation Maximization

Gaussian Mixture Models, Expectation Maximization Gaussian Mixture Models, Expectation Maximization Instructor: Jessica Wu Harvey Mudd College The instructor gratefully acknowledges Andrew Ng (Stanford), Andrew Moore (CMU), Eric Eaton (UPenn), David Kauchak

More information

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute

More information

Secretary Problems. Petropanagiotaki Maria. January MPLA, Algorithms & Complexity 2

Secretary Problems. Petropanagiotaki Maria. January MPLA, Algorithms & Complexity 2 January 15 2015 MPLA, Algorithms & Complexity 2 Simplest form of the problem 1 The candidates are totally ordered from best to worst with no ties. 2 The candidates arrive sequentially in random order.

More information

Eigenvalues and diagonalization

Eigenvalues and diagonalization Eigenvalues and diagonalization Patrick Breheny November 15 Patrick Breheny BST 764: Applied Statistical Modeling 1/20 Introduction The next topic in our course, principal components analysis, revolves

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Lecture Stat Information Criterion

Lecture Stat Information Criterion Lecture Stat 461-561 Information Criterion Arnaud Doucet February 2008 Arnaud Doucet () February 2008 1 / 34 Review of Maximum Likelihood Approach We have data X i i.i.d. g (x). We model the distribution

More information

Boolean Inner-Product Spaces and Boolean Matrices

Boolean Inner-Product Spaces and Boolean Matrices Boolean Inner-Product Spaces and Boolean Matrices Stan Gudder Department of Mathematics, University of Denver, Denver CO 80208 Frédéric Latrémolière Department of Mathematics, University of Denver, Denver

More information

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm by Korbinian Schwinger Overview Exponential Family Maximum Likelihood The EM Algorithm Gaussian Mixture Models Exponential

More information

Statistical Methods for Data Mining

Statistical Methods for Data Mining Statistical Methods for Data Mining Kuangnan Fang Xiamen University Email: xmufkn@xmu.edu.cn Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find

More information

12 - Nonparametric Density Estimation

12 - Nonparametric Density Estimation ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6

More information

Numerical Methods I Solving Nonlinear Equations

Numerical Methods I Solving Nonlinear Equations Numerical Methods I Solving Nonlinear Equations Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 16th, 2014 A. Donev (Courant Institute)

More information

Definition 2.3. We define addition and multiplication of matrices as follows.

Definition 2.3. We define addition and multiplication of matrices as follows. 14 Chapter 2 Matrices In this chapter, we review matrix algebra from Linear Algebra I, consider row and column operations on matrices, and define the rank of a matrix. Along the way prove that the row

More information

Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013

Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013 Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 013 1 Estimation of Var(A) and Breeding Values in General Pedigrees The classic

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

Vector, Matrix, and Tensor Derivatives

Vector, Matrix, and Tensor Derivatives Vector, Matrix, and Tensor Derivatives Erik Learned-Miller The purpose of this document is to help you learn to take derivatives of vectors, matrices, and higher order tensors (arrays with three dimensions

More information

Hypothesis Testing for Var-Cov Components

Hypothesis Testing for Var-Cov Components Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Index Notation for Vector Calculus

Index Notation for Vector Calculus Index Notation for Vector Calculus by Ilan Ben-Yaacov and Francesc Roig Copyright c 2006 Index notation, also commonly known as subscript notation or tensor notation, is an extremely useful tool for performing

More information

Massachusetts Institute of Technology

Massachusetts Institute of Technology Massachusetts Institute of Technology 6.867 Machine Learning, Fall 2006 Problem Set 5 Due Date: Thursday, Nov 30, 12:00 noon You may submit your solutions in class or in the box. 1. Wilhelm and Klaus are

More information

Three-Way Tables (continued):

Three-Way Tables (continued): STAT5602 Categorical Data Analysis Mills 2015 page 110 Three-Way Tables (continued) Now let us look back over the br preference example. We have fitted the following loglinear models 1.MODELX,Y,Z logm

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 2 - Spring 2017 Lecture 6 Jan-Willem van de Meent (credit: Yijun Zhao, Chris Bishop, Andrew Moore, Hastie et al.) Project Project Deadlines 3 Feb: Form teams of

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,

More information

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Amy G. Froelich Michael D. Larsen Iowa State University *The work presented in this talk was partially supported by

More information

GROUP THEORY PRIMER. New terms: tensor, rank k tensor, Young tableau, Young diagram, hook, hook length, factors over hooks rule

GROUP THEORY PRIMER. New terms: tensor, rank k tensor, Young tableau, Young diagram, hook, hook length, factors over hooks rule GROUP THEORY PRIMER New terms: tensor, rank k tensor, Young tableau, Young diagram, hook, hook length, factors over hooks rule 1. Tensor methods for su(n) To study some aspects of representations of a

More information

Lecture Notes 1: Matrix Algebra Part C: Pivoting and Matrix Decomposition

Lecture Notes 1: Matrix Algebra Part C: Pivoting and Matrix Decomposition University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 1 of 46 Lecture Notes 1: Matrix Algebra Part C: Pivoting and Matrix Decomposition Peter J. Hammond Autumn 2012, revised Autumn 2014 University

More information

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Intelligent Data Analysis and Probabilistic Inference Lecture

More information

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model Checking/Diagnostics Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics The session is a continuation of a version of Section 11.3 of MMD&S. It concerns

More information

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model Checking/Diagnostics Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics The session is a continuation of a version of Section 11.3 of MMD&S. It concerns

More information

COURSE Iterative methods for solving linear systems

COURSE Iterative methods for solving linear systems COURSE 0 4.3. Iterative methods for solving linear systems Because of round-off errors, direct methods become less efficient than iterative methods for large systems (>00 000 variables). An iterative scheme

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 20: Epistasis and Alternative Tests in GWAS Jason Mezey jgm45@cornell.edu April 16, 2016 (Th) 8:40-9:55 None Announcements Summary

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

Lecture 16 Deep Neural Generative Models

Lecture 16 Deep Neural Generative Models Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed

More information

COM336: Neural Computing

COM336: Neural Computing COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk

More information

EM for ML Estimation

EM for ML Estimation Overview EM for ML Estimation An algorithm for Maximum Likelihood (ML) Estimation from incomplete data (Dempster, Laird, and Rubin, 1977) 1. Formulate complete data so that complete-data ML estimation

More information