D-optimal Designs for Factorial Experiments under Generalized Linear Models

Size: px
Start display at page:

Download "D-optimal Designs for Factorial Experiments under Generalized Linear Models"

Transcription

1 D-optimal Designs for Factorial Experiments under Generalized Linear Models Jie Yang Department of Mathematics, Statistics, and Computer Science University of Illinois at Chicago Joint research with Abhyuday Mandal and Dibyen Majumdar October 20, 2012

2 1 Introduction A motivating example Preliminary setup 2 Locally D-optimal Designs Characterization of locally D-optimal designs Saturated designs Lift-one algorithm for searching locally D-optimal designs 3 EW D-optimal Designs Definition Examples 4 Robustness Robustness of misspecification of w Robustness of uniform design and EW design 5 Example Motivating example: revisited Conclusions

3 A motivating example: Hamada and Nelder (1997) Two-level factors: (A) poly-film thickness, (B) oil mixture ratio, (C) material of gloves, and (D) condition of metal blanks. Response: the windshield molding was good or not. Row A B C D Replicates good molding Question: Can we do something better?

4 Preliminary setup Consider an experiment with m fixed and distinct design points: x 1 x 2 X =. x m = x 11 x 12 x 1d x 21 x 22 x 2d x m1 x m2 x md For example, a 2 k factorial experiment with main-effects model implies m = 2 k, d = k + 1, x i1 = 1, and x i2,..., x id { 1, 1}. Exact design problem: Suppose n is given. Consider optimal n i s such that n i 0 and m i=1 n i = n. Approximate design problem: Let p i = n i /n. Consider optimal p i s such that p i 0 and m i=1 p i = 1.

5 Generalized linear model: single parameter Consider independent univariate responses Y 1,..., Y n : Y i f (y; θ i ) = exp{yb(θ i ) + c(θ i ) + d(y)} For example, { } exp y log θ 1 θ + log(1 θ), Bernoulli(θ) exp {y { log θ θ log y!}, } Poisson(θ) exp y 1 k 1 y θ k log θ + log Γ(k), Gamma(k, θ), fixed k > 0 { } exp y θ σ θ2 2 2σ y 2 2 2σ log(2πσ2 ), N(θ, σ 2 ), fixed σ 2 > 0 Generalized linear model (McCullagh and Nelder 1989, Dobson 2008): link function g and parameters of interest β = (β 1,..., β d ), such that E(Y i ) = µ i and η i = g(µ i ) = x iβ.

6 Information matrix and D-optimal design Recall that there are m distinct predictor combinations x 1,..., x m with numbers of replicates n 1,..., n m, respectively. The maximum likelihood estimator of β has an asymptotic covariance matrix that is the inverse of the information matrix I = nx WX where X = (x 1,..., x m ) is an m d matrix, and W = diag(p 1 w 1,..., p m w m ) with w i = 1 var(y i ) ( µi η i ) 2. A D-optimal design is the p = (p 1,..., p m ) which maximizes with given w 1,..., w m. X WX

7 w as a function of β Suppose the link function g is one-to-one and differentiable. Suppose further µ i itself determines var(y i ). Then w i = ν(η i ) = ν (x i β) for some function ν. Binary response, logit link: ν(η) = 1 2+e η +e η. Poisson count, log link: w = ν(η) = exp{η}. Gamma response, reciprocal link: w = ν(η) = k/η 2. Normal response, identity link: w = ν(η) 1/σ 2.

8 w i = ν(η i ) = ν(x i β) for binary response ( ) Logit Probit C Log Log Log Log = x

9 Example: 2 2 experiment with main-effects model X = , W = The optimization problem maximizing where v i = 1/w i and X WX = 16w1 w 2 w 3 w 4 L(p), w 1 p w 2 p w 3 p w 4 p 4 L(p) = v 4 p 1 p 2 p 3 + v 3 p 1 p 2 p 4 + v 2 p 1 p 3 p 4 + v 1 p 2 p 3 p 4

10 General case: order-d polynomial For general case, X is an m d matrix with distinct rows, W = diag(p 1 w 1,..., p m w m ). Based on González-Dávila, Dorta-Guerra and Ginebra (2007) and Yang, Mandal and Majumdar (2012b), we have Lemma Let X [i 1, i 2,..., i d ] be the d d sub-matrix consisting of the i 1 th,..., i d th rows of the design matrix X. Then f (p) = X WX = X [i 1,..., i d ] 2 p i1 w i1 p id w id. 1 i 1 < <i d m

11 Characterization of locally D-optimal designs For each i = 1,..., m, 0 x < 1, we define ( 1 x f i (x) = f p 1,..., 1 x p i 1, x, 1 x p i+1,..., 1 x ) p m 1 p i 1 p i 1 p i 1 p i = ax(1 x) d 1 + b(1 x) d If p i > 0, b = f i (0), a = f (p) b(1 p i ) d ; otherwise, b = f (p), a = f i ( 1 2) 2 d b. Theorem p i (1 p i ) d 1 Suppose f (p) > 0. Then p is D-optimal if and only if for each i = 1,..., m, one of the two conditions below is satisfied: (i) p i = 0 and f i ( 1 2) d+1 2 d f (p); (ii) 0 < p i 1 d and f i(0) = 1 p i d (1 p i ) d f (p).

12 Saturated designs Theorem Let I = {i 1,..., i d } {1,..., m} satisfying X [i 1,..., i d ] 0. Then p i1 = p i2 = = p id = 1 d is D-optimal if and only if for each i / I, X [{i} I \ {j}] 2 X [i 1, i 2,..., i d ] 2. w j w i j I 2 2 main-effects model: p 1 = p 2 = p 3 = 1/3 is D-optimal if and only if v 1 + v 2 + v 3 v 4, where v i = 1/w i. 2 3 main-effects model: p 1 = p 4 = p 6 = p 7 = 1/4 is D-optimal if and only if v 1 + v 4 + v 6 + v 7 4 min{v 2, v 3, v 5, v 8 }.

13 Saturated designs: more example 2 3 factorial design: Suppose the design matrix X = p 1 = p 2 = p 3 = p 4 = 1/4 is D-optimal if and only if v 1 + v 2 + v 4 v 5 and v 1 + v 3 + v 4 v 6. p 2 = p 3 = p 4 = p 5 = 1/4 is D-optimal if and only if v 2 + v 4 + v 5 v 1 and v 2 + v 3 + v 5 v 6. p 3 = p 4 = p 5 = p 6 = 1/4 is D-optimal if and only if v 3 + v 4 + v 6 v 1 and v 3 + v 5 + v 6 v 2.

14 Saturation condition in terms of β: logit link, 2 2 design For fixed β 0, a pair (β 1, β 2 ) satisfies the saturation condition if and only if the corresponding point is above the curve labelled by β 0. β β 0 = ± 0.5 β 0 = ± 1 β 0 = ± β 1

15 Lift-one algorithm (Yang, Mandal and Majumdar, 2012b) Given X and w 1,..., w m, search for p = (p 1,..., p m ) maximizing f (p) = X WX : 1 Start with arbitrary p 0 = (p 1,..., p m ) satisfying 0 < p i < 1, i = 1,..., m and compute f (p 0 ). 2 Set up a random order of i going through {1, 2,..., m}. 3 ( For each i, determine f i (z). In this step, either f i (0) or f 1 i 2 needs to be calculated. 4 Define ( ) p (i) = 1 z, 1 p i p 1,..., 1 z 1 p i p i 1, z, 1 z 1 p i p i+1,..., 1 z 1 p i p 2 k where z maximizes f i (z), 0 z 1. Here f (p (i) ) = f i (z ). 5 Replace p 0 with p (i), f (p 0 ) with f (p (i) ). 6 Repeat 2 5 until convergence, that is, f (p 0 ) = f (p (i) ) for each i. )

16 Comments on lift-one algorithm The basic idea of the lift-one algorithm is that, for randomly chosen i {1,..., m}, we update p i to pi and all the other p j s to pj = p j 1 p i 1 p i. The major advantage of the lift-one algorithm is that in order to determine an optimal p i, we need to calculate X WX only once. This algorithm is motivated by the coordinate descent algorithm (Zangwill, 1969). It is also in spirit similar to the idea of one-point correction in the literature (Wynn, 1970; Fedorov, 1972; Müller, 2007), where design points are added/adjusted one by one.

17 Convergence of lift-one algorithm Modified lift-one algorithm: (1) For the 10mth iteration and a fixed order of i = 1,..., m we repeat steps 3 5. If p (i) is a better allocation found by the lift-one algorithm than the allocation p 0, instead of updating p 0 to p (i) immediately, we obtain p (i) for each i, and replace p 0 with the best p (i) only. (2) For iterations other than the 10mth, we follow the original lift-one algorithm update. Theorem When the lift-one algorithm or the modified lift-one algorithm converges, the converged allocation p maximizes X WX on the set of feasible allocations. Furthermore, the modified lift-one algorithm is guaranteed to converge.

18 Performance of lift-one algorithm: comparison Table: CPU time in seconds for 100 simulated β Algorithms Nelder-Mead quasi-newton conjugate simulated lift-one gradient annealing 2 2 Designs Designs Designs NA NA NA 1.45

19 Performance of lift-one algorithm: number of nonzero p i s Under 2 k main-effects model, binary response, logit link, β i s iid from Uniform(-3,3): k= 2 k= 3 k= 4 Density Density Density number of support points number of support points number of support points k= 5 k= 6 k= 7 Density Density Density number of support points number of support points number of support points

20 Performance of lift-one algorithm: time cost Under 2 k main-effects model, binary response, logit link, we simulate β i s iid from a uniform distribution and check the time cost in seconds by lift-one algorithm: (a) Time Cost on Average (b) Number of Nonzero p i 's on Average seconds β ~ U( 0.5,0.5) β ~ U( 1, 1) β ~ U( 3, 3) number of nonzero p i β ~ U( 0.5,0.5) β ~ U( 1, 1) β ~ U( 3, 3) m=2 k m=2 k

21 EW D-optimal designs For experiments under generalized linear models, we may need to specify the β i s, which gives us the w i s, to get D-optimal designs, known as local D-optimality. An EW D-optimal design is an optimal allocation of p that maximizes X E(W )X. It is one of several alternatives suggested by Atkinson, Donev and Tobias (2007). EW D-optimal designs are often approximately as efficient as Bayesian D-optimal designs. EW D-optimal designs can be obtained easily using the lift-one algorithm. In general, EW D-optimal designs are robust.

22 EW D-optimal design and Bayesian D-optimal design Theorem For any given link function, if the regression coefficients β 0, β 1,..., β d are independent with finite expectation, and β 1,..., β d all have a symmetric distribution about 0 (not necessarily the same distribution), then the uniform design is an EW D-optimal design. A Bayes D-optimal design maximizes E(log X WX ) where the expectation is taken over the prior distribution of β i s. Note that, by Jensen s inequality, E ( log X WX ) log X E(W )X since log X WX is concave in w. Thus an EW D-optimal design maximizes an upper bound to the Bayesian D-optimality criterion.

23 Example: 2 2 main-effects model Suppose β 0, β 1, β 2 are independent, β 0 U( 1, 1), and β 1, β 2 U[0, 1). Under the logit link, the EW D-optimal design is p e = (0.239, 0.261, 0.261, 0.239). The Bayes optimal design, which maximizes φ(p) = E log X WX is p o = (0.235, 0.265, 0.265, 0.235). The relative efficiency of p e with respect to p o is { } φ(pe) φ(p o) exp 100% = 99.99% k + 1 for logit link, or 99.94% for probit link, 99.77% for log-log link, and % for complementary log-log link. The time cost for EW is 0.11 sec, while it is 5.45 secs for maximizing φ(p). It should also be noted that the relative efficiency of the uniform design p u = (1/4, 1/4, 1/4, 1/4) with respect to p o is 99.88% for logit link, and is 89.6% for complementary log-log link.

24 Example: 2 3 main-effects model Suppose β 0, β 1, β 2, β 3 are independent, and the experimenter has the following prior information for the parameters: β 0 U( 3, 3), and β 1, β 2, β 3 U[0, 3). For the logit link the uniform design is not EW D-optimal. In this case, EW solution is p e = (0, 1/6, 1/6, 1/6, 1/6, 1/6, 1/6, 0), while p o = (0.004, 0.165, 0.166, 0.165, 0.165, 0.166, 0.165, 0.004) which maximizes φ(p). The relative efficiency of p e with respect to p o is 99.98%. On the other hand, the relative efficiency of the uniform design with respect to p o is 94.39%. It takes about 2.39 seconds to find an EW solution while it takes seconds to find p o. The difference in computational time is even more prominent for 2 4 case (24 seconds versus 3147 seconds).

25 Robustness for misspecification of w Denote the D-criterion value as ψ(p, w) = X WX for given w = (w 1,..., w m ) and p = (p 1,..., p m ). Define the relative loss of efficiency of p with respect to w as ( ) 1 ψ(p, w) d R(p, w) = 1, ψ(p w, w) where p w is a D-optimal allocation with respect to w. Define the maximum relative loss of efficiency of a given design p with respect to a specified region W of w by R max (p) = max R(p, w). w W

26 Robustness of uniform design: 2 2 main-effects model Yang, Mandal and Majumdar (2012a) showed that under 2 2 experiment with main-effects model, if w i [a, b], i = 1, 2, 3, 4, 0 < a b, then the uniform design p u = (1/4, 1/4, 1/4, 1/4) is the most robust one in terms of the maximum of relative loss of efficiency. On the other hand, if the experimenter has some prior knowledge about the model parameters, for example, if one believes that β 0 Uniform( 1, 1), β 1, β 2 Uniform[0, 1) and β 0, β 1, β 2 are independent, then the theoretical R max of uniform design is 0.134, while the theoretical R max of design given by p = (0.19, 0.31, 0.31, 0.19) is That is, uniform design may not be the most robust one.

27 Robustness of uniform design: 2 4 main-effects model Simulate β 0,..., β 4 for 1000 times and calculate the corresponding w s. For each w s, we obtain a D-optimal allocation p s. Percentages β 0 U( 3, 3) U( 1, 1) U( 3, 0) N(0, 5) β 1 U( 1, 1) U(0, 1) U(1, 3) N(0, 1) β 2 U( 1, 1) U(0, 1) U(1, 3) N(2, 1) β 3 U( 1, 1) U(0, 1) U( 3, 1) N(.5, 2) β 4 U( 1, 1) U(0, 1) U( 3, 1) N(.5, 2) (I) (II) (III) (I) (II) (III) (I) (II) (III) (I) (II) (III) R R R Note: (I) = R 100α (p u ), uniform design; (II) = min 100α(p s ), best among 1000; 1 s 1000 (III) = R 100α (p e ), EW design.

28 Windshield experiment: revisited (Yang, Mandal and Majumdar, 2012b) Based on the data presented in Hamada and Nelder (1997), ˆβ = (1.77, 1.57, 0.13, 0.80, 0.14) under logit link. The efficiency of the original III design p HN is 78% of the locally D-optimal design if ˆβ were the true value. It might be reasonable to consider an initial guess of β = (2, 1.5, 0.1, 1, 0.1). This will lead to the locally D-optimal half-fractional design p a with relative efficiency 99%. Another reasonable option is to consider a range, for example, β 0 Unif(1, 3), β 1 Unif( 3, 1), β 2, β 4 Unif( 0.5, 0.5), and β 3 Unif( 1, 0), the relative efficiency of the EW D-optimal half-fractional design p e is 98%.

29 Windshield experiment: optimal half-fractional design Row A B C D η π p HN p a p e

30 Conclusions We consider the problem of obtaining locally D-optimal designs for experiments with fixed finite set of design points under generalized linear models. We obtain a characterization for a design to be locally D-optimal. Based on this characterization, we develop efficient numerical techniques to search for locally D-optimal designs. We suggest the use of EW D-optimal designs. These are much easier to compute and still highly efficient compared with Bayesian D-optimal designs. We investigate the properties of fractional factorial designs (not presented here, see Yang, Mandal and Majumdar, 2012b). We also study the robustness of the D-optimal designs.

31 References 1 Hamada, M. and Nelder, J. A. (1997). Generalized linear models for quality-improvement experiments, Journal of Quality Technology, 29, McCullagh, P. and Nelder, J. (1989). Generalized Linear Models, Second Edition. 3 Dobson, A.J. (2008). An Introduction to Generalized Linear Models, 3rd edition, Chapman and Hall/CRC. 4 González-Dávila, E., Dorta-Guerra, R. and Ginebra, J. (2007). On the information in two-level experiments, Model Assisted Statistics and Applications, 2, J. Yang, A. Mandal, and D. Majumdar (2012a). Optimal designs for two-level factorial experiments with binary response, Statistica Sinica, Vol. 22, No. 2, J. Yang, A. Mandal, and D. Majumdar (2012b). Optimal designs for 2 k factorial experiments with binary response, submitted for publication. Available at

32 More references 1 Zangwill, W. (1969), Nonlinear Programming: A Unified Approach, Prentice-Hall, New Jersey. 2 Wynn, H.P. (1970). The sequential generation of D-optimum experimental designs, Annals of Mathematical Statistics, 41, Fedorov, V.V. (1972). Theory of Optimal Experiments, Academic Press, New York. 4 Müller, W. G. (2007). Collecting Spatial Data: Optimum Design of Experiments for Random Fields, 3rd Edition, Spinger. 5 Atkinson, A. C., Donev, A. N. and Tobias, R. D. (2007). Optimum Experimental Designs, with SAS, Oxford University Press.

Optimal Designs for 2 k Experiments with Binary Response

Optimal Designs for 2 k Experiments with Binary Response 1 / 57 Optimal Designs for 2 k Experiments with Binary Response Dibyen Majumdar Mathematics, Statistics, and Computer Science College of Liberal Arts and Sciences University of Illinois at Chicago Joint

More information

D-optimal Factorial Designs under Generalized Linear Models

D-optimal Factorial Designs under Generalized Linear Models D-optimal Factorial Designs under Generalized Linear Models Jie Yang 1 and Abhyuday Mandal 2 1 University of Illinois at Chicago and 2 University of Georgia Abstract: Generalized linear models (GLMs) have

More information

OPTIMAL DESIGNS FOR 2 k FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

OPTIMAL DESIGNS FOR 2 k FACTORIAL EXPERIMENTS WITH BINARY RESPONSE 1 OPTIMAL DESIGNS FOR 2 k FACTORIAL EXPERIMENTS WITH BINARY RESPONSE Jie Yang 1, Abhyuday Mandal 2 and Dibyen Majumdar 1 1 University of Illinois at Chicago and 2 University of Georgia Abstract: We consider

More information

Optimal Designs for 2 k Factorial Experiments with Binary Response

Optimal Designs for 2 k Factorial Experiments with Binary Response Optimal Designs for 2 k Factorial Experiments with Binary Response arxiv:1109.5320v3 [math.st] 10 Oct 2012 Jie Yang 1, Abhyuday Mandal 2 and Dibyen Majumdar 1 1 University of Illinois at Chicago and 2

More information

OPTIMAL DESIGNS FOR 2 k FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

OPTIMAL DESIGNS FOR 2 k FACTORIAL EXPERIMENTS WITH BINARY RESPONSE Statistica Sinica 26 (2016), 385-411 doi:http://dx.doi.org/10.5705/ss.2013.265 OPTIMAL DESIGNS FOR 2 k FACTORIAL EXPERIMENTS WITH BINARY RESPONSE Jie Yang 1, Abhyuday Mandal 2 and Dibyen Majumdar 1 1 University

More information

arxiv: v8 [math.st] 27 Jan 2015

arxiv: v8 [math.st] 27 Jan 2015 1 arxiv:1109.5320v8 [math.st] 27 Jan 2015 OPTIMAL DESIGNS FOR 2 k FACTORIAL EXPERIMENTS WITH BINARY RESPONSE Jie Yang 1, Abhyuday Mandal 2 and Dibyen Majumdar 1 1 University of Illinois at Chicago and

More information

Optimal Designs for 2 k Factorial Experiments with Binary Response

Optimal Designs for 2 k Factorial Experiments with Binary Response Optimal Designs for 2 k Factorial Experiments with Binary Response arxiv:1109.5320v4 [math.st] 29 Mar 2013 Jie Yang 1, Abhyuday Mandal 2 and Dibyen Majumdar 1 1 University of Illinois at Chicago and 2

More information

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE 1 OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE Abhyuday Mandal 1, Jie Yang 2 and Dibyen Majumdar 2 1 University of Georgia and 2 University of Illinois Abstract: We consider

More information

D-optimal Designs for Multinomial Logistic Models

D-optimal Designs for Multinomial Logistic Models D-optimal Designs for Multinomial Logistic Models Jie Yang University of Illinois at Chicago Joint with Xianwei Bu and Dibyen Majumdar October 12, 2017 1 Multinomial Logistic Models Cumulative logit model:

More information

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE Statistica Sinica 22 (2012), 885-907 doi:http://dx.doi.org/10.5705/ss.2010.080 OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE Jie Yang 1, Abhyuday Mandal 2 and Dibyen Majumdar

More information

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE 1 OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE Jie Yang 1, Abhyuday Mandal 2 and Dibyen Majumdar 1 1 University of Illinois at Chicago and 2 University of Georgia Abstract:

More information

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE Statistica Sinica: Sulement OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE Jie Yang 1, Abhyuday Mandal, and Dibyen Majumdar 1 1 University of Illinois at Chicago and University

More information

D-OPTIMAL DESIGNS WITH ORDERED CATEGORICAL DATA

D-OPTIMAL DESIGNS WITH ORDERED CATEGORICAL DATA Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0210 D-OPTIMAL DESIGNS WITH ORDERED CATEGORICAL DATA Jie Yang, Liping Tong and Abhyuday Mandal University of Illinois at Chicago,

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

arxiv: v5 [math.st] 8 Nov 2017

arxiv: v5 [math.st] 8 Nov 2017 D-OPTIMAL DESIGNS WITH ORDERED CATEGORICAL DATA Jie Yang 1, Liping Tong 2 and Abhyuday Mandal 3 arxiv:1502.05990v5 [math.st] 8 Nov 2017 1 University of Illinois at Chicago, 2 Advocate Health Care and 3

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

Generalized linear models

Generalized linear models Generalized linear models Søren Højsgaard Department of Mathematical Sciences Aalborg University, Denmark October 29, 202 Contents Densities for generalized linear models. Mean and variance...............................

More information

D-optimal Designs with Ordered Categorical Data

D-optimal Designs with Ordered Categorical Data D-optimal Designs with Ordered Categorical Data Jie Yang Liping Tong Abhyuday Mandal University of Illinois at Chicago Loyola University Chicago University of Georgia February 20, 2015 Abstract We consider

More information

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

arxiv: v2 [math.st] 14 Aug 2018

arxiv: v2 [math.st] 14 Aug 2018 D-optimal Designs for Multinomial Logistic Models arxiv:170703063v2 [mathst] 14 Aug 2018 Xianwei Bu 1, Dibyen Majumdar 2 and Jie Yang 2 1 AbbVie Inc and 2 University of Illinois at Chicago August 16, 2018

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

On construction of constrained optimum designs

On construction of constrained optimum designs On construction of constrained optimum designs Institute of Control and Computation Engineering University of Zielona Góra, Poland DEMA2008, Cambridge, 15 August 2008 Numerical algorithms to construct

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random STA 216: GENERALIZED LINEAR MODELS Lecture 1. Review and Introduction Much of statistics is based on the assumption that random variables are continuous & normally distributed. Normal linear regression

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and

More information

Generalized Linear Models: An Introduction

Generalized Linear Models: An Introduction Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,

More information

AP-Optimum Designs for Minimizing the Average Variance and Probability-Based Optimality

AP-Optimum Designs for Minimizing the Average Variance and Probability-Based Optimality AP-Optimum Designs for Minimizing the Average Variance and Probability-Based Optimality Authors: N. M. Kilany Faculty of Science, Menoufia University Menoufia, Egypt. (neveenkilany@hotmail.com) W. A. Hassanein

More information

Logistic Regression. Seungjin Choi

Logistic Regression. Seungjin Choi Logistic Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

A-optimal designs for generalized linear model with two parameters

A-optimal designs for generalized linear model with two parameters A-optimal designs for generalized linear model with two parameters Min Yang * University of Missouri - Columbia Abstract An algebraic method for constructing A-optimal designs for two parameter generalized

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

Repeated ordinal measurements: a generalised estimating equation approach

Repeated ordinal measurements: a generalised estimating equation approach Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

A Bayesian Mixture Model with Application to Typhoon Rainfall Predictions in Taipei, Taiwan 1

A Bayesian Mixture Model with Application to Typhoon Rainfall Predictions in Taipei, Taiwan 1 Int. J. Contemp. Math. Sci., Vol. 2, 2007, no. 13, 639-648 A Bayesian Mixture Model with Application to Typhoon Rainfall Predictions in Taipei, Taiwan 1 Tsai-Hung Fan Graduate Institute of Statistics National

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

Optimization Concepts and Applications in Engineering

Optimization Concepts and Applications in Engineering Optimization Concepts and Applications in Engineering Ashok D. Belegundu, Ph.D. Department of Mechanical Engineering The Pennsylvania State University University Park, Pennsylvania Tirupathi R. Chandrupatia,

More information

Stat 710: Mathematical Statistics Lecture 12

Stat 710: Mathematical Statistics Lecture 12 Stat 710: Mathematical Statistics Lecture 12 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 12 Feb 18, 2009 1 / 11 Lecture 12:

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

OPTIMAL DESIGNS FOR GENERALIZED LINEAR MODELS WITH MULTIPLE DESIGN VARIABLES

OPTIMAL DESIGNS FOR GENERALIZED LINEAR MODELS WITH MULTIPLE DESIGN VARIABLES Statistica Sinica 21 (2011, 1415-1430 OPTIMAL DESIGNS FOR GENERALIZED LINEAR MODELS WITH MULTIPLE DESIGN VARIABLES Min Yang, Bin Zhang and Shuguang Huang University of Missouri, University of Alabama-Birmingham

More information

Two Useful Bounds for Variational Inference

Two Useful Bounds for Variational Inference Two Useful Bounds for Variational Inference John Paisley Department of Computer Science Princeton University, Princeton, NJ jpaisley@princeton.edu Abstract We review and derive two lower bounds on the

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.

Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Linear models for classification Logistic regression Gradient descent and second-order methods

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Stochastic Variational Inference

Stochastic Variational Inference Stochastic Variational Inference David M. Blei Princeton University (DRAFT: DO NOT CITE) December 8, 2011 We derive a stochastic optimization algorithm for mean field variational inference, which we call

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

Bias-Variance Tradeoff

Bias-Variance Tradeoff What s learning, revisited Overfitting Generative versus Discriminative Logistic Regression Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University September 19 th, 2007 Bias-Variance Tradeoff

More information

Machine Learning Lecture 5

Machine Learning Lecture 5 Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

Weighted Least Squares I

Weighted Least Squares I Weighted Least Squares I for i = 1, 2,..., n we have, see [1, Bradley], data: Y i x i i.n.i.d f(y i θ i ), where θ i = E(Y i x i ) co-variates: x i = (x i1, x i2,..., x ip ) T let X n p be the matrix of

More information

Generative v. Discriminative classifiers Intuition

Generative v. Discriminative classifiers Intuition Logistic Regression Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University September 24 th, 2007 Generative v. Discriminative classifiers Intuition Want to Learn: h:x a Y X features Y target

More information

Linear Models in Machine Learning

Linear Models in Machine Learning CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,

More information

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means

More information

Efficient computation of Bayesian optimal discriminating designs

Efficient computation of Bayesian optimal discriminating designs Efficient computation of Bayesian optimal discriminating designs Holger Dette Ruhr-Universität Bochum Fakultät für Mathematik 44780 Bochum, Germany e-mail: holger.dette@rub.de Roman Guchenko, Viatcheslav

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means

More information

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown. Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

Reading Group on Deep Learning Session 1

Reading Group on Deep Learning Session 1 Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular

More information

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel Logistic Regression Pattern Recognition 2016 Sandro Schönborn University of Basel Two Worlds: Probabilistic & Algorithmic We have seen two conceptual approaches to classification: data class density estimation

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

Credibility Theory for Generalized Linear and Mixed Models

Credibility Theory for Generalized Linear and Mixed Models Credibility Theory for Generalized Linear and Mixed Models José Garrido and Jun Zhou Concordia University, Montreal, Canada E-mail: garrido@mathstat.concordia.ca, junzhou@mathstat.concordia.ca August 2006

More information

Vasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks

Vasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks C.M. Bishop s PRML: Chapter 5; Neural Networks Introduction The aim is, as before, to find useful decompositions of the target variable; t(x) = y(x, w) + ɛ(x) (3.7) t(x n ) and x n are the observations,

More information

Regression with Numerical Optimization. Logistic

Regression with Numerical Optimization. Logistic CSG220 Machine Learning Fall 2008 Regression with Numerical Optimization. Logistic regression Regression with Numerical Optimization. Logistic regression based on a document by Andrew Ng October 3, 204

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Gaussian and Linear Discriminant Analysis; Multiclass Classification

Gaussian and Linear Discriminant Analysis; Multiclass Classification Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 15 Outline 1 Fitting GLMs 2 / 15 Fitting GLMS We study how to find the maxlimum likelihood estimator ˆβ of GLM parameters The likelihood equaions are usually

More information

POLI 8501 Introduction to Maximum Likelihood Estimation

POLI 8501 Introduction to Maximum Likelihood Estimation POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,

More information

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014. Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)

More information

Approximate and exact D optimal designs for 2 k factorial experiments for Generalized Linear Models via SOCP

Approximate and exact D optimal designs for 2 k factorial experiments for Generalized Linear Models via SOCP Zuse Institute Berlin Takustr. 7 14195 Berlin Germany BELMIRO P.M. DUARTE, GUILLAUME SAGNOL Approximate and exact D optimal designs for 2 k factorial experiments for Generalized Linear Models via SOCP

More information

Machine Learning 2017

Machine Learning 2017 Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section

More information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X. Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may

More information

Optimum designs for model. discrimination and estimation. in Binary Response Models

Optimum designs for model. discrimination and estimation. in Binary Response Models Optimum designs for model discrimination and estimation in Binary Response Models by Wei-Shan Hsieh Advisor Mong-Na Lo Huang Department of Applied Mathematics National Sun Yat-sen University Kaohsiung,

More information

D-optimal designs for logistic regression in two variables

D-optimal designs for logistic regression in two variables D-optimal designs for logistic regression in two variables Linda M. Haines 1, M. Gaëtan Kabera 2, P. Ndlovu 2 and T. E. O Brien 3 1 Department of Statistical Sciences, University of Cape Town, Rondebosch

More information

Sampling bias in logistic models

Sampling bias in logistic models Sampling bias in logistic models Department of Statistics University of Chicago University of Wisconsin Oct 24, 2007 www.stat.uchicago.edu/~pmcc/reports/bias.pdf Outline Conventional regression models

More information

Statistics 580 Optimization Methods

Statistics 580 Optimization Methods Statistics 580 Optimization Methods Introduction Let fx be a given real-valued function on R p. The general optimization problem is to find an x ɛ R p at which fx attain a maximum or a minimum. It is of

More information

Generalized Estimating Equations

Generalized Estimating Equations Outline Review of Generalized Linear Models (GLM) Generalized Linear Model Exponential Family Components of GLM MLE for GLM, Iterative Weighted Least Squares Measuring Goodness of Fit - Deviance and Pearson

More information

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak. Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to

More information

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33 Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett

More information

Generative v. Discriminative classifiers Intuition

Generative v. Discriminative classifiers Intuition Logistic Regression Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University September 24 th, 2007 1 Generative v. Discriminative classifiers Intuition Want to Learn: h:x a Y X features

More information

EM-algorithm for motif discovery

EM-algorithm for motif discovery EM-algorithm for motif discovery Xiaohui Xie University of California, Irvine EM-algorithm for motif discovery p.1/19 Position weight matrix Position weight matrix representation of a motif with width

More information

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20 Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)

More information

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017 Permuted and IROM Department, McCombs School of Business The University of Texas at Austin 39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017 1 / 36 Joint work

More information

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation

More information

But if z is conditioned on, we need to model it:

But if z is conditioned on, we need to model it: Partially Unobserved Variables Lecture 8: Unsupervised Learning & EM Algorithm Sam Roweis October 28, 2003 Certain variables q in our models may be unobserved, either at training time or at test time or

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February

More information