Introduction to the R Statistical Computing Environment R Programming

Similar documents
Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Lecture Notes on Linear Regression

Basic R Programming: Exercises

Logistic regression models 1/12

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Negative Binomial Regression

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

The Geometry of Logit and Probit

Linear Approximation with Regularization and Moving Least Squares

Maximum Likelihood Estimation

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

Generalized Linear Methods

Linear Regression Analysis: Terminology and Notation

Logistic Regression Maximum Likelihood Estimation

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Chapter Newton s Method

4DVAR, according to the name, is a four-dimensional variational method.

IV. Performance Optimization

1 Convex Optimization

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

18.1 Introduction and Recap

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Machine learning: Density estimation

Newton s Method for One - Dimensional Optimization - Theory

OPTIMISATION. Introduction Single Variable Unconstrained Optimisation Multivariable Unconstrained Optimisation Linear Programming

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

10-701/ Machine Learning, Fall 2005 Homework 3

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Binomial Distribution: Tossing a coin m times. p = probability of having head from a trial. y = # of having heads from n trials (y = 0, 1,..., m).

Topic 5: Non-Linear Regression

6. Stochastic processes (2)

6. Stochastic processes (2)

The exam is closed book, closed notes except your one-page cheat sheet.

Tracking with Kalman Filter

Singular Value Decomposition: Theory and Applications

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Basic Business Statistics, 10/e

Diagnostics in Poisson Regression. Models - Residual Analysis

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Optimization. September 4, 2018

Designing a Pseudo R-Squared Goodness-of-Fit Measure in Generalized Linear Models

Lecture 21: Numerical methods for pricing American type derivatives

Chapter 14: Logit and Probit Models for Categorical Response Variables

Interval Regression with Sample Selection

Statistics for Economics & Business

Some modelling aspects for the Matlab implementation of MMA

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation

Laboratory 3: Method of Least Squares

Limited Dependent Variables

4.3 Poisson Regression

RELIABILITY ASSESSMENT

How its computed. y outcome data λ parameters hyperparameters. where P denotes the Laplace approximation. k i k k. Andrew B Lawson 2013

Introduction to Generalized Linear Models

Optimization. August 30, 2016

Multilayer Perceptron (MLP)

Advanced Statistical Methods: Beyond Linear Regression

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

Exercises of Chapter 2

Basically, if you have a dummy dependent variable you will be estimating a probability.

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients

Andreas C. Drichoutis Agriculural University of Athens. Abstract

1 Binary Response Models

Primer on High-Order Moment Estimators

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

EEE 241: Linear Systems

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Composite Hypotheses testing

Maxent Models & Deep Learning

LECTURE 9 CANONICAL CORRELATION ANALYSIS

The conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above

3.1 ML and Empirical Distribution

Goodness of fit and Wilks theorem

Chapter 9: Statistical Inference and the Relationship between Two Variables

Limited Dependent Variables and Panel Data. Tibor Hanappi

LINEAR REGRESSION MODELS W4315

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Laboratory 1c: Method of Least Squares

Assortment Optimization under MNL

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Chapter 20 Duration Analysis

NUMERICAL DIFFERENTIATION

Note 10. Modeling and Simulation of Dynamic Systems

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Feb 14: Spatial analysis of data fields

Lecture 10 Support Vector Machines II

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

EM and Structure Learning

Transcription:

Introducton to the R Statstcal Computng Envronment R Programmng John Fox McMaster Unversty ICPSR 2018 John Fox (McMaster Unversty) R Programmng ICPSR 2018 1 / 14 Programmng Bascs Topcs Functon defnton Control structures: Condtonals: f, felse, swtch Iteraton: for, whle, repeat Recurson Avodng teraton: Vectorzaton and functons nt the apply() famly Large data sets John Fox (McMaster Unversty) R Programmng ICPSR 2018 2 / 14

There are two latent classes of cases: Those for whch the response varable y s necessarly zero Those for whch the response condtonal on the predctors, the xs, s Posson dstrbuted and thus may be zero or a postve nteger The probablty π that a partcular case s n the frst (necessarly zero) latent class may be dependent upon potentally dstnct predctors, zs, accordng to a bnary logstc-regresson model: log e π 1 π = γ 0 + γ 1 z 1 + + γ p z p John Fox (McMaster Unversty) R Programmng ICPSR 2018 3 / 14 For an ndvdual n the second latent class, y follows a Posson regresson model wth log lnk, log e µ = β 0 + β 1 x 1 + + β k x k where µ = E (y ), and condtonal dstrbuton p(y x 1,..., x k ) = µy e µ y! for y = 0, 1, 2,... John Fox (McMaster Unversty) R Programmng ICPSR 2018 4 / 14

The probablty of observng a zero count for case, not knowng to whch latent class the case belongs, s therefore p(0) = Pr(y = 0) = π + (1 π )e µ and the probablty of observng a partcular nonzero count y > 0 s p(y ) = (1 π ) µy e µ y! John Fox (McMaster Unversty) R Programmng ICPSR 2018 5 / 14 The log-lkelhood for the ZIP model combnes the two components, for y = 0 and for y > 0: where log e (β, γ) = [ log e π + (1 π )e µ ] y =0 [ + log e (1 π ) µy e µ ] y 1 >0 y! β = (β 0, β 1,..., β k ) s the vector of parameters from the Posson-regresson component of the model (on whch the µ depend) γ = (γ 0, γ 1,..., γ p ) s the vector of parameters from the logstc-regresson component of the model (on whch the π depend) John Fox (McMaster Unversty) R Programmng ICPSR 2018 6 / 14

In maxmzng the lkelhood, t helps (but sn t essental) to have the gradent (vector of partal dervatves wth respect to the parameters) of the log-lkelhood. For the ZIP model the gradent s complcated: β γ = exp[ exp(x β)] exp(x β) exp(z γ) + exp[ exp(x β)]x + [y exp(x β)]x :y >0 = n =1 exp(z γ) exp(z γ) + exp[ exp(x β)]z exp(z γ) 1 + exp(z γ)z John Fox (McMaster Unversty) R Programmng ICPSR 2018 7 / 14 And the Hessan (the matrx of second-order partal dervatves, from whch the covarance matrx of the coeffcents s computed) s even more complcated (thankfully we won t need t): β β = { exp(x β) [exp(x β) 1] exp [exp(x β) + z γ] + 1 exp(2x β) exp(x β)x x :y >0 {exp [exp(x β) + z γ] + 1}2 } x x John Fox (McMaster Unversty) R Programmng ICPSR 2018 8 / 14

(Hessan contnued): γ γ β γ = n =1 = exp [exp(x β) + z γ] {exp [exp(x β) + z γ] + 1}2 z z exp(z γ) [exp(z γ) + 1]2 z z exp [x β + exp(x β) + z γ] {exp [exp(x β) + z γ] + 1}2 x z John Fox (McMaster Unversty) R Programmng ICPSR 2018 9 / 14 We can let a general-purpose optmzer do the work of maxmzng the log-lkelhood Optmzers work by evaluatng the gradent of the objectve functon (the log-lkelhood) at the current estmates of the parameters, ether numercally or analytcally They teratvely mprove the parameter estmates usng the nformaton n the gradent Iteraton ceases when the gradent s suffcently close to zero. The covarance matrx of the coeffcents s the nverse of the matrx of second dervatves of the log-lkelhood, called the Hessan, whch measures curvature of the log-lkelhood at the maxmum There s generally no advantage n usng an analytc Hessan durng optmzaton John Fox (McMaster Unversty) R Programmng ICPSR 2018 10 / 14

I ll use the optm() functon to ft the ZIP model. It takes several arguments, ncludng: par, a vector of start values for the parameters fn, the objectve functon to be mnmzed (n our case the negatve of the log-lkelhood), the frst argument of whch s the parameter vector; there may be other arguments gr (optonal), the gradent, also a functon of the parameter vector (and possbly of other arguments)... (optonal), any other arguments to be passed to fn and gr method, I ll use "BFGS" hessan, set to TRUE to return the numercal Hessan at the soluton See?optm for detals and other optonal arguments John Fox (McMaster Unversty) R Programmng ICPSR 2018 11 / 14 optm() returns a lst wth several element, ncludng: par, the values of the parameters that mnmze the objectve functon value, the value of the objectve functon at the mnmum convergence, a code ndcatng whether the optmzaton has converged: 0 means that convergence occurred hessan, a numercal approxmaton to the Hessan at the soluton Agan, see?optm for detals John Fox (McMaster Unversty) R Programmng ICPSR 2018 12 / 14

Beyond the Bascs: Object-Orented Programmng The S3 Object System Three standard object-orented programmng systems n R: S3, S4, reference classes How the S3 object system works Method dspatch, for object of class "class" : generc(object) = generc.class(object) = generc.default(object) For example, summarzng an object mod of class "lm": summary(mod) = summary.lm(mod) Objects can have more than one class, n whch case the frst applcable method s used. For example, objects produced by glm() are of class c("glm", "lm") and therefore can nhert methods from class "lm". Generc functons: generc <- functon(object, other-arguments,...) UseMethod("generc") For example, summary <- functon(object,...) UseMethod("summary") John Fox (McMaster Unversty) R Programmng ICPSR 2018 13 / 14 Beyond the Bascs: Debuggng and Proflng R Code Tools ntegrated wth the RStudo IDE: Locatng an error: traceback() Settng a breakpont and examnng the local envronment of an executng functon: browser() A smple nteractve debugger: debug() A post-mortem debugger: debugger() Measurng tme and memory usage wth system.tme (or often better, mcrobenchmark() n the mcrobenchmark package) and Rprof John Fox (McMaster Unversty) R Programmng ICPSR 2018 14 / 14