School on Modelling Tools and Capacity Building in Climate and Public Health April 2013

Size: px
Start display at page:

Download "School on Modelling Tools and Capacity Building in Climate and Public Health April 2013"

Transcription

1 School on Modelling Tools and Capacity Building in Climate and Public Health April 2013 Expectile and Quantile Regression and Other Extensions KAZEMBE Lawrence University of Malawi Chancellor College Faculty of Science Department of Mathematical Sciences 18 Chirunga Road, 0000 Zomba MALAWI

2 Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University of Namibia Windhoek, Namibia A presentation at School on Modelling Tools and Capacity Building in Climate and Public Health ICTP, Trieste, Italy 1

3 Objectives and Questions Objective: Introduce other regression beyond the mean Are there median regression models What about regression at the mode? Can we fit regression model at any other locations too? What of the variance, skewness and kurtosis? 2

4 Prelimaries Given a r.v. y f( ) then -E(Y )=μ defines the mean; -Var(Y )=σ 2 is the variance. General assumption: two measures completely define the distribution (stationarity concept) Classical regression is often characterized by relating to the mean -E(y x) =βx if x is the set of covariates. -y N(μ, σ 2 ), then μ = βx 3

5 -y Bern(π), then log ( ) π 1 π = βx -y P ois(λ), then log(λ) =βx. Statisticians are mean lovers (Friedman, Friedman & Amoo) It goes like: We are mean lovers. Deviation is considered normal. We are right 95% of the time. Statisticians do it discretely and continuously. We can legally comment on someone s posterior distribution. Why the mean? -Mean regression reduces complexity -However, the mean is not sufficient to describe a distribution.

6 Examples Plot (Rent in Munich, Kneib et al) 4

7 Examples Quantile Plot (Rent in Munich, Kneib et al) 5

8 Example (Malaria vs Altitude, Kazembe et al) 6

9 Example (Botswana data) 7

10 Motivating Examples Epidemiology and Public Health: -Investigating height, weight and body mass index as a function of different covariates e.g. age (Wei et al. 2006) -Exploring stunting curves in African and Indian children (Yue et al. 2012) -Generating age-specific centile charts (Chitty et al. 1994). Economics: -Study of determinants of wages (Koenker, 2005). 8

11 Education: -The performance of students in public schools (Koenker and Hallock, 2001). Climate data: - Spatiotemporal analysis of Boston temperature (Reich 2012)

12 Double GLM Classical regression assumes a homogeneous variance y x,ε N(βx,σ 2 ) ε N(0,σ 2 ) However variance heterogeneity (heteroscedasticity) is an order in real-life problems The variance of the response may depend on the covariates 9

13 Normal regression example: -Regression for mean and variance of a normal distribution y i = η i1 + exp(η 2i )ε i, ε i N(0, 1) such that E(y i x i )=η 1i Var(y i x i ) = exp(η 2i ) 2

14 Regression for location, scale and shape In general: Specify a distribution for the response on all parameters and relate to the predictors. The generalized additive model location, scale and shape (GAMLSS) is a statistical model developed by Rigby and Stasinopoulos (2005). For a probability (density) function f(y i μ i,σ i,ν i,τ i ) conditional on (μ i,σ i,ν i,τ i ) a vector of four distribution parameters, each of which can be a function 10

15 to the explanatory variables (X) g 1 (μ) =η 1 = β 1 X 1 + g 2 (σ) =η 2 = β 2 X 2 + g 3 (ν) =η 3 = β 3 X 3 + g 4 (τ) =η 4 = β 4 X 4 + J 1 j=1 J 2 j=1 J 3 j=1 J 4 j=1 h j1 (x j1 ) (1) h j2 (x j2 ) (2) h j3 (x j3 ) (3) h j4 (x j4 ) (4)

16 GAMLSS assumes different models for each parameter -Model 1: Mean regression -Model 2: Dispersion regression -Model 3: Skewness regression -Model 4: Kurtosis regression Other summary measures are also permissible

17 Quantile Regression Quantile, Centile, and Percentile are terms that can be used interchangeably -A 0.5 quantile 50 percentile, which is a median. Related terms are quartiles, quintiles and deciles: divides distribution into 4, 5, and 10 parts Quantiles are related to the median. Suppose Y has a cumulative distribution F (y) = P (Y τ). Then τth quantile of Y is defined to be Q(τ) = τ f(y)dy =inf{y : τ F (y)} 11

18 for 0 <τ<1. The quantile regression drops the parametric assumption for the error/ response distribution. Fit separate models for different asymmetries τ [0, 1]: y i = η iτ + ε iτ Instead of assuming E(y i η iτ = 0, one assumes or P (ε iτ 0) = τ F yi (η iτ )=P (ε iτ 0) = τ

19 This gives a set of regression function at any assumed quantile value. Assumptions: -it is distribution-free since it does not make any specific assumption on the type of errors -it does not even require i.i.d errors -it allows for heterogeneity.

20 Expectile Regression Expectiles are related to the mean, as are quantiles related to the median. ni=1 y i η i min median regression ni=1 w τ (y i,η iτ ) y i η iτ min quantile regression ni=1 y i η i 2 min n i=1 w τ (y i,η iτ ) y i η iτ 2 min mean regression expectile regression where w τ is the check function defined by 12

21 w τ = τ if y i >μ(τ) 1 τ if y i μ(τ) for some population expectile μ(τ) for different values of an asymmetric parameter 0 <τ<1. Expectiles are obtained by solving eτ τ = y e τ f y (y)dy y e τ f y (y)dy = G y (e τ ) e τ F y (e τ ) 2(G y (e τ ) e τ F y (e τ ))+(e τ μ) where -f y ( ) andf y ( ) denote the density and cumulative distribution function of y. -G y (e) = e yf y (y)dy is the partial moment function of y and -G y ( ) =μ is the expectation of y.

22 Worked example - binomial data 13

23 ALT ALT ALT s(alt, df = c(2, 4, 2)): s(alt, df = c(2, 4, 2)): s(alt, df = c(2, 4, 2)): ALT ALT ALT s(alt, df = c(2, 4, 2)): s(alt, df = c(2, 4, 2)): s(alt, df = c(2, 4, 2)):

24 TMIN TMIN TMIN s(tmin, df = c(2, 4, 2)): s(tmin, df = c(2, 4, 2)): s(tmin, df = c(2, 4, 2)): TMIN TMIN TMIN 14 s(tmin, df = c(2, 4, 2)): s(tmin, df = c(2, 4, 2)): s(tmin, df = c(2, 4, 2)):

25 TMAX TMAX TMAX s(tmax, df = c(2, 4, 2)): s(tmax, df = c(2, 4, 2)): s(tmax, df = c(2, 4, 2)): TMAX TMAX TMAX 15 s(tmax, df = c(2, 4, 2)): s(tmax, df = c(2, 4, 2)): s(tmax, df = c(2, 4, 2)):

26 Reference Hudson, I. L., Rea, A., Dalrymple, M. L., Eilers, P. H. C (2008). Climate impacts on sudden infant death syndrome: a GAMLSS approach. Proceedings of the 23rd international workshop on statistical modelling pp Beyerlein, A., Fahrmeir, L., Mansmann, U., Toschke., A. M (2008). Alternative regression models to assess increase in childhood BM. BMC Medical Research Methodology, 8(59). Smyth, G. K (1989). Generalized linear models with varying dispersion. J. R. Statist. Soc. B, 51, Sobotka, F., and T. Kneib, Geoadditive Expectile Regression. Computational Statistics and Data Analysis, doi: /j.csda

Beyond Mean Regression

Beyond Mean Regression Beyond Mean Regression Thomas Kneib Lehrstuhl für Statistik Georg-August-Universität Göttingen 8.3.2013 Innsbruck Introduction Introduction One of the top ten reasons to become statistician (according

More information

gamboostlss: boosting generalized additive models for location, scale and shape

gamboostlss: boosting generalized additive models for location, scale and shape gamboostlss: boosting generalized additive models for location, scale and shape Benjamin Hofner Joint work with Andreas Mayr, Nora Fenske, Thomas Kneib and Matthias Schmid Department of Medical Informatics,

More information

Improving linear quantile regression for

Improving linear quantile regression for Improving linear quantile regression for replicated data arxiv:1901.0369v1 [stat.ap] 16 Jan 2019 Kaushik Jana 1 and Debasis Sengupta 2 1 Imperial College London, UK 2 Indian Statistical Institute, Kolkata,

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Detection of risk factors for obesity in early childhood with quantile regression methods for longitudinal data

Detection of risk factors for obesity in early childhood with quantile regression methods for longitudinal data Nora Fenske, Ludwig Fahrmeir, Peter Rzehak, Michael Höhle Detection of risk factors for obesity in early childhood with quantile regression methods for longitudinal data Technical Report Number 038, 08

More information

Impact Evaluation Technical Workshop:

Impact Evaluation Technical Workshop: Impact Evaluation Technical Workshop: Asian Development Bank Sept 1 3, 2014 Manila, Philippines Session 19(b) Quantile Treatment Effects I. Quantile Treatment Effects Most of the evaluation literature

More information

LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape

LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape Nikolaus Umlauf https://eeecon.uibk.ac.at/~umlauf/ Overview Joint work with Andreas Groll, Julien Hambuckers

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring

More information

Spatio-Temporal Expectile Regression Models

Spatio-Temporal Expectile Regression Models Spatio-Temporal Expectile Regression Models Elmar Spiegel University of Goettingen Abstract Spatio-temporal models are becoming increasingly popular in recent regression research. However, they usually

More information

Unconditional Quantile Regressions

Unconditional Quantile Regressions Unconditional Regressions Sergio Firpo PUC-Rio and UBC Nicole Fortin UBC Thomas Lemieux UBC June 20 2006 Preliminary Paper, Comments Welcome Abstract We propose a new regression method for modelling unconditional

More information

Principal components in an asymmetric norm

Principal components in an asymmetric norm Ngoc Mai Tran Maria Osipenko Wolfgang Karl Härdle Ladislaus von Bortkiewicz Chair of Statistics C.A.S.E. Centre for Applied Statistics and Economics School of Business and Economics Humboldt-Universität

More information

Chapter 1 - Lecture 3 Measures of Location

Chapter 1 - Lecture 3 Measures of Location Chapter 1 - Lecture 3 of Location August 31st, 2009 Chapter 1 - Lecture 3 of Location General Types of measures Median Skewness Chapter 1 - Lecture 3 of Location Outline General Types of measures What

More information

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y Predictor or Independent variable x Model with error: for i = 1,..., n, y i = α + βx i + ε i ε i : independent errors (sampling, measurement,

More information

Multivariate Lineare Modelle

Multivariate Lineare Modelle 0-1 TALEB AHMAD CASE - Center for Applied Statistics and Economics Humboldt-Universität zu Berlin Motivation 1-1 Motivation Multivariate regression models can accommodate many explanatory which simultaneously

More information

Asymmetric least squares estimation and testing

Asymmetric least squares estimation and testing Asymmetric least squares estimation and testing Whitney Newey and James Powell Princeton University and University of Wisconsin-Madison January 27, 2012 Outline ALS estimators Large sample properties Asymptotic

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution Lecture 12: Small Sample Intervals Based on a Normal Population MSU-STT-351-Sum-17B (P. Vellaisamy: MSU-STT-351-Sum-17B) Probability & Statistics for Engineers 1 / 24 In this lecture, we will discuss (i)

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Introduction to Quantile Regression

Introduction to Quantile Regression Introduction to Quantile Regression CHUNG-MING KUAN Department of Finance National Taiwan University May 31, 2010 C.-M. Kuan (National Taiwan U.) Intro. to Quantile Regression May 31, 2010 1 / 36 Lecture

More information

Statistics & Data Sciences: First Year Prelim Exam May 2018

Statistics & Data Sciences: First Year Prelim Exam May 2018 Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book

More information

Approximate Median Regression via the Box-Cox Transformation

Approximate Median Regression via the Box-Cox Transformation Approximate Median Regression via the Box-Cox Transformation Garrett M. Fitzmaurice,StuartR.Lipsitz, and Michael Parzen Median regression is used increasingly in many different areas of applications. The

More information

Post-exam 2 practice questions 18.05, Spring 2014

Post-exam 2 practice questions 18.05, Spring 2014 Post-exam 2 practice questions 18.05, Spring 2014 Note: This is a set of practice problems for the material that came after exam 2. In preparing for the final you should use the previous review materials,

More information

Treatment Effects Beyond the Mean: A Practical Guide. Using Distributional Regression

Treatment Effects Beyond the Mean: A Practical Guide. Using Distributional Regression Treatment Effects Beyond the Mean: A Practical Guide Using Distributional Regression Maike Hohberg 1, Peter Pütz 1, and Thomas Kneib 1 1 University of Goettingen October 24, 2017 Abstract This paper introduces

More information

Analyzing Highly Dispersed Crash Data Using the Sichel Generalized Additive Models for Location, Scale and Shape

Analyzing Highly Dispersed Crash Data Using the Sichel Generalized Additive Models for Location, Scale and Shape Analyzing Highly Dispersed Crash Data Using the Sichel Generalized Additive Models for Location, Scale and Shape By Yajie Zou Ph.D. Candidate Zachry Department of Civil Engineering Texas A&M University,

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

Stat 704 Data Analysis I Probability Review

Stat 704 Data Analysis I Probability Review 1 / 39 Stat 704 Data Analysis I Probability Review Dr. Yen-Yi Ho Department of Statistics, University of South Carolina A.3 Random Variables 2 / 39 def n: A random variable is defined as a function that

More information

Module 11: Linear Regression. Rebecca C. Steorts

Module 11: Linear Regression. Rebecca C. Steorts Module 11: Linear Regression Rebecca C. Steorts Announcements Today is the last class Homework 7 has been extended to Thursday, April 20, 11 PM. There will be no lab tomorrow. There will be office hours

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Modelling geoadditive survival data

Modelling geoadditive survival data Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model

More information

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random STA 216: GENERALIZED LINEAR MODELS Lecture 1. Review and Introduction Much of statistics is based on the assumption that random variables are continuous & normally distributed. Normal linear regression

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

SUMMARIZING MEASURED DATA. Gaia Maselli

SUMMARIZING MEASURED DATA. Gaia Maselli SUMMARIZING MEASURED DATA Gaia Maselli maselli@di.uniroma1.it Computer Network Performance 2 Overview Basic concepts Summarizing measured data Summarizing data by a single number Summarizing variability

More information

Summary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1

Summary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1 Summary statistics 1. Visualize data 2. Mean, median, mode and percentiles, variance, standard deviation 3. Frequency distribution. Skewness 4. Covariance and correlation 5. Autocorrelation MSc Induction

More information

Gaussian Random Fields

Gaussian Random Fields Gaussian Random Fields March 22, 2007 Random Fields A N dimensional random field is a set of random variables Y (x), x R N, which has a collection of distribution functions F (Y (x ) y,..., Y (x n ) y

More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

Gradient boosting in Markov-switching generalized additive models for location, scale and shape

Gradient boosting in Markov-switching generalized additive models for location, scale and shape arxiv:1710.02385v2 [stat.me] 17 May 2018 Gradient boosting in Markov-switching generalized additive models for location, scale and shape Timo Adam 1, Andreas Mayr 2 and Thomas Kneib 3 1 Bielefeld University,

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES VARIABLE Studying the behavior of random variables, and more importantly functions of random variables is essential for both the

More information

Conditional Transformation Models

Conditional Transformation Models IFSPM Institut für Sozial- und Präventivmedizin Conditional Transformation Models Or: More Than Means Can Say Torsten Hothorn, Universität Zürich Thomas Kneib, Universität Göttingen Peter Bühlmann, Eidgenössische

More information

A Resampling Method on Pivotal Estimating Functions

A Resampling Method on Pivotal Estimating Functions A Resampling Method on Pivotal Estimating Functions Kun Nie Biostat 277,Winter 2004 March 17, 2004 Outline Introduction A General Resampling Method Examples - Quantile Regression -Rank Regression -Simulation

More information

Spatial Modeling of grub density of the may beetle (forest cockchafer: Melolontha hippocastani) in the Hessischen Ried area

Spatial Modeling of grub density of the may beetle (forest cockchafer: Melolontha hippocastani) in the Hessischen Ried area Spatial Modeling of grub density of the may beetle (forest cockchafer: Melolontha hippocastani) in the Hessischen Ried area Matthias Schmidt und Rainer Hurling Motivation Since several decades the forests

More information

The Simulation Study to Test the Performance of Quantile Regression Method With Heteroscedastic Error Variance

The Simulation Study to Test the Performance of Quantile Regression Method With Heteroscedastic Error Variance CAUCHY Jurnal Matematika Murni dan Aplikasi Volume 5(1)(2017), Pages 36-41 p-issn: 2086-0382; e-issn: 2477-3344 The Simulation Study to Test the Performance of Quantile Regression Method With Heteroscedastic

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

Variable Selection and Model Choice in Survival Models with Time-Varying Effects

Variable Selection and Model Choice in Survival Models with Time-Varying Effects Variable Selection and Model Choice in Survival Models with Time-Varying Effects Boosting Survival Models Benjamin Hofner 1 Department of Medical Informatics, Biometry and Epidemiology (IMBE) Friedrich-Alexander-Universität

More information

Economics 536 Lecture 21 Counts, Tobit, Sample Selection, and Truncation

Economics 536 Lecture 21 Counts, Tobit, Sample Selection, and Truncation University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 21 Counts, Tobit, Sample Selection, and Truncation The simplest of this general class of models is Tobin s (1958)

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook BIOMETRY THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH THIRD E D I T I O N Robert R. SOKAL and F. James ROHLF State University of New York at Stony Brook W. H. FREEMAN AND COMPANY New

More information

12 The Analysis of Residuals

12 The Analysis of Residuals B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 12 The Analysis of Residuals 12.1 Errors and residuals Recall that in the statistical model for the completely randomized one-way design, Y ij

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

Additive Terms. Flexible Regression and Smoothing. Mikis Stasinopoulos 1 Bob Rigby 1

Additive Terms. Flexible Regression and Smoothing. Mikis Stasinopoulos 1 Bob Rigby 1 1 Flexible Regression and Smoothing Mikis Stasinopoulos 1 Bob Rigby 1 1 STORM, London Metropolitan University XXV SIMPOSIO INTERNACIONAL DE ESTADêSTICA, Armenia, Colombia, August 2015 2 Outline 1 Linear

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018

QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018 Page 1 of 4 QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018 ECONOMICS 250 Introduction to Statistics Instructor: Gregor Smith Instructions: The exam

More information

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization. Statistical Tools in Evaluation HPS 41 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific number

More information

Quantile regression in varying coefficient models

Quantile regression in varying coefficient models Quantile regression in varying coefficient models Irène Gijbels KU Leuven Department of Mathematics and Leuven Statistics Research Centre Belgium July 6, 2017 Irène Gijbels (KU Leuven) ICORS 2017, Wollongong,

More information

Quantile Regression Methods for Reference Growth Charts

Quantile Regression Methods for Reference Growth Charts Quantile Regression Methods for Reference Growth Charts 1 Roger Koenker University of Illinois at Urbana-Champaign ASA Workshop on Nonparametric Statistics Texas A&M, January 15, 2005 Based on joint work

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent

More information

3 Joint Distributions 71

3 Joint Distributions 71 2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY. SECOND YEAR B.Sc. SEMESTER - III

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY. SECOND YEAR B.Sc. SEMESTER - III Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY SECOND YEAR B.Sc. SEMESTER - III SYLLABUS FOR S. Y. B. Sc. STATISTICS Academic Year 07-8 S.Y. B.Sc. (Statistics)

More information

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization. Statistical Tools in Evaluation HPS 41 Fall 213 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific

More information

On the L p -quantiles and the Student t distribution

On the L p -quantiles and the Student t distribution 1 On the L p -quantiles and the Student t distribution Valeria Bignozzi based on joint works with Mauro Bernardi, Luca Merlo and Lea Petrella MEMOTEF Department, Sapienza University of Rome Workshop Recent

More information

Chapter 8. Quantile Regression and Quantile Treatment Effects

Chapter 8. Quantile Regression and Quantile Treatment Effects Chapter 8. Quantile Regression and Quantile Treatment Effects By Joan Llull Quantitative & Statistical Methods II Barcelona GSE. Winter 2018 I. Introduction A. Motivation As in most of the economics literature,

More information

On the Dependency of Soccer Scores - A Sparse Bivariate Poisson Model for the UEFA EURO 2016

On the Dependency of Soccer Scores - A Sparse Bivariate Poisson Model for the UEFA EURO 2016 On the Dependency of Soccer Scores - A Sparse Bivariate Poisson Model for the UEFA EURO 2016 A. Groll & A. Mayr & T. Kneib & G. Schauberger Department of Statistics, Georg-August-University Göttingen MathSport

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR

More information

Quartiles, Deciles, and Percentiles

Quartiles, Deciles, and Percentiles Quartiles, Deciles, and Percentiles From the definition of median that it s the middle point in the axis frequency distribution curve, and it is divided the area under the curve for two areas have the

More information

PEP-PMMA Technical Note Series (PTN 01) Percentile Weighed Regression

PEP-PMMA Technical Note Series (PTN 01) Percentile Weighed Regression PEP-PMMA Technical Note Series (PTN 01) Percentile Weighed Regression Araar Abdelkrim University of Laval, aabd@ecn.ulaval.ca May 29, 2016 Abstract - In this brief note, we review the quantile and unconditional

More information

BIOS 2083: Linear Models

BIOS 2083: Linear Models BIOS 2083: Linear Models Abdus S Wahed September 2, 2009 Chapter 0 2 Chapter 1 Introduction to linear models 1.1 Linear Models: Definition and Examples Example 1.1.1. Estimating the mean of a N(μ, σ 2

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

Quantile regression for longitudinal data using the asymmetric Laplace distribution

Quantile regression for longitudinal data using the asymmetric Laplace distribution Biostatistics (2007), 8, 1, pp. 140 154 doi:10.1093/biostatistics/kxj039 Advance Access publication on April 24, 2006 Quantile regression for longitudinal data using the asymmetric Laplace distribution

More information

Lecture 7: Dynamic panel models 2

Lecture 7: Dynamic panel models 2 Lecture 7: Dynamic panel models 2 Ragnar Nymoen Department of Economics, UiO 25 February 2010 Main issues and references The Arellano and Bond method for GMM estimation of dynamic panel data models A stepwise

More information

Continuous random variables

Continuous random variables Continuous random variables Can take on an uncountably infinite number of values Any value within an interval over which the variable is definied has some probability of occuring This is different from

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Sampling Distributions of Statistics Corresponds to Chapter 5 of Tamhane and Dunlop

Sampling Distributions of Statistics Corresponds to Chapter 5 of Tamhane and Dunlop Sampling Distributions of Statistics Corresponds to Chapter 5 of Tamhane and Dunlop Slides prepared by Elizabeth Newton (MIT), with some slides by Jacqueline Telford (Johns Hopkins University) 1 Sampling

More information

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

Economics 326 Methods of Empirical Research in Economics. Lecture 7: Con dence intervals

Economics 326 Methods of Empirical Research in Economics. Lecture 7: Con dence intervals Economics 326 Methods of Empirical Research in Economics Lecture 7: Con dence intervals Hiro Kasahara University of British Columbia December 24, 2014 Point estimation I Our model: 1 Y i = β 0 + β 1 X

More information

A Complete Spatial Downscaler

A Complete Spatial Downscaler A Complete Spatial Downscaler Yen-Ning Huang, Brian J Reich, Montserrat Fuentes 1 Sankar Arumugam 2 1 Department of Statistics, NC State University 2 Department of Civil, Construction, and Environmental

More information

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables Chapter 2 Some Basic Probability Concepts 2.1 Experiments, Outcomes and Random Variables A random variable is a variable whose value is unknown until it is observed. The value of a random variable results

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression EdPsych 580 C.J. Anderson Fall 2005 Simple Linear Regression p. 1/80 Outline 1. What it is and why it s useful 2. How 3. Statistical Inference 4. Examining assumptions (diagnostics)

More information

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade Denis Chetverikov Brad Larsen Christopher Palmer UCLA, Stanford and NBER, UC Berkeley September

More information

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego Model-free prediction intervals for regression and autoregression Dimitris N. Politis University of California, San Diego To explain or to predict? Models are indispensable for exploring/utilizing relationships

More information

An Introduction to Bayesian Linear Regression

An Introduction to Bayesian Linear Regression An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

More information

Lec 1: An Introduction to ANOVA

Lec 1: An Introduction to ANOVA Ying Li Stockholm University October 31, 2011 Three end-aisle displays Which is the best? Design of the Experiment Identify the stores of the similar size and type. The displays are randomly assigned to

More information

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression Model 1 2 Ordinary Least Squares 3 4 Non-linearities 5 of the coefficients and their to the model We saw that econometrics studies E (Y x). More generally, we shall study regression analysis. : The regression

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,

More information

Final Exam STAT On a Pareto chart, the frequency should be represented on the A) X-axis B) regression C) Y-axis D) none of the above

Final Exam STAT On a Pareto chart, the frequency should be represented on the A) X-axis B) regression C) Y-axis D) none of the above King Abdul Aziz University Faculty of Sciences Statistics Department Final Exam STAT 0 First Term 49-430 A 40 Name No ID: Section: You have 40 questions in 9 pages. You have 90 minutes to solve the exam.

More information

ECON 5350 Class Notes Review of Probability and Distribution Theory

ECON 5350 Class Notes Review of Probability and Distribution Theory ECON 535 Class Notes Review of Probability and Distribution Theory 1 Random Variables Definition. Let c represent an element of the sample space C of a random eperiment, c C. A random variable is a one-to-one

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Introduction. The Linear Regression Model One popular model is the linear regression model. It writes as :

Introduction. The Linear Regression Model One popular model is the linear regression model. It writes as : Introduction Definition From Wikipedia : The two main purposes of econometrics are to give empirical content to economic theory and to subject economic theory to potentially falsifying tests. Another popular

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Simple Linear Regression Analysis

Simple Linear Regression Analysis LINEAR REGRESSION ANALYSIS MODULE II Lecture - 6 Simple Linear Regression Analysis Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Prediction of values of study

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Problem Set # 1. Master in Business and Quantitative Methods

Problem Set # 1. Master in Business and Quantitative Methods Problem Set # 1 Master in Business and Quantitative Methods Contents 0.1 Problems on endogeneity of the regressors........... 2 0.2 Lab exercises on endogeneity of the regressors......... 4 1 0.1 Problems

More information

Z score indicates how far a raw score deviates from the sample mean in SD units. score Mean % Lower Bound

Z score indicates how far a raw score deviates from the sample mean in SD units. score Mean % Lower Bound 1 EDUR 8131 Chat 3 Notes 2 Normal Distribution and Standard Scores Questions Standard Scores: Z score Z = (X M) / SD Z = deviation score divided by standard deviation Z score indicates how far a raw score

More information

MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability

MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability STA301- Statistics and Probability Solved MCQS From Midterm Papers March 19,2012 MC100401285 Moaaz.pk@gmail.com Mc100401285@gmail.com PSMD01 MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability

More information

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different

More information