Using R in Undergraduate Probability and Mathematical Statistics Courses. Amy G. Froelich Department of Statistics Iowa State University

Similar documents
Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

Probability and Estimation. Alan Moses

Practice Problems Section Problems

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Chapter 6: Functions of Random Variables

A General Overview of Parametric Estimation and Inference Techniques.

Regression Models - Introduction

ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS

Regression Estimation Least Squares and Maximum Likelihood

Module 6: Methods of Point Estimation Statistics (OA3102)

Multivariate Regression: Part I

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables

Stat 5101 Lecture Notes

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

ECE531 Lecture 10b: Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Section 5.4. Ken Ueda

5601 Notes: The Sandwich Estimator

Solution-Midterm Examination - I STAT 421, Spring 2012

A Primer on Statistical Inference using Maximum Likelihood

Sampling Distributions

Statistics 511 Additional Materials

STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method

Mathematical statistics

CS 361: Probability & Statistics

STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN

GARCH Models Estimation and Inference

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

MATH4427 Notebook 2 Fall Semester 2017/2018

Financial Econometrics and Volatility Models Extreme Value Theory

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

From Histograms to Multivariate Polynomial Histograms and Shape Estimation. Assoc Prof Inge Koch

Sampling Distributions of Statistics Corresponds to Chapter 5 of Tamhane and Dunlop

Bayesian Regression Linear and Logistic Regression

Exponential Families

The Fundamentals of Heavy Tails Properties, Emergence, & Identification. Jayakrishnan Nair, Adam Wierman, Bert Zwart

Subject CS1 Actuarial Statistics 1 Core Principles

Multiple Linear Regression for the Supervisor Data

Why Bayesian? Rigorous approach to address statistical estimation problems. The Bayesian philosophy is mature and powerful.

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Statistical Inference with Regression Analysis

Problem set 1: answers. April 6, 2018

A Course in Statistical Theory

Department of Statistical Science FIRST YEAR EXAM - SPRING 2017

STAB57: Quiz-1 Tutorial 1 (Show your work clearly) 1. random variable X has a continuous distribution for which the p.d.f.

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology Kharagpur

Estimation of Truncated Data Samples in Operational Risk Modeling

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems


review session gov 2000 gov 2000 () review session 1 / 38

Basic Concepts of Inference

CS 361: Probability & Statistics

Breaking Free from Traditional Calculus Textbooks with Mathematica

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Supporting Australian Mathematics Project. A guide for teachers Years 11 and 12. Probability and statistics: Module 25. Inference for means

STAT 512 sp 2018 Summary Sheet

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013

Review. December 4 th, Review

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Institute of Actuaries of India

New Statistical Methods That Improve on MLE and GLM Including for Reserve Modeling GARY G VENTER

Random and mixed effects models

Investigation of Possible Biases in Tau Neutrino Mass Limits

HANDBOOK OF APPLICABLE MATHEMATICS

5.2 Fisher information and the Cramer-Rao bound

Linear Models in Econometrics

Aim #92: How do we interpret and calculate deviations from the mean? How do we calculate the standard deviation of a data set?

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Stat 5102 Final Exam May 14, 2015

Statistical inference

Lecture 17: The Exponential and Some Related Distributions

Regression Models - Introduction

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Inferences for Regression

INFERENCE FOR MULTIPLE LINEAR REGRESSION MODEL WITH EXTENDED SKEW NORMAL ERRORS

1. Simple Linear Regression

Reminders. Thought questions should be submitted on eclass. Please list the section related to the thought question

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

Stat 139 Homework 2 Solutions, Spring 2015

Part 4: Multi-parameter and normal models

Lecture 6. Probability events. Definition 1. The sample space, S, of a. probability experiment is the collection of all

STATISTICS 4, S4 (4769) A2

Design of Engineering Experiments

Theory of Maximum Likelihood Estimation. Konstantin Kashin

ST495: Survival Analysis: Hypothesis testing and confidence intervals

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Statistics II Lesson 1. Inference on one population. Year 2009/10

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION

Lecture 4: Multivariate Regression, Part 2

Primer on statistics:

Integration by Undetermined Coefficients

Statistics 3858 : Maximum Likelihood Estimators

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).

Robustness and Distribution Assumptions

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Introduction to Probabilistic Machine Learning

Transcription:

Using R in Undergraduate Probability and Mathematical Statistics Courses Amy G. Froelich Department of Statistics Iowa State University

Undergraduate Probability and Mathematical Statistics at Iowa State Introduction to Probability (Calculus-Based) Probability Discrete Random Variables Continuous Random Variables Multivariate Distributions Introductory to Mathematical Statistics (Calculus-Based) Transformations of Random Variables and Sampling Distributions Mathematical Statistics and Inference From Theory to Practice and Back Textbook: Mathematical Statistics with Applications by Wackerly,, Mendenhall & Scheaffer

Why Use R in Prob/Math Stat? Curriculum: Exposure to R Content: Model nature of statistics Pedagogy: Ground theoretical concepts in understanding obtained from earlier applied statistics courses.

Gamma Distribution Distribution Characteristics Distribution of Sample Mean Estimation of Parameters

Distribution Characteristics Study influence of parameters on Shape Mean Variance Percentiles Study through Observed distributions through simulation Theoretical distributions

Distribution Characteristics Simulation Study for Shape Parameters (alpha = 2, 10, 25, 50; beta = 2) 10,000 observations

Distribution of Sample Mean What is the distribution of the sample mean from a random sample of gamma random variables? Distribution of sum of gamma random variables. Introduce concept through simulation. Characteristics of observed simulation. Intuition about distribution.

Distribution of Sum of Gamma R.V.s Simulation Example Parameters (alpha = 2, beta = 2) Sample Size (n = 25)

Distribution of Sum of Gamma R.V.s Observed Distribution Very Slightly Skewed Right Observed mean = 99.77758 Observed variance = 196.796 Intuition about Distribution Gamma distribution? Parameters? alpha = 50, beta = 2

Distribution of Sample Mean Distribution of sample mean Connection to distribution of sum. Introduce concept through simulation. Characteristics of observed distribution. Intuition about distribution. Introduce Central Limit Theorem.

Distribution of Sample Mean Simulation Example Parameters (alpha = 2, beta = 2) Sample Size (n = 25)

Distribution of Sample Mean Observed Distribution Slightly Skewed Right Distribution Observed Mean = 3.991103 Observed Variance = 0.3148736 Intuition about distribution Gamma Distribution? Parameters? alpha = 50 beta = 2/25 Normal Distribution?

Parameter Estimation Two Methods Method of Moments Maximum Likelihood Properties of Estimators Distribution (Shape, Center, Spread) Bias Mean Square Error (MSE) Mean Absolute Deviation (MAD)

Method of Moments Introduce concept through previous work. Theoretically derive formulas for MOM estimators. Study properties of MOM estimators through simulation. Characteristics of observed distributions. Introduce concepts of observed bias, MSE, MAD.

Method of Moments Simulation Example Parameter Values (alpha = 2, beta = 2) Sample Size (n = 5, 15, 25, 50, 75, 100)

α β Sample Size Mean Variance Mean Variance 5 5.072479 61.87657 1.443441 1.278346 15 2.618371 1.35053 1.803246 0.7176061 25 2.365998 0.6300843 1.877318 0.474249 50 2.178479 0.2695496 1.938572 0.2601572 75 2.127446 0.1724364 1.953107 0.1791825 100 2.086805 0.1228160 1.971858 0.1325997

Maximum Likelihood Introduce concept through simulated data. Theoretically derive the ML Estimators. Closed form solution not possible. Maximization scheme needed (ex. Newton s s Method.) Study properties of ML Estimators through simulation. Characteristics of observed distribution. Introduce concepts of observed bias, MSE, MAD.

Maximum Likelihood Simulation Example Parameter Values (alpha = 2, beta = 2) Sample Size (n = 5, 15, 25, 50, 75, 100)

α β Sample Size Mean Variance Mean Variance 5 4.729084 62.63001 1.60065 1.447492 15 2.448018 1.040122 1.864778 0.5549284 25 2.238958 0.4268584 1.924708 0.3371712 50 2.112406 0.1702163 1.959108 0.1746927 75 2.079821 0.1055121 1.968891 0.1168963 100 2.056419 0.07579958 1.979734 0.08670047

Comparison of MOM and ML Simulation Example Estimators Parameter Values (alpha = 2, beta = 2) Sample Sizes (n = 10, 50, 100) Characteristics of observed distribution. Mean (bias), Var,, MSE, MAD.

α Obs. Mean Obs. Var. Sample Size MOM MLE MOM MLE n = 10 3.029221 2.786177 3.266518 2.795095 n = 50 2.172688 2.105872 0.257089 0.1637300 n = 100 2.097151 2.060918 0.129137 0.0784782

α Obs. MSE Obs. MAD Sample Size MOM MLE MOM MLE n = 10 4.32549 3.41289 1.295606 1.087239 n = 50 0.286885 0.174923 0.409188 0.3190955 n = 100 0.138562 0.082181 0.289018 0.221859

β Obs. Mean Obs. Var. Sample Size MOM MLE MOM MLE n = 10 1.695161 1.791985 0.893835 0.786961 n = 50 1.94285 1.965812 0.260478 0.173177 n = 100 1.962924 1.974595 0.137522 0.089519

β Obs. MSE Obs. MAD Sample Size MOM MLE MOM MLE n = 10 0.986672 0.830152 0.801810 0.739329 n = 50 0.263718 0.174328 0.402426 0.333662 n = 100 0.138883 0.090155 0.295451 0.2393529

Comparison of MOM and ML Estimators Emphasizing observed nature of comparison. Emphasizing limitations due to Sample size Parameter Values Emphasizing asymptotic properties of MOM and ML Estimators.

Possible Implementation Issues Textbook No textbook available matching this approach, yet! Using R Need lots of supplements for students. Student expectations from course Ease them in early and often Departmental Support Not a problem at Iowa State.

Student Reactions to R Course evaluations Fall 2005. Did you like using R? Was it helpful to you? Did you find it easy or difficult to use? 34 responses 32 positive; 1 indifferent; 1 negative comments to using R.

Selected Student Comments I I liked using R. I found it easy with the examples given in class and the handouts clearly explaining how to use it. I I liked using R, it was helpful in doing the difficult distribution problems. I also liked it because we could construct histograms to visual(ize) ) the distributions. I I liked using R, easy to use, somewhat difficult to know what to type in. Continue using it in 341 since it s s used in the real world. I I thought R was very helpful and easy to use In most cases, the professor s s instructions made the program very understandable. It was nice to use R because it seems really versatile and I like the easy to understand interface. I I think R was very helpful, because we didn t t have to calculate everything by hand. It was easy to use too, because we got a good explanation about it.