Monte Carlo Studies. The response in a Monte Carlo study is a random variable.
|
|
- Samson Kelly
- 5 years ago
- Views:
Transcription
1 Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating process. At each scenario, a Monte Carlo study generates multiple realizations of the response, and we use aggregations of those means, etc. How many realizations? The Monte Carlo sample size. Because a Monte Carlo study of a statistical method involves sample sizes, we must be clear on what affects the variances of the aggregates (means, etc.). Monte Carlo sample size: m. 1
2 Example: Variances in a Monte Carlo Study of a Statistical Hypothesis Test Consider two-sample t test for equality of means. H 0 : µ 1 = µ 2 vs. H 1 : µ 1 µ 2 Sample sizes n 1 and n 2. When is the test valid, most powerful, etc. (all the good things you can say about a test)? What about other cases: 1) N(µ 1, σ 2 1 ) vs. N(µ 2, σ 2 2 ) 2) N(µ, σ 2 ) vs. Distribution 2 3) Distribution 1 vs. Distribution 2? Lots of scenarios. What are they? 2
3 Variances in the Monte Carlo Study of a Statistical Hypothesis Test For a given scenario, we generate m s pairs of samples, perform the test, add to count of number rejections. Monte Carlo estimate of β(s): r m s In statistics, when we give an estimate, we also give an estimate of the variance of the estimator. What is an estimate of variance of β(s)? 3
4 Estimates of Variances in the Monte Carlo Study sample variance or because it is binomial, β(s)(1 β(s)) m s What is the point here? The standard deviation is O(m 1 s ). We choose m s. How? 4
5 Preliminaries (Mostly from Chapter 1) Data structures and structure in data Multiple analyses and multiple views Modeling and computational inference Probability models The role of the empirical cumulative distribution function Statistical functions of the CDF and the ECDF Plug-in estimators Order statistics, quantiles, and empirical quantiles The role of optimization in inference *** carry over to next lecture Estimation by minimizing residuals Estimation by maximum likelihood Inference about functions Probability statements in statistical inference Arithmetic on the computer 5
6 As we go along, we will encounter important statistical concepts in statistical inference: sufficiency, unbiasedness, mean-squared error, etc. 6
7 Data-Generating Processes and Statistical Models Our understanding of phenomena is facilitated by means of a model. A model is a description of the phenomenon of interest. We can formulate a model either as a description of a datagenerating process, or as a prescription for processing data. The model is often expressed as a set of equations that relate data elements to each other. It may include probability distributions for the data elements. If any of the data elements are considered to be realizations of random variables, the model is a stochastic model. 7
8 Models A class of models may have a common form within which the members of the class are distinguished by values of parameters. In models that are not mathematically tractable computationally intensive methods involving simulations, resamplings, and multiple views may be used to make inferences about the parameters of a model. 8
9 Structure in Data The components of statistical datasets are observations and variables. In general, data structures are ways of organizing data to take advantage of the relationships among the variables constituting the dataset. Data structures may express hierarchical relationships, crossed relationships (as in relational databases), or more complicated aspects of the data (as in object-oriented databases). In data analysis, structure in the data is of interest. 9
10 Structure in Data Structure in the data includes such nonparametric features as modes, gaps, or clusters in the data, the symmetry of the data, and other general aspects of the shape of the data. Because many classical techniques of statistical analysis rely on an assumption of normality of the data, the most interesting structure in the data may be those aspects of the data that deviate most from normality. Graphical displays may be used to discover qualitative structure in the data. 10
11 Model Building The process of building models involves successive refinements. The evolution of the models proceeds from vague, tentative models to more complete ones, and our understanding of the process being modeled grows in this process. The usual statements about statistical methods regarding bias, variance, and so on are made in the context of a model. 11
12 Model Building It is not possible to measure bias or variance of a procedure to select a model, except in the relatively simple case of selection from some well-defined and simple set of possible models. Only within the context of rigid assumptions (a metamodel ) can we do a precise statistical analysis of model selection. Even the simple cases of selection of variables in linear regression analysis under the usual assumptions about the distribution of residuals (and this is a highly idealized situation) present more problems to the analyst than are generally recognized. 12
13 Descriptive Statistics, Inferential Statistics, and Model Building We can distinguish statistical activities that involve: data collection; descriptions of a given dataset; inference within the context of a model or family of models; and model selection. 13
14 Once data are available, either from a survey or designed experiment, or just observational data, a statistical analysis begins by considering general descriptions of the dataset. These descriptions include ensemble characteristics, such as averages and spreads, and identification of extreme points. The descriptions are in the form of various summary statistics and graphical displays. The descriptive analyses may be computationally intensive for large datasets, especially if there are a large number of variables. 14
15 Computational Statistics The computationally intensive approach also involves multiple views of the data, including consideration of various transformations of the data. A stochastic model is often expressed as a probability density function or as a cumulative distribution function of a random variable. In a simple linear regression model with normal errors, Y = β 0 + β 1 x + E, for example, the model may be expressed by use of the probability density function for the random variable E. The probability density function for Y is p(y) = 1 2πσ e (y β 0 β 1 x) 2 /(2σ 2). The elements of a stochastic model include observable random variables, observable covariates, unobservable parameters, and constants. 15
16 Statistical Models The parameters may be considered to be unobservable random variables, and in that sense, a specific data model is defined by a realization of the parameter random variable. In the model, written as Y = f(x; β) + E, we identify a systematic component, f(x; β), and a random component, E. The selection of an appropriate model may be very difficult, and almost always involves not only questions of how well the model corresponds to the observed data, but also the tractability of the model. The methods of computational statistics allow a much wider range of tractability than can be contemplated in mathematical statistics. 16
17 Classical Statistical Inference Formal statistical inference involves use of a sample to make decisions about stochastic models based on probabilities that would result if a given model was indeed the data-generating process. Estimation. Testing. The heuristic paradigm calls for rejection of a model if the probability is small that data arising from the model would be similar to the observed sample. In either case, classical statistical inference may use asymptotic approximations. Asymptotic inference. 17
18 Computational Inference Computationally intensive methods include exploration of a range of models, many of which may be mathematically intractable. In a different approach employing the same paradigm, the statistical methods may involve direct simulation of the hypothesized data-generating process rather than formal computations of probabilities that would result under a given model of the data-generating process. We refer to this approach as computational inference. In a variation of computational inference, we may not even attempt to develop a model of the data-generating process; rather, we build decision rules directly from the data. 18
19 The Empirical Cumulative Distribution Function Methods of statistical inference are based on an assumption (often implicit) that a discrete uniform distribution with mass points at the observed values of a random sample is asymptotically the same as the distribution governing the data-generating process. Thus, the distribution function of this discrete uniform distribution is a model of the distribution function of the data-generating process. For a given set of univariate data, y 1,..., y n, the empirical cumulative distribution function, or ECDF, is P n (y) = #{y i, s.t. y i y}. n The ECDF is the basic function used in many methods of computational inference. 19
20 The Empirical Cumulative Distribution Function It is easy to see that the ECDF is pointwise unbiased for the CDF. That is, if the y i are independent realizations of random variables Y i, each with CDF P( ), for a given y, E(P n (y)) = E 1 n = 1 n n i=1 n i=1 = Pr(Y y) = P(y). I (,y] (Y i ) E ( I (,y] (Y i ) ) 20
21 So E(P n (y)) = P(y). Similarly, we find V(P n (y)) = P(y)(1 P(y))/n; indeed, at a fixed point y, np n (y) is a binomial random variable with parameters n and π = P(y). See Exercise 1.2. Because P n is a function of the order statistics, which form a complete sufficient statistic for P, there is no unbiased estimator of P(y) with smaller variance. 21
22 The Empirical Probability Density Function We also define the empirical probability density function (EPDF) as the derivative of the ECDF: p n (y) = 1 n n i=1 where δ is the Dirac delta function. δ(y y i ), The EPDF is just a series of spikes at points corresponding to the observed values. It is not as useful as the ECDF. It is, however, unbiased at any point for the probability density function at that point. The ECDF and the EPDF can be used as estimators of the corresponding population functions, but there are better estimators. 22
23 Statistical Functions of the CDF and the ECDF Statistical Functions of the CDF and the ECDF In many models of interest, a parameter can be expressed as a functional of the probability density function or of the cumulative distribution function of a random variable in the model. The mean of a distribution, for example, can be expressed as a functional Θ of the CDF P: Θ(P) = y dp(y). IR d A functional that defines a parameter is called a statistical function. 23
24 Estimation of Statistical Functions A common task in statistics is to use a random sample to estimate the parameters of a probability distribution. If the statistic T from a random sample is used to estimate the parameter θ, we measure the performance of T by the magnitude of the bias, by the variance, V(T) = E by the mean squared error, E(T) θ, ( (T E(T))(T E(T)) T), E ( (T θ) T (T θ) ), and by other expected values of measures of the distance from T to θ. 24
25 Properties of Estimators The order of the mean squared error is an important characteristic of an estimator. For good estimators of location, the order of the mean squared error is typically O(n 1 ). Good estimators of probability densities, however, typically have mean squared errors of at least order O(n 4/5 ). 25
26 Estimation Using the ECDF There are many ways to construct an estimator and to make inferences about the population. In the univariate case especially, we often use data to make inferences about a parameter by applying the statistical function to the ECDF. An estimator of a parameter that is defined in this way is called a plug-in estimator. A plug-in estimator for a given parameter is the same functional of the ECDF as the parameter is of the CDF. 26
27 Plug-In Estimators For the mean of the model, for example, we use the estimate that is the same functional of the ECDF as the population mean, Θ(P n ) = = = 1 n y dp n(y) = 1 n = ȳ. y d1 n n i=1 n y i i=1 n i=1 I (,y] (y i ) y di (,y] (y i) The sample mean is thus a plug-in estimator of the population mean. 27
28 Plug-In Estimators An estimator, such as the sample mean, is called a method of moments estimator. Method of moments estimators are an important type of plug-in estimator. The method of moments results in estimates of the parameters E(Y r ) that are the corresponding sample moments. Statistical properties of plug-in estimators are generally relatively easy to determine. In some cases, the statistical properties, such as expectation and variance, are optimal in some sense. 28
29 Estimation Using the ECDF In addition to estimation based on the ECDF, other methods of computational statistics make use of the ECDF. In some cases, such as in bootstrap methods, the ECDF is a surrogate for the CDF. In other cases, such as Monte Carlo methods, an ECDF for an estimator is constructed by repeated sampling, and that ECDF is used to make inferences using the observed value of the estimator from the given sample. Use of the ECDF in statistical inference does not require many assumptions about the distribution. 29
30 Estimation Using the ECDF Viewed as a statistical function, Θ denotes a specific functional form. Any functional of the ECDF is a function of the data, so we may also use the notation Θ(Y 1,..., Y n ). Often, however, the notation is cleaner if we use another letter to denote the function of the data; for example, T(Y 1,..., Y n ), even if it might be the case that T(Y 1,..., Y n ) = Θ(P n ). 30
31 Quantiles A useful distributional measure for describing a univariate distribution with CDF P is is a quantity y π, such that for π (0,1). Pr(Y y π ) π, and Pr(Y y π ) 1 π, This quantity is called a π quantile. For an absolutely continuous distribution with CDF P, y π = P 1 (π). If P is not absolutely continuous, or in the case of a multivariate random variable, y π in this equation may not be unique. 31
32 The Quantile Function In the case of a univariate random variable, we can define a useful concept of quantile that always exists. For a probability distribution with CDF P, we define the function P 1 on the open interval (0,1) as P 1 (π) = inf{x, s.t. P(x) π}. We call P 1 the quantile function. Notice that if P is strictly increasing, the quantile function is the ordinary inverse of the cumulative distribution function. If P is not strictly increasing, the quantile function can be interpreted as a generalized inverse of the cumulative distribution function. Notice that for the random variable X with CDF P, if x (π) = P 1 (π), then x (π) is the π quantile of X as above. 32
33 Quantiles For a univariate distribution we can define a unique π quantile as a weighted average of values around y π, where π π and P(y π ) = π. It is clear that y π is a functional of the CDF, say Ξ π (P). The functional is very simple. It is Ξ π (P) = P 1 (π), where P 1 is the quantile function. For a univariate random variable, the π quantile is a single point. For a d-variate random variable, a similar definition leads to a (d 1)-dimensional object that is generally nonunique. (Quantiles are not so useful in the case of multivariate distributions.) 33
34 Empirical Quantiles For a given sample of size n, the order statistics y (1),..., y (n) constitute an obvious set of empirical quantiles. The probabilities from the ECDF that are associated with the order statistic y (i) are i/n. But these lead to a probability of 1 for the largest sample value, y (n), and a probability of 1/n for the smallest sample value, y (1). 34
35 Distribution of Order Statistics If Y (1),..., Y (n) are the order statistics in a random sample of size n from a distribution with PDF p Y ( ) and CDF P Y ( ), then the PDF of the i th order statistic is p Y(i) (y (i) ) = ( n) ( i P Y (y (i) )) i 1 py (y (i) ) ( 1 P Y (y (i) )) n i. Interestingly, the order statistics from a U(0, 1) distribution have beta distributions. 35
36 Estimation of Quantiles Empirical quantiles can be used as estimators of the population quantiles, but there are generally other estimators that are better, as we can deduce from basic properties of statistical inference. The first thing that we note is that the extreme order statistics have very large variances if the support of the underlying distribution is infinite. We would therefore not expect them alone to be the best estimator of an extreme quantile unless the support is finite. A fundamental principle of statistical inference is that a sufficient statistic should be used, if one is available. 36
37 Estimation of Quantiles No order statistic alone is sufficient, except for the minimum or maximum order statistic in the case of a distribution with finite support. The set of all order statistics, however, is always sufficient. Because of the Rao-Blackwell theorem, this would lead us to expect that some combination of order statistics would be a better estimator of any population quantile than a single estimator. 37
38 The Harrell-Davis Estimator The Harrell-Davis estimator uses a weighted combination of all order statistics where the weights are from a beta distribution. This comes from the fact that for any continuous CDF P if Y is a random variable from the distribution with CDF P, then U = P(Y ) has a U(0,1) distribution, and the order statistics from a uniform have beta distributions. See Exercise 1.7 in revised Chapter 1. 38
39 The Harrell-Davis Estimator The Harrell-Davis estimator for the π quantile uses the beta distribution with parameters π(n + 1) and (1 π)(n + 1). Let P βπ ( ) be the CDF of the beta distribution with those parameters. The Harrell-Davis estimator for the π quantile is where ŷ π = n i=1 w i y (i), w i = P βπ (i/n) P βπ ((i 1)/n). 39
40 Monte Carlo Study of the Harrell-Davis Estimator Let s conduct an empirical study of the relative performance of the sample median and the Harrell-Davis estimator as estimators of the population median. First, write a function to compute this estimator for any given sample size and given probability. For example, in R: hd <- function(y,p){ n <- length(y) a <- p*(n+1) b <- (1-p)*(n+1) q <-sum(sort(y)*(pbeta((1:n)/n,a,b)- pbeta((0:(n-1))/n,a,b))) q } 40
41 Monte Carlo Study of the Harrell-Davis Estimator Use samples of size 25, and use 1000 Monte Carlo replicates. In each case, for each replicate, generate a pseudorandom sample of size 25, compute the two estimators of the median and obtain the squared error, using the known population value of the median. Use normal, Cauchy, and gamma distributions. The average of the squared errors over the 1000 replicates is your Monte Carlo estimate of the MSE. Summarize your findings in a clearly-written report. What are the differences in relative performance of the sample median and the Harrell-Davis quantile estimator as estimators of the population median? What characteristics of the population seem to have an effect on the relative performance? 41
42 Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from fitted model, e.g. least squares maximize likelihood (what is likelihood?) These involve optimization. **** Next week we ll pick here... 42
Basic Computations in Statistical Inference
1 Basic Computations in Statistical Inference The purpose of an exploration of data may be rather limited and ad hoc, or the purpose may be more general, perhaps to gain understanding of some natural phenomenon.
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationThe Nonparametric Bootstrap
The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use
More informationStatistical Estimation
Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from
More informationReview and continuation from last week Properties of MLEs
Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Bootstrap for Regression Week 9, Lecture 1
MA 575 Linear Models: Cedric E. Ginestet, Boston University Bootstrap for Regression Week 9, Lecture 1 1 The General Bootstrap This is a computer-intensive resampling algorithm for estimating the empirical
More informationContents 1. Contents
Contents 1 Contents 6 Distributions of Functions of Random Variables 2 6.1 Transformation of Discrete r.v.s............. 3 6.2 Method of Distribution Functions............. 6 6.3 Method of Transformations................
More informationRobust Inference. A central concern in robust statistics is how a functional of a CDF behaves as the distribution is perturbed.
Robust Inference Although the statistical functions we have considered have intuitive interpretations, the question remains as to what are the most useful distributional measures by which to describe a
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationStatistical Methods in HYDROLOGY CHARLES T. HAAN. The Iowa State University Press / Ames
Statistical Methods in HYDROLOGY CHARLES T. HAAN The Iowa State University Press / Ames Univariate BASIC Table of Contents PREFACE xiii ACKNOWLEDGEMENTS xv 1 INTRODUCTION 1 2 PROBABILITY AND PROBABILITY
More information3 Joint Distributions 71
2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationChapter 2: Resampling Maarten Jansen
Chapter 2: Resampling Maarten Jansen Randomization tests Randomized experiment random assignment of sample subjects to groups Example: medical experiment with control group n 1 subjects for true medicine,
More informationStatistics and Data Analysis
Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli Statistics and Data
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2
MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and
More informationReview. December 4 th, Review
December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter
More informationPart III. A Decision-Theoretic Approach and Bayesian testing
Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to
More informationPractice Problems Section Problems
Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,
More informationWooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics
Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).
More informationIf we want to analyze experimental or simulated data we might encounter the following tasks:
Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction
More informationA General Overview of Parametric Estimation and Inference Techniques.
A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationσ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =
Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,
More informationOptimization Problems
Optimization Problems The goal in an optimization problem is to find the point at which the minimum (or maximum) of a real, scalar function f occurs and, usually, to find the value of the function at that
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationEric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION
Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION INTRODUCTION Statistical disclosure control part of preparations for disseminating microdata. Data perturbation techniques: Methods assuring
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationp(z)
Chapter Statistics. Introduction This lecture is a quick review of basic statistical concepts; probabilities, mean, variance, covariance, correlation, linear regression, probability density functions and
More informationStatistical inference
Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall
More informationIntroduction to Statistical Hypothesis Testing
Introduction to Statistical Hypothesis Testing Arun K. Tangirala Statistics for Hypothesis Testing - Part 1 Arun K. Tangirala, IIT Madras Intro to Statistical Hypothesis Testing 1 Learning objectives I
More informationBTRY 4090: Spring 2009 Theory of Statistics
BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationDr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)
Dr. Maddah ENMG 617 EM Statistics 10/15/12 Nonparametric Statistics (2) (Goodness of fit tests) Introduction Probability models used in decision making (Operations Research) and other fields require fitting
More informationSTATISTICS SYLLABUS UNIT I
STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution
More informationTwo hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45
Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS 21 June 2010 9:45 11:45 Answer any FOUR of the questions. University-approved
More informationMathematical statistics
October 1 st, 2018 Lecture 11: Sufficient statistic Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationIntroduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??
to Bayesian Methods Introduction to Bayesian Methods p.1/?? We develop the Bayesian paradigm for parametric inference. To this end, suppose we conduct (or wish to design) a study, in which the parameter
More informationChapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations
Chapter 5 Statistical Models in Simulations 5.1 Contents Basic Probability Theory Concepts Discrete Distributions Continuous Distributions Poisson Process Empirical Distributions Useful Statistical Models
More informationChapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics
Chapter 6 Order Statistics and Quantiles 61 Extreme Order Statistics Suppose we have a finite sample X 1,, X n Conditional on this sample, we define the values X 1),, X n) to be a permutation of X 1,,
More informationPreliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference
1 / 171 Bootstrap inference Francisco Cribari-Neto Departamento de Estatística Universidade Federal de Pernambuco Recife / PE, Brazil email: cribari@gmail.com October 2013 2 / 171 Unpaid advertisement
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationOne-Sample Numerical Data
One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html
More informationApproximate Bayesian Computation
Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Principles of Statistical Inference Recap of statistical models Statistical inference (frequentist) Parametric vs. semiparametric
More informationSTA 2201/442 Assignment 2
STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution
More information1. Point Estimators, Review
AMS571 Prof. Wei Zhu 1. Point Estimators, Review Example 1. Let be a random sample from. Please find a good point estimator for Solutions. There are the typical estimators for and. Both are unbiased estimators.
More informationIntroduction to Bayesian Methods
Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative
More informationPreliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference
1 / 172 Bootstrap inference Francisco Cribari-Neto Departamento de Estatística Universidade Federal de Pernambuco Recife / PE, Brazil email: cribari@gmail.com October 2014 2 / 172 Unpaid advertisement
More informationInformation in Data. Sufficiency, Ancillarity, Minimality, and Completeness
Information in Data Sufficiency, Ancillarity, Minimality, and Completeness Important properties of statistics that determine the usefulness of those statistics in statistical inference. These general properties
More information3. Linear Regression With a Single Regressor
3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)
More informationRegression Models - Introduction
Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013
Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you
More informationNotes on the Multivariate Normal and Related Topics
Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationAP Statistics Cumulative AP Exam Study Guide
AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationPart IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationChapter 2: Fundamentals of Statistics Lecture 15: Models and statistics
Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:
More informationMathematical Statistics
Mathematical Statistics Chapter Three. Point Estimation 3.4 Uniformly Minimum Variance Unbiased Estimator(UMVUE) Criteria for Best Estimators MSE Criterion Let F = {p(x; θ) : θ Θ} be a parametric distribution
More informationLeast Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions
Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error
More informationMonte Carlo Simulations
Monte Carlo Simulations What are Monte Carlo Simulations and why ones them? Pseudo Random Number generators Creating a realization of a general PDF The Bootstrap approach A real life example: LOFAR simulations
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing
More informationA union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling
A union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling Min-ge Xie Department of Statistics, Rutgers University Workshop on Higher-Order Asymptotics
More informationConfidence intervals for kernel density estimation
Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting
More informationSequential Importance Sampling for Rare Event Estimation with Computer Experiments
Sequential Importance Sampling for Rare Event Estimation with Computer Experiments Brian Williams and Rick Picard LA-UR-12-22467 Statistical Sciences Group, Los Alamos National Laboratory Abstract Importance
More informationMathematical statistics
October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationUniversity of California San Diego and Stanford University and
First International Workshop on Functional and Operatorial Statistics. Toulouse, June 19-21, 2008 K-sample Subsampling Dimitris N. olitis andjoseph.romano University of California San Diego and Stanford
More informationLecture 12 November 3
STATS 300A: Theory of Statistics Fall 2015 Lecture 12 November 3 Lecturer: Lester Mackey Scribe: Jae Hyuck Park, Christian Fong Warning: These notes may contain factual and/or typographic errors. 12.1
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationBrandon C. Kelly (Harvard Smithsonian Center for Astrophysics)
Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming
More informationReview of Statistics
Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and
More informationISyE 6644 Fall 2014 Test 3 Solutions
1 NAME ISyE 6644 Fall 14 Test 3 Solutions revised 8/4/18 You have 1 minutes for this test. You are allowed three cheat sheets. Circle all final answers. Good luck! 1. [4 points] Suppose that the joint
More informationAnswers and expectations
Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E
More informationDefect Detection using Nonparametric Regression
Defect Detection using Nonparametric Regression Siana Halim Industrial Engineering Department-Petra Christian University Siwalankerto 121-131 Surabaya- Indonesia halim@petra.ac.id Abstract: To compare
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationEco517 Fall 2004 C. Sims MIDTERM EXAM
Eco517 Fall 2004 C. Sims MIDTERM EXAM Answer all four questions. Each is worth 23 points. Do not devote disproportionate time to any one question unless you have answered all the others. (1) We are considering
More informationRobustness and Distribution Assumptions
Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology
More informationSpace Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses
Space Telescope Science Institute statistics mini-course October 2011 Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses James L Rosenberger Acknowledgements: Donald Richards, William
More informationParameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!
Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses
More informationCOS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION
COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:
More informationHANDBOOK OF APPLICABLE MATHEMATICS
HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester
More informationTutorial on Approximate Bayesian Computation
Tutorial on Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology 16 May 2016
More information14.30 Introduction to Statistical Methods in Economics Spring 2009
MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?
Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what
More informationGov 2002: 3. Randomization Inference
Gov 2002: 3. Randomization Inference Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last week: This week: What can we identify using randomization? Estimators were justified via
More informationFinite Population Sampling and Inference
Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane
More informationLecture 2 Machine Learning Review
Lecture 2 Machine Learning Review CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago March 29, 2017 Things we will look at today Formal Setup for Supervised Learning Things
More informationMachine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables?
Linear Regression Machine Learning CSE546 Sham Kakade University of Washington Oct 4, 2016 1 What about continuous variables? Billionaire says: If I am measuring a continuous variable, what can you do
More information