A Short Course in Basic Statistics
|
|
- Hugh Sanders
- 6 years ago
- Views:
Transcription
1 A Short Course in Basic Statistics Ian Schindler November 5, 2017 Creative commons license share and share alike BY: C 1 Descriptive Statistics 1.1 Presenting statistical data Definition 1 A statistical population is a set Ω of objects of study. We may assume that card(ω) <. Therefore there is a bijection i : Ω 1, n for some n N. This leads to using Ω and 1, n interchangeably. A random variable is a function X : Ω ω. The goal of descriptive statistics is to summarise the range of random variables Raw data Let X : Ω = 1, n ω. If ω R, the value of X at i 1, n, X(i) will be denoted x i. Continuous: ω = I R. Discrete: ω R and card(ω) is small. Categorical : ω is nominal A table of a simple statistical series For a discrete variable with K categories, {x i } n, we define: The class x k where x k < x k+1 for k 1, K 1. The number of elements in the class k: n k with n = K k=1 n k. The frequency: f k = n k /n with K k=1 f k = 1. 1
2 A discrete variable can be represented in a table: (k, x k, n k ) K k=1. (1) More completely: (k, x k, n k, f k ) K k=1 (2) The cumulative number of elements is s k = k l=1 n k with s K = n. The cumulative frequency Φ k = k l=1 f k. Below is a table representing a simple statistical series where Ω is 40 children and X : Ω R is their age in years. k x k n k s k f k Φ k ,1 0, ,2 0, ,35 0, ,30 0, ,05 1 The table can be represented by a bar graph: The empirical distribution function Φ(x) : R [0, 1] is defined by Φ(x) := the proportion of the range of X x. Thus Φ is monotone increasing with Φ( ) = 0, Φ(+ ) = 1, Φ(x k ) = Φ k between the values of the range of X Φ(x) is constant. The function Φ(x) can be deduced from the above bar graph of X. 2
3 The Stem and Leaf diagram, introduced by J.W. Tukey (1977), gives more information than a barplot. The decimal point is at the If the variable X is continuous rather than discrete, classes are prescribed so that it can be treated as if it where discrete. The number of classes, and the bounds on the classes are left to the software package, or chosen to make the graphics look good. A rule of thumb is to have n classes. The data can then be presented in a table of the form: [k, [b k 1; b k [, n k ] k=1,k such that K n k = n k=1 We define the amplitude of the class as a k := b k b k 1. To simplify, we will only consider constant amplitudes. Rather than a barplot representing the frequencies of X, we use a histogram. The height of the classe in the histogram usually corresponds to the number in the class. The frequency of the class can also be used. 3
4 The definition of the empirical distribution function does not change. One can think of the empirical distribution function as the integral of the histogram of X (using frequencies) just as a distribution function is the integral of the probability density function in probability. 1.2 Two variables Suppose that X : Ω ω 1 and Y : Ω ω 2, that is, using the equivalence Ω 1, n we have couples (x i, y i ) n. Let [X 1,..., X j,..., X J ] and [Y 1,..., Y j,..., Y J ] designate the states of X and Y respectively (quantitative or nominal). Let n jk be the number of elements with state X j and state Y k. 4
5 X Y Y 1 Y k Y K X 1 n 11 n 1k n 1K n x j n j1 n jk n jk n j x J n J1 n Jk n JK n J. n.1 n.k n.k n Table 1: Contingency table Where n j. = k n jk, n.k = j n jk, and n = j n j. = k n.k. The contingency table can be normalised to a frequency table by dividing each number by n to obtain: f jk = n jk n. The conditional frequencies are used to establish the independence of the variable X and Y. Conditional row profile for row j : n jk n j. Conditional column profile for column k : n jk n.k. 1.3 Summary statistics of a quantitative variable X Measures of the center Definition 2 The mode:= the class or center of the class with the highest frequency. The median : Q 2 is defined by Φ(Q 2 ).5 and for all x < Q 2, Φ(x) <.5. The arithmetic mean or average: x def = 1 n n x i. The geometric mean: GM def = ( n x i) 1/n. Note that if y i = x i + a, then ȳ = x + a Measures of dispersion The range: R = x max x min. The interquartile range: IQR def = Q 3 Q 1. The variance: VarX = 1 n (x i x) 2 n ( n ) = 1 x 2 i x 2. n 5
6 To estimate the variance an unbiased estimator is used: s 2 X = 1 n 1 n (x i x) 2. The kth moment: µ k := 1 n n (x i x) k. Standard deviation: σ X = Var X = 1 n n (x i x) 2. The coefficient of variation: c X = σ X x. etc. Note that Var(aX + b) = a 2 Var X. 2 Graphics Graphics should clarify thus be readable. This means a graph should be titled and scales should be clear Nominative variables Column graph Bar graph Pie chart Discrete quantitative variable Bar graph. Empirical distribution function. 2.1 Continuous quantitative variable Histogramme Empirical distribution function Two quantitative variables Scatterplot. 6
7 2.2 Link between two variables The scalar product of two vectors X and Y in R n is defined as (X, Y ) = n x iy i. Note that X 2 = (X, X). We also have (X, Y ) = X Y cos θ where θ is the angle between X, and Y Two quantitative variables X et Y The data (i, x(i), y(i)) n can be represented as vectors X and Y Rn. def We also define X = x(1,..., 1). We have Var X = 1 n (X X, X X) = 1 n X X 2. We compute The covariance: Cov (X, Y ) = 1 n n (x i x)(y i ȳ) = [ 1 n n x ] iy i xȳ = 1 n (X X, Y Ȳ ). Property: Var (X + Y ) = Var X+Var Y + 2Cov(X, Y ). The Bravais Pearson correlation coefficient. r(x, Y ) = Cov(X, Y ) σ X σ Y = cos θ. Where θ is the angle between Y and Ŷ (see below). R 2 = Ŷ Ȳ 2 / Y Ȳ 2 = cos 2 θ, Where θ is as above. Linear regression or least squares regression consists of projecting Y Ȳ onto the span of X X. In other words we solve the problem: min Y Ȳ a(x X) 2. (3) a R to obtain the projection of Y Ȳ onto the space spanned by X X. The projection â(x X) is called Ŷ Ȳ. Equivalently we suppose y i = f(x i ) = ax i + b + ɛ i and compute â and ˆb such that n ɛ2 i = n (âx i + ˆb) 2 is minimal. One obtains Remark 1 The equation Cov (X,Y ) Var X â = b = ȳ â x. Ŷ = â(x x) + ȳ Defines the linear regression line which goes through the point ( x, ȳ). From the contingency table one can calculate: 7
8 The mean: x := 1 n I n i.x i. La variance: σ 2 (X) := 1 n I n i.(x i x) 2. The conditional means : The conditional variances: x j := 1 n.j σ 2 j (X) := 1 n.j I n ij x i. I n ij (x i x j ) 2. The variance of the conditional means of X or explained variance: σ 2 ( x j ) := 1 n J n.j ( x j x) 2. The average conditional variance of X or residual variance: σ 2 j (x) := 1 n j=1 J n.j σj 2 (X). j=1 The correlation quotients: η 2 X/Y and η Y/X: η 2 X/Y : = σ2 ( x j ) σ 2 (X) η 2 Y/X : = σ2 (ȳ i ) σ 2 (Y ) The correlation quotients are used to test for a functional relationship (not necessarily linear) between X and Y. 8
9 2.2.2 Time Series Indices: Elementary indices: I n 0 = P n/p 0. Properties: circularity: I n 0 = I n n I n 0 reversibility: I 0 n = 1/I n 0. Chains: I n 0 = I n n 1 I n 1 n 2 I 1. 0 Synthetic indices: Let {B i } N, be N goods, with respective prices: {p i 0 }N at t = t 0 and {p i n} N at t = t n bought with respective quantities {q0 i } at t = t 0 and {qn} i at t = t n. Simple average index: I n 0 = N pi n N pi 0 Average of indices: The Laspeyre index: I n 0 = 1 N N I i n 0 L n 0 := N pi nq i 0 N pi 0 qi 0 = i p i n ρ i p i. 0 The Paasche index: P n 0 := N pi nq i n N pi 0 qi n = 1. i ρ i pi 0 p i n A simple time series is a data set of the form: (t, x t ) T t=1. Let T t := the trend (4) s t := seasonal variations (5) c t := cyclical variations (6) ɛ t := random variations. (7) We consider the model: x t = T t + s t + c t + ɛ t. 9
10 In certain cases x t = T t s t c t ɛ t In which case we consider Log x t to reduce to the first case. We suppose s t to be periodic with 0 average, that is, there is a k such that The moving averages k : s t = s t+k t k s t+i = 0 t. i=0 for k = 2p + 1, for k = 2p. Forecasting: y t := x t p + x t p x t + + x t+p 2p + 1 y t := x t p/2 + x t p x t+p 1 + x t+p /2 2p Step 1: linear regression of y t as a function of t, ŷ t = ât + ˆb. (8) (9) Step 2: estimate seasonal values, ŝ t = x t ˆT t = x t ŷ t Step 3: because we have estimated s t several times, we take the average Thus ŝ t := 1 I I ŝ t+ik. ˆx t = ŷ t + ŝ t Two nominal variables From the contingency table (cf 1.4) we may test whether two variables X and Y are independent or not with the χ 2 test of Independence s: χ 2 = L l=1 c=1 C (n lc E lc ) 2 where n lc := the observed number E lc := n l.n. c n. Note that E lc is the expected number if the variables are independent. 10 E lc
11 Remark 2 1. If X and Y have no interdependence, then one expects that χ 2 = χ 2 depends on n, L, and C and lim n χ 2 is a known distribution. If C + L 2, we have the following bounds on χ 2 : 0 χ 2 n[min(l, C) 1]. For large n, we approximate χ 2 with the limit distribution to test the hypothesis: H 0 : X and Y are independent. H 0 is rejected if the χ 2 statistic is sufficiently large. More precisely, we fix our tolerance of error, usually at either 5% or 1%. A bound is computed such that the probability of χ 2 exceeding the bound under H 0 is our tolerated error. If this bound is exceeded by our computed statistic H 0 is rejected. 11
Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one.
Study Sheet December 10, 2017 The course PDF has been updated (6/11). Read the new one. 1 Definitions to know The mode:= the class or center of the class with the highest frequency. The median : Q 2 is
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The
More informationClass 11 Maths Chapter 15. Statistics
1 P a g e Class 11 Maths Chapter 15. Statistics Statistics is the Science of collection, organization, presentation, analysis and interpretation of the numerical data. Useful Terms 1. Limit of the Class
More informationDETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics
DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and
More informationAsymptotic Statistics-III. Changliang Zou
Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (
More informationChapter 3. Data Description
Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationBivariate Relationships Between Variables
Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods
More informationContents. Acknowledgments. xix
Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables
More informationSTAT 4385 Topic 03: Simple Linear Regression
STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis
More informationChapter 1:Descriptive statistics
Slide 1.1 Chapter 1:Descriptive statistics Descriptive statistics summarises a mass of information. We may use graphical and/or numerical methods Examples of the former are the bar chart and XY chart,
More informationSociology 6Z03 Review I
Sociology 6Z03 Review I John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review I Fall 2016 1 / 19 Outline: Review I Introduction Displaying Distributions Describing
More informationSTP 420 INTRODUCTION TO APPLIED STATISTICS NOTES
INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make
More informationReview for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling
Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included
More informationReview of Statistics
Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and
More information3 Joint Distributions 71
2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More informationM & M Project. Think! Crunch those numbers! Answer!
M & M Project Think! Crunch those numbers! Answer! Chapters 1-2 Exploring Data and Describing Location in a Distribution Univariate Data: Length Stemplot and Frequency Table Stem (Units Digit) 0 1 1 Leaf
More informationMa 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA
Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA March 6, 2017 KC Border Linear Regression II March 6, 2017 1 / 44 1 OLS estimator 2 Restricted regression 3 Errors in variables 4
More informationUnit 2. Describing Data: Numerical
Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient
More informationVariance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18
Variance reduction p. 1/18 Variance reduction Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Variance reduction p. 2/18 Example Use simulation to compute I = 1 0 e x dx We
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationDr. Babasaheb Ambedkar Marathwada University, Aurangabad. Syllabus at the F.Y. B.Sc. / B.A. In Statistics
Dr. Babasaheb Ambedkar Marathwada University, Aurangabad Syllabus at the F.Y. B.Sc. / B.A. In Statistics With effect from the academic year 2009-2010 Class Semester Title of Paper Paper Per week Total
More informationSTAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression
STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test
More informationCIVL 7012/8012. Collection and Analysis of Information
CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real
More informationSTATISTICS; An Introductory Analysis. 2nd hidition TARO YAMANE NEW YORK UNIVERSITY A HARPER INTERNATIONAL EDITION
2nd hidition TARO YAMANE NEW YORK UNIVERSITY STATISTICS; An Introductory Analysis A HARPER INTERNATIONAL EDITION jointly published by HARPER & ROW, NEW YORK, EVANSTON & LONDON AND JOHN WEATHERHILL, INC.,
More informationSimple Linear Regression for the MPG Data
Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationLinear Models and Estimation by Least Squares
Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationMarquette University MATH 1700 Class 5 Copyright 2017 by D.B. Rowe
Class 5 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 2017 by D.B. Rowe 1 Agenda: Recap Chapter 3.2-3.3 Lecture Chapter 4.1-4.2 Review Chapter 1 3.1 (Exam
More informationApplication of Variance Homogeneity Tests Under Violation of Normality Assumption
Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationPreliminary Statistics course. Lecture 1: Descriptive Statistics
Preliminary Statistics course Lecture 1: Descriptive Statistics Rory Macqueen (rm43@soas.ac.uk), September 2015 Organisational Sessions: 16-21 Sep. 10.00-13.00, V111 22-23 Sep. 15.00-18.00, V111 24 Sep.
More informationBivariate Paired Numerical Data
Bivariate Paired Numerical Data Pearson s correlation, Spearman s ρ and Kendall s τ, tests of independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html
More informationTransition Passage to Descriptive Statistics 28
viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of
More informationDescriptive Data Summarization
Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationBinary choice 3.3 Maximum likelihood estimation
Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood
More informationEconometrics Summary Algebraic and Statistical Preliminaries
Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L
More informationEmpirical Research Methods
Institut für Politikwissenschaft Schwerpunkt qualitative empirische Sozialforschung Goethe-Universität Frankfurt Fachbereich 03 Prof. Dr. Claudius Wagemann Empirical Research Methods 25 June 2013 Quantitative
More informationFinding Relationships Among Variables
Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis
More informationVectors and Matrices Statistics with Vectors and Matrices
Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc
More informationQuantitative Methods Chapter 0: Review of Basic Concepts 0.1 Business Applications (II) 0.2 Business Applications (III)
Quantitative Methods Chapter 0: Review of Basic Concepts 0.1 Business Applications (II) 0.1.1 Simple Interest 0.2 Business Applications (III) 0.2.1 Expenses Involved in Buying a Car 0.2.2 Expenses Involved
More informationMIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability
STA301- Statistics and Probability Solved MCQS From Midterm Papers March 19,2012 MC100401285 Moaaz.pk@gmail.com Mc100401285@gmail.com PSMD01 MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability
More informationStatistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression
Model 1 2 Ordinary Least Squares 3 4 Non-linearities 5 of the coefficients and their to the model We saw that econometrics studies E (Y x). More generally, we shall study regression analysis. : The regression
More informationChapter 3: Regression Methods for Trends
Chapter 3: Regression Methods for Trends Time series exhibiting trends over time have a mean function that is some simple function (not necessarily constant) of time. The example random walk graph from
More informationReview for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data
Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationTopic 21 Goodness of Fit
Topic 21 Goodness of Fit Contingency Tables 1 / 11 Introduction Two-way Table Smoking Habits The Hypothesis The Test Statistic Degrees of Freedom Outline 2 / 11 Introduction Contingency tables, also known
More information11 Hypothesis Testing
28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes
More informationFinal Exam. Economics 835: Econometrics. Fall 2010
Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each
More informationMATH4427 Notebook 4 Fall Semester 2017/2018
MATH4427 Notebook 4 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH4427 Notebook 4 3 4.1 K th Order Statistics and Their
More informationSTATISTICS 479 Exam II (100 points)
Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the
More informationMATH 1150 Chapter 2 Notation and Terminology
MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the
More informationLearning Objectives for Stat 225
Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:
More information7.0 Lesson Plan. Regression. Residuals
7.0 Lesson Plan Regression Residuals 1 7.1 More About Regression Recall the regression assumptions: 1. Each point (X i, Y i ) in the scatterplot satisfies: Y i = ax i + b + ɛ i where the ɛ i have a normal
More informationFrom Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...
From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...
More informationCorrelation analysis. Contents
Correlation analysis Contents 1 Correlation analysis 2 1.1 Distribution function and independence of random variables.......... 2 1.2 Measures of statistical links between two random variables...........
More informationSTA 302f16 Assignment Five 1
STA 30f16 Assignment Five 1 Except for Problem??, these problems are preparation for the quiz in tutorial on Thursday October 0th, and are not to be handed in As usual, at times you may be asked to prove
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationAfter completing this chapter, you should be able to:
Chapter 2 Descriptive Statistics Chapter Goals After completing this chapter, you should be able to: Compute and interpret the mean, median, and mode for a set of data Find the range, variance, standard
More information11. Regression and Least Squares
11. Regression and Least Squares Prof. Tesler Math 186 Winter 2016 Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter 2016 1 / 23 Regression Given n points ( 1, 1 ), ( 2, 2 ),..., we want to determine
More informationLast Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics
Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different
More informationMAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik
MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationData presentation and descriptive statistics
Data presentation and descriptive statistics Dr. Paola Grosso SNE research group p.grosso@uva.nl Instructions for use I do talk fast, so ask me to repeat if something is not clear; I made an effort to
More informationRandom Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R
In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample
More informationECN221 Exam 1 VERSION B Fall 2017 (Modules 1-4), ASU-COX VERSION B
ECN221 Exam 1 VERSION B Fall 2017 (Modules 1-4), ASU-COX VERSION B Choose the best answer. Do not write letters in the margin or communicate with other students in any way; if you do you will receive a
More informationDirection: This test is worth 250 points and each problem worth points. DO ANY SIX
Term Test 3 December 5, 2003 Name Math 52 Student Number Direction: This test is worth 250 points and each problem worth 4 points DO ANY SIX PROBLEMS You are required to complete this test within 50 minutes
More informationDescriptive statistics Maarten Jansen
Descriptive statistics Maarten Jansen Overview 1. Classification of observational values 2. Visualisation methods 3. Guidelines for good visualisation c Maarten Jansen STAT-F-413 Descriptive statistics
More informationFrom last time... The equations
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationAIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)
AIM HIGH SCHOOL Curriculum Map 2923 W. 12 Mile Road Farmington Hills, MI 48334 (248) 702-6922 www.aimhighschool.com COURSE TITLE: Statistics DESCRIPTION OF COURSE: PREREQUISITES: Algebra 2 Students will
More informationStatistical Data Analysis
DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the
More informationDr. Allen Back. Sep. 23, 2016
Dr. Allen Back Sep. 23, 2016 Look at All the Data Graphically A Famous Example: The Challenger Tragedy Look at All the Data Graphically A Famous Example: The Challenger Tragedy Type of Data Looked at the
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationGlossary for the Triola Statistics Series
Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationThe regression model with one fixed regressor cont d
The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,
More informationStatistics I Chapter 2: Univariate data analysis
Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,
More informationSTATISTICS ( CODE NO. 08 ) PAPER I PART - I
STATISTICS ( CODE NO. 08 ) PAPER I PART - I 1. Descriptive Statistics Types of data - Concepts of a Statistical population and sample from a population ; qualitative and quantitative data ; nominal and
More informationMultivariate Random Variable
Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate
More informationZIMBABWE SCHOOL EXAMINATIONS COUNCIL (ZIMSEC) ADVANCED LEVEL SYLLABUS
ZIMBABWE SCHOOL EXAMINATIONS COUNCIL (ZIMSEC) ADVANCED LEVEL SYLLABUS Further Mathematics 9187 EXAMINATION SYLLABUS FOR 2013-2017 2 CONTENTS Page Aims.. 2 Assessment Objective.. 2 Scheme of Assessment
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationBIOL 4605/7220 CH 20.1 Correlation
BIOL 4605/70 CH 0. Correlation GPT Lectures Cailin Xu November 9, 0 GLM: correlation Regression ANOVA Only one dependent variable GLM ANCOVA Multivariate analysis Multiple dependent variables (Correlation)
More informationDiploma Part 2. Quantitative Methods. Examiner s Suggested Answers
Diploma Part Quantitative Methods Examiner s Suggested Answers Question 1 (a) The standard normal distribution has a symmetrical and bell-shaped graph with a mean of zero and a standard deviation equal
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More informationChapter 4. Displaying and Summarizing. Quantitative Data
STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range
More informationProbability reminders
CS246 Winter 204 Mining Massive Data Sets Probability reminders Sammy El Ghazzal selghazz@stanfordedu Disclaimer These notes may contain typos, mistakes or confusing points Please contact the author so
More informationCalculus first semester exam information and practice problems
Calculus first semester exam information and practice problems As I ve been promising for the past year, the first semester exam in this course encompasses all three semesters of Math SL thus far. It is
More information