Review of Statistics

Size: px
Start display at page:

Download "Review of Statistics"

Transcription

1 Review of Statistics

2 Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and correlation Central Limit Theorem Hypothesis testing z-test, p-value Simple Linear Regression

3 Statistical Methods Statistical Methods Descriptive Statistics Inferential Statistics

4 Descriptive Statistics Involves Collecting Data Presenting Data Characterizing Data Purpose Describe Data st Qtr 2nd Qtr 3rd Qtr 4th Qtr East West North

5 Inferential Statistics Involves Estimation Hypothesis Testing Population? Purpose Make Decisions About Population Characteristics

6 Descriptive Statistics

7 Mean Measure of central tendency Acts as Balance Point Affected by extreme values ( outliers ) Formula: X n Xi i = = = n X + X + + X n n

8 Median Measure of central tendency Middle value in ordered sequence If odd n, Middle Value of Sequence If even n, Average of 2 Middle Values Value that splits the distribution into two halves Not Affected by Extreme Values

9 Median (Example) Raw Data: Ordered: Position: Median = = 16

10 Mode Measure of Central Tendency Value That Occurs Most Often Not Affected by Extreme Values There May Be Several Modes Raw Data: Ordered:

11 Sample Variance S 2 = n i = 1 (X X) i n 1 2 n - 1 in denominator! (Use n if population variance) = (X X) + (X X) + + (X X) n n

12 Sample Standard Deviation S = S 2 = n i = 1 (X X ) i n 1 2 = (X X ) + (X X ) + + (X X ) n n

13 Probability

14 Event, Sample Space Event: one possible outcome Sample space: collection of all the possible events S = { } Probability of an outcome: proportion of times that the outcome occurs in the long run The complement of event A: includes all the events that are not part of the event A: Symbol A Event A { } Complement of A A { }

15 Properties of an Event 1. Mutually Exclusive Two outcomes that cannot occur at the same time Experiment: Observe gender of one person 2. Collectively Exhaustive One outcome in sample space must occur

16 Joint Events Joint event: Event that has two or more characteristics means intersection of event (set) A and event (set) B Example: A and B, (A B): Female, Under age 20

17 Compound Events Union of event A and event B ( A B ): Total area of the two circles A B contains all the outcomes which are part of event (set) A, part of event (set) B or part of both A and B means union of event A and event B

18 Compound Probability Addition Rule Used to Get Compound Probabilities for Unions of Events P(A OR B) = P(A B) = P(A) + P(B) - P(A B) For Mutually Exclusive Events: P(A OR B) = P(A B) = P(A) + P(B) Mutually Exclusive Events A B

19 Random variables Random variable numerical summary of a random outcome a function that assigns a numerical value to each simple event in a sample space Discrete or continuous random variables Discrete: only a discrete set of possible values => summarized by probability distribution: list of all possible values of the variables and the probability that each value will occur. Continuous: continuum of possible values => summarized by the probability density function (pdf)

20 Discrete Probability Distribution 1. List of pairs [ X i, P(X i ) ] X i = Value of Random Variable (Outcome) P(X i ) = Probability Associated with Value 2. Mutually exclusive (no overlap) 3. Collectively exhaustive (nothing left out) 4. 0 P(X i ) 1 5. Σ P(X i ) = 1

21 Joint Probability Using Contingency Table Event Event B 1 B 2 Total A 1 P(A1 B1) P(A1 B2) P(A 1 ) A 2 P(A2 B1) P(A2 B2) P(A2) Total P(B 1 ) P(B 2 ) 1 Conditional probability: P( AB1) P( A1 B1) PB ( 1) PA ( 2 B1) PB ( ) 1 1 Joint Probability Joint distribution: Marginal Probability Marginal distributions: Conditional distribution:

22 Contingency Table Example Joint Event: Draw 1 Card. Note Kind, Color Color Type Red Black Total Ace 2/52 2/52 4/52 Non-Ace 24/52 24/52 48/52 Total 26/52 26/52 52/52 P(Ace) P(Red) P(Ace AND Red)

23 Moments Discrete Case Moment: Summary of a certain aspect of a distribution Mean, Expected Value Mean of Probability Distribution Weighted Average of All Possible Values μ = E(X) = ΣX i P(X i ) Variance Weighted Average Squared Deviation about Mean σ 2 = E[ (X i μ) 2 ] = Σ (X i μ) 2 P(X i )

24 Statistical Independence When the outcome of one event (B) does not affect the probability of occurrence of another event (A), the events A and B are said to be statistically independent. Example: Toss a coin twice => no causality Condition for independence: Two events A and B are statistically independent if and only if (iff) P(A B) = P(A)

25 Bayes Theorem and Multiplication Rule Bayes Theorem P(A B) = P(A B) P(B) The difficult part is P(A B) Use above equation to derive P(A B) P(A and B) = P(A B) = P(A)P(B A) = P(B)P(A B) For independent events: P(A and B) = P(A B) = P(A)P(B)

26 Covariance Measures the joint variability of two random variables N σ XY = (X i μ X )(Y i μ Y )P(X i, Y i ) i=1 Can take any value in the real numbers Depends on units of measurement (e.g., dollars, cents, billions of dollars) Example: positive covariance = y and x are positively related; when y is above its mean, x tends to be above its mean; when y is below its mean, x tends to be below its mean.

27 Correlation Standardized covariance, takes values in [-1, 1] Does not depend on unit of measurement Correlation coefficient (ρ) formula: ρ = cov( XY ) σ σ X Y = σ σ X XY σ Covariance and correlation measure only linear dependence! Example: Cov(X,Y)=0 Does not necessarily imply that y and x are independent. They may be non-linearly related. But if X and Y are jointly normally distributed, then they are independent. Y

28 Sum of Two Random Variables Expected Value of the Sum of Two Random Variables E(X + Y) = E(X) +E(Y) Variance of the Sum of Two Random Variables Var (X + Y) = σ 2 = + σ 2 Y + 2σ XY σ 2 X X+Y

29 Continuous Probability Distributions - Normal Distribution Bell-Shaped, symmetrical Mean, median, mode are equal Infinite range 68% of the data are within 1 standard deviation of the mean 95% of the data are within 2 standard deviations of the mean In early 1800's, German mathematician and physicist Karl Gauss used it to analyze astronomical data, therefore known as Gaussian distribution. f(x) Mean, Median, Mode X

30 Normal Distribution Probability Density Function f( X) = 1 2 e π σ 1 2 X μ σ 2 f(x) = frequency of random variable X π = ; e = σ = population standard deviation X = value of random variable (- < X < ) μ = population mean

31 Effect of Varying Parameters (μ & σ) f(x) B A C X

32 Normal Distribution Probability Probability is the area under the curve! d Pc ( X d) = f ( x) dx? c f(x) c d X

33 Infinite Number of Normal Distribution Tables Normal distributions differ by mean & standard deviation. Each distribution would require its own table. f(x) X That s an infinite number!

34 Standardize the Normal Distribution Normal Distribution σ Z = X μ σ Standardized Normal Distribution σ z = 1 μ X μ Z = 0 Z One table!

35 Standardizing Example Normal Distribution σ = 10 Z = X μ = = 0.12 σ 10 Standardized Normal Distribution σ Z = 1 μ = X μ Z = 0.12 Z

36 Moments: Mean, Variance (Continuous Case) Mean, Expected Value Mean of probability distribution Weighted average of all possible values Variance μ = E(X) = X f(x) dx - Weighted average squared deviation about mean σ 2 = E[ (X μ) 2 ] = (X- μ) 2 f(x) dx -

37 Moments: Skewness, Kurtosis ( ) 3 Skewness: E X μ S = 3 Measures asymmetry in distribution σ The larger the absolute size of the skewness, the more asymmetric is distribution. A large positive value indicates a long right tail, and a large negative value indicates a long left tail. A zero value indicates symmetry around the mean. ( μ ) 4 E X Kurtosis: 4 σ Measures thickness of tails of a distribution A kurtosis above three indicates fat tails or leptokurtosis, relative to the normal, i.e. extreme events are more likely to occur. K =

38 Central Limit Theorem: Basic Idea As sample size gets large (n 30)... sample mean will have a normal distribution. X

39 Important Continuous Distributions All derived from normal distribution 2 χ distribution: arises from squared normal random variables, t distribution: arises from ratios of normal 2 and χ variables F distribution: arises 2 from ratios of χ variables. 2 χ t distribution (red), normal distribution (blue) distribution F distribution

40 Fundamentals of Hypothesis Testing

41 Identifying Hypotheses 1. Question, e.g. test that the population mean is equal to 3 2. State the question statistically (H 0 : μ = 3) 3. State its opposite statistically (H 1 : μ 3) Hypotheses are mutually exclusive & exhaustive Sometimes it is easier to form the alternative hypothesis first. 4. Choose level of significance α Typical values are 0.01, 0.05, 0.10 Rejection region of sampling distribution: the unlikely values of sample statistic if null hypothesis is true

42 Identifying Hypotheses: Examples 1. Is the population average amount of TV viewing 12 hours? μ = 12 μ 12 H 0 : μ = 12 H 1 : μ Is the population average amount of TV viewing different from 12 hours? μ 12 μ = 12 H 0 : μ = 12 H 1 : μ 12

43 Hypothesis Testing: Basic Idea Sampling Distribution It is unlikely that we would get a sample mean of this value if in fact this were the population mean.... Therefore, we reject the null hypothesis that μ = μ = 50 H 0 sample mean

44 Example: Z-test statistic (σ known) 1. Convert Sample Statistic (e.g., ) to Standardized Z Variable X μx X μ Z = = σ σ x n 2. Compare to Critical Z Values If Z-test statistic falls in critical region, reject H 0 ; Otherwise do not reject H 0 X

45 p-value Probability of obtaining a test statistic more extreme ( or ) than actual sample value given H 0 is true Smallest value of α for which H 0 can be rejected Used to make rejection decision If p value α, do not reject H 0 If p value < α, reject H 0

46 One-Tailed Test: Rejection Region H 0 : μ 0 H 1 : μ < 0 H 0 : μ 0 H 1 : μ > 0 Reject H 0 α Reject H 0 α 0 Must be significantly below μ. Z 0 Z Here: Small values don t contradict H 0.

47 One-Tailed Z Test: Finding Critical Z Values What Is Z Given α = 0.025? σ Z = Z α /2 =.025 Standardized Normal Probability Table (Portion).06 Z

48 Two-Tailed Test: Rejection Regions Sampling Distribution Rejection Region 1/2 α 1 - α Nonrejection Region H 0 : μ = 0 H 1 : μ 0 Level of Confidence Rejection Region 1/2 α Critical Value H 0 Value Critical Value Sample Statistic

49 t-test, F-test Test statistic may not be normally distributed => z-test not applicable Examples: Variance unknown, but estimated. Hypothesis that the slope of a regression line differs significantly from zero. => t-test Hypothesis that the standard deviations of two normally distributed populations are equal. => F-test

50 Jarque-Bera test Assesses whether a given sample of data is normally distributed. Aggregates information in the data about both skewness and kurtosis. Test of the hypothesis that S = 0 and K = 3, based on Ŝ and ˆK. Test statistic: T ˆ 2 1 ( ˆ ) 2 JB = S + K (here T is the number of observations) Under the null hypothesis of independent normallydistributed observations, the Jarque-Bera statistic is 2 distributed in large samples as a χ random variable with 2 degrees of freedom.

51 Simple Linear Regression

52 Simple Linear Regression Model y-intercept slope random iid 2 error ε (0, σ ) t y = + x + t 0 1 t t β β ε dependent (response) variable independent (explanatory) variable

53 Linear Regression Assumptions 1. x is exogenously determined 2. ε t are iid(0,σ 2 ) (iid = independently and identically distributed ) Zero mean Independence of errors (no autocorrelation) Constant variance (homoscedasticity) More things to think about: Normality of ε t (if not satisfied, inference procedures only asymptotically valid) Model specification (e.g. linearity, β 1 constant over time?)

54 Simple Linear Regression Model y y = β + β x + ε t 0 1 t t ε t = disturbance observed value observed value ( ) E yx = β + β x * * 0 1 x

55 -- Sample Linear Regression Model y y = b + b x + e i 0 1 i i e i = Random Error y = b + b x i 0 1 i Unsampled Observation Observed Value x

56 Ordinary Least Squares OLS minimizes sum of squared residuals min ˆ β, ˆ β 0 1 y T t= 1 ( y ˆ ˆ ) t β0 β1xt 2 y = β + β x + ε t 0 1 t t T ( ) 2 2 y yˆ = e t t t t= 1 t= 1 T predicted value e 2 e 4 e 1 e 3 fitted value (in-sample forecast) y = ˆ β + ˆ β x ˆt 0 1 x t

57 On Thursday: Evaluating the Model 1. Examine variation measures coefficient of determination ( goodness of fit ) standard error of the estimate 2. Analyze residuals e serial correlation 3. Test coefficients for significance β y ˆt = ˆ β + ˆ β x 0 1 t

58 Random Error Variation 1.Variation of Actual Y from Predicted Y 2. Measured by Standard Error of Estimate Sample Standard Deviation of e Denoted S YX 3. Affects Several Factors Parameter Significance Prediction Accuracy

59 Measures of Variation in Regression 1.Total Sum of Squares (SST) Measures variation of observed Y i around the mean, Y 2.Explained Variation (SSR) Variation due to relationship between X & Y 3.Unexplained Variation (SSE) Variation due to other factors

60 Variation Measures Y Y i Unexplained Sum of Squares (Y i - Y ^ i ) 2 Total Sum of Squares (Y i -Y) 2 X i Explained Sum of Squares (Y ^ i -Y) 2 Y = b + b X i Y 0 1 X i

61 Coefficient of Determination Proportion of Variation Explained by Relationship Between X & Y r 2 Explained Variation SSR = = Total Variation SST = n n b Y + b XY n(y) 0 i 1 i i i = 1 i = 1 n 2 Yi i = 1 n(y) 2 ˆ 2 0 r 2 1

62 Coefficients of Determination (r2) and Correlation (r) Yr 2 = 1, r = +1 Y r 2 = 1, r = -1 ^ Y^ i = b 0 + b 1 X i X Y i = b 0 + b 1 X i Yr 2 =.8, r = +0.9 Y r 2 = 0, r = 0 X Y^ i = b 0 + b 1 X i X Y^ i = b 0 + b 1 X i X

63 Standard Error of Estimate 2 2 ) ˆ ( = = = = = = n Y X b Y b Y S n Y Y S n i n i n i i i i i YX n i i i YX

64 Residual Analysis 1.Graphical Analysis of Residuals Plot residuals vs. X i values Residuals mean errors Difference between actual Y i & predicted Y i 2.Purposes Examine functional form (linear vs. non-linear Model) Evaluate violations of assumptions

65 Test of Slope Coefficient for Significance 1.Tests If There Is a Linear Relationship Between X & Y 2.Hypotheses H 0 : β 1 = 0 (No Linear Relationship) H 1 : β 1 0 (Linear Relationship) 3.Test Statistic b = 1 t β 1 n 2 S b 1 where S = b 1 S YX n X 2 n( X ) 2 i i = 1

Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions

Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions 1999 Prentice-Hall, Inc. Chap. 4-1 Chapter Topics Basic Probability Concepts: Sample

More information

Statistics for Managers Using Microsoft Excel (3 rd Edition)

Statistics for Managers Using Microsoft Excel (3 rd Edition) Statistics for Managers Using Microsoft Excel (3 rd Edition) Chapter 4 Basic Probability and Discrete Probability Distributions 2002 Prentice-Hall, Inc. Chap 4-1 Chapter Topics Basic probability concepts

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 3 Probability Contents 1. Events, Sample Spaces, and Probability 2. Unions and Intersections 3. Complementary Events 4. The Additive Rule and Mutually Exclusive

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Semester 2, 2015/2016

Semester 2, 2015/2016 ECN 3202 APPLIED ECONOMETRICS 2. Simple linear regression B Mr. Sydney Armstrong Lecturer 1 The University of Guyana 1 Semester 2, 2015/2016 PREDICTION The true value of y when x takes some particular

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248) AIM HIGH SCHOOL Curriculum Map 2923 W. 12 Mile Road Farmington Hills, MI 48334 (248) 702-6922 www.aimhighschool.com COURSE TITLE: Statistics DESCRIPTION OF COURSE: PREREQUISITES: Algebra 2 Students will

More information

Bivariate distributions

Bivariate distributions Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations Basics of Experimental Design Review of Statistics And Experimental Design Scientists study relation between variables In the context of experiments these variables are called independent and dependent

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Lecture 2: Review of Probability

Lecture 2: Review of Probability Lecture 2: Review of Probability Zheng Tian Contents 1 Random Variables and Probability Distributions 2 1.1 Defining probabilities and random variables..................... 2 1.2 Probability distributions................................

More information

Can you tell the relationship between students SAT scores and their college grades?

Can you tell the relationship between students SAT scores and their college grades? Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

What is a Hypothesis?

What is a Hypothesis? What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population mean Example: The mean monthly cell phone bill in this city is μ = $42 population proportion Example:

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. 12er12 Chapte Bivariate i Regression (Part 1) Bivariate Regression Visual Displays Begin the analysis of bivariate data (i.e., two variables) with a scatter plot. A scatter plot - displays each observed

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

What is Probability? Probability. Sample Spaces and Events. Simple Event

What is Probability? Probability. Sample Spaces and Events. Simple Event What is Probability? Probability Peter Lo Probability is the numerical measure of likelihood that the event will occur. Simple Event Joint Event Compound Event Lies between 0 & 1 Sum of events is 1 1.5

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION. Unit : I - V

KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION. Unit : I - V KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION Unit : I - V Unit I: Syllabus Probability and its types Theorems on Probability Law Decision Theory Decision Environment Decision Process Decision tree

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

5.1 Model Specification and Data 5.2 Estimating the Parameters of the Multiple Regression Model 5.3 Sampling Properties of the Least Squares

5.1 Model Specification and Data 5.2 Estimating the Parameters of the Multiple Regression Model 5.3 Sampling Properties of the Least Squares 5.1 Model Specification and Data 5. Estimating the Parameters of the Multiple Regression Model 5.3 Sampling Properties of the Least Squares Estimator 5.4 Interval Estimation 5.5 Hypothesis Testing for

More information

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments /4/008 Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University C. A Sample of Data C. An Econometric Model C.3 Estimating the Mean of a Population C.4 Estimating the Population

More information

Chapter 2. Probability

Chapter 2. Probability 2-1 Chapter 2 Probability 2-2 Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance with certainty. Examples: rolling a die tossing

More information

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them.

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them. Sample Problems 1. True or False Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them. (a) The sample average of estimated residuals

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Statistics Introductory Correlation

Statistics Introductory Correlation Statistics Introductory Correlation Session 10 oscardavid.barrerarodriguez@sciencespo.fr April 9, 2018 Outline 1 Statistics are not used only to describe central tendency and variability for a single variable.

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T,

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T, Regression Analysis The multiple linear regression model with k explanatory variables assumes that the tth observation of the dependent or endogenous variable y t is described by the linear relationship

More information

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr. Topic 2: Probability & Distributions ECO220Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit utoronto.ca November 21, 2017 Dr. Nick

More information

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered) Test 3 Practice Test A NOTE: Ignore Q10 (not covered) MA 180/418 Midterm Test 3, Version A Fall 2010 Student Name (PRINT):............................................. Student Signature:...................................................

More information

Regression Analysis II

Regression Analysis II Regression Analysis II Measures of Goodness of fit Two measures of Goodness of fit Measure of the absolute fit of the sample points to the sample regression line Standard error of the estimate An index

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

LECTURE 5 HYPOTHESIS TESTING

LECTURE 5 HYPOTHESIS TESTING October 25, 2016 LECTURE 5 HYPOTHESIS TESTING Basic concepts In this lecture we continue to discuss the normal classical linear regression defined by Assumptions A1-A5. Let θ Θ R d be a parameter of interest.

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Correlation. A statistics method to measure the relationship between two variables. Three characteristics Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction

More information

Ch. 1: Data and Distributions

Ch. 1: Data and Distributions Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

4.1 Least Squares Prediction 4.2 Measuring Goodness-of-Fit. 4.3 Modeling Issues. 4.4 Log-Linear Models

4.1 Least Squares Prediction 4.2 Measuring Goodness-of-Fit. 4.3 Modeling Issues. 4.4 Log-Linear Models 4.1 Least Squares Prediction 4. Measuring Goodness-of-Fit 4.3 Modeling Issues 4.4 Log-Linear Models y = β + β x + e 0 1 0 0 ( ) E y where e 0 is a random error. We assume that and E( e 0 ) = 0 var ( e

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects Economics 113 Simple Regression Models Simple Regression Assumptions Simple Regression Derivation Changing Units of Measurement Nonlinear effects OLS and unbiased estimates Variance of the OLS estimates

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying

More information

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES VARIABLE Studying the behavior of random variables, and more importantly functions of random variables is essential for both the

More information

Econ 325: Introduction to Empirical Economics

Econ 325: Introduction to Empirical Economics Econ 325: Introduction to Empirical Economics Lecture 2 Probability Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 3-1 3.1 Definition Random Experiment a process leading to an uncertain

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

LECTURE 5. Introduction to Econometrics. Hypothesis testing

LECTURE 5. Introduction to Econometrics. Hypothesis testing LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation

More information

STA301- Statistics and Probability Solved Subjective From Final term Papers. STA301- Statistics and Probability Final Term Examination - Spring 2012

STA301- Statistics and Probability Solved Subjective From Final term Papers. STA301- Statistics and Probability Final Term Examination - Spring 2012 STA30- Statistics and Probability Solved Subjective From Final term Papers Feb 6,03 MC004085 Moaaz.pk@gmail.com Mc004085@gmail.com PSMD0 STA30- Statistics and Probability Final Term Examination - Spring

More information

Origins of Probability Theory

Origins of Probability Theory 1 16.584: INTRODUCTION Theory and Tools of Probability required to analyze and design systems subject to uncertain outcomes/unpredictability/randomness. Such systems more generally referred to as Experiments.

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx INDEPENDENCE, COVARIANCE AND CORRELATION Independence: Intuitive idea of "Y is independent of X": The distribution of Y doesn't depend on the value of X. In terms of the conditional pdf's: "f(y x doesn't

More information

Probability Dr. Manjula Gunarathna 1

Probability Dr. Manjula Gunarathna 1 Probability Dr. Manjula Gunarathna Probability Dr. Manjula Gunarathna 1 Introduction Probability theory was originated from gambling theory Probability Dr. Manjula Gunarathna 2 History of Probability Galileo

More information

Simple Linear Regression: The Model

Simple Linear Regression: The Model Simple Linear Regression: The Model task: quantifying the effect of change X in X on Y, with some constant β 1 : Y = β 1 X, linear relationship between X and Y, however, relationship subject to a random

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1 Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor

More information

Lecture 9: Linear Regression

Lecture 9: Linear Regression Lecture 9: Linear Regression Goals Develop basic concepts of linear regression from a probabilistic framework Estimating parameters and hypothesis testing with linear models Linear regression in R Regression

More information

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Spatial Regression 3. Review - OLS and 2SLS Luc Anselin http://spatial.uchicago.edu OLS estimation (recap) non-spatial regression diagnostics endogeneity - IV and 2SLS OLS Estimation (recap) Linear Regression

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Statistics Primer. A Brief Overview of Basic Statistical and Probability Principles. Essential Statistics for Data Analysts Using Excel

Statistics Primer. A Brief Overview of Basic Statistical and Probability Principles. Essential Statistics for Data Analysts Using Excel Statistics Primer A Brief Overview of Basic Statistical and Probability Principles Liberty J. Munson, PhD 9/19/16 Essential Statistics for Data Analysts Using Excel Table of Contents What is a Variable?...

More information

Sample Space: Specify all possible outcomes from an experiment. Event: Specify a particular outcome or combination of outcomes.

Sample Space: Specify all possible outcomes from an experiment. Event: Specify a particular outcome or combination of outcomes. Chapter 2 Introduction to Probability 2.1 Probability Model Probability concerns about the chance of observing certain outcome resulting from an experiment. However, since chance is an abstraction of something

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Econometrics. 4) Statistical inference

Econometrics. 4) Statistical inference 30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution

More information

1 INFO Sep 05

1 INFO Sep 05 Events A 1,...A n are said to be mutually independent if for all subsets S {1,..., n}, p( i S A i ) = p(a i ). (For example, flip a coin N times, then the events {A i = i th flip is heads} are mutually

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

Course 4 Solutions November 2001 Exams

Course 4 Solutions November 2001 Exams Course 4 Solutions November 001 Exams November, 001 Society of Actuaries Question #1 From the Yule-Walker equations: ρ φ + ρφ 1 1 1. 1 1+ ρ ρφ φ Substituting the given quantities yields: 0.53 φ + 0.53φ

More information

Appendix A : Introduction to Probability and stochastic processes

Appendix A : Introduction to Probability and stochastic processes A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of

More information

02 Background Minimum background on probability. Random process

02 Background Minimum background on probability. Random process 0 Background 0.03 Minimum background on probability Random processes Probability Conditional probability Bayes theorem Random variables Sampling and estimation Variance, covariance and correlation Probability

More information

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2017 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information