Math Review Sheet, Fall 2008

Similar documents
Learning Objectives for Stat 225

1 Probability and Random Variables

Review. December 4 th, Review

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Random Variables and Their Distributions

Practice Problems Section Problems

This does not cover everything on the final. Look at the posted practice problems for other topics.

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

Chapter 2. Continuous random variables

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Class 26: review for final exam 18.05, Spring 2014

Recitation 2: Probability

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

L2: Review of probability and statistics

Design of Engineering Experiments

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Review: mostly probability and some statistics

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture 1: Probability Fundamentals

1 Random Variable: Topics

ECE 302 Division 2 Exam 2 Solutions, 11/4/2009.

Sociology 6Z03 Review II

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Review of Probabilities and Basic Statistics

Counting principles, including permutations and combinations.

Fundamentals of Applied Probability and Random Processes

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

18.05 Practice Final Exam

Test 1 Review. Review. Cathy Poliak, Ph.D. Office in Fleming 11c (Department Reveiw of Mathematics University of Houston Exam 1)

Joint Probability Distributions, Correlations

Lecture 1: Bayesian Framework Basics

, 0 x < 2. a. Find the probability that the text is checked out for more than half an hour but less than an hour. = (1/2)2

Probability and Distributions

Review for Exam Spring 2018

SDS 321: Introduction to Probability and Statistics

Institute of Actuaries of India

Chapter 4: Continuous Random Variables and Probability Distributions

Mathematical statistics

The Binomial distribution. Probability theory 2. Example. The Binomial distribution

CONTINUOUS RANDOM VARIABLES

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Chapter 2. Probability

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Review of Statistics

Probability Review. Gonzalo Mateos

ECO220Y Continuous Probability Distributions: Uniform and Triangle Readings: Chapter 9, sections

ECN221 Exam 1 VERSION B Fall 2017 (Modules 1-4), ASU-COX VERSION B

Midterm Exam 1 Solution

the amount of the data corresponding to the subinterval the width of the subinterval e x2 to the left by 5 units results in another PDF g(x) = 1 π

Appendix A : Introduction to Probability and stochastic processes

M378K In-Class Assignment #1

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

MATH : EXAM 2 INFO/LOGISTICS/ADVICE

Week 2: Review of probability and statistics

Mathematical statistics

2.3 Analysis of Categorical Data

6.1 Moment Generating and Characteristic Functions

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

STAT Chapter 5 Continuous Distributions

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

Contents 1. Contents

Ch. 5 Joint Probability Distributions and Random Samples

Probability Density Functions

STAT J535: Introduction

Probability and Stochastic Processes

Brief Review of Probability

Probability Theory and Statistics. Peter Jochumzen

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

Mathematical statistics

Bivariate distributions

Brief Review of Probability

Lecture 11. Probability Theory: an Overveiw

CHAPTER 10 Comparing Two Populations or Groups

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Probability review. September 11, Stoch. Systems Analysis Introduction 1

Math438 Actuarial Probability

Lecture 1: Review on Probability and Statistics

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

This paper is not to be removed from the Examination Halls

Introduction to Statistical Inference Self-study

Subject CS1 Actuarial Statistics 1 Core Principles

The Union and Intersection for Different Configurations of Two Events Mutually Exclusive vs Independency of Events

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

Ch. 1: Data and Distributions

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.

Chapter 1: Revie of Calculus and Probability

GOV 2001/ 1002/ Stat E-200 Section 1 Probability Review

IE 230 Probability & Statistics in Engineering I. Closed book and notes. 60 minutes.

REVIEW: Midterm Exam. Spring 2012

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

Probability Review. Chao Lan

Experimental Design and Statistics - AGA47A

Basics of Stochastic Modeling: Part II

ECE 353 Probability and Random Signals - Practice Questions

Class 8 Review Problems 18.05, Spring 2014

Transcription:

1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the following forms: By graph Histogram Stemplot Bar graph Pie chart Summary numbers (condensed information): 1. Sample mean or median (the center) 2. Sample variance or sample standard deviation, or quartiles (the spread) You need to know the formulas for these quantities (special attention to the formula for s. Mathematical formula Probability A probability space (system) is defined through a triplet that includes a sample space, a collection of events and a probability measure. Make sure you understand the concepts of outcomes and events (corresponding to elements and subsets for a set). An outcome is an event but an event is not limited to an outcome. When we talk about probability of something, we refer to events, which contain outcomes but can be more complicated since they combine different outcomes in different ways. Some of the crucial concepts are: 1. Unions and intersection: the probabilities of them in regard to the probabilities of individual events involved are governed by the probability axioms (you need to be familiar with them). An important and intuitive formula is P(A B) = P(A) + P(B) P(A B) 2. Counting problems: these are some of the most challenging problems in this course. However, they are not our main focus. One particular question students ask all the time is which formula we should use: permutation or combination. The fact is that in many cases both would work, so there is not a clear answer. A useful strategy we can follow is to break the selection procedure into several stages, where each stage is clearly defined and manageable, then one can use the product rule to calculate the probability in question.

2 3. Conditional probability and independence should be studied together. You can interpret the formula P(A B) P(A B) = P(B) by noting that if A and B are independent then P(A B) = P(A) based on the above and the definition of independence. 4. Bayes theorem gives a way to compute the conditional probabilities from other conditional probabilities with roles interchanged: P(A j B) = P(B A j )P(A j ) k Random Variables: Discrete and Continuous i=1 P(B A i )P(A i ) Note: In formulas involving rv s, we use upper case letters for rv s and lower case letters for the values they assume. If you see a function with only lower case letter variables, you are dealing with a function of deterministic variables. The probability distribution is all you can ask for in a random variable. If you opt for functions or graphs, you can choose from Probability mass function (pmf) for discrete r.v., or probability density function (pdf) for continuous r.v.. Cumulative distribution function (cdf), and you need to pay attention to where the dots and circles are. Make sure you know how to convert these two functions for the same distribution. The expected value, or the expectation, or the mean of a quantity that depends on the rv (with the dependence specified by a formula), is the first important quantity you can compute based on the distribution. The formulas are E[h(X)] = h(x) p(x) for discrete rv where p(x) is the pmf E[h(X)] = D h(x) f (x)dx for continuous rv where f (x) is the pdf The variance measures the variability and the formula is V[h(X)] = E[h(X) 2 ] [E(h(X))] 2 Here is a list of distribution formulas we may need in the final: 1. Discrete: Binomial pmf: b(x;n, p) = n p x (1 p) n x, x Poisson pmf: p(x;λ) = e λ λ x. x! Notice in the formula the regions where the mass function assumes zero.

3 2. Continuous: Uniform distribution (x µ ) 1 2 Normal distribution: f (x) = 2πσ 2 e 2σ 2. Notice that the standardization procedure P[ X < x]=p[ Z < x µ ], where X has σ normal distribution with mean µ and standard deviation σ, and Z has the standard normal distribution. The cdf for the standard normal is Φ(z) which is given in table A.3. Exponential distribution: f (x) = λe λx for x 0 Joint distribution: pmf p(x, y) for discrete, usually given in a table, and pdf f (x, y) for continuous, often given by a formula. How do we obtain the CDF from either a pmf or a pdf? The marginal pmf or pdf describes the distribution if we don t care about one of the variables (that is, we cannot distinguish different values for one of the variables). Here is an easy way to see: the marginal pdf of X does NOT depend on y, therefore it is obtained by integrating the pdf f (x, y) with respect to y so all y-values are covered and the y-dependence vanishes. Conditional pdf: this is in the same spirit as conditional probability. For example, the conditional pdf of Y given that X = x is f Y X (y x) = f (x, y) f X (x) Expectation and Covariance: the complication is the covariance Cov(X,Y) = E[(X µ X )(Y µ Y )] and the correlation is ρ X,Y = Cov(X,Y) σ X σ Y will have 1 ρ X,Y 1. and we see that the scaling factors are there so we When we have several rv s, we are interested in the combination behavior and one of the simple cases is the linear combination of several rv s. If we are looking at a particular average involving the square root of a very large number of independent rv s with the same distribution, it will behave like a normal rv. This is one of the most important theorems in probability theory: the Central Limit Theorem. For other linear combinations, the expectation of the linear combination is the linear combination of the expectations, but when we look the variability, we need to look at the variance, rather than the standard deviation and we need to be careful about the signs:

4 V (a 1 X 1 +L+ a n X n ) = a 1 2 V (X 1 ) +L+ a n 2 V (X n ). Remember that in order to use this we need independence of the rv s. Point Estimation The general notation is ˆ θ for an estimator of the population parameter θ. There are different estimators for the same parameter, and we would like to choose one for a particular application. Features you need to look for are: unbiased and minimum variance. Sometimes you just cannot find one with all the desired features. A statistic is a quantity computed from the sample data. Different random samples will result in different values for the statistic. The main focus of the latter part of the course is to use statistics from sample data to infer properties of the population under study. Statistical Inference There are two approaches here: confidence interval and test of significance, each can be applied to the following problems: inference about the mean, the proportion and the variance of a population, inference about the difference between two population means and difference between two population proportions. 1. Confidence Intervals for One Sample: The basic form of confidence intervals is ˆ θ ± z * SE or ˆ θ ± t * SE where z * and t * are the critical z-value and t -value based on the confidence level and the type of intervals, for z-test and t -test respectively, and SE is the standard error. Inference about Large sample Sample not large µ, Normal distribution z-test. z α or z α / 2, SE = s/ n t -test. t α,n 1 or t α / 2,n 1, SE = s/ n µ, Distribution unknown Same as above N/A

5 p (proportion) Similar to the above except p ˆ (1 p ˆ ) SE = n Use binomial distribution 2. Test of Hypothesis for One Sample: The statistic is θ θ 0 SE, where θ is the value of the statistic from a sample and θ 0 is the null value. SE is the standard error with formula in the above table. The test to be used is also specified in the above table. The first step is always about stating the hypotheses. There is a decision to be made about the form of the alternative hypothesis, which is usually determined from the context of the problem. There are two equivalent approaches to test the hypotheses if you want to use your data to question the null hypothesis at the given level α: (a). Compute the statistic and compare with the corresponding critical value; (b). Compute the P- value and then compare with α. 3. Confidence Interval and Test of Hypothesis Involving Two Samples: First you have to decide if you should use a two-sample test or a paired data test. In the latter case the problem is similar to the one sample test as long as you keep the pairs. In the true two-sample test, the major change is the definition of the Standard Error. For the difference between the means: SE = s 2 1 + s 2 2. This is used in both the n 1 n 2 confidence interval and the two-sample t-procedure. For differences in proportions, we use different formulas for standard errors. In confidence interval computations, we use p ˆ SE = 1 (1 p ˆ 1 ) p + ˆ 2 (1 p ˆ 2 ), n 1 n 2 and in significance tests for comparing two proportions, with null hypothesis assuming that the proportions are the same, we use SE = where ˆ p is the pooled proportion. p ˆ (1 p ˆ )( 1 + 1 ), n 1 n 2

6 4. Computation of the probability of the type-ii errors: the formulas will be supplied but you need to understand the definition of the type-ii errors and why we care about them. Good luck and have nice winter break!