# Probability and Statistics

Save this PDF as:

Size: px
Start display at page:

Download "Probability and Statistics"

## Transcription

1 ICME Refresher Course: robability ad Statistics Staford Uiversity robability ad Statistics Luyag Che September 20, Basic robability Theory 11 robability Spaces A probability space is a triple (Ω, F, ), where Ω is a set of outcomes, F is a set of evets, ad : F [0, 1] is a fuctio that assigs probabilities to evets A σ-algebra (or σ-field) F is a collectio of subsets of Ω that satisfy 1, Ω F 2 if A F, the A c F 3 if A i F is a coutable sequece of sets, the i A i F A measurable space (Ω, F) is a space o which we ca put a measure A measure µ : F R is a oegative coutably additive set fuctio that satisfies 1 µ(a) µ( ) = 0 for all A F 2 if A i F is a coutable sequece of disjoit sets, the µ( i A i ) = i µ(a i ) If µ(ω) = 1, we call µ a probability measure Let µ be a measure o (Ω, F) 1 Mootoicity If A B, the µ(a) µ(b) 2 Subadditivity If A m=1a m, the µ(a) m=1 µ(a m) 3 Cotiuity from below If A i A (ie, A 1 A 2 ad i A i = A), the µ(a i ) µ(a) 4 Cotiuity from above If A i A (ie, A i A 2 ad i A i = A), with µ(a 1 ) <, the µ(a i ) µ(a) 12 istributios A radom variable X is a real-valued fuctio defied o Ω, such that for every Borel set B R, we have X 1 (B) = {ω Ω : X(ω) B} F A radom variable X is discrete if its possible values are fiite or coutably ifiite A radom variable X is cotiuous if its possible values form a ucoutable set ad the probability that X equals ay such value exactly is zero A trivial, but useful, type of example of a radom variable is the idicator fuctio of a set A F: { 1 ω A 1 A (ω) = 0 ω / A Luyag Che: 1

2 ICME Refresher Course: robability ad Statistics Staford Uiversity If X is a radom variable, the X iduces a probability measure o R called its distributio, by settig µ(a) = (X A) for Borel sets A The distributio of a radom variable X is described by givig its distributio fuctio F (x) = (X x) Ay distributio fuctio F has the followig properties: 1 F is odecreasig 2 lim x F (x) = 1, lim x F (x) = 0 3 F is right cotiuous, that is, lim y x F (y) = F (x) 4 lim y x F (y) = F (x ) = (X < x) Ay fuctio F satisfyig 1 3 above is the distributio fuctio of some radom variable Whe the distributio fuctio F (x) has the form we say that X has desity fuctio f 13 Itegratio & Expected Value F (x) = x Suppose f ad g are itegrable fuctios o (Ω, F, µ) 1 If f 0 ae, the fdµ 0 2 For all a R, afdµ = a fdµ 3 f + gdµ = fdµ + gdµ 4 If g f ae, the gdµ fdµ 5 If g = f ae, the gdµ = fdµ 6 fdµ f dµ f(y)dy If X is a radom variable o (Ω, F, ), the we defie its expected value to be E[X] = Xd E[X] does ot always exist Jese s iequality Suppose φ is covex, ad X ad φ(x) are both itegrable, the φ(e[x]) E[φ(X)] Hölder s iequality If p, q (1, ) with 1/p + 1/q = 1, the E[ XY ] (E[ X p ]) 1 p (E[ Y q ]) 1 q The special case p = q = 2 is called the Cauchy-Schwarz iequality Markov s iequality ( X a) a 1 E[ X ] Chebyshev s iequality ( X a) a 2 E[ X 2 ] If k is a positive iteger, the E[X k ] is called the kth momet of X The first momet E[X] is usually called the mea ad deoted by µ If E[X 2 ] <, the the variace of X is defied to be var(x) = E[(X µ) 2 ] = E[X 2 ] µ 2 The covariace of two radom variables X ad Y is defied as cov(x, Y ) = E[(X µ X )(Y µ Y )] = E[XY ] µ X µ Y Luyag Che: 2

3 ICME Refresher Course: robability ad Statistics Staford Uiversity 14 Itegratio to the Limit omiated Covergece Theorem If X X as, X Y for all ad E[Y ] <, the E[X ] E[X] Mootoe Covergece Theorem If 0 X X, the E[X ] E[X] Fatou s Lemma If X 0, the 15 Fubii s Theorem X Y lim if E[X ] E[lim if X ] Fubii s theorem If f 0 or f dµ <, the f(x, y)µ 2 (dy)µ 1 (dx) = fdµ = X Y Exercise Let X be a oegative radom variable Show that 2 Covergece 21 Covergece Cocepts E[X] = 0 Y (X t)dt X f(x, y)µ 1 (dx)µ 2 (dy) Coverge i probability We say that X X i probability, if for ay ε > 0, lim ( X X > ε) = 0 Coverge i L p We say that X X i L p, if lim E[ X X p ] = 0 Coverge almost surely We say that X X as, if (lim X = X) = 1 Coverge i distributio We say that X X i distributio, their CFs coverge, ie F (x) F (x) for ay cotiuous poit x of F Note The followig three statemets are equivalet: 1 lim E[g(X )] = E[g(X)] for all bouded ad cotiuous g(x) 2 lim E[e iαx ] = E[e iαx ] poitwise for all α R 3 lim F (x) = F (x) for ay cotiuous poit x of F 22 Relatioship betwee ifferet Covergeces If X as X, the X X roof ( ε>0 N>0 N { X X < ε}) = 1 = ( ε>0 N>0 N { X X ε}) = 0 = ( N>0 N { X N ε}) = 0 ε > 0 = lim ( X X ε) = 0 Luyag Che: 3

4 ICME Refresher Course: robability ad Statistics Staford Uiversity as X X does t imply X X Couterexample { i 1 t < i+1 f 2 +i(t) = 2 k 2 k k 0 otherwise i = 0, 1,, 2 k 1, k = 0, 1, X = f (U) where U is uiformly distributed o [0, 1] X coverges to 0 i probability, but ot as If X L p X, the X X roof ( X X ε) E[ X X p ] ε p 0 L X X does t imply p X X Couterexample f (t) = { 1/p 0 t < 1 0 otherwise X = f (U) where U is uiformly distributed o [0, 1] X coverges to 0 i probability, but ot i L p If X X, the X X If X a (costat), the X a 23 Cotiuous Mappig Theorem ad Slutsky s Theorem Cotiuous Mappig Theorem Suppose g : R R is a cotiuous fuctio 1 If X X, the g(x ) g(x) 2 If X X, the g(x ) g(x) 3 If X as X, the g(x ) as g(x) Slutsky s Theorem If X X ad Y a (costat), the X + Y X + a ad X Y ax 24 elta Method Theorem Let X 1, X 2, be a sequece of radom variables such that (X a) Z for some radom variable Z ad costat a Let g : R R be cotiuously differetiable at a The (g(x ) g(a)) g (a)z roof where X a X a (g(x ) g(a)) = g ( X ) (X a) (X a) Z X a X a g ( X ) g (a) The use Slutsky s Theorem Luyag Che: 4

5 ICME Refresher Course: robability ad Statistics Staford Uiversity 25 Weak Laws of Large Numbers (WLLN) Theorem Let X 1, X 2, be ucorrelated radom variables with E[X i ] = µ ad var(x i ) C < If S = X X the as, S / µ i L 2 ad also i probability roof E[S /] = µ E[ S / µ 2 ] = var(s /) = 1 2 var(s ) = 1 2 var(x i ) C 0 Theorem Let X 1, X 2, be iid radom variables with E[X i ] = µ ad E[ X i ] < If S = X X the as, S / µ i probability roof S / µ = 1 (X i 1 { Xi } + X i 1 { Xi >}) E[X 1 1 { X1 }] + E[X 1 1 { X1 }] E[X 1 ] ( 1 = ) (X i 1 { Xi } E[X 1 1 { X1 }]) + 1 = I + II + III ( ) X i 1 { Xi >} + E[X 1 1 { X1 }] E[X 1 ] E[ I 2 ] = 1 E[ X 11 { X1 } E[X 1 1 { X1 }] 2 ] 1 E[ X { X1 }] = 1 E[ X { X1 ε }] + 1 E[ X {ε < X1 }] ε 2 + E[ X 1 1 { X1 >ε }] [ 1 ] E[ II ] = E X i 1 { Xi >} 1 E[ X i 1 { Xi >}] = E[ X 1 1 { X1 >}] 0 III = E[X 1 1 { X1 }] E[X 1 ] E[ X 1 1 { X1 >}] 0 Note Neither idepedece of the X i or their fiite variace are eeded for the validity of WLLN 26 Strog Laws of Large Numbers (SLLN) Theorem Let X 1, X 2, be iid radom variables with E[X i ] = µ ad E[ X i ] < If S = X X the as, S / µ as If the iid radom variables {X i } have fiite forth order momets, E[ X i 4 ] < or E[ X i µ 4 ] <, the a applicatio of the Chebyshev iequality with p = 4 gives the eeded estimate ad we have the SLLN i this case Of course, this is oly a sufficiet coditio for its validity As with the WLLN, it is eough that E[ X i ] < 27 Cetral Limit Theorem Theorem Let X 1, x 2, be iid radom variables with E[X i ] = µ ad var(x i ) = σ 2 < If S = X X the (S / µ) N (0, σ 2 ) roof E[e iα (S / µ) ] = E[e i α j=1 (Xj µ) ] = φ ( α ) where φ(α) = E[e iα(x1 µ) ] The φ(0) = 1, φ (0) = 0, φ (0) = σ 2 By Taylor s theorem, we have where 0 < α < α φ( α ) = 1 φ (α ) α2 2 φ ( α ) e α2 σ 2 2 Luyag Che: 5

6 ICME Refresher Course: robability ad Statistics Staford Uiversity 3 Statistics 31 robability ad Statistics The basic problem of probability is: Give the distributio of the data, what are the properties (eg expectatio, variace, etc ) of the outcomes? The basic problem of statistics is: Give the outcomes, what ca we say about the distributio of the data? (Give X 1,, X F, what ca we say about F? ) 32 Fudametal Cocepts oit estimatio ivolves the use of sample data to calculate a sigle value (kow as a statistic) which is to serve as a best guess or best estimate of a ukow (fixed or radom) populatio parameter Let X 1,, X be iid data poits from some distributio F (x; θ ) A poit estimator ˆθ of parameter θ is some fuctio of X 1,, X : ˆθ = g(x 1,, X ) We itroduce the followig two methods: Method of Momets ad Maximum Likelihood I statistics, the bias of a estimator is the differece betwee this estimator s expected value ad the true value of the parameter beig estimated A estimator with zero bias is called ubiased Otherwise the estimator is said to be biased Let ˆθ be a estimate of a parameter θ based o a sample of size The ˆθ is said to be cosistet i probability if ˆθ coverges i probability to θ as approaches ifiity A 1 α cofidece iterval for a parameter θ is a iterval C = (a, b) where a = a(x 1,, X ) ad b = b(x 1,, X ) are fuctios of the data such that (θ C ) 1 α 33 The Methods of Momets The kth momet of a probability law is defied as µ k = E[X k ], where X is a radom variable followig that probability law If X 1,, X are iid radom variables from that distributio, the kth sample momet is defied as ˆµ k = 1 Xk i We ca view ˆµ k as a estimate of µ k The method of momets estimates parameters by fidig expressios for them i term of the lowest possible order momets ad the substitutig sample momets ito the expressios Example The first ad secod momets for the ormal distributio N (µ, σ 2 ) are µ 1 = E[X] = µ µ 2 = E[X 2 ] = µ 2 + σ 2 Therefore, µ = µ 1 ad σ 2 = µ 2 µ 2 1 The correspodig estimates of µ ad σ 2 from the sample momets are ˆσ 2 = 1 ˆµ = 1 ( 1 Xi 2 X i = X X i ) 2 = 1 (X i X) 2 Luyag Che: 6

7 ICME Refresher Course: robability ad Statistics Staford Uiversity Questio Are the two estimators above ubiased? Are the two estimators above cosistet? What are the cofidece itervals? E[ˆµ] = µ ˆσ 2 = 1 (X i µ) 2 ( X µ) 2 E[ˆσ 2 ] = σ 2 1 σ2 = 1 σ2 ˆµ is ubiased ˆσ 2 is biased Both ˆµ ad ˆσ 2 are cosistet estimators A 1 α cofidece iterval of ˆµ is [µ σ Φ 1 (1 α/2), µ + σ Φ 1 (1 α/2)] (ˆσ 2 /σ 2 χ 2 ( 1)) 34 The Method of Maximum Likelihood Suppose that radom variables X 1,, X have a joit desity f(x 1,, x θ) Give observed values X i = x i, i = 1,,, the likelihood of θ as a fuctio of x 1,, x is defied as L(θ) = f(x 1,, x θ) If X i are assumed to be iid, the likelihood is L(θ) = f(x i θ) The log likelihood is l(θ) = log L(θ) = log f(x i θ) The maximum likelihood estimate (MLE) of θ is that value of θ that maximizes the likelihood, that is, makes the observed data most probable or most likely The estimates obtaied by the method of maximum likelihood are ot always the same as those obtaied by the method of momets Example If X 1,, X are iid N (µ, σ 2 ), their joit desity is the product of their margial desities: 1 ( f(x 1,, x µ, σ) = exp 1 [ xi µ ] 2 ) 2πσ 2 2 σ The log likelihood is thus l(µ, σ) = log σ 2 The partials with respect to µ ad σ are l µ = 1 σ 2 log 2π 1 2σ 2 (X i µ) l σ = σ + 1 σ 3 The followig are the good properties of the MLE: (X i µ) 2 (X i µ) 2 ˆµ MLE = X ˆσ MLE = 1 (X i X) 2 1 Uder appropriate smoothess coditios o f, the MLE from a iid sample is cosistet 2 Uder appropriate smoothess coditios o f, (ˆθ θ ) N (0, 1/I(θ )) 3 The MLE achieves the Cramer-Rao lower boud Fisher Iformatio [ ] 2 [ 2 ] I(θ) = E θ log f(x θ) = E θ 2 log f(x θ) Luyag Che: 7

8 ICME Refresher Course: robability ad Statistics Staford Uiversity 35 Hypothesis Testig H 0 : the ull hypotheses H 1 (or H A ): the alterative hypothesis Rejectig H 0 whe it is true is called a type I error The probability of a type I error is called the sigificace level of the test ad is usually deoted by α Acceptig the ull hypothesis whe it is false is called a type II error Its probability is usually deoted by β The set of values of the test statistic that leads to rejectio of the ull hypothesis is called the rejectio regio, ad the set of values that leads to acceptace is called the acceptace regio The probability distributio of the test statistic whe the ull hypothesis is true is called the ull distributio The p-value is the probability of a result as or more extreme tha that actually observed if the ull hypothesis were true Some familiar hypothesis tests: z-test, Studet s t-test, Geeralized Likelihood Ratio Test Suppose that the observatios X = (X 1,, X ) have a joit desity fuctio f(x 1,, x θ) H 0 specifies that θ ω 0 ad H 1 specifies that θ ω 1, where ω 0 ω 1 = ad Ω = ω 0 ω 1 The test statistic Λ = max[l(θ)] θ ω 0 max [L(θ)] θ Ω Uder smoothess coditios o the probability desity, the ull distributio of 2 log Λ teds to a chi-square distributio with degrees of freedom equal to dim Ω dim ω 0 as the sample size teds to ifiity 36 Liear Regressio Cosider the followig regressio model: where Y = y 1 y The least square estimator β = β 1 β p Y = Xβ + ε ε = ε 1 ε X = ˆβ LS = arg mi Y Xβ 2 2 Cosider the model above ad we have the followig assumptios: 1 X is o-radom matrix with full colum rak 2 E[ε] = 0 3 cov(ε i, ε j ) = σ 2 δ ij 4 ε i iid N (0, σ 2 ) ˆβ LS = (X T X) 1 X T Y Uder assumptio 1-2, ˆβ LS is a ubiased estimator x 11 x 1p x 1 x p Luyag Che: 8

9 ICME Refresher Course: robability ad Statistics Staford Uiversity Uder assumptio 1-3, Cov( ˆβ LS ) = σ 2 (X T X) 1 A ubiased estimator of σ 2 is s 2 = 1 p RSS = 1 p (Y X ˆβ LS ) T (Y X ˆβ LS ) Uder assumptio 1 ad 4, ˆβ LS N (β, σ 2 (X T X) 1 ) RSS σ 2 χ 2 p ˆβ LS,j β j s t p c jj where c jj is the jth elemet o the diagoal of (X T X) 1 Luyag Che: 9

### MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

### Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

### IIT JAM Mathematical Statistics (MS) 2006 SECTION A

IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim

More information

### The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

More information

### Unbiased Estimation. February 7-12, 2008

Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom

More information

### Lecture 3 The Lebesgue Integral

Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

### Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

### STATISTICAL INFERENCE

STATISTICAL INFERENCE POPULATION AND SAMPLE Populatio = all elemets of iterest Characterized by a distributio F with some parameter θ Sample = the data X 1,..., X, selected subset of the populatio = sample

More information

### Asymptotic Results for the Linear Regression Model

Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is

More information

### First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >

More information

### Singular Continuous Measures by Michael Pejic 5/14/10

Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

More information

### Statistical Theory MT 2009 Problems 1: Solution sketches

Statistical Theory MT 009 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. (a) Let 0 < θ < ad put f(x, θ) = ( θ)θ x ; x = 0,,,... (b) (c) where

More information

### Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Some Basic Probability Cocepts 2. Experimets, Outcomes ad Radom Variables A radom variable is a variable whose value is ukow util it is observed. The value of a radom variable results from a experimet;

More information

### Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

More information

### Introduction to Probability. Ariel Yadin

Itroductio to robability Ariel Yadi Lecture 2 *** Ja. 7 ***. Covergece of Radom Variables As i the case of sequeces of umbers, we would like to talk about covergece of radom variables. There are may ways

More information

### Mathematics 170B Selected HW Solutions.

Mathematics 17B Selected HW Solutios. F 4. Suppose X is B(,p). (a)fidthemometgeeratigfuctiom (s)of(x p)/ p(1 p). Write q = 1 p. The MGF of X is (pe s + q), sice X ca be writte as the sum of idepedet Beroulli

More information

### Solutions to HW Assignment 1

Solutios to HW: 1 Course: Theory of Probability II Page: 1 of 6 Uiversity of Texas at Austi Solutios to HW Assigmet 1 Problem 1.1. Let Ω, F, {F } 0, P) be a filtered probability space ad T a stoppig time.

More information

### Properties and Hypothesis Testing

Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

### ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

### Point Estimation: properties of estimators 1 FINITE-SAMPLE PROPERTIES. finite-sample properties (CB 7.3) large-sample properties (CB 10.

Poit Estimatio: properties of estimators fiite-sample properties CB 7.3) large-sample properties CB 10.1) 1 FINITE-SAMPLE PROPERTIES How a estimator performs for fiite umber of observatios. Estimator:

More information

### MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 6 9/23/203 Browia motio. Itroductio Cotet.. A heuristic costructio of a Browia motio from a radom walk. 2. Defiitio ad basic properties

More information

### Probability Theory. Muhammad Waliji. August 11, 2006

Probability Theory Muhammad Waliji August 11, 2006 Abstract This paper itroduces some elemetary otios i Measure-Theoretic Probability Theory. Several probabalistic otios of the covergece of a sequece of

More information

### A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed

More information

### Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom

More information

### Exam II Review. CEE 3710 November 15, /16/2017. EXAM II Friday, November 17, in class. Open book and open notes.

Exam II Review CEE 3710 November 15, 017 EXAM II Friday, November 17, i class. Ope book ad ope otes. Focus o material covered i Homeworks #5 #8, Note Packets #10 19 1 Exam II Topics **Will emphasize material

More information

### STA 4032 Final Exam Formula Sheet

Chapter 2. Probability STA 4032 Fial Eam Formula Sheet Some Baic Probability Formula: (1) P (A B) = P (A) + P (B) P (A B). (2) P (A ) = 1 P (A) ( A i the complemet of A). (3) If S i a fiite ample pace

More information

### Statistical inference: example 1. Inferential Statistics

Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

### Lecture 6 Ecient estimators. Rao-Cramer bound.

Lecture 6 Eciet estimators. Rao-Cramer boud. 1 MSE ad Suciecy Let X (X 1,..., X) be a radom sample from distributio f θ. Let θ ˆ δ(x) be a estimator of θ. Let T (X) be a suciet statistic for θ. As we have

More information

### MAT1026 Calculus II Basic Convergence Tests for Series

MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

### DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

### Lecture 5: Linear Regressions

Lecture 5: Liear Regressios I lecture 2, we itroduced statioary liear time series models. I that lecture, we discussed the data geeratig processes ad their characteristics, assumig that we kow all parameters

More information

### 5. Likelihood Ratio Tests

1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,

More information

### Chapter 6 Principles of Data Reduction

Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

### Kernel density estimator

Jauary, 07 NONPARAMETRIC ERNEL DENSITY ESTIMATION I this lecture, we discuss kerel estimatio of probability desity fuctios PDF Noparametric desity estimatio is oe of the cetral problems i statistics I

More information

### s = and t = with C ij = A i B j F. (i) Note that cs = M and so ca i µ(a i ) I E (cs) = = c a i µ(a i ) = ci E (s). (ii) Note that s + t = M and so

3 From the otes we see that the parts of Theorem 4. that cocer us are: Let s ad t be two simple o-egative F-measurable fuctios o X, F, µ ad E, F F. The i I E cs ci E s for all c R, ii I E s + t I E s +

More information

### 1 Probability Generating Function

Chater 5 Characteristic Fuctio ad Probablity Covergece Theorem Probability Geeratig Fuctio Defiitio. For a discrete radom variable X which ca oly achieve o-egative itegers, we defie the robability geeratig

More information

### MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

### KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give

More information

### Law of the sum of Bernoulli random variables

Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

More information

### Basis for simulation techniques

Basis for simulatio techiques M. Veeraraghava, March 7, 004 Estimatio is based o a collectio of experimetal outcomes, x, x,, x, where each experimetal outcome is a value of a radom variable. x i. Defiitios

More information

### B Supplemental Notes 2 Hypergeometric, Binomial, Poisson and Multinomial Random Variables and Borel Sets

B671-672 Supplemetal otes 2 Hypergeometric, Biomial, Poisso ad Multiomial Radom Variables ad Borel Sets 1 Biomial Approximatio to the Hypergeometric Recall that the Hypergeometric istributio is fx = x

More information

### NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018)

NYU Ceter for Data Sciece: DS-GA 003 Machie Learig ad Computatioal Statistics (Sprig 208) Brett Berstei, David Roseberg, Be Jakubowski Jauary 20, 208 Istructios: Followig most lab ad lecture sectios, we

More information

### Section 14. Simple linear regression.

Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

### It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig

More information

### Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

### Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

### 2.2. Central limit theorem.

36.. Cetral limit theorem. The most ideal case of the CLT is that the radom variables are iid with fiite variace. Although it is a special case of the more geeral Lideberg-Feller CLT, it is most stadard

More information

### Algorithms for Clustering

CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

More information

### MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS Cotets 1. A few useful discrete radom variables 2. Joit, margial, ad

More information

### Probability and statistics: basic terms

Probability ad statistics: basic terms M. Veeraraghava August 203 A radom variable is a rule that assigs a umerical value to each possible outcome of a experimet. Outcomes of a experimet form the sample

More information

### Infinite Sequences and Series

Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

### Solutions: Homework 3

Solutios: Homework 3 Suppose that the radom variables Y,...,Y satisfy Y i = x i + " i : i =,..., IID where x,...,x R are fixed values ad ",...," Normal(0, )with R + kow. Fid ˆ = MLE( ). IND Solutio: Observe

More information

### MAS111 Convergence and Continuity

MAS Covergece ad Cotiuity Key Objectives At the ed of the course, studets should kow the followig topics ad be able to apply the basic priciples ad theorems therei to solvig various problems cocerig covergece

More information

### Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

### A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

### 4.1 Data processing inequality

ECE598: Iformatio-theoretic methods i high-dimesioal statistics Sprig 206 Lecture 4: Total variatio/iequalities betwee f-divergeces Lecturer: Yihog Wu Scribe: Matthew Tsao, Feb 8, 206 [Ed. Mar 22] Recall

More information

### Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet

More information

### Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }

UCLA STAT A Applied Probability & Statistics for Egieers Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistat: Neda Farziia, UCLA Statistics Uiversity of Califoria, Los Ageles, Sprig

More information

### Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

### Statistics 20: Final Exam Solutions Summer Session 2007

1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets

More information

### Stat 200 -Testing Summary Page 1

Stat 00 -Testig Summary Page 1 Mathematicias are like Frechme; whatever you say to them, they traslate it ito their ow laguage ad forthwith it is somethig etirely differet Goethe 1 Large Sample Cofidece

More information

### Solutions to home assignments (sketches)

Matematiska Istitutioe Peter Kumli 26th May 2004 TMA401 Fuctioal Aalysis MAN670 Applied Fuctioal Aalysis 4th quarter 2003/2004 All documet cocerig the course ca be foud o the course home page: http://www.math.chalmers.se/math/grudutb/cth/tma401/

More information

### Regression with an Evaporating Logarithmic Trend

Regressio with a Evaporatig Logarithmic Tred Peter C. B. Phillips Cowles Foudatio, Yale Uiversity, Uiversity of Aucklad & Uiversity of York ad Yixiao Su Departmet of Ecoomics Yale Uiversity October 5,

More information

### Glivenko-Cantelli Classes

CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce

More information

### A) is empty. B) is a finite set. C) can be a countably infinite set. D) can be an uncountable set.

M.A./M.Sc. (Mathematics) Etrace Examiatio 016-17 Max Time: hours Max Marks: 150 Istructios: There are 50 questios. Every questio has four choices of which exactly oe is correct. For correct aswer, 3 marks

More information

### Monte Carlo Integration

Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

### The Choquet Integral with Respect to Fuzzy-Valued Set Functions

The Choquet Itegral with Respect to Fuzzy-Valued Set Fuctios Weiwei Zhag Abstract The Choquet itegral with respect to real-valued oadditive set fuctios, such as siged efficiecy measures, has bee used i

More information

### 62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

### Lecture 10 October Minimaxity and least favorable prior sequences

STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

### Statisticians use the word population to refer the total number of (potential) observations under consideration

6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space

More information

### Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

### Lecture 9: September 19

36-700: Probability ad Mathematical Statistics I Fall 206 Lecturer: Siva Balakrisha Lecture 9: September 9 9. Review ad Outlie Last class we discussed: Statistical estimatio broadly Pot estimatio Bias-Variace

More information

### Lecture 11 October 27

STATS 300A: Theory of Statistics Fall 205 Lecture October 27 Lecturer: Lester Mackey Scribe: Viswajith Veugopal, Vivek Bagaria, Steve Yadlowsky Warig: These otes may cotai factual ad/or typographic errors..

More information

### G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan

Deviatio of the Variaces of Classical Estimators ad Negative Iteger Momet Estimator from Miimum Variace Boud with Referece to Maxwell Distributio G. R. Pasha Departmet of Statistics Bahauddi Zakariya Uiversity

More information

### Final Solutions. 1. (25pts) Define the following terms. Be as precise as you can.

Mathematics H104 A. Ogus Fall, 004 Fial Solutios 1. (5ts) Defie the followig terms. Be as recise as you ca. (a) (3ts) A ucoutable set. A ucoutable set is a set which ca ot be ut ito bijectio with a fiite

More information

### REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

### An Empirical Likelihood Approach To Goodness of Fit Testing

Submitted to the Beroulli A Empirical Likelihood Approach To Goodess of Fit Testig HANXIANG PENG ad ANTON SCHICK Idiaa Uiversity Purdue Uiversity at Idiaapolis, Departmet of Mathematical Scieces, Idiaapolis,

More information

### FUNDAMENTALS OF REAL ANALYSIS by

FUNDAMENTALS OF REAL ANALYSIS by Doğa Çömez Backgroud: All of Math 450/1 material. Namely: basic set theory, relatios ad PMI, structure of N, Z, Q ad R, basic properties of (cotiuous ad differetiable)

More information

### STAT331. Example of Martingale CLT with Cox s Model

STAT33 Example of Martigale CLT with Cox s Model I this uit we illustrate the Martigale Cetral Limit Theorem by applyig it to the partial likelihood score fuctio from Cox s model. For simplicity of presetatio

More information

### UCLA STAT 110B Applied Statistics for Engineering and the Sciences

UCLA STAT 110B Applied Statistics for Egieerig ad the Scieces Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistats: Bria Ng, UCLA Statistics Uiversity of Califoria, Los Ageles,

More information

### Read carefully the instructions on the answer book and make sure that the particulars required are entered on each answer book.

THE UNIVERSITY OF WARWICK FIRST YEAR EXAMINATION: Jauary 2009 Aalysis I Time Allowed:.5 hours Read carefully the istructios o the aswer book ad make sure that the particulars required are etered o each

More information

### Introduction to Optimization Techniques

Itroductio to Optimizatio Techiques Basic Cocepts of Aalysis - Real Aalysis, Fuctioal Aalysis 1 Basic Cocepts of Aalysis Liear Vector Spaces Defiitio: A vector space X is a set of elemets called vectors

More information

### Complex Analysis Spring 2001 Homework I Solution

Complex Aalysis Sprig 2001 Homework I Solutio 1. Coway, Chapter 1, sectio 3, problem 3. Describe the set of poits satisfyig the equatio z a z + a = 2c, where c > 0 ad a R. To begi, we see from the triagle

More information

### Homework for 4/9 Due 4/16

Name: ID: Homework for 4/9 Due 4/16 1. [ 13-6] It is covetioal wisdom i military squadros that pilots ted to father more girls tha boys. Syder 1961 gathered data for military fighter pilots. The sex of

More information

### Topic 15: Maximum Likelihood Estimation

Topic 5: Maximum Likelihood Estimatio November ad 3, 20 Itroductio The priciple of maximum likelihood is relatively straightforward. As before, we begi with a sample X (X,..., X of radom variables chose

More information

### Week 10. f2 j=2 2 j k ; j; k 2 Zg is an orthonormal basis for L 2 (R). This function is called mother wavelet, which can be often constructed

Wee 0 A Itroductio to Wavelet regressio. De itio: Wavelet is a fuctio such that f j= j ; j; Zg is a orthoormal basis for L (R). This fuctio is called mother wavelet, which ca be ofte costructed from father

More information

### Element sampling: Part 2

Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

### Maximum Likelihood Methods (Hogg Chapter Six)

Maximum Likelihood Methods Hogg Chapter ix TAT 406-0: Mathematical tatistics II prig emester 06 Cotets 0 Admiistrata Maximum Likelihood Estimatio. Maximum Likelihood Estimates............ Motivatio....................

More information

### DEEPAK SERIES DEEPAK SERIES DEEPAK SERIES FREE BOOKLET CSIR-UGC/NET MATHEMATICAL SCIENCES

DEEPAK SERIES DEEPAK SERIES DEEPAK SERIES FREE BOOKLET DEEPAK SERIES CSIR-UGC/NET MATHEMATICAL SCIENCES SOLVED PAPER DEC- DEEPAK SERIES DEEPAK SERIES DEEPAK SERIES Note : This material is issued as complimetary

More information

### Regression with quadratic loss

Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

### Homework for 2/3. 1. Determine the values of the following quantities: a. t 0.1,15 b. t 0.05,15 c. t 0.1,25 d. t 0.05,40 e. t 0.

Name: ID: Homework for /3. Determie the values of the followig quatities: a. t 0.5 b. t 0.055 c. t 0.5 d. t 0.0540 e. t 0.00540 f. χ 0.0 g. χ 0.0 h. χ 0.00 i. χ 0.0050 j. χ 0.990 a. t 0.5.34 b. t 0.055.753

More information

### INFINITE SEQUENCES AND SERIES

11 INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES 11.4 The Compariso Tests I this sectio, we will lear: How to fid the value of a series by comparig it with a kow series. COMPARISON TESTS

More information

### Time series models 2007

Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Solutios to problem sheet 1, 2007 Exercise 1.1 a Let Sc = E[Y c 2 ]. The This gives Sc = EY 2 2cEY + c 2 ds dc = 2EY + 2c = 0

More information

### Matrix Representation of Data in Experiment

Matrix Represetatio of Data i Experimet Cosider a very simple model for resposes y ij : y ij i ij, i 1,; j 1,,..., (ote that for simplicity we are assumig the two () groups are of equal sample size ) Y

More information

### Properties of Fuzzy Length on Fuzzy Set

Ope Access Library Joural 206, Volume 3, e3068 ISSN Olie: 2333-972 ISSN Prit: 2333-9705 Properties of Fuzzy Legth o Fuzzy Set Jehad R Kider, Jaafar Imra Mousa Departmet of Mathematics ad Computer Applicatios,

More information

### Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

CSCI-B609: A Theorist s Toolkit, Fall 06 Aug 3 Lecture 0: the Cetral Limit Theorem Lecturer: Yua Zhou Scribe: Yua Xie & Yua Zhou Cetral Limit Theorem for iid radom variables Let us say that we wat to aalyze

More information

### Parameter, Statistic and Random Samples

Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

### Seunghee Ye Ma 8: Week 5 Oct 28

Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

More information

### Sample questions. 8. Let X denote a continuous random variable with probability density function f(x) = 4x 3 /15 for

Sample questios Suppose that humas ca have oe of three bloodtypes: A, B, O Assume that 40% of the populatio has Type A, 50% has type B, ad 0% has Type O If a perso has type A, the probability that they

More information

### Economics 102C: Advanced Topics in Econometrics 4 - Asymptotics & Large Sample Properties of OLS

Ecoomics 102C: Advaced Topics i Ecoometrics 4 - Asymptotics & Large Sample Properties of OLS Michael Best Sprig 2015 Asymptotics So far we have looked at the fiite sample properties of OLS Relied heavily

More information