4 Conditional Distribution Estimation
|
|
- Evangeline Jones
- 6 years ago
- Views:
Transcription
1 4 Coditioal Distributio Estimatio 4. Estimators Te coditioal distributio (CDF) of y i give X i = x is F (y j x) = P (y i y j X i = x) = E ( (y i y) j X i = x) : Tis is te coditioal mea of te radom variable (y i y) : Tus te CDF is a regressio, ad ca be estimated usig regressio metods. Oe di erece is tat (y i y) is a fuctio of te argumet y; so CDF estimatio is a set of regressios, oe for eac value of y: Stadard CDF estimators iclude te NW, LL, ad WNW. Te NW ca be writte as ^F (y j x) = P K H (X i x) (y i y) P K (H (X i x)) Te NW ad WNW estimators ave te advatages tat tey are o-egative ad odecreasig i y; ad are tus valid CDFs. Te LL estimator does ot ecessarily satisfy tese properties. It ca be egative, ad eed ot be mootoic i y: As we leared for regressio estimatio, te LL ad WMW estimators bot ave better bias ad boudary properties. Puttig tese two observatios togeter, it seems reasoable to cosider usig te WNW estimator. Te estimator ^F (y j x) is smoot i x; but a step fuctio i y: We discuss later estimators wic are smoot i y: 4. Asymptotic Distributio Recall tat i te case of kerel regressio, we ad p qx g(x) jb j (x) A d! N ; R(k)q (x) f(x) were (x) was te coditioal variace of te regressio, ad te B j (x) equals (for NW) B j j g(x) + j wile for LL ad WNW te bias term is just te rst part. Clearly, for ay xed y; te same teory applies. I te case of CDF estimatio, te regressio 38
2 equatio is (y i y) = F (y j X i ) + e i (y) were e i (y) is coditioally mea zero ad as coditioal variace fuctio (x) = F (y j x) ( F (y j x)) : (We kow te coditioal variace takes tis form because te depedet variable is biary.) I write te error as a fuctio of y to empasize tat it is di eret for eac y: I te case of LL or NWW, te bias terms are B j (y j j F (y j x) te curvature i te CDF wit respect to te coditioig variables. We tus d for all (y; x) p qx ^F (y j x) F (y j x) jb j (y j x) A d! N ; R(k)q F (y j x) ( F (y j x)) f(x) ad qx AMSE ^F (y j x) jb j (y j x) A + R(k)q F (y j x) ( F (y j x)) jhj f(x) I te q = case AMSE ^F (y j x) = 4 B (y j x) + R(k)F (y j x) ( F (y j x)) : f(x) I te regressio case we de ed te WIMSE as te itegral of te AMSE, weigtig by f(x)m(x). Here we also itegrate over y: For q = W IMSE = AMSE ^F (y j x) f(x)m(x) (dx) dy = 4 B (y j x) dyf(x)m(x) (dx) + R(k) R R F (y j x) ( F (y j x)) dym(x)dx Te itegral over y does ot eed weigtig sice F (y j x) ( to eiter limit. F (y j x)) declies to zero as y teds Observe tat te coverge rate is te same as i regressio. Te optimal badwidts are te same rates as i regressio. 39
3 4.3 Badwidt Selectio I do ot believe tat badwidt coice for oparametric CDF estimatio is widely studied. Li-Racie suggest usig a CV metod based o coditioal desity estimatio. It sould also be possible to directly apply CV metods to CDF estimatio. Te leave-oe-out residuals are ^e i;i (y) = (y i y) ^F i (y j X i ) So te CV criterio for ay xed y is CV (y; ) = = ^e i;i (y) M (X i ) (y i y) ^F i (y j X i ) M (Xi ) If you wated to estimate te CDF at a sigle value of y you could pick to miimize tis criterio. For estimatio of te etire fuctio, we wat to itegrate over te values of y: Oe metod is CV () = CV (y; )dy ' NX CV (yj ; ) were yj is a grid of values over te support of y i suc tat y j y = : To calculate tis quatity, it ivolves N times te umber of calculatios as for regressio, as te leave-oe-out computatios are doe for eac yj o te grid. My guess is tat te grid over te y values could be coarse, e.g. oe could set N = : 4.4 Smooted Distributio Estimators - Ucoditioal Case Te CDF estimators itroduced above are ot smoot, but are discotiuous step fuctios. For some applicatios tis may be icoveiet. It may be desireable to ave a smoot CDF estimate as a iput for a semiparametric estimator. It is also te case tat smootig will improve ig-order estimatio e ciecy. To see tis, we eed to retur to te case of uivariate data. Recall tat te uivariate DF estimator for iid data y i is ^F (y) = (y i y) It is easy to see tat tis estimator is ubiased ad as variace F (y) ( F (y)) =: 4
4 Now cosider a smooted estimator ~F (y) = y G yi were G(x) = R x k(u)du is a kerel distributio fuctio (te itegral of a uivariate kerel fuctio). Tus F ~ (y) = R y ^f(x)dx were ^f(x) is te kerel desity estimate. To calculate its expectatio E F ~ y yi (y) = EG y x = G f(x)dx = G (u) f (y u) du te last usig te cage of variables u = (y x)= or x = y u wit Jacobia : Next, do ot expad f (y u) i a Taylor expasio, because te momets of G do ot exist. Istead, rst use itegratio by parts. Te itegral of f is F ad tat of f (y u) is F (y u) ; ad te derivative of G(u) is k(u): Tus te above equals k (u) F (y u) du wic ca ow be expaded usig Taylor s expasio, yieldig E ~ F (y) = F (y) + f () (y) + o Just as i oter estimatio cotexts, we see tat te bias of ~ F (y) is of order ; ad is proportioal to te secod derivative of wat we are estimatig, as F () (y) = f () (y) Tus smootig itroduces estimatio bias. Te iterestig part comes i te aalysis of variace. var ~F (y) = y G var yi = y! EG yi y yi EG '! y x G f(x)dx F (y) Let s calculate tis itegral. By a cage of variables y G x f(x)dx = G (u) f(y u)du 4
5 Oce agai we caot direct apply a Taylor expasio, but eed to rst do itegratio-by-parts. Agai te itegral of f (y u) is F (y u) : Te derivative of G(u) is G(u)k(u): So te above is G (u) k(u)f (y u) du ad te applyig a Taylor expasio, we obtai F (y) G (u) k(u)du f (y) G (u) k(u)udu + o() sice F () (y) = f(y): Now sice te derivative of G(u) is G(u)k(u); it follows tat te itegral of G(u)k(u) is G(u) ; ad tus te rst itegral over ( ; ) is G() G( ) = = sice G(u) is a distributio fuctio. Tus te rst part is simply F (y): De e (k) = G (u) k(u)udu > For ay symmetric kerel k, (k) > : Tis is because for u > ; G(u) > G( u); tus G (u) k(u)udu > G ( u) k(u)udu = G (u) k(u)udu ad so te itegral over ( te followig table. ; ) is positive. Itegrated kerels ad te value (k) are give i Kerel Itegrated Kerel (k) Epaecikov G (u) = 4 + 3u u3 (juj ) 9=35 Biweigt G (u) = u u 3 + 3u 5 (juj ) 5=3 Triweigt G 3 (u) = u 35u + u 5 5u 7 (juj ) 45=87 Gaussia G (u) = (u) = p Togeter, we ave var ~F (y) ' y G! x f(x)dx F (y) = F (y) F (y) (k) f (y) + o() = F (y) ( F (y)) (k) f (y) + o Te rst part is te variace of ^F (x); te usmooted estimator. variace by f (y). Smootig reduces te 4
6 Its MSE is MSE ~F F (y) ( F (y)) (y) = (k) f (y) f () (y) Te itegrated MSE is MISE ~F (y) = MSE ~F (y) dy = R F (y) ( F (y)) dy (k) + 4 R f () 4 were R f () = f () (y) dy Te rst term is idepedet of te smootig parameter (ad correspods to te itegrated variace of te usmooted EDF estimator). To d te optimal badwidt, take te FOC: d d MISE ^F (y) = (k) + 3 R f () = ad solve to d =! =3 (k) R f () =3 Te optimal badwidt coverges to zero at te fast =3 rate. Does smootig elp? Te usmooted estimator as MISE of order ; ad te smooted estimator (wit optimal badwidt) is of order 4=3 : We ca tus tik of te gai i te scaled MISE as beig of order 4=3, wic is of smaller order ta te origial rate. It is importat tat te badwidt ot be too large. Suppose you set / =5 as for desity estimatio. Te te square bias term is of order 4 / 4=5 wic is larger ta te leadig term. I tis case te smooted estimator as larger MSE ta te usual estimator! Ideed, you eed to be of smaller order ta =4 for te MSE to be o worse ta te uusual case. For practical badwidt selectio, Li-Racie ad Bowma et. al. (998) recommed a CV metod. For xed y te criterio is CV (; y) = (y i y) F ~ i (y) wic is te sum of squared leave-oe-out residuals. For a global estimate te criterio is CV () = CV (; y)dy ad tis ca be approximated by a summatio over a grid of values for y: Tis is essetially te same as te CV criterio we itroduced above i te coditioal case. 43
7 4.5 Smooted Coditioal Distributio Estimators Te smooted versios of te CDF estimators replace te idicator fuctios (y i y) wit te itegrated kerel G y yi were we will use to deote te badwidt smootig i te y directio. Te NW versio is ~F (y j x) = P K H (X i P K (H (X i x) G y wit H = f ; ::: q g: Te LL is obtaied by a local liear regressio of G y yi o X i badwidts H. Ad similarly te WNW. x)) yi x wit Wat is its distributio? It is essetially tat of ^F (y j x) ; plus a additioal bias term, mius a variace term. First take bias. Recall qx Bias ^F (y j x) ' jb j (y j x) were for LL ad WNW B j (y j j F (y j x) : Ad for smooted DF estimatio, te bias F (y) If you work out te bias of te smooted CDF, you d it is te sum of tese two, tat is F ~ (y j x) qx Bias ~F (y j x) ' jb j (y j x) were for j te B j (y j x) are te same as before, ad for j = j= B (y j x) = F (y j x) For variace, recall var ^F (y j x) = R(k)q F (y j x) ( F (y j x)) f(x) jhj ad for smooted DF estimatio, te variace was reduced by te term f (y) : I te CDF 44
8 case it turs out to be similarly adjusted: I sum, te MSE is var ~F (y j x) = R(k)q [F (y j x) ( F (y j x)) (k) f (y j x)] f(x) jhj qx MSE ~F (y j x) jb j (y j x) A j= Te WIMSE, q = case, is + R(k)q [F (y j x) ( F (y j x)) (k) f (y j x)] f(x) jhj W IMSE = AMSE ~F (y j x) f(x)m(x) (dx) dy = B (y j x) + B (y j x) dyf(x)m(x) (dx) + R(k) R R F (y j x) ( F (y j x)) dym(x)dx (k) R M(x)dx 4.6 Badwidt Coice First, cosider te optimal badwidt rates. As smootig i te y directio oly a ects te iger-order asymptotic distributio, it sould be clear tat te optimal rates for ; :::; q is ucaged from te usmooted case, ad is terefore equal to te regressio settig. Tus te optimal badwidt rates are j =(4+q) for j : Substitutig tese rates ito te MSE equatio, ad igorig costats, we ave Di eretiatig wit respect to MSE ~F (y j x) + =(4+q) + 4=(4+q) 4=(4+q) = 4 + =(4+q) 4=(4+q) ad sice will be of smaller order ta =(4+q) ; we ca igore te 3 term, ad te solvig te remaider we obtai =(4+q) : E.g. for q = te te optimal rate is =5. Wat is te gai from smootig? Wit optimal badwidt, te MISE is reduced by a term of order 6=(4+q) : Tis is 6=5 for q = ad for q = : Tis gai icreases as q icreases. Tus te gai i e ciecy (from smootig) is icreased we X is of iger dimesio. Ituitively, icreasig X is equivalet to reducig te e ective sample size, icreasig te gai from smootig. How sould te badwidt be selected? Li-Racie recommed pickig te badwidts by usig a CV metod for coditioal desity estimatio, ad te rescalig. 45
9 As a alterative, we ca use CV directly for te CDF estimate. Tat is, de e te CV criterio CV (y; ) = CV () = (y i y) ~ F i (y j X i ) M (Xi ) CV (; y)dy were = ( ; ; :::; q ) icludes smootig i bot te y ad x directios. Te estimator ~ F i is te smoot leave-oe-out estimator of F: Tis formulae allows icludes NW, LL ad WNW estimatio. Te secod itegral ca be approximated usig a grid. To my kowledge, tis procedure as ot bee formally ivestigated. 46
LECTURE 2 LEAST SQUARES CROSS-VALIDATION FOR KERNEL DENSITY ESTIMATION
Jauary 3 07 LECTURE LEAST SQUARES CROSS-VALIDATION FOR ERNEL DENSITY ESTIMATION Noparametric kerel estimatio is extremely sesitive to te coice of badwidt as larger values of result i averagig over more
More informationNotes On Nonparametric Density Estimation. James L. Powell Department of Economics University of California, Berkeley
Notes O Noparametric Desity Estimatio James L Powell Departmet of Ecoomics Uiversity of Califoria, Berkeley Uivariate Desity Estimatio via Numerical Derivatives Cosider te problem of estimatig te desity
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationLecture 9: Regression: Regressogram and Kernel Regression
STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 9: Regressio: Regressogram ad erel Regressio Istructor: Ye-Ci Ce Referece: Capter 5 of All of oparametric statistics 9 Itroductio Let X,
More informationStudy the bias (due to the nite dimensional approximation) and variance of the estimators
2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite
More informationLecture 7 Testing Nonlinear Inequality Restrictions 1
Eco 75 Lecture 7 Testig Noliear Iequality Restrictios I Lecture 6, we discussed te testig problems were te ull ypotesis is de ed by oliear equality restrictios: H : ( ) = versus H : ( ) 6= : () We sowed
More informationLecture 7: Density Estimation: k-nearest Neighbor and Basis Approach
STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationNotes On Nonparametric Density Estimation. James L. Powell Department of Economics University of California, Berkeley
Notes O Noarametric Desity Estimatio James L. Powell Deartmet of Ecoomics Uiversity of Califoria, Berkeley Uivariate Desity Estimatio via Numerical Derivatives Cosider te roblem of estimatig te desity
More informationNonparametric regression: minimax upper and lower bounds
Capter 4 Noparametric regressio: miimax upper ad lower bouds 4. Itroductio We cosider oe of te two te most classical o-parametric problems i tis example: estimatig a regressio fuctio o a subset of te real
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationON LOCAL LINEAR ESTIMATION IN NONPARAMETRIC ERRORS-IN-VARIABLES MODELS 1
Teory of Stocastic Processes Vol2 28, o3-4, 2006, pp*-* SILVELYN ZWANZIG ON LOCAL LINEAR ESTIMATION IN NONPARAMETRIC ERRORS-IN-VARIABLES MODELS Local liear metods are applied to a oparametric regressio
More informationExponential Families and Bayesian Inference
Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationBayesian Methods: Introduction to Multi-parameter Models
Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationStatistical Inference Based on Extremum Estimators
T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationSOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker
SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker CHAPTER 9. POINT ESTIMATION 9. Covergece i Probability. The bases of poit estimatio have already bee laid out i previous chapters. I chapter 5
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationEconomics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator
Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters
More informationOutput Analysis and Run-Length Control
IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%
More informationMonte Carlo Integration
Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationLecture Stat Maximum Likelihood Estimation
Lecture Stat 461-561 Maximum Likelihood Estimatio A.D. Jauary 2008 A.D. () Jauary 2008 1 / 63 Maximum Likelihood Estimatio Ivariace Cosistecy E ciecy Nuisace Parameters A.D. () Jauary 2008 2 / 63 Parametric
More informationIt should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.
Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationCONCENTRATION INEQUALITIES
CONCENTRATION INEQUALITIES MAXIM RAGINSKY I te previous lecture, te followig result was stated witout proof. If X 1,..., X are idepedet Beroulliθ radom variables represetig te outcomes of a sequece of
More informationLecture 3: MLE and Regression
STAT/Q SCI 403: Itroductio to Resamplig Methods Sprig 207 Istructor: Ye-Chi Che Lecture 3: MLE ad Regressio 3. Parameters ad Distributios Some distributios are idexed by their uderlyig parameters. Thus,
More informationBernoulli numbers and the Euler-Maclaurin summation formula
Physics 6A Witer 006 Beroulli umbers ad the Euler-Maclauri summatio formula I this ote, I shall motivate the origi of the Euler-Maclauri summatio formula. I will also explai why the coefficiets o the right
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More informationLecture 3: August 31
36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,
More informationRecursive Algorithms. Recurrences. Recursive Algorithms Analysis
Recursive Algorithms Recurreces Computer Sciece & Egieerig 35: Discrete Mathematics Christopher M Bourke cbourke@cseuledu A recursive algorithm is oe i which objects are defied i terms of other objects
More information1 Covariance Estimation
Eco 75 Lecture 5 Covariace Estimatio ad Optimal Weightig Matrices I this lecture, we cosider estimatio of the asymptotic covariace matrix B B of the extremum estimator b : Covariace Estimatio Lemma 4.
More information1 Approximating Integrals using Taylor Polynomials
Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationALLOCATING SAMPLE TO STRATA PROPORTIONAL TO AGGREGATE MEASURE OF SIZE WITH BOTH UPPER AND LOWER BOUNDS ON THE NUMBER OF UNITS IN EACH STRATUM
ALLOCATING SAPLE TO STRATA PROPORTIONAL TO AGGREGATE EASURE OF SIZE WIT BOT UPPER AND LOWER BOUNDS ON TE NUBER OF UNITS IN EAC STRATU Lawrece R. Erst ad Cristoper J. Guciardo Erst_L@bls.gov, Guciardo_C@bls.gov
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationThe Advection-Diffusion equation!
ttp://www.d.edu/~gtryggva/cf-course/! Te Advectio-iffusio equatio! Grétar Tryggvaso! Sprig 3! Navier-Stokes equatios! Summary! u t + u u x + v u y = P ρ x + µ u + u ρ y Hyperbolic part! u x + v y = Elliptic
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More information10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationSequences. Notation. Convergence of a Sequence
Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it
More informationCSE 527, Additional notes on MLE & EM
CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be
More informationThe Growth of Functions. Theoretical Supplement
The Growth of Fuctios Theoretical Supplemet The Triagle Iequality The triagle iequality is a algebraic tool that is ofte useful i maipulatig absolute values of fuctios. The triagle iequality says that
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationThis section is optional.
4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore
More informationEstimating the Population Mean using Stratified Double Ranked Set Sample
Estimatig te Populatio Mea usig Stratified Double Raked Set Sample Mamoud Syam * Kamarulzama Ibraim Amer Ibraim Al-Omari Qatar Uiversity Foudatio Program Departmet of Mat ad Computer P.O.Box (7) Doa State
More informationLecture 15: Learning Theory: Concentration Inequalities
STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:
Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal
More informationAlgorithms for Clustering
CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat
More information1.010 Uncertainty in Engineering Fall 2008
MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval
More informationKernel density estimator
Jauary, 07 NONPARAMETRIC ERNEL DENSITY ESTIMATION I this lecture, we discuss kerel estimatio of probability desity fuctios PDF Noparametric desity estimatio is oe of the cetral problems i statistics I
More informationHypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance
Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by
More informationECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where
More informationJanuary 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS
Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we
More informationAAEC/ECON 5126 FINAL EXAM: SOLUTIONS
AAEC/ECON 5126 FINAL EXAM: SOLUTIONS SPRING 2015 / INSTRUCTOR: KLAUS MOELTNER This exam is ope-book, ope-otes, but please work strictly o your ow. Please make sure your ame is o every sheet you re hadig
More informationThe random version of Dvoretzky s theorem in l n
The radom versio of Dvoretzky s theorem i l Gideo Schechtma Abstract We show that with high probability a sectio of the l ball of dimesio k cε log c > 0 a uiversal costat) is ε close to a multiple of the
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationLecture 15: Density estimation
Lecture 15: Desity estimatio Why do we estimate a desity? Suppose that X 1,...,X are i.i.d. radom variables from F ad that F is ukow but has a Lebesgue p.d.f. f. Estimatio of F ca be doe by estimatig f.
More informationLast time, we talked about how Equation (1) can simulate Equation (2). We asserted that Equation (2) can also simulate Equation (1).
6896 Quatum Complexity Theory Sept 23, 2008 Lecturer: Scott Aaroso Lecture 6 Last Time: Quatum Error-Correctio Quatum Query Model Deutsch-Jozsa Algorithm (Computes x y i oe query) Today: Berstei-Vazirii
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationRegression with an Evaporating Logarithmic Trend
Regressio with a Evaporatig Logarithmic Tred Peter C. B. Phillips Cowles Foudatio, Yale Uiversity, Uiversity of Aucklad & Uiversity of York ad Yixiao Su Departmet of Ecoomics Yale Uiversity October 5,
More informationLECTURE 8: ASYMPTOTICS I
LECTURE 8: ASYMPTOTICS I We are iterested i the properties of estimators as. Cosider a sequece of radom variables {, X 1}. N. M. Kiefer, Corell Uiversity, Ecoomics 60 1 Defiitio: (Weak covergece) A sequece
More informationDirection: This test is worth 150 points. You are required to complete this test within 55 minutes.
Term Test 3 (Part A) November 1, 004 Name Math 6 Studet Number Directio: This test is worth 10 poits. You are required to complete this test withi miutes. I order to receive full credit, aswer each problem
More informationChapter 8. Euler s Gamma function
Chapter 8 Euler s Gamma fuctio The Gamma fuctio plays a importat role i the fuctioal equatio for ζ(s that we will derive i the ext chapter. I the preset chapter we have collected some properties of the
More informationStatistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions
Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet
More information6.867 Machine learning, lecture 7 (Jaakkola) 1
6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit
More informationDouble Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution
Iteratioal Mathematical Forum, Vol., 3, o. 3, 3-53 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.9/imf.3.335 Double Stage Shrikage Estimator of Two Parameters Geeralized Expoetial Distributio Alaa M.
More informationSeunghee Ye Ma 8: Week 5 Oct 28
Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete
More informationKernel Density Estimation
Kernel Density Estimation Univariate Density Estimation Suppose tat we ave a random sample of data X 1,..., X n from an unknown continuous distribution wit probability density function (pdf) f(x) and cumulative
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationEstimation of the Mean and the ACVF
Chapter 5 Estimatio of the Mea ad the ACVF A statioary process {X t } is characterized by its mea ad its autocovariace fuctio γ ), ad so by the autocorrelatio fuctio ρ ) I this chapter we preset the estimators
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More informationTR/46 OCTOBER THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION A. TALBOT
TR/46 OCTOBER 974 THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION by A. TALBOT .. Itroductio. A problem i approximatio theory o which I have recetly worked [] required for its solutio a proof that the
More informationElement sampling: Part 2
Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationLecture 19. sup y 1,..., yn B d n
STAT 06A: Polyomials of adom Variables Lecture date: Nov Lecture 19 Grothedieck s Iequality Scribe: Be Hough The scribes are based o a guest lecture by ya O Doell. I this lecture we prove Grothedieck s
More informationRecurrence Relations
Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The
More informationLecture 33: Bootstrap
Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece
More informationSMOOTHING QUANTILE REGRESSIONS
SMOOTHING QUANTILE REGRESSIONS Emmauel Guerre Marcelo Ferades Eduardo Horta Scool of Ecoomics ad Fiace Scool of Ecoomics ad Fiace Departmet of Statistics Quee Mary Uiversity of Lodo Quee Mary Uiversity
More informationMaximum Likelihood Estimation and Complexity Regularization
ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio
More informationlim za n n = z lim a n n.
Lecture 6 Sequeces ad Series Defiitio 1 By a sequece i a set A, we mea a mappig f : N A. It is customary to deote a sequece f by {s } where, s := f(). A sequece {z } of (complex) umbers is said to be coverget
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationf(x) dx as we do. 2x dx x also diverges. Solution: We compute 2x dx lim
Math 3, Sectio 2. (25 poits) Why we defie f(x) dx as we do. (a) Show that the improper itegral diverges. Hece the improper itegral x 2 + x 2 + b also diverges. Solutio: We compute x 2 + = lim b x 2 + =
More informationThe Poisson Process *
OpeStax-CNX module: m11255 1 The Poisso Process * Do Johso This work is produced by OpeStax-CNX ad licesed uder the Creative Commos Attributio Licese 1.0 Some sigals have o waveform. Cosider the measuremet
More informationProduct measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.
Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationComputation Of Asymptotic Distribution For Semiparametric GMM Estimators
Computatio Of Asymptotic Distributio For Semiparametric GMM Estimators Hideiko Icimura Departmet of Ecoomics Uiversity College Lodo Cemmap UCL ad IFS April 9, 2004 Abstract A set of su ciet coditios for
More informationDirection: This test is worth 250 points. You are required to complete this test within 50 minutes.
Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely
More informationChapter 4. Fourier Series
Chapter 4. Fourier Series At this poit we are ready to ow cosider the caoical equatios. Cosider, for eample the heat equatio u t = u, < (4.) subject to u(, ) = si, u(, t) = u(, t) =. (4.) Here,
More information