Quantile regression with multilayer perceptrons.
|
|
- Tobias Anderson
- 5 years ago
- Views:
Transcription
1 Quatile regressio with multilayer perceptros. S.-F. Dimby ad J. Rykiewicz Uiversite Paris 1 - SAMM 90 Rue de Tolbiac, Paris - Frace Abstract. We cosider oliear quatile regressio ivolvig multilayer perceptros MLP). I this paper we ivestigate the asymptotic behavior of quatile regressio i a geeral framework. First by allowig possibly o-idetifiable regressio models like MLP s with redudat hidde uits, the by relaxig the coditios o the desity of the oise. I this paper, we preset a uiversal boud for the overfittig of such model uder weak assumptios. The mai applicatio of this boud is to give a hit about determiig the true architecture of the MLP quatile regressio model. As a illustratio, we use this theoretical result to propose ad compare effective criteria to fid the true architecture of such regressio model. 1 Itroductio Quatiles are poits take at regular itervals from the cumulative distributio fuctio CDF) of a radom variable. Some q-quatiles have special ames : The 2-quatile is called the media, the 4-quatiles are called quartiles ad the 10-quatiles are called deciles. We ca defie the quatile through a simple alterative expediet as a optimizatio problem. Just as we ca defie the sample meas as the solutio to the problem of miimizig a sum of squared residuals, we ca defie the media as the solutio to the problem of miimizig a sum of absolute residuals. More geerally, if y 1,,y are observed values, solvig mi m R i=1 ρ τ y i m) 1) where the cost fuctio ρ τ z) =τ z) 1 R +z) 1 τ) z 1 R z) is the tilted absolute fuctio. Havig succeeded i defiig the ucoditioal quatiles as a optimizatio problem, it is easy to defie coditioal quatiles i a aalogous fashio. To obtai a estimate of the coditioal quatile, we simply replace the scalar m i the equatio 1 by a fuctio fx i ), where x i are the covariate variables. 2 The model The basic model is a possibly oliear regressio model with a additive error. It is give by Y t = f θ X t )+ε t 2) 61
2 Where Y t ) 1 t are the observatios, X t ) 1 t are radom covariates ad ε t ) 1 t are uobserved error term. The regressio fuctio f θ is assumed to be a MLP fuctio with k hidde uits ca be writte : f θ x) =β + k a i φ wi T ) x + b i, i=1 with θ =β,a 1,,a k,b 1,,b k,w 11,,w 1d,,w k1,,w kd ) the parameter vector of the model ad φ a bouded trasfer fuctio, usually a sigmoïdal fuctio. θ belogs to Θ k R k d+2)+1, a compact i.e. closed ad bouded) set of possible parameters. The quatile regressio estimator is obtaied by fˆθτ solvig the optimizatio problem : mi θ Θk M τ f θ ) with M τ f θ )= i=1 ρ τ y i f θ x i )) 3) For a fuctio ρ τ.) equal to ρ τ z) =τ z) 1 R +z) 1 τ) z 1 R z) 4) I the sequel, let f θτ be a, possibly ot uique, fuctio such that f θτ =argmimf θ )withmf θ )= ρ τ y f θ x))dp x, y). 5) θ Θ k f θτ is the optimal fuctio for the theoretical quatile regressio problem. 2.1 Asymptotic distributio If the possible fuctios f θ are parametric, idetifiable ad smooth eough fuctio ad if the desity of the oise exists ad is positive the asymptotic ormality of the M-estimator ca be show see Koeker ad Basset [1] for the liear case ad Weiss [6] for the o-liear case ad 1 2-quatile). However it is possible to give more geeral results usig empirical processes theory. I this paper we prove a geeral boud valid eve if the optimal fuctios f θτ are ot uique ad without assumptios o the desity of oise, except momet coditios A geeral boud for Mf τ θ ) We will prove a iequality boudig the differece: M τ f θ ) M τ f θτ ). For a square itegrable fuctio gx, Y )thel 2 orm is: gx, Y ) 2 := g 2 x, y)dp x, y). 62
3 Let λ>0 be a costat, the geeralized derivative fuctio is defied as: e d λ θ X, Y )= λρτ Y f θ X)) e λρτ Y f θ τ X)) e λρ τ Y f θ τ X)) e λρ τ Y f θ X)) e λρτ Y f θ τ X)) e λρ τ Y f θ τ X)) 2 = e λρ τ Y f θ X)) λρτ Y f θ τ X)) 1 e λρ τ Y f θ X)) λρτ Y f θ τ X)) 1 2 ad let us defie ) d λ θ x, y) =mi{ 0,d λ θ x, y)}. For ow, let us assume that d λ θ is well defied, this poit will be discuss later. We ca state the followig iequality: Iequality: for λ>0, sup Mf τ θτ ) Mf τ θ )) 1 θ Θ k 2λ sup θ Θ k i=1 dλ θ x i,y i ) ) d λ 2 x θ i,y i ) i=1 Proof : The proof is very similar to the proof for the least square estimator obtaied by Rykiewicz [4]. We have Mf τ θτ ) Mf τ θ )) = ) 1 λ i=1 1+ log e λρ τ Y f θ X)) e λρ τ Y f θ τ X)) e λρ τ Y f θ τ X)) 2 d λ θ x i,y i ) 1 sup 0 p e λρ τ Y f θ X)) e λρτ Y f θ τ X)) e λρ τ Y f θ τ X)) λ i=1 log 1+pd λ θ x i,y i ) ) 2 1 sup p 0 λ p i=1 dλ θ x i,y i ) p2 ) ) 2 i=1 d λ 2 x θ i,y i ). Sice for ay real umber u, log1+u) u 1 2 u2. Fially, replacig p by the optimal value, we foud M τ f θτ ) M τ f θ )) 1 i=1 dλ θ xi,yi) 2λ i=1d λ θ ) 2 xi,yi) This iequality allows to prove that M τ f θτ ) M τ f θ ) is bouded i probability uder simple assumptios. This may be applied to model selectio as discussed i the ext sectio. 2.2 Applicatio : selectio of models I this sectio, the set Θ of possible parameters will be set to Θ= K k=1θ k, with Θ k1 Θ k2 for k 1 <k 2 ad K is a, possibly huge, fixed costat. Let k 0 be the miimal dimesio of the fuctioal space eeded to realize the true regressio fuctio f τ. For multilayer perceptro Θ k may be set of MLP with k hidde uits. We defie the miimum-pealized estimator of k 0, as the miimizer ˆk of 6) 7) T k) =mi θ Θ M τ f θ )+a k)) 8) 63
4 Let us assume the followig assumptios: A1) a.) is icreasig, a k 1 ) a k 2 )) teds to ifiity as teds to ifiity, for ay k 1 >k 2 ad a k) teds to 0 as teds to ifiity for ay k. A2) It exists λ>0sothat { d λ θ,θ Θ} is a Dosker class see va der Vaart [5]). We ow have: Theorem: Uder A1) ad A2), ˆk coverges i probability to the true dimesio k 0. The proof of this theorem is exactly the same as i Rykiewicz [4]. The assumptio A1) is fairly stadard for model selectio, i the Gaussia case A1) will be fulfilled by BIC-like criteria. The assumptio A2) is more difficult to check. First we ote: e λρ τ Y f θ X)) ρ τ Y f θ τ X))) 1 ) 2 = e 2λρτ Y f θx)) ρ τ Y f θ τ X))) 2e λρτ Y f θx)) ρ τ Y f θ τ X))) +1 So, d λ θ is well defied if E [ e 2λρτ Y f θx)) ρ τ Y f X)))] θ τ <, SiceaMLP fuctio is bouded, d λ θ is well defied if Y admits expoetial momets. Fially, usig the same techiques of reparameterizatio as i Rykiewicz [3], assumptio A2) ca be show to be true for liear regressios or MLP models with sigmoïdal trasfer fuctios, if the set of possible parameters Θ is compact. 3 A little experimet The theoretical pealizatio terms of the previous sectio ca be chose amog a wide rage of fuctios see coditio A1). I the sequel, a little experimet is coducted to assess the right rate of pealizatio to guess the true architecture of a model. Cosider a simulated model: Y t = F θ 0X 1t,X 2t )+ε t,t=1,,, with X 11,X 21 ),, X 1,X 2 )) i.i.d., X 1t,X 2t ) N0 R 2, 3 I 2 ), where I 2 is the idetity matrix. The oise sequece ε 1,...,ε is idepedet ad idetically distributed followig a Gaussia distributio N 0, 1) ad F θ 0x 1,x 2 ) = tah6 x 1 2 x 2 )+2 tah8 x 1 +3 x 2 ) 3 tah2 6 x 1 2 x 2 ) ) Here, the true model is a MLP with 2 iputs, 3 hidde uits ad oe output. I order to avoid too log time of computatio, the umber of hidde uits is assumedtobebetwee1ad10. 64
5 Let D be the size of the parameter vector the dimesio of the model or the umber of weights of the MLP), we cosider the quatile regressio with τ =0.5, so we miimize the sum of absolute residuals. We will compare 3 criteria, from the least pealized AIC like) to the most pealized Very Strog Pealizatio), the followig pealized criteria are assessed: 1 AIC like: t=1 ρ 0.5 z t F θ x t,y t )) 1+ 2D BIC like: 1 t=1 ρ 0.5 z t F θ x t,y t )) ) 1+ D log SP Strog Pealizatio): 1 t=1 ρ 0.5 z t F θ x t,y t )) ) 1+ D We simulate = 100, = 500 ad = 1000 data accordig to the true model 9), for each the experimet is repeated 100 times. The followig architectures are chose by the pealized criteria : =100 =500 =1000 b h. uits AIC like models sel BIC like models sel SP models sel b h. uits AIC like models sel BIC like models sel SP models sel b h. uits AIC like models sel BIC like models sel SP models sel The BIC like criterio ad the Strog Pealizatio chose ofte the true architecture eve for a small umber of data. Accordig to the theory, AIC like criterio is ot cosistet see coditio A1) ad the chose architecture is always too large. The Strog pealizatio chose a too small architecture whe theumberofdataissmall = 100), however it is a cosistet criterio, so its behavior is correct for larger umber of data = 500 ad = 1000). The BIC like criterio seems to be the best for this cost fuctio. ) 65
6 4 Coclusio The covetioal least squares estimator may be seriously deficiet i case of o-gaussia errors. It seems reasoable to pay a small premium i the form of sacrificed efficiecy, i order to get more robust regressio models. The class of statistics model called regressio quatiles are kow to have good properties uder some restrictive assumptios. I this paper we have show that some results may be obtaied uder more geeral assumptios. We have prove a iequality showig that overfittig of theses models is moderate if the oise admits expoetial momets. This boud justifies the use of pealized criterio similar to the BIC criterio i order to fit the dimesio of models. Fially, a more challegig task may be to get a more precise tuig of pealizatio term which, accordig to our result, ca be chose amog a wide rage of fuctios. Refereces [1] Koeker, R. ad Basset, G., Regressio quatiles. Ecoometrica, 46:1, pages 33-50, 1978 [2] Egel, E., Die produktios- ud Kosumptioverhaltisse des Koigreichs Sachse.Iteratioal Statistical Istitute Bulleti, 9, pages 1-125, 1857 [3] J. Rykiewicz, Cosistet estimatio of the architecture of multilayer perceptros. I M. Verleyse, editor, proceedigs of the 14 th Europea Symposium o Artificial Neural Networks ESANN 2006), d-side pub., pages , April 28-30, Bruges Belgium), [4] J. Rykiewicz, Geeral boud of overfittig for MLP regressio models. Neurocomputig to appear. [5] A.W. va der Vaart, Asymptotic statistics, Cambridge uiversity Press, Cambridge, [6] Weiss, A., Estimatig oliear dyamic models usig least absolute error estimatio.ecoometric Theory, 7, pages 46-68, 1991 Ecoometrica, 46:1, pages 33-50,
General bound of overfitting for MLP regression models.
arxiv:20.0633v [math.st] 3 Ja 202 Geeral boud of overfittig for MLP regressio models. Rykiewicz, J. Abstract Multilayer perceptros (MLP) with oe hidde layer have bee used for a log time to deal with o-liear
More informationTesting the number of parameters with multidimensional MLP
Testig the umber of parameters with multidimesioal MLP Joseph Rykiewicz To cite this versio: Joseph Rykiewicz. Testig the umber of parameters with multidimesioal MLP. ASMDA 2005, 2005, Brest, Frace. pp.561-568,
More informationBull. Korean Math. Soc. 36 (1999), No. 3, pp. 451{457 THE STRONG CONSISTENCY OF NONLINEAR REGRESSION QUANTILES ESTIMATORS Seung Hoe Choi and Hae Kyung
Bull. Korea Math. Soc. 36 (999), No. 3, pp. 45{457 THE STRONG CONSISTENCY OF NONLINEAR REGRESSION QUANTILES ESTIMATORS Abstract. This paper provides suciet coditios which esure the strog cosistecy of regressio
More informationChapter 3. Strong convergence. 3.1 Definition of almost sure convergence
Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i
More informationA Note on Box-Cox Quantile Regression Estimation of the Parameters of the Generalized Pareto Distribution
A Note o Box-Cox Quatile Regressio Estimatio of the Parameters of the Geeralized Pareto Distributio JM va Zyl Abstract: Makig use of the quatile equatio, Box-Cox regressio ad Laplace distributed disturbaces,
More informationFirst Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise
First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationECE 901 Lecture 13: Maximum Likelihood Estimation
ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered
More informationECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors
ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic
More informationEconomics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator
Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters
More informationOptimally Sparse SVMs
A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but
More informationThis section is optional.
4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore
More information32 estimating the cumulative distribution function
32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio
More informationSelf-normalized deviation inequalities with application to t-statistic
Self-ormalized deviatio iequalities with applicatio to t-statistic Xiequa Fa Ceter for Applied Mathematics, Tiaji Uiversity, 30007 Tiaji, Chia Abstract Let ξ i i 1 be a sequece of idepedet ad symmetric
More informationStatistical Inference Based on Extremum Estimators
T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationLecture 24: Variable selection in linear models
Lecture 24: Variable selectio i liear models Cosider liear model X = Z β + ε, β R p ad Varε = σ 2 I. Like the LSE, the ridge regressio estimator does ot give 0 estimate to a compoet of β eve if that compoet
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationSTATISTICS 593C: Spring, Model Selection and Regularization
STATISTICS 593C: Sprig, 27 Model Selectio ad Regularizatio Jo A. Weller Lecture 2 (March 29): Geeral Notatio ad Some Examples Here is some otatio ad termiology that I will try to use (more or less) systematically
More informationA survey on penalized empirical risk minimization Sara A. van de Geer
A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More informationRegression and generalization
Regressio ad geeralizatio CE-717: Machie Learig Sharif Uiversity of Techology M. Soleymai Fall 2016 Curve fittig: probabilistic perspective Describig ucertaity over value of target variable as a probability
More informationStochastic Simulation
Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso
More informationLECTURE 8: ASYMPTOTICS I
LECTURE 8: ASYMPTOTICS I We are iterested i the properties of estimators as. Cosider a sequece of radom variables {, X 1}. N. M. Kiefer, Corell Uiversity, Ecoomics 60 1 Defiitio: (Weak covergece) A sequece
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationSequences and Series of Functions
Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationRates of Convergence by Moduli of Continuity
Rates of Covergece by Moduli of Cotiuity Joh Duchi: Notes for Statistics 300b March, 017 1 Itroductio I this ote, we give a presetatio showig the importace, ad relatioship betwee, the modulis of cotiuity
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationRank tests and regression rank scores tests in measurement error models
Rak tests ad regressio rak scores tests i measuremet error models J. Jurečková ad A.K.Md.E. Saleh Charles Uiversity i Prague ad Carleto Uiversity i Ottawa Abstract The rak ad regressio rak score tests
More informationIntroduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT
Itroductio to Extreme Value Theory Laures de Haa, ISM Japa, 202 Itroductio to Extreme Value Theory Laures de Haa Erasmus Uiversity Rotterdam, NL Uiversity of Lisbo, PT Itroductio to Extreme Value Theory
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationDimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector
Dimesio-free PAC-Bayesia bouds for the estimatio of the mea of a radom vector Olivier Catoi CREST CNRS UMR 9194 Uiversité Paris Saclay olivier.catoi@esae.fr Ilaria Giulii Laboratoire de Probabilités et
More informationElement sampling: Part 2
Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig
More informationEFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS
EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS Ryszard Zieliński Ist Math Polish Acad Sc POBox 21, 00-956 Warszawa 10, Polad e-mail: rziel@impagovpl ABSTRACT Weak laws of large umbers (W LLN), strog
More informationEstimation of the Mean and the ACVF
Chapter 5 Estimatio of the Mea ad the ACVF A statioary process {X t } is characterized by its mea ad its autocovariace fuctio γ ), ad so by the autocorrelatio fuctio ρ ) I this chapter we preset the estimators
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.
More informationInformation-based Feature Selection
Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationSolutions: Homework 3
Solutios: Homework 3 Suppose that the radom variables Y,...,Y satisfy Y i = x i + " i : i =,..., IID where x,...,x R are fixed values ad ",...," Normal(0, )with R + kow. Fid ˆ = MLE( ). IND Solutio: Observe
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationIIT JAM Mathematical Statistics (MS) 2006 SECTION A
IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim
More informationLecture 4. We also define the set of possible values for the random walk as the set of all x R d such that P(S n = x) > 0 for some n.
Radom Walks ad Browia Motio Tel Aviv Uiversity Sprig 20 Lecture date: Mar 2, 20 Lecture 4 Istructor: Ro Peled Scribe: Lira Rotem This lecture deals primarily with recurrece for geeral radom walks. We preset
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationMath 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency
Math 152. Rumbos Fall 2009 1 Solutios to Review Problems for Exam #2 1. I the book Experimetatio ad Measuremet, by W. J. Youde ad published by the by the Natioal Sciece Teachers Associatio i 1962, the
More informationChapter 7 Isoperimetric problem
Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated
More information5.1 A mutual information bound based on metric entropy
Chapter 5 Global Fao Method I this chapter, we exted the techiques of Chapter 2.4 o Fao s method the local Fao method) to a more global costructio. I particular, we show that, rather tha costructig a local
More informationA statistical method to determine sample size to estimate characteristic value of soil parameters
A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig
More informationStudy the bias (due to the nite dimensional approximation) and variance of the estimators
2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite
More informationNotes On Median and Quantile Regression. James L. Powell Department of Economics University of California, Berkeley
Notes O Media ad Quatile Regressio James L. Powell Departmet of Ecoomics Uiversity of Califoria, Berkeley Coditioal Media Restrictios ad Least Absolute Deviatios It is well-kow that the expected value
More informationA RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS
J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a
More informationBerry-Esseen bounds for self-normalized martingales
Berry-Essee bouds for self-ormalized martigales Xiequa Fa a, Qi-Ma Shao b a Ceter for Applied Mathematics, Tiaji Uiversity, Tiaji 30007, Chia b Departmet of Statistics, The Chiese Uiversity of Hog Kog,
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationOn the convergence rates of Gladyshev s Hurst index estimator
Noliear Aalysis: Modellig ad Cotrol, 2010, Vol 15, No 4, 445 450 O the covergece rates of Gladyshev s Hurst idex estimator K Kubilius 1, D Melichov 2 1 Istitute of Mathematics ad Iformatics, Vilius Uiversity
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationStatistical Theory MT 2008 Problems 1: Solution sketches
Statistical Theory MT 008 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. a) Let 0 < θ < ad put fx, θ) = θ)θ x ; x = 0,,,... b) c) where α
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationTMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.
Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationLinear Regression Models
Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect
More informationAdvanced Stochastic Processes.
Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.
More informationAsymptotic distribution of products of sums of independent random variables
Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationLecture 13: Maximum Likelihood Estimation
ECE90 Sprig 007 Statistical Learig Theory Istructor: R. Nowak Lecture 3: Maximum Likelihood Estimatio Summary of Lecture I the last lecture we derived a risk (MSE) boud for regressio problems; i.e., select
More informationSeries III. Chapter Alternating Series
Chapter 9 Series III With the exceptio of the Null Sequece Test, all the tests for series covergece ad divergece that we have cosidered so far have dealt oly with series of oegative terms. Series with
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet
More informationStatistical Theory MT 2009 Problems 1: Solution sketches
Statistical Theory MT 009 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. (a) Let 0 < θ < ad put f(x, θ) = ( θ)θ x ; x = 0,,,... (b) (c) where
More informationRiesz-Fischer Sequences and Lower Frame Bounds
Zeitschrift für Aalysis ud ihre Aweduge Joural for Aalysis ad its Applicatios Volume 1 (00), No., 305 314 Riesz-Fischer Sequeces ad Lower Frame Bouds P. Casazza, O. Christese, S. Li ad A. Lider Abstract.
More informationIf a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?
2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a
More informationDetailed proofs of Propositions 3.1 and 3.2
Detailed proofs of Propositios 3. ad 3. Proof of Propositio 3. NB: itegratio sets are geerally omitted for itegrals defied over a uit hypercube [0, s with ay s d. We first give four lemmas. The proof of
More information5. Likelihood Ratio Tests
1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,
More information1+x 1 + α+x. x = 2(α x2 ) 1+x
Math 2030 Homework 6 Solutios # [Problem 5] For coveiece we let α lim sup a ad β lim sup b. Without loss of geerality let us assume that α β. If α the by assumptio β < so i this case α + β. By Theorem
More informationIt is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.
Taylor Polyomials ad Taylor Series It is ofte useful to approximate complicated fuctios usig simpler oes We cosider the task of approximatig a fuctio by a polyomial If f is at least -times differetiable
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationThe Choquet Integral with Respect to Fuzzy-Valued Set Functions
The Choquet Itegral with Respect to Fuzzy-Valued Set Fuctios Weiwei Zhag Abstract The Choquet itegral with respect to real-valued oadditive set fuctios, such as siged efficiecy measures, has bee used i
More informationAgnostic Learning and Concentration Inequalities
ECE901 Sprig 2004 Statistical Regularizatio ad Learig Theory Lecture: 7 Agostic Learig ad Cocetratio Iequalities Lecturer: Rob Nowak Scribe: Aravid Kailas 1 Itroductio 1.1 Motivatio I the last lecture
More informationAn Introduction to Asymptotic Theory
A Itroductio to Asymptotic Theory Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) Asymptotic Theory 1 / 20 Five Weapos i Asymptotic Theory Five Weapos i Asymptotic Theory Pig Yu
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece
More informationarxiv: v1 [math.pr] 13 Oct 2011
A tail iequality for quadratic forms of subgaussia radom vectors Daiel Hsu, Sham M. Kakade,, ad Tog Zhag 3 arxiv:0.84v math.pr] 3 Oct 0 Microsoft Research New Eglad Departmet of Statistics, Wharto School,
More informationACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory
1. Graph Theory Prove that there exist o simple plaar triagulatio T ad two distict adjacet vertices x, y V (T ) such that x ad y are the oly vertices of T of odd degree. Do ot use the Four-Color Theorem.
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationSection 14. Simple linear regression.
Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo
More informationDistribution of Random Samples & Limit theorems
STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to
More informationSlide Set 13 Linear Model with Endogenous Regressors and the GMM estimator
Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationChapter 2 The Monte Carlo Method
Chapter 2 The Mote Carlo Method The Mote Carlo Method stads for a broad class of computatioal algorithms that rely o radom sampligs. It is ofte used i physical ad mathematical problems ad is most useful
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationx = Pr ( X (n) βx ) =
Exercise 93 / page 45 The desity of a variable X i i 1 is fx α α a For α kow let say equal to α α > fx α α x α Pr X i x < x < Usig a Pivotal Quatity: x α 1 < x < α > x α 1 ad We solve i a similar way as
More information2 Banach spaces and Hilbert spaces
2 Baach spaces ad Hilbert spaces Tryig to do aalysis i the ratioal umbers is difficult for example cosider the set {x Q : x 2 2}. This set is o-empty ad bouded above but does ot have a least upper boud
More informationQuestions and Answers on Maximum Likelihood
Questios ad Aswers o Maximum Likelihood L. Magee Fall, 2008 1. Give: a observatio-specific log likelihood fuctio l i (θ) = l f(y i x i, θ) the log likelihood fuctio l(θ y, X) = l i(θ) a data set (x i,
More informationMOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE
Vol. 8 o. Joural of Systems Sciece ad Complexity Apr., 5 MOMET-METHOD ESTIMATIO BASED O CESORED SAMPLE I Zhogxi Departmet of Mathematics, East Chia Uiversity of Sciece ad Techology, Shaghai 37, Chia. Email:
More informationRegression with an Evaporating Logarithmic Trend
Regressio with a Evaporatig Logarithmic Tred Peter C. B. Phillips Cowles Foudatio, Yale Uiversity, Uiversity of Aucklad & Uiversity of York ad Yixiao Su Departmet of Ecoomics Yale Uiversity October 5,
More informationy X F n (y), To see this, let y Y and apply property (ii) to find a sequence {y n } X such that y n y and lim sup F n (y n ) F (y).
Modica Mortola Fuctioal 2 Γ-Covergece Let X, d) be a metric space ad cosider a sequece {F } of fuctioals F : X [, ]. We say that {F } Γ-coverges to a fuctioal F : X [, ] if the followig properties hold:
More informationAn Introduction to Randomized Algorithms
A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis
More informationThe Maximum-Likelihood Decoding Performance of Error-Correcting Codes
The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,
More informationEmpirical Processes: Glivenko Cantelli Theorems
Empirical Processes: Gliveko Catelli Theorems Mouliath Baerjee Jue 6, 200 Gliveko Catelli classes of fuctios The reader is referred to Chapter.6 of Weller s Torgo otes, Chapter??? of VDVW ad Chapter 8.3
More information