Semi-supervised Inference for Explained Variance in High-dimensional Linear Regression and Its Applications

Size: px
Start display at page:

Download "Semi-supervised Inference for Explained Variance in High-dimensional Linear Regression and Its Applications"

Transcription

1 Semi-supervised Iferece for Explaied Variace i High-dimesioal Liear Regressio ad Its Applicatios T. Toy Cai ad Zijia Guo Uiversity of Pesylvaia ad Rutgers Uiversity March 8, 08 Abstract We cosider statistical iferece for the explaied variace β Σβ uder the highdimesioal liear model Y = Xβ + ɛ i the semi-supervised settig, where β is the regressio vector ad Σ is the desig covariace matrix. A calibrated estimator, which efficietly itegrates both labelled ad ulabelled data, is proposed. It is show that the estimator achieves the miimax optimal rate of covergece i the geeral semisupervised framework. The optimality result characterizes how the ulabelled data affects the miimax optimal rate. Moreover, the limitig distributio for the proposed estimator is established ad data-drive cofidece itervals for the explaied variace are costructed. We further develop a radomized calibratio techique for statistical iferece i the presece of weak sigals ad apply the obtaied iferece results to a rage of importat statistical problems, icludig sigal detectio ad global testig, predictio accuracy evaluatio, ad cofidece ball costructio. The umerical performace of the proposed methodology is demostrated i simulatio studies ad a aalysis of estimatig heritability for a yeast segregat data set with multiple traits. Keywords: Cofidece iterval, cofidece ball, heritability, predictio accuracy, sigal detectio, miimaxity, sparsity. Itroductio High-dimesioal liear models are ubiquitous i cotemporary statistical modelig with a wide rage of applicatios i may scietific fields. The early focus has bee maily o Departmet of Statistics, The Wharto School, Uiversity of Pesylvaia, Philadelphia, PA 904. The research of Toy Cai was supported i part by NSF Grat DMS-7735 ad NIH Grat R0 GM

2 developig methods for the recovery of the whole regressio vector via pealized or costraied l miimizatio approaches. Examples iclude the Lasso [34], Datzig Selector [4], MCP [4], square-root Lasso [4], ad scaled Lasso [33]. There have bee sigificat recet iterests i statistical iferece for low-dimesioal fuctioals, icludig cofidece itervals ad hypothesis testig for idividual regressio coefficiets [43, 35, 7, 6], miimaxity ad adaptivity of cofidece itervals for geeral liear fuctioals [7], estimatio of the sigal-to-oise-ratio [38, 4], iferece for the l q accuracy of a give estimator [8], ad estimatio of quadratic fuctioals [4, ]. Motivated by a rage of applicatios, the preset paper cosiders statistical iferece for the explaied variace, which is a oe-dimesioal weighted quadratic fuctioal, i the high-dimesioal ad semi-supervised settig. We first develop i detail the theory for optimal estimatio of the explaied variace, which also leads to the costructio of cofidece itervals. The results are the applied to several other importat statistical iferece problems.. Problem Formulatio ad Motivatios We cosider the high-dimesioal liear model with a radom desig, y i = Xi β + ɛ i, for i where y i R ad X i R p deote respectively the outcome ad the measured covariates of the i-th observatio, ɛ i deotes the error ad β R p deotes the high-dimesioal regressio vector. The rows X i are i.i.d. p-dimesioal sub-gaussia radom vectors with mea 0 ad covariace matrix Σ ad the errors {ɛ i } i are i.i.d sub-gaussia radom variable with mea 0 ad variace σ ad idepedet of {X i } i. The explaied variace uder the regressio model is represeted by the weighted quadratic fuctioal of β, Q = β Σβ. We study estimatio ad iferece for the explaied variace i the semi-supervised settig, where the data is a combiatio of the labelled data {y i, X i } i i the regressio model ad the ulabelled data {X i } + i +N. Here the measured covariates of both the labelled ad ulabelled data are assumed to be idepedet ad follow the same distributio. The more covetioal supervised settig is treated as a special case. The settig of semi-supervised learig is commoly see i applicatios where the outcomes are more expesive to collect tha the covariates. For example, i the aalysis of Electroic Health Records EHR databases, the covariates are easy to be automatically extracted while labellig of the outcomes is costly ad time-cosumig [5, 9]. I additio,

3 semi-supervised learig aturally arises i the itegrative aalysis of multiple geetics data sets where the covariates are the same across all data sets but the outcomes measured vary from study to study due to the specific purposes of idividual studies [36]. This ca be aturally formulated as semi-supervised learig, where the pre-specified outcome is oly measured over oe or several but ot all data sets while the covariates are measured across all data sets. See [5, 9, 3, 4] for more discussio about semi-supervised learig. The developmet of the optimal estimator ad cofidece itervals for Q = β Σβ i the semi-supervised settig alog with the correspodig statistical aalysis is of sigificat iterest o its ow right ad poses may challeges. This iferece problem is also closely coected to several other importat statistical problems.. Heritability. Heritability is amog the most importat geetics cocepts. Highdimesioal liear regressio has prove to be useful i modelig the pheotypegeotype relatioship i the presece of the large amout of geetic variats [3,, 38, 4]. Uder the liear model with the outcome ormalized to have uit variace, oe heritability measure defied i the literature is the quadratic fuctioal, β Σβ, which measures the total variace explaied by geetic variats [4, 38].. Sigal-to-Noise Ratio ad Proportio of Variace Explaied. Sigal-to- Noise Ratio SNR ad Proportio of Variace Explaied PVE are importat statistics cocepts ad are defied respectively as β Σβ/σ ad β Σβ/β Σβ + σ uder model. The quadratic fuctioal β Σβ is cetral to SNR ad PVE. Together with a good estimator of σ [, 33, 4], the results for β Σβ established i this paper are useful for iferece of SNR ad PVE. 3. Sigal Detectio ad Global Testig. Iferece for the explaied variace ca be applied to testig the global hypothesis H 0 : β = β ull for β ull R p, which icludes sigal detectio as a special case with β ull = 0. The coectio is revealed i the followig adjusted liear model, y i X i βull = X i β β ull + ɛ i for i. 3 Uder model 3, testig for H 0 : β = β ull is recast as testig the hypotheses H 0 : β β ull Σ β β ull = 0 versus H : β β ull Σ β β ull > Predictio Accuracy Assessmet. Accuracy assessmet is of sigificat importace i applicatios. Iferece for the explaied variace is useful for assessig the out-of-sample predictio accuracy of a give estimator. We use X 0, y 0 to deote the set of traiig observatios ad ˇβ to deote a give estimator of β based o the traiig data set X 0, y 0 ; we use X, y to deote the set of test 3

4 observatios. The predictio accuracy for a future observatio x ew is defied as E xew x ew ˇβ β = ˇβ β Σ ˇβ β. To obtai the iferece results for this quatity, we rely o the followig adjusted liear model, y i X i ˇβ = X i β ˇβ + ɛi for i. 4 Iferece results developed for the explaied variace ca be applied to 4 to obtai the correspodig results for the predictio accuracy E xew x ew ˇβ β. 5. Cofidece Ball for β. Costructio of cofidece balls for β is aother importat applicatio of iferece for explaied variace studied i this paper. We use ˇβ to deote a pre-specified estimator of β, which serves as the ceter of the costructed cofidece ball. Based o the adjusted liear model 4, a cofidece iterval LZ, UZ for the explaied variace ˇβ β Σ ˇβ β leads to a atural cofidece ball for β, { } β : β ˇβ λ mi Σ UZ, 5 where λ mi Σ deotes the smallest eigevalue of Σ. The close coectios to the above statistical applicatios provide further motivatios for studyig the iferece problem for β Σβ. I Sectio 4, we demostrate i detail how to apply the obtaied results for β Σβ to tackle some of these statistical applicatios.. Results ad Cotributios We itroduce a ew estimator, Calibrated High-dimesioal Iferece for Variace Explaied CHIVE, i the semi-supervised settig. The CHIVE estimator for Q = β Σβ is costructed i two steps, which together efficietly itegrate both labelled ad ulabelled data. The first step is to plug i the estimators of β ad Σ, deoted by β ad Σ, respectively, ad the secod step is to calibrate this plug-i estimator β Σ β through estimatig its estimatio error. The secod step is essetial i rebalacig the bias ad variace to improve the estimatio accuracy. The calibratio techique is a geeral machiery as it ca take differet forms of β ad Σ as its iputs. This flexibility is quite useful i the semi-supervised settig, where the ulabelled data ca be efficietly used to estimate the desig covariace matrix Σ. We show the optimality of CHIVE by establishig the miimax optimal rate of covergece for estimatig β Σβ i the geeral semi-supervised settig. We also quatify the ucertaity of the CHIVE estimator by establishig its limitig distributio uder stroger coditios. Data-drive cofidece itervals for β Σβ are costructed based o the limitig distributio. The supervised settig ad the settig with kow desig covariace matrix 4

5 are discussed as the special cases with N = 0 ad N =, respectively. We further develop a radomized calibratio techique for statistical iferece i the presece of weak sigals ad apply the obtaied results to several importat statistical problems. The umerical performace of the proposed methodology is demostrated i simulatio studies ad a real data aalysis of estimatig heritability for a yeast segregat data set with multiple traits. The mai cotributios of the preset paper are three-fold.. We propose a ovel estimator, CHIVE, for the explaied variace β Σβ that efficietly uses both the labelled ad ulabelled data ad is show to achieve the miimax optimal rate. The results characterize how the ulabelled data affects the miimax optimal rate for estimatig β Σβ. Specifically, the optimal rate is β / + β / N + + k log p/, where p is the dimesio, is the size of the labelled data, N is the size of the ulabelled data, ad k ad β deote respectively the sparsity ad the l orm of β. It is iterestig to ote that the ulabelled data oly helps reduce the covergece rate β / N + but ot the other two terms.. We quatify the ucertaity for the CHIVE estimator through establishig its limitig distributio. It is show that the limitig distributio is ormal ad its variace depeds o the proportio of the labelled data. The result is the used for the costructio of data-drive cofidece itervals for β Σβ. 3. The iferece results obtaied i this paper are applied to i sigal detectio ad global testig, ii predictio accuracy evaluatio, ad iii cofidece ball costructio. For sigal detectio, we cotrol the type I error ad characterize the type II error by establishig the power fuctio uder a local alterative. The results ca be easily exteded to the geeral global testig problem. For evaluatio of out-of-sample predictio accuracy of a give sparse estimator, both the poit ad iterval estimators are developed. We establish the estimatio error boud for the poit estimator of the predictio accuracy ad cotrol the legth of the correspodig cofidece iterval. A cofidece ball for the regressio vector β with cotrolled radius is also costructed. We stress that these procedures are data-drive ad do ot require a priori kowledge of the desig covariace matrix Σ or the oise level σ. See more details i Sectio 4 ad the related umerical performace i Sectios 5. ad 5.3. A cetral questio i semi-supervised learig is how to efficietly use both labelled ad ulabelled data to coduct statistical iferece [5, 9]. The results obtaied i the preset paper illustrate how the ulabelled data ca facilitate statistical iferece for the explaied variace ad also the related statistical applicatios. 5

6 .3 Related Work Estimatio ad iferece for quadratic fuctioals have bee studied i the literature i a rage of settigs. I particular, miimax ad adaptive estimatio of quadratic fuctioals plays a importat role i oparametric iferece ad has bee well studied i desity estimatio, oparametric regressio, ad white oise with drift model. See, for example, [5, 7, 8, 8,,, 6]. I high-dimesioal liear regressio, estimatio ad iferece for quadratic fuctioals has also bee studied i [4, 38, ]. I particular, [38] ad [] cosidered estimatio of β Σβ/σ ad β, respectively, but ot the ucertaity quatificatio problem. [4] studied the costructio of cofidece itervals for β uder the settig of Σ = I, moderate dimesio where /p ξ 0, ad o sparsity assumptio o β. The iferece problem i sparse high-dimesioal liear regressio cosidered i the curret paper is sigificatly differet from the settig cosidered i [4], maily due to the complicated geometry iduced by the sparsity structure ad the ukow desig covariace matrix Σ. Other works related to quadratic fuctioal iferece iclude costructio of cofidece itervals for the l loss of the estimator cosidered i [8] ad iferece for treatmet effect ad edogeeity parameter i istrumetal variable regressio [, 0]. I additio, [5, 44] cosidered hypothesis testig for high-dimesioal liear regressio. The statistical applicatios studied i this paper have also bee cosidered separately i the literature. Sigal detectio was studied i [3, ] uder the liear model i a special settig where the desig covariace matrix Σ is equal to or closed to the idetity matrix. I this settig, [3, ] established optimal sigal detectio method ad theory. The obtaied iferece results i the preset paper eable the study of the sigal detectio problem uder a geeral settig where the desig covariace matrix Σ is ukow. The cofidece ball costructio for the whole regressio vector was cosidered i [30] i the case of kow σ ad the optimal size ad possibility of adaptive cofidece balls was also established. The results obtaied i the curret paper lead to a cofidece ball costructio for β i the case of ukow σ. A problem related to predictio accuracy is iferece for the estimatio accuracy, which was cosidered i [8, 4]. However, iferece for the predictio accuracy ad that for the estimatio accuracy are quite differet problems..4 Orgaizatio of the Paper The rest of the paper is orgaized as follows. I Sectio, we itroduce i detail the CHIVE estimator ad establish its miimax rate optimality. Sectio 3 focuses o quatifyig the ucertaity of the CHIVE estimator ad costructio of cofidece itervals for β Σβ. We apply i Sectio 4 the developed procedures to tackle three importat problems, sigal de- 6

7 tectio ad global testig, predictio accuracy evaluatio ad cofidece ball costructio. Simulatio results are give i Sectio 5 ad a aalysis of a yeast data set is preseted i Sectio 6. A discussio is provided i Sectio 7 ad the proofs are give i Sectio 8. Additioal proofs ad simulatio results are preseted i the appedix. Optimal Estimatio of β Σβ I this sectio, we first itroduce the calibratio methodology for estimatig the explaied variace ad the establish the miimax covergece rate for estimatig β Σβ i the geeral semi-supervised framework. The results demostrate the effect of the ulabelled data o the optimal covergece rate. The supervised settig ad the settig with kow desig covariace matrix are the discussed as special cases. We begi with the otatio that will be used i the rest of the paper. We use Z = X, y to deote the data set. For a matrix A, A i, A j, ad A i,j deote respectively the i-th row, j-th colum, ad i, j etry of the matrix A. The spectral orm of A is A = sup x = Ax ad the matrix l orm A L = sup p j p A ij. For a symmetric matrix A, λ mi A ad λ max A deote respectively the smallest ad largest eigevalue of A. For a set S, S deotes the cardiality of S. For a vector x R p, x j deotes the vector without the j-th idex, suppx deotes the support of x ad the l q orm of x is defied as x q = p x i q q for q 0 with x 0 = suppx ad x = max j p x j. For a R, a + = max {a, 0}; For a, b R, a b = max{a, b}. We use c ad C to deote geeric positive costats that may vary from place to place. For p a sequece of radom variables X idexed by, we use X X to represet that X coverges to X i probability. For a sequece of radom variables X ad umbers a, we defie X = o p a if X /a coverges to zero i probability. For two positive sequeces a ad b, a b meas a Cb for all ad a b if b a ad a b if a b a ad b a, ad a b if lim b = 0 ad a b if b a.. Calibratio of Plug-i Estimators For the semi-supervised learig, the data is mixed of the labelled data X, y,, X, y ad the ulabelled data X +,,, X +N, where X,, X, X +,,, X +N are i.i.d realizatios of p-dimesioal covariates. We use β ad Σ to deote certai reasoably good estimators of β ad Σ, which will be specified later. Based o β ad Σ, a atural estimator of the quadratic fuctioal Q = β Σβ is the plug-i estimator β Σ β, which has the followig error decompositio, β Σ β β Σβ = β Σ β β β β Σ β β + β Σ Σ β. 6 7

8 Based o the above decompositio, the estimator β Σ β ca be further improved sice the estimatio error due to the first term β Σ β β o the right had side of 6 ca be further reduced. We estimate the term β Σ β β i the error decompositio 6 by β X i y i X i β ad propose the followig calibrated estimator, Q β, Σ, Z = β Σ β + β X i y i X i β. 7 This estimator is referred to as the CHIVE estimator, as a shorthad for Calibrated Highdimesioal Iferece for Variace Explaied. The calibratio step i 7 is essetially to improve the plug-i estimator β Σ β through re-balacig the bias ad variace. The calibrated estimator requires three iputs, the iitial estimators β ad Σ ad the data Z = X, y. With this machiery, it remais to propose iitial estimators for β ad Σ. We begi with estimators for β ad the move o to the estimators for Σ. Throughout the paper, without special otificatio, we make the followig assumptios o the estimators β ad σ. B With probability larger tha γ where γ 0, the estimator β satisfies } k log p log p max { X β β, β β, β β k. B σ is a cosistet estimator of σ, that is, σ σ p 0. Examples of estimators satisfyig B ad B. The scaled lasso estimator { β, σ } defied i the followig equatio 8 has bee show i [33] to satisfy B ad B uder regularity coditios, y Xβ { β, σ} = arg mi β R p,σ R + σ See also Lemma i [] for more details. + σ +.0 log p p j= X j β j. 8 Sice the square root lasso estimator [4] is umerically the same with the scale Lasso estimator, the square root lasso estimators of β ad σ also satisfy B ad B. I additio, with a prior kowledge of σ, the Lasso estimator of β ad other variats are also show to satisfy the above coditio B; see [4, 4, 39] for more details. Now, we tur to the estimators of Σ. The additioal ulabelled data is useful for estimatig the desig covariace matrix Σ. We pool the iformatio cotaied i both the labelled ad ulabelled data ad estimate Σ by Σ S = +N +N X i Xi. The we use β ad Σ S as iputs ad utilize the calibratio idea itroduced i 7, Q β, ΣS, Z = β ΣS β + β X i y i X β, i where Σ S = +N X i X i. 9 8

9 Whe there is o cofusio, we use Q to deote the estimator proposed i 9. We first itroduce the followig regularity coditios ad the establish the covergece rate of the proposed estimator i 9 i Theorem. A The rows X i are i.i.d. p-dimesioal sub-gaussia radom vectors with mea 0 ad covariace matrix Σ with /M λ mi Σ λ max Σ M for M ; The errors {ɛ i } i are idepedet of {X i } i +N ad follow i.i.d sub-gaussia radom variable with mea zero ad variace σ ; The high-dimesioal vector β is assumed to be of sparsity k; A There exists some positive costat c 0 > 0 such that E β X X β β Σβ > c0. Assumptio A requires that the spectrum of the covariace matrices Σ is bouded away from zero ad ifiity ad that the oise level σ is upper bouded by a costat. Assumptio A also assumes that both the desig ad the oise are sub-gaussia. Defie U = X i β/ β Σβ, where EU = 0 ad EU =. Assumptio A is placed o this radom variable U such that VarU is ot vaishig. This assumptio is imposed such that VarU ca be well estimated ad this type of assumptio has bee itroduced i covariace matrix estimatio literature [0] for the same purpose. Theorem. Suppose that Coditio A holds ad k c/ log p for some costat c > 0. For ay estimator β satisfyig Coditio B, with probability larger tha p c exp c N c t γ, the estimator Q = Q β, ΣS, Z defied i 9 satisfies Q Q t β β N k log p + t + + β N +. 0 Uder the additioal assumptio k /log p ad β k log p/, Q Q where ρ = lim 4σ β Σβ + ρe β X X β N0, β Σβ N+. Remark. Sice Q 0, the covergece rate 0 also holds for Q +, the positive part of Q. To keep the otatio simpler, we oly preset the results for Q throughout this paper. This covergece rate established i 0 is show to be optimal i Sectio.. Uder the additioal assumptios k /log p ad β k log p/, we establish a more refied distributioal result i. Such ormal limitig distributio is used i Sectio 3 to costruct cofidece itervals for β Σβ. Oe iterestig pheomeo is that the limitig distributio established i depeds o the proportio of the labelled data. If the amout of ulabelled data domiates that of labelled data that is, ρ = 0, the the limitig distributio i is simplified to Q Q 4σ β Σβ N0,. 9

10 . Optimal Rate of Covergece I this sectio, we further ivestigate the optimality of the proposed estimator 9 by studyig the miimax covergece rate of estimatig β Σβ i the semi-supervised settig ad cosider the followig specific parameter space, Θ k, M = { θ = β, Σ, σ : β 0 k, M/ β M, } λ mi Σ λ max Σ M, σ M, M where M ad M > 0 are positive costats. The parameter space defied i requires the sparsity β 0 k ad M/ β M, where k ad M are allowed to grow with ad p. Here k quatifies the sparsity of β ad M quatifies the sigal stregth of the true sigal β i terms of its l orm. The other coditios /M λ mi Σ λ max Σ M ad σ M are regularity coditios. The followig theorem establishes the miimax lower bouds for the covergece rate of estimatig Q over the parameter space Θk, M. Theorem. Suppose k c mi {/ log p, p ν } for some costats c > 0 ad 0 ν <. The if Q M sup P Q { M Q + mi θ Θk,M N + + k log p }, M 4. 3 I the above theorem, oly the first term i the lower boud is ivolved with the amout of the additioal ulabelled data, that is to say, a larger amout of ulabelled data oly helps lower the term M / N + but ot ay other terms. Theorems ad together show that the estimator proposed i Sectio. is miimax rate optimal uder regularity coditios. Corollary. Suppose that Coditio A holds ad k c mi {/ log p, p ν } for some costats c > 0 ad 0 ν <. For ay estimator β satisfyig Coditio B, the estimator Q defied i 9 is miimax rate optimal over Θk, M where k log p/ M C for some costat C > 0. The above corollary shows that the proposed method attais the optimal covergece rate whe the l orm is relatively strog, that is, β is bouded away from zero by k log p/. As show i Theorem, for the case where M k log p/, the lower boud of estimatig β Σβ is M. This optimal covergece rate ca be achieved by a trivial estimator 0 ad hece the correspodig regime M k log p/ is ot iterestig i terms of studyig optimal estimators. I Corollary, the lower boud 3 is oly matched for the regime where M C for some costat C > 0. For theoretical iterest, we are goig to modify the proposed estimator Q defied i 9 such that the modified versio will achieve the lower boud 3 over the whole iterestig regime M k log p/. We radomly split the data y, X 0

11 ito two subsamples Z = y, X with sample size ad Z = y, X with sample size, where. Let β deote a estimator which is produced by the first sub-sample y, X ad satisfies Coditio A. Oe example of such a estimator is the scaled Lasso estimator 8 applied to the subsample Z = y, X. We propose the followig estimator of Q, Q β, Σ, Z = β Σ β + β where Σ = +N +N i= + X i X i. i= + X i y i X i β, 4 The followig theorem establishes the covergece rate of Q β, Σ, Z ad shows that this estimator achieves the optimal covergece rate of estimatig Q for M k log p/. Theorem 3. Suppose that Coditio A holds ad k c/ log p for some costat c > 0. Let β be a estimator depedig o the first half sample y, X ad satisfyig Coditio B. The with probability larger tha p c exp c N c t γ, Q β, Σ, Z Q t + β β + t + k log p N +. 5 Hece, the estimator Q β, Σ, Z defied i 4 achieves the optimal estimatio rate M + M + k log p. 6 over Θk, M i the regime k c mi {/ log p, p ν } for some costats c > 0 ad 0 ν < ad M k log p/..3 Two Special Cases We ow tur to the iferece i the supervised settig ad the settig with kow desig covariace matrix. These two settigs ca be viewed as the special cases with N = 0 ad N = respectively..3. Case I: Supervised Iferece I the supervised settig, we oly observe the labelled data ad will estimate Σ by the sample covariace matrix Σ L = X i Xi. The followig theorem establishes the covergece rate of the estimator Q = Q β, Σ L, Z. Theorem 4. Suppose that Coditio A holds ad k c/ log p for some costat c > 0. For ay estimator β satisfyig B, with probability larger tha p c exp c t γ,

12 Q β, Σ L, Z proposed i 7 with Σ L = X i X i satisfies Q β, Σ L, Z Q t β + β + k log p. 7 { Uder the additioal assumptio A ad β mi k log p/, k log p/ /}, the Q β, ΣL, Z Q 4σ β Σβ + E β X X β N0, 8 β Σβ The estimator Q β, Σ L, Z is a special case of the estimator 9 with N = 0. comparig Theorem 4 with Theorem, if β C for some positive costat C, the the ulabelled data leads to a faster covergece rate by reducig the term β / i 7 to β / N + i 0; however, the ulabelled data does ot affect other terms i the covergece rate. The effect of the ulabelled data is also revealed i the limitig distributio of the proposed estimator, where a compariso of 8 ad shows that the exact variace level is reduced from 4σ β Σβ + E β X X β β Σβ to 4σ β Σβ + ρe β X X β β Σβ, where ρ deotes the limitig proportio of the amout of labelled data out of the total amout of both labelled ad ulabelled data. The followig corollary further establishes the miimax rate for estimatig β Σβ i the supervised settig. Corollary. Suppose that Coditio A holds ad k c mi {/ log p, p ν } for some costats c > 0 ad 0 ν <. By For ay estimator β satisfyig Coditio B, the estimator Q = Q β, Σ L, Z defied i 7 with Σ L = X i X i achieves the followig optimal estimatio rate over Θk, M for M k log p/, M + M + k log p. 9 Remark. A related paper [] studies estimatio of β ad shows that the optimal rate of estimatig β over Θk, M for M k log p/ is M/ + M + k log p/ i the supervised settig. I cotrast to 9, we ca see that either of these two problems is easier tha the other, where there is a additioal term M / i 9 ad a additioal term Mk log p/ i the optimal covergece rate of estimatig β. Remark 3. Iferece for β Σβ is closely coected to [33, 38], where [33] studied the iferece problem for σ ad [38] studied the estimatio of β Σβ/σ. I particular, [33] proposed the scaled lasso estimator σ i 8 to estimate σ ad [38] proposed to estimate

13 β Σβ by y σ + as a itermediate step of estimatig β Σβ/σ. For the estimator Q β, Σ L, Z defied i 7, if β is take as the scaled Lasso estimator, the Q β, Σ L, Z is reduced to beig the same as the estimator proposed i [38], where the equivalece is show by the followig expressio, β ΣL β + β X i y i X i β = y y X β = y σ. 0 We shall stress that the calibratio idea i 7 provides a completely ew perspective o estimatio of β Σβ, where istead of usig the expressio Q = Eyi σ ad estimatig σ first, we estimate Q directly by calibratig the plug-i estimator. This ew perspective establishes a geeral machiery takig reasoable good iitial estimators of β ad Σ as iputs. As show i 9, the flexibility of the calibrated estimator has prove to be extremely useful i efficietly poolig additioal iformatio o Σ; Note that the estimatio method itroduced i [38] caot be exteded to the semi-supervised settig as that for the calibratio perspective i 9. Additioally, [38] focused o the estimatio problem istead of cofidece iterval costructio ad hypothesis testig problems. I terms of techical details o estimatio optimality, the results i [38] allowed for a more geeral regime k p tha Corollary but oly cosidered the optimality i the supervised settig ad cosidered a fixed M i the aalysis..4 Case II: Kow Σ The geeral semi-supervised results also shed light o aother iterestig settig where the desig covariace Σ is kow. I the semi-supervised settig, the additioal ulabelled data is used for estimatig the desig covariace matrix Σ. The case of kow Σ is a extreme case of the semi-supervised settig with N take as ifiity. The estimator 4 ca be modified such that the iformatio o Σ is icorporated, Q β, Σ, Z = β Σ β + β i= + Similarly, the estimator proposed i 9 ca be modified as Q β, Σ, Z = β Σ β + β X i y i X i β X i y i X β. i Corollary 3. Suppose that Coditio A holds ad k c/ log p for some costat c > 0.. For ay estimator β depedig o the first half sample y, X ad satisfyig Coditio B, with probability larger tha p c exp c t γ, the estimator 3

14 defied i satisfies Q β, Σ, Z Q t + β + k log p. 3. For ay estimator β satisfyig Coditio B, with probability larger tha p c exp c t γ, the estimator defied i satisfies Q β, Σ, Z Q t β + β + k log p. 4 Through comparig 3 with 5 ad 4 with 0, the ucertaity of estimatig the desig covariace matrix leads to the additioal term β / N +. By applyig Theorem, we ca show that the covergece rate i 3 achieves the optimal covergece rate M/ + k log p/. The term M / N + will disappear due to the kow desig covariace matrix Σ. 3 Cofidece Itervals for β Σβ I this sectio, we cosider the problem of costructig cofidece itervals for β Σβ, which is ivolved with ucertaity quatificatio of the CHIVE estimator proposed i Sectio. We first costruct cofidece itervals for β Σβ i Sectio 3. ad itroduce a radomized calibratio procedure i Sectio 3. to study iferece for explaied variace i the presece of weak sigals. 3. Cofidece Iterval Costructio We start with ucertaity quatificatio of the CHIVE estimator Q proposed i 9. By the limitig distributio established i, the mai ext step is to cosistetly estimate the stadard error 4σ β Σβ + ρe β X X β β Σβ /. Specifically, we estimate 4σ β Σβ by φ, ρ by ρ = /N + ad E β X X β β Σβ by φ, where φ = σ β ΣS β ad φ = +N β X i X β i β, ΣS β with Σ S defied i 9. The we propose the followig cofidece iterval cetered at Q, CIZ = Q φ zα/, Q + z α/ φ, where φ 4 φ + ρ φ =, 5 + where z α/ is the upper α/ quatile of stadard ormal distributio. The followig theorem establishes the coverage ad precisio properties of CIZ, where the legth of ay iterval CIZ = LZ, UZ is defied as LCIZ = UZ LZ. 4

15 Theorem 5. Suppose that Coditios A ad A hold, k mi{/logn+ log p, /log p} ad β k log p/. For β ad σ satisfyig Coditios B ad B, respectively, the cofidece iterval give i 5 satisfies the followig coverage ad precisio properties, lim P β Σβ CIZ α 6 ad lim P LCIZ + δ 0 4σ β Σβ + E β X X β β Σβ = 0 7 N + for ay positive costat δ 0 > 0. The effect of the additioal data o the legth of cofidece iterval is demostrated i 7, where the cofidece iterval gets shorter with a larger amout of ulabelled data. Furthermore, the legth 4σ β Σβ/ + E β X X β β Σβ /N + is upper bouded by β / + β / N +, which matches the optimal covergece rate of estimatio M/ + M / N + over the parameter space Θk, M for k /log p ad M k log p/. As show i Theorem 5, the validity of the proposed cofidece iterval 5 requires the coditio that β is bouded away from zero by k log p/. Although k log p/ coverges to zero over the extreme sparse regime k /log p, it reveals the difficulty of costructig stable cofidece itervals for β Σβ whe β is at a local eighborhood of zero. The ext sectio will address the iferece problem i presece of such weak sigals. 3. Iferece for Weak Sigals: Radomized Calibratio As discussed i the itroductio, ucertaity quatificatio of Q = β Σβ is closely coected to other importat statistical problems, icludig sigal detectio ad global testig; predictio accuracy evaluatio ad 3 cofidece ball costructio. These applicatios provide a strog motivatio for iferece for the explaied variace uder the settigs of weak sigals that is, β k log p/. I the followig, we focus o the iferece problem i the presece of weak sigals ad itroduce a radomized versio of iid the CHIVE estimator 9. We first geerate radom variables u i N0, τ0 for i, which is idepedet of the observed data Z. Similar to 9, we propose the followig radomized calibrated estimator, Q R = Q β, R ΣS, Z, u = β ΣS β + X i β + u i y i X i β. 8 5

16 Whe there is o cofusio, we use Q R to deote the estimator proposed i 8. I cotrast to 9, the calibratio step i 8 is ivolved with a additioal term u iy i X β. i If u i is zero istead of beig geerated as ormal radom variables i 8, the estimator Q β, R ΣS, Z, 0 is reduced to beig exactly the same as Q β, ΣS, Z defied i 9. Sice u i i 8 is radomly geerated ormal radom variables, this additioal term approximately follows a ormal distributio with mea zero ad variace 4σ τ0 /. Eve i the presece of weak sigals, this additioal term further elarges the variace level of the calibrated estimator such that the bias level of the calibrated estimator is domiated by the correspodig variace level. The followig corollary establishes the limitig distributio of the estimator Q R after radomized calibratio. Theorem 6. Suppose that Coditio A holds, k / log p ad τ 0 > 0 is a positive costat. For ay estimator β satisfyig Coditio B, the where ρ = lim QR Q 4σ d N 0, 9 β Σβ + τ0 + ρe β X X β β Σβ +N. I compariso to the limitig distributio i Theorem, Theorem 6 requires o coditio o β to coduct iferece for the explaied variace while the variace level of Q R is slightly larger tha that of Q by the amout 4σ τ0 /. This additioal variace term is a side effect of the radomized calibratio. However, it has paved the way to quatify the ucertaity whe β is ear zero that is β k log p/. Similar to 5, the stadard error of the radomized estimator Q R i 9 is approximated as φ R = +N β 4 σ ΣS β + τ β 0 + X i X β, i β Σ S β 30 where Σ S is defied i 9. The we propose the followig cofidece iterval, QR CI R Z = z α/ φ R, QR + z α/ φr, 3 + where z α/ is the upper α/ quatile of stadard ormal distributio. The followig corollary characterizes the coverage ad precisio properties of CI R Z. Corollary 4. Suppose that Coditios A ad A hold, k mi{/logn+ log p, /log p} ad τ 0 > 0 is a positive costat. For β ad σ satisfyig Coditios B ad B, respectively, the the cofidece iterval defied i 3 satisfies the followig coverage ad precisio properties, lim P β Σβ CI R Z α 3 6

17 ad lim P LCI R Z + δ 0 for ay positive costat δ 0 > 0. 4σ β Σβ + τ0 + E β X X β β Σβ = 0 33 N + The algorithm for estimatig β Σβ ad quatifyig its ucertaity is summarized i Algorithm. Algorithm : Radomized CHIVE i the Semi-supervised Settig Iput : Labelled data {y i, X i } i ad ulabelled covariates {X i } + i +N ; Radomizatio level τ 0 Output: Poit estimator Q R = Q R y, X, τ 0 ad its stadard error estimator φ R = φ R y, X, τ 0 Iitializatio: Costruct poit estimator β ad σ satisfyig B ad B; Estimate Σ by Σ S defied i 9; Radomized Calibratio: Estimate Q by the estimator Q R defied i 8, where the variables {u i } i are geerated to be idepedet of the observed data X, y ad followig i.i.d N0, τ 0 ; 3 Ucertaity Quatificatio: Estimate the stadard error of the proposed estimator by φ R defied i Statistical Applicatios I this sectio, we apply Algorithm to tackle several importat statistical problems, icludig sigal detectio ad global testig i Sectio 4., predictio accuracy evaluatio i Sectio 4. ad cofidece ball costructio i Sectio Applicatio : Sigal Detectio ad Global Testig Sigal detectio is of great importace i statistics ad related scietific applicatios ad the detectio problem i high-dimesioal liear regressio was studied i [, 3]. The iferece procedure stated i Algorithm has profoud implicatios o sigal detectio ad the geeral global testig i high-dimesioal liear regressio. We cosider the global hypothesis H 0 : β = β ull, which icludes the sigal detectio as a special case with takig β ull = 0. The global testig problem is cast as H 0 : β β ull Σβ β ull = 0 v.s. H : β β ull Σβ β ull >

18 We apply Algorithm with a give τ 0 > 0 ad obtai the poit estimator Q R y Xβ ull, X, τ 0 ad its stadard error estimator φ R y Xβ ull, X, τ 0. The we propose the detectio procedure, with Type I error cotrolled at α 0, as Dτ 0 = QR y Xβ ull, X, τ 0 φ R y Xβ ull, X, τ 0 z α. 35 We defie the correspodig ull parameter space as { } H 0 = θ = β ull, Σ, σ : λ mi Σ λ max Σ M, σ M M 36 ad the local alterative parameter space as { H = θ = β, Σ, σ : β β ull Σβ β ull = }, λ mi Σ λ max Σ M, σ M. M The followig corollary establishes that Dτ 0 cotrols the type I error asymptotically ad also establishes the asymptotic power fuctio of the proposed test. Corollary 5. Suppose that Coditios A ad A hold, τ 0 > 0 is a positive costat ad the vector δ = β β ull satisfies the coditios that δ 0 mi{/logn + log p, δ /log p} ad E X X δ δ Σδ > c0 for some positive costat c 0. The for ay θ H 0, 37 lim P θ Dτ 0 = α. 38 For ρ > 0 ad ay θ H with some positive costat > 0, the lim P θ Dτ 0 = = Φ z α. 39 4σ β Σβ + τ0 + ρe β X X β β Σβ The assumptios of Corollary 5 are the same as those of Corollary 4 from the perspective that the coditios imposed o β i Corollary 4 are ow imposed o the differece vector δ = β β ull. Oe sufficiet coditio for the differece vector δ beig sparse is that both the true sigal β ad the ull hypothesis β ull are sparse. Corollary 5 shows that for ay positive costat τ 0, Dτ 0 cotrols the type I error asymptotically. For the fiite sample performace, we have ivestigated how to choose the radomizatio level τ 0 i the simulatio sectio. See Sectio 5. for the umerical performace. 4. Applicatio : Predictio Accuracy Assessmet Iferece for explaied variace has importat applicatios to evaluatig the out-of-sample predictio for a give sparse estimator ˇβ. To keep the otatio cosistet, we assume ˇβ is 8

19 estimated based o a traiig data set X 0, y 0 ad X, y is a idepedet test data to evaluate its predictio accuracy. We start with computig the residual o the test data set y X ˇβ = Xβ ˇβ + ɛ. 40 The out-of-sample predictio accuracy is defied as PA ˇβ = E xew x ew ˇβ β = ˇβ β Σ ˇβ β ad it is reduced to the explaied variace for the residual model 40 with outcome r = y X ˇβ ad covariates X. Let Q R r, X, τ 0 ad φ R r, X, τ 0 deote the outputs of Algorithm with the labeled data {r i, X i } i ad ulabelled data {X i } + i +N as iputs. The we propose the poit estimator of PA ˇβ as Q R r, X, τ 0 ad the iterval estimator for PA ˇβ as QR CI PA ˇβ = r, X, τ 0 z φ R α/ r, X, τ 0, Q R r, X, τ 0 + z φr α/ r, X, τ 0 + The followig corollary establishes the covergece rate for the poit estimator ad the coverage ad precisio properties of the iterval estimator. Corollary 6. Suppose that Coditios A ad A hold ad τ 0 > 0 is a positive costat. For ay sparse estimator satisfyig ˇβ 0 C β 0 ad C > 0, the. If k c/ log p for some positive costat c > 0, the with probability larger tha p c exp c N c t γ, Q R r, X, τ 0 Q t ˇβ β + τ 0 + t ˇβ β + ˇβ β + k log p N If k mi{/logn + log p, δ /log p} ad E X X δ δ Σδ > c0 for some positive costat c 0 where δ = β ˇβ, the the cofidece iterval defied i 4 satisfies the followig coverage ad precisio properties, lim P PA ˇβ CI PA ˇβ α 43 ad lim P ˇβ β + τ 0 LCI PA ˇβ C + ˇβ β = 0 44 N + for some costat C > 0. The above corollary has show that the precisio of cofidece iterval for the predictio accuracy is ot just related to the sample sizes, N, the sparsity k ad the dimesio p, but also related to the accuracy of the evaluated estimator ˇβ β. See Sectio 5.3 for the umerical performace. 9

20 4.3 Applicatio 3: Cofidece Ball Costructio The predictio accuracy evaluatio established i 4 ca be used to costruct cofidece ball for β. For the settig where λ mi Σ is kow, the we have λ mi Σ ˇβ β ˇβ β Σ ˇβ β ad costruct the cofidece ball for β as { } CB ˇβ = β : β ˇβ z α/ λ mi Σ φ R r, X, τ 0 As show i 44, the radius of the cofidece ball CB ˇβ is upper bouded by ˇβ β +τ 0 + ˇβ β N+. To miimize the radius, we eed to select the ceter ˇβ for the cofidece ball i 45 such that ˇβ is sparse ad ˇβ β is small. I the high-dimesioal literature, several pealized estimators are show to satisfy such properties, such as Lasso, scaled Lasso ad Datzig Selector Simulatio Study We carry out simulatio studies i this sectio to demostrate the umerical performace of the proposed methods, which cosist of cofidece iterval costructio for β Σβ i Sectio 5., sigal detectio i Sectio 5. ad predictio accuracy evaluatio i Sectio 5.3. Throughout all simulatio studies i this paper, we geerate the high-dimesioal liear regressio with the dimesio p = 800 ad the correspodig sample size across {00, 400, 600, 800, 000, 00, 400}. For the liear model, the covariates {X i } i are geerated i i.i.d. fashio to follow multivariate ormal distributio with mea zero ad covariace matrix Σ R where Σ ij = 0.5 i j ad the errors {ɛ i } i are geerated as i.i.d stadard ormal distributio. I additio to the labelled data, we also geerate the ulabelled data {X i } + i +N with N =, 000 to study the proposed iferece procedures i the semi-supervised settig. The simulatios are replicated over 500 simulatios. 5. Iferece for β Σβ The sample sizes are geerated across 00, 400, 600, 800 ad 000 ad the high-dimesioal regressio vector β is geerated across the followig three settigs, a. Settig : β is geerated with sparsity 0 where β j = j/0 for j 0 ad β j = 0 for j ; b. Settig : β is geerated with sparsity 50 where β j = j/50 for j 50 ad β j = 0 for j 5; 0

21 c. Settig 3: β is geerated as approximate sparse vector with β j = 0.5 p. We have compared the estimatio accuracy across two differet types of estimators, the plug-i estimator ad the CHIVE estimator ad across two differet settigs, supervised settig ad semi-supervised settig. Recall that oly labelled data is available i the supervised settig ad both the labelled ad ulabelled data are available i the semi-supervised settig. The umerical compariso has bee reported i Figure. Across all three settigs, it is observed that the proposed CHIVE estimator has achieved uiformly much better estimatio accuracy tha the plug-i estimators, i both supervised ad semi-supervised settigs. This umerical observatio demostrates that the calibratio step is useful i improvig the estimatio accuracy. I additio, the ulabelled data is useful i estimatig β Σβ, where as demostrated i Figure, the solid lie the CHIVE estimator i the semisupervised settig is always below the dotted lie the CHIVE estimator i the supervised settig. This matches with the theoretical results. True Value 9.4 True Value True Value.9 RMSE Plugi CHIVE Plugi.semi CHIVE.semi RMSE Plugi CHIVE Plugi.semi CHIVE.semi RMSE Plugi CHIVE Plugi.semi CHIVE.semi Sample Size Sample Size Sample Size Figure : Root Mea Squared Error RMSE of differet estimators of β Σβ. The x-axis stads for the sample size ad y-axis stads for the RMSE of correspodig estimators. The dotted lie ad the solid lie represet the correspodig RMSEs of the CHIVE estimator i the supervised settig ad semi-supervised settig, respectively; The dashed lie ad the dotted-dashed solid lie represet the correspodig RMSE of the plug-i estimator i the supervised settig ad semi-supervised settig, respectively. The true values for β Σβ are 9.4, ad.9, from the leftmost to the rightmost. I additio to the sigificat improvemet i terms of estimatio, the CHIVE estimator serves as the ceter of cofidece itervals for β Σβ. The coverage ad precisio properties of the costructed cofidece iterval CI are reported i Table. With a larger sample size,

22 the empirical coverage of the proposed cofidece iterval achieves 95% ad the average legths of the cofidece itervals get shorter. The itegratio of the ulabelled data i the semi-supervised settig has shorte the legths of cofidece iterval sigificatly. Settig Settig Settig 3 Supervised Semi-Supervised Cov Le Cov Le , , , Table : Coverage ad precisio properties of Proposed CIs. Differet rows correspod to differet settigs Settig,,3 ad differet sample sizes = 00, 400, 600, 800, 000 for the give settig. Each row reports empirical coverage idexed with Cov ad average legths idexed with Le of proposed CIs. The colums idexed with Supervised represet the results for the supervised settig ad the colums idexed with Semi-Supervised represet the results for the semi-supervised settig. For example, i the first row of umbers 0.9, 3.750, 0.896,.796, it correspods to the settig ad sample size = 00, i the supervised settig, CI has empirical coverage 0.9 ad the average legth is 3.750; i the semi-supervised settig, CI has empirical coverage ad the average legth is Sigal Detectio For the detectio problem, we cosider the followig geeratio of β as β j = δ for j 50 ad β j = 0 for 5 j 800 ad vary δ across {0.00, 0.05, 0, 05, 0.075, 00, 0.5, 0.5} ad vary the sample size across {600, 00}. I Figure, we demostrate the coverage ad precisio properties of the radomized cofidece itervals across four methods, the oradomized detector D0 ad the three radomized detectors D, D4 ad D6, where D is defied i 35. The two plots o the top of Figure, correspodig to the supervised

23 Coverage 0.50 Legth δ δ Coverage 0.50 Legth δ δ Radomizatio tau0=0 tau0= tau0=4 tau0=6 Figure : Empirical coverage ad average legths of the proposed radomized cofidece itervals i the supervised settig. The above two figures correspod to the sample size = 600 ad the bottom two figures correspod to = 00. The left had side figures stad for the empirical coverage for differet δ while the right had side figures stad for the average legths of CIs for differet δ. Differet type of the curves correspod to differet radomizatio levels τ 0 {0,, 4, 6}. The dashed horizotal lies o the left had figures correspod to the targeted coverage level, settig with = 600 demostrate the effect of radomizatio o the empirical coverage ad average legths, where the radomizatio leads to a iterval estimator achievig the coverage properties at the expese of wider iterval estimators. With the radomizatio level τ 0 reachig, the coverage property is guarateed while the empirical coverage for the procedure without radomizatio τ 0 = 0 is much lower tha 0.95, especially for weak sigals with a small δ. The bottom two plots of Figure correspods to the supervised settig with =, 00 ad the mai observatio is similar to the case of = 600 but the cofidece itervals are much shorter tha the settig with = 600. The empirical detectio rate is reported i Table, where the sample size is geerated across = 600 ad =, 00 ad the explaied variace β Σβ is cotrolled via the scaler δ. Whe δ = 0, it correspods to the ull case ad a proper detectio procedure is expected to have type I error rate As predicted by theory, the detectio method without 3

24 radomizatio D0 fails to give proper type I error due to presece of weak sigals. With itroducig the radomizatio procedure, the type I error rate gets closer to Whe δ moves away from zero, the detectio procedure is take as a powerful procedure as the empirical detectio rate approaches. For the detectio procedure with radomizatio level τ 0 =, the settig with δ = 0.05 correspods to a idistiguishable regio, where it is challegig to detect the sigal. However, as δ reaches 0.05, the detectio rate reaches for = 600 ad for = 00. As characterized by theory, a larger radomizatio level requires a higher value of δ such that the sigal ca be detected, for example, for τ 0 = 4, util δ reaches 0.075, the detectio rate reaches 0.8 for = 600 ad for = 00. The correspodig semi-supervised settig shows a similar pheomeo to the supervised settig but teds to be easier tha the supervised settig due to the ulabelled data. The results are reported i the supplemetary materials. = 600 =, 00 δ β Σβ D0 D D4 D6 D0 D D4 D Table : Empirical detectio rates i the supervised settig. The colum idexed with δ represets the sigal stregth, where the sigal is of sparsity 50 ad of the form δ,,, 0, 0,, 0; the colum idexed with β Σβ represets the value of β Σβ; the colums uder =600 ad =,00 correspod to sample size 600 ad,00 respectively, where the colum idexed with Dτ 0 report the empirical detectio rates for the detector Dτ Predictio Loss Evaluatio I this subsectio, the high-dimesioal regressio vector β is geerated with sparsity 0 where β j = j/5 for j 0 ad β j = 0 for j. Let βλ deote the Lasso estimator based o a idepedet traiig data X 0, y 0 with sample size 0 = 600, β λ = arg mi β R p y 0 X 0 β 0 + λ p j= X 0 j 0 β j. We cosider the iferece problem for the out-of-sample predictio accuracy βλ β Σ βλ β. Specifically, we cosider three estimators βλ 0, β5λ 0 ad β0λ 0 with λ 0 = 4 Z 0./p 0

25 ad report the umerical performace of both poit ad iterval estimators of the correspodig predictio accuracy. We cosider the predictio accuracy problem across three differet sample sizes, {600, 00, 400} ad itroduce differet radomizatio levels. We will use PAτ 0 to deote the procedure with radomizatio level τ 0. βλ 0 β5λ0 β0λ0 True Accuracy Super, 600 Semi, 600 Super, 00 Semi, 00 Super, 400 Semi, 400 PA0 PA PA4 PA0 PA PA4 PA0 PA PA4 Coverage Est Aver Lower Aver Upper Aver Coverage Est Aver Lower Aver Upper Aver Coverage Est Aver Lower Aver Upper Aver Coverage Est Aver Lower Aver Upper Aver Coverage Est Aver Lower Aver Upper Aver Coverage Est Aver Lower Aver Upper Aver Table 3: Iferece for predictio accuracy βλ β Σ βλ β. The table reports six settigs, correspodig to three differet sample sizes 600,00, 400 ad the supervised ad semi-supervised settig. For example, Super, 600 stads for the supervised settig with sample size = 600 ad Semi, 600 stads for the semi-supervised settig with sample size = 600. The true predictio accuracy of the three estimators βλ 0, β5λ 0 ad βλ 0 is reported as 0.065, ad.30. Three predictio accuracy evaluators PA0, PA ad PA4 are reported, where PA0 is the evaluator with o radomizatio, PA is the evaluator with radomizatio level τ 0 = ad PA4 is the evaluator with radomizatio level τ 0 = 4. For each settig, the row idexed with Coverage reports the empirical coverage of the correspodig cofidece itervals over 500 simulatios; the row idexed with Est Aver reports the sample average of the correspodig poit estimators over 500 simulatios; the rows idexed with Lower Aver ad Upper Aver report the sample averages of the lower ad upper limits of iterval estimators over 500 simulatios. 5

Accuracy Assessment for High-Dimensional Linear Regression

Accuracy Assessment for High-Dimensional Linear Regression Uiversity of Pesylvaia ScholarlyCommos Statistics Papers Wharto Faculty Research -016 Accuracy Assessmet for High-Dimesioal Liear Regressio Toy Cai Uiversity of Pesylvaia Zijia Guo Uiversity of Pesylvaia

More information

ACCURACY ASSESSMENT FOR HIGH-DIMENSIONAL LINEAR REGRESSION 1. BY T. TONY CAI AND ZIJIAN GUO University of Pennsylvania and Rutgers University

ACCURACY ASSESSMENT FOR HIGH-DIMENSIONAL LINEAR REGRESSION 1. BY T. TONY CAI AND ZIJIAN GUO University of Pennsylvania and Rutgers University The Aals of Statistics 018, Vol. 46, No. 4, 1807 1836 https://doi.org/10.114/17-aos1604 Istitute of Mathematical Statistics, 018 ACCURACY ASSESSMENT FOR HIGH-DIMENSIONAL LINEAR REGRESSION 1 BY T. TONY

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector Summary ad Discussio o Simultaeous Aalysis of Lasso ad Datzig Selector STAT732, Sprig 28 Duzhe Wag May 4, 28 Abstract This is a discussio o the work i Bickel, Ritov ad Tsybakov (29). We begi with a short

More information

Confidence Intervals for High-Dimensional Linear Regression: Minimax Rates and Adaptivity

Confidence Intervals for High-Dimensional Linear Regression: Minimax Rates and Adaptivity Uiversity of Pesylvaia ScholarlyCommos Statistics Papers Wharto Faculty Research 5-207 Cofidece Itervals for High-Dimesioal Liear Regressio: Miimax Rates ad Adaptivity Toy Cai Uiversity of Pesylvaia Zijia

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

32 estimating the cumulative distribution function

32 estimating the cumulative distribution function 32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data Proceedigs 59th ISI World Statistics Cogress, 5-30 August 013, Hog Kog (Sessio STS046) p.09 Kolmogorov-Smirov type Tests for Local Gaussiaity i High-Frequecy Data George Tauche, Duke Uiversity Viktor Todorov,

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a

More information

MA Advanced Econometrics: Properties of Least Squares Estimators

MA Advanced Econometrics: Properties of Least Squares Estimators MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Optimal Estimation of Genetic Relatedness in High-dimensional Linear Models

Optimal Estimation of Genetic Relatedness in High-dimensional Linear Models Optimal Estimatio of Geetic Relatedess i High-dimesioal Liear Models Abstract Estimatig the geetic relatedess betwee two traits based o the geome-wide associatio data is a importat problem i geetics research.

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

CONFIDENCE INTERVALS FOR HIGH-DIMENSIONAL LINEAR REGRESSION: MINIMAX RATES AND ADAPTIVITY 1. BY T. TONY CAI AND ZIJIAN GUO University of Pennsylvania

CONFIDENCE INTERVALS FOR HIGH-DIMENSIONAL LINEAR REGRESSION: MINIMAX RATES AND ADAPTIVITY 1. BY T. TONY CAI AND ZIJIAN GUO University of Pennsylvania The Aals of Statistics 207, Vol. 45, No. 2, 65 646 DOI: 0.24/6-AOS46 Istitute of Mathematical Statistics, 207 CONFIDENCE INTERVALS FOR HIGH-DIMENSIONAL LINEAR REGRESSION: MINIMAX RATES AND ADAPTIVITY BY

More information

Regression with an Evaporating Logarithmic Trend

Regression with an Evaporating Logarithmic Trend Regressio with a Evaporatig Logarithmic Tred Peter C. B. Phillips Cowles Foudatio, Yale Uiversity, Uiversity of Aucklad & Uiversity of York ad Yixiao Su Departmet of Ecoomics Yale Uiversity October 5,

More information

Bayesian Methods: Introduction to Multi-parameter Models

Bayesian Methods: Introduction to Multi-parameter Models Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Optimal Estimation of Co-heritability in High-dimensional Linear Models

Optimal Estimation of Co-heritability in High-dimensional Linear Models Optimal Estimatio of Co-heritability i High-dimesioal Liear Models Zijia Guo, Wajie Wag,, T. Toy Cai, ad Hogzhe Li Departmet of Statistics, The Wharto School, Uiversity of Pesylvaia Departmet of Biostatistics

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates Iteratioal Joural of Scieces: Basic ad Applied Research (IJSBAR) ISSN 2307-4531 (Prit & Olie) http://gssrr.org/idex.php?joural=jouralofbasicadapplied ---------------------------------------------------------------------------------------------------------------------------

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Statistical Analysis on Uncertainty for Autocorrelated Measurements and its Applications to Key Comparisons

Statistical Analysis on Uncertainty for Autocorrelated Measurements and its Applications to Key Comparisons Statistical Aalysis o Ucertaity for Autocorrelated Measuremets ad its Applicatios to Key Comparisos Nie Fa Zhag Natioal Istitute of Stadards ad Techology Gaithersburg, MD 0899, USA Outlies. Itroductio.

More information

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates. 5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Lecture 27. Capacity of additive Gaussian noise channel and the sphere packing bound

Lecture 27. Capacity of additive Gaussian noise channel and the sphere packing bound Lecture 7 Ageda for the lecture Gaussia chael with average power costraits Capacity of additive Gaussia oise chael ad the sphere packig boud 7. Additive Gaussia oise chael Up to this poit, we have bee

More information

STAT431 Review. X = n. n )

STAT431 Review. X = n. n ) STAT43 Review I. Results related to ormal distributio Expected value ad variace. (a) E(aXbY) = aex bey, Var(aXbY) = a VarX b VarY provided X ad Y are idepedet. Normal distributios: (a) Z N(, ) (b) X N(µ,

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Singular Continuous Measures by Michael Pejic 5/14/10

Singular Continuous Measures by Michael Pejic 5/14/10 Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if LECTURE 14 NOTES 1. Asymptotic power of tests. Defiitio 1.1. A sequece of -level tests {ϕ x)} is cosistet if β θ) := E θ [ ϕ x) ] 1 as, for ay θ Θ 1. Just like cosistecy of a sequece of estimators, Defiitio

More information

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2 Chapter 8 Comparig Two Treatmets Iferece about Two Populatio Meas We wat to compare the meas of two populatios to see whether they differ. There are two situatios to cosider, as show i the followig examples:

More information

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D. ample ie Estimatio i the Proportioal Haards Model for K-sample or Regressio ettigs cott. Emerso, M.D., Ph.D. ample ie Formula for a Normally Distributed tatistic uppose a statistic is kow to be ormally

More information

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002 ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Rank tests and regression rank scores tests in measurement error models

Rank tests and regression rank scores tests in measurement error models Rak tests ad regressio rak scores tests i measuremet error models J. Jurečková ad A.K.Md.E. Saleh Charles Uiversity i Prague ad Carleto Uiversity i Ottawa Abstract The rak ad regressio rak score tests

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

5. Likelihood Ratio Tests

5. Likelihood Ratio Tests 1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

Summary. Recap ... Last Lecture. Summary. Theorem

Summary. Recap ... Last Lecture. Summary. Theorem Last Lecture Biostatistics 602 - Statistical Iferece Lecture 23 Hyu Mi Kag April 11th, 2013 What is p-value? What is the advatage of p-value compared to hypothesis testig procedure with size α? How ca

More information

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 10 October Minimaxity and least favorable prior sequences STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

A proposed discrete distribution for the statistical modeling of

A proposed discrete distribution for the statistical modeling of It. Statistical Ist.: Proc. 58th World Statistical Cogress, 0, Dubli (Sessio CPS047) p.5059 A proposed discrete distributio for the statistical modelig of Likert data Kidd, Marti Cetre for Statistical

More information

x a x a Lecture 2 Series (See Chapter 1 in Boas)

x a x a Lecture 2 Series (See Chapter 1 in Boas) Lecture Series (See Chapter i Boas) A basic ad very powerful (if pedestria, recall we are lazy AD smart) way to solve ay differetial (or itegral) equatio is via a series expasio of the correspodig solutio

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

Probability, Expectation Value and Uncertainty

Probability, Expectation Value and Uncertainty Chapter 1 Probability, Expectatio Value ad Ucertaity We have see that the physically observable properties of a quatum system are represeted by Hermitea operators (also referred to as observables ) such

More information

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01 ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly

More information

Quantile regression with multilayer perceptrons.

Quantile regression with multilayer perceptrons. Quatile regressio with multilayer perceptros. S.-F. Dimby ad J. Rykiewicz Uiversite Paris 1 - SAMM 90 Rue de Tolbiac, 75013 Paris - Frace Abstract. We cosider oliear quatile regressio ivolvig multilayer

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

11 THE GMM ESTIMATION

11 THE GMM ESTIMATION Cotets THE GMM ESTIMATION 2. Cosistecy ad Asymptotic Normality..................... 3.2 Regularity Coditios ad Idetificatio..................... 4.3 The GMM Iterpretatio of the OLS Estimatio.................

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

THE SPECTRAL RADII AND NORMS OF LARGE DIMENSIONAL NON-CENTRAL RANDOM MATRICES

THE SPECTRAL RADII AND NORMS OF LARGE DIMENSIONAL NON-CENTRAL RANDOM MATRICES COMMUN. STATIST.-STOCHASTIC MODELS, 0(3), 525-532 (994) THE SPECTRAL RADII AND NORMS OF LARGE DIMENSIONAL NON-CENTRAL RANDOM MATRICES Jack W. Silverstei Departmet of Mathematics, Box 8205 North Carolia

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

Rates of Convergence by Moduli of Continuity

Rates of Convergence by Moduli of Continuity Rates of Covergece by Moduli of Cotiuity Joh Duchi: Notes for Statistics 300b March, 017 1 Itroductio I this ote, we give a presetatio showig the importace, ad relatioship betwee, the modulis of cotiuity

More information

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators. IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits

More information

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values Iteratioal Joural of Applied Operatioal Research Vol. 4 No. 1 pp. 61-68 Witer 2014 Joural homepage: www.ijorlu.ir Cofidece iterval for the two-parameter expoetiated Gumbel distributio based o record values

More information

Pattern Classification, Ch4 (Part 1)

Pattern Classification, Ch4 (Part 1) Patter Classificatio All materials i these slides were take from Patter Classificatio (2d ed) by R O Duda, P E Hart ad D G Stork, Joh Wiley & Sos, 2000 with the permissio of the authors ad the publisher

More information

Asymptotic Results for the Linear Regression Model

Asymptotic Results for the Linear Regression Model Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is

More information