Dimensionality Reduction and Learning
|
|
- Bruno Campbell
- 5 years ago
- Views:
Transcription
1 CMSC (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that for L methods we eed ot work fte dmesoal spaces. I partcular, we ca uadaptvely fd ad work a low dmesoal space ad acheve about as good results. These results questo the eed for explctly workg fte (or hgh) dmesoal spaces for L methods. I cotrast, for sparsty based methods (cludg L regularzato), such o-adaptve proecto methods sgfcatly loose predctve power. Marg Based Classfcato For ow, assume we have a dstrbuto over X X R d ad Y {, }. Assume that there exsts a weght vector β such that sg(β X) = Y, wth probablty oe. Hece, the dstrbuto s separable. Furthermore, let us scale the dstrbuto so that t s separable at marg,.e.: Y (β X) What learg algorthm should we use? The The VC dmeso of halfspaces s Ω(D), so avely mmzg the 0/ loss D dmesos may ot be lead to good geeralzato propertes (ad t s ot clear how to do ths ayways). Istead, maxmzg the marg ca be show to provde good geeralzato propertes however computatoally, ths may be a lttle cumbersome (eve though t s polytme). Let us say say we have a trag set T = {(X, Y )}. Ofte, what s doe, s that the perceptro algorthm s ru o the trag set. The perceptro algorthm ru o ay sequece of pots {(X, Y )} T sampled from ths dstrbuto makes at most: M X β mstakes (regardless of the legth of the sequece) where X = max X X X. Hece, f we repeatedly cycle through the dataset, the evetually we wll o loger make mstakes. But what about geeralzato? Navely usg ths perceptro predctor does ot ecessarly lead to good geeralzato behavor sce the VC dmeso of halfspaces s Ω(D) (ad o boud s kow for ths coverget pot of the perceptro).. Radom Proectos ad Marg Preservato Now let us proect β ad X by P = k A, where A Rk d ad each etry A s sample depedetly from N(0, ). Is separablty preserved uder our trag set? Lemma.. Assume X. If k = O( β X log δ ), the wth probablty greater tha δ for all P β P X ad X P X X, β P β β
2 Proof. Choose ɛ = β X ad apply the er product preservg lemma, whch mples that for ay partcular X ad β, that P β P X β X, so that: Y P β P X Y β X For O( ) evets, we use O(δ/ ) so the total error probablty s δ. The fal clam follows from the orm preservg lemma.. Geeralzato If we ru the perceptro algorthm, o the trag set, the the total umber of mstakes made s: M t O( β X ) Note that ths mples that after O( β X ) terato the perceptro wll stablze to a costat soluto, whch has zero error. For geeralzato, we are ow workg wth a space of dmeso O( β log δ ). There are other methods to obta geeralzato but the mportat pot here s that uder the marg assumpto, we are essetally workg a fte dmesoal space (ad ths subspace ca be determe o-adaptvely from the labels {Y }). 3 Rdge Regresso ad Dmesoalty Reducto Let us ow cosder the ormal meas problem, sometmes referred to as the fxed desg settg. Here, we have a set of pots X = {X } R d, ad let X deote the R d matrx where the row of X s X. We also observe a output vector Y R. We desre to lear E[Y ]. I partcular, we seek to predct E[Y ] as X ˆβ. The square loss of a estmator w s: L(w) = E Y Y Xw = where the expectato s wth respect to Y. Let β be the optmal predctor: The rsk of a estmator ˆβ s defed as: E(Y X w) = β = argm w L(w) R( ˆβ) = L( ˆβ) L(β) = X ˆβ Xβ (whch s the fxed desg rsk). Deotg, Σ := X X we ca wrte the rsk as: R( ˆβ) = ( ˆβ β) Σ( ˆβ β) := ˆβ β Σ Aother terpretato of the rsk s how well we accurately lear the parameters of the model. Assume that ˆβ(Y ) s a estmator costructed wth the outcome Y we drop the explct Y depedece as ths s clear from cotext. Let β = E Y ˆβ be expected weght. We ca decompose the expected rsk as: E Y [R( ˆβ)] = E Y X ˆβ Xβ + Xβ Xβ = E Y ˆβ β Σ + β β Σ
3 where we have that: ad (average) varace = E Y X ˆβ Xβ predcto bas vector = Xβ Xβ whch shows a certa bas/varace decomposto of the error. 3. Rsk Bouds for Rdge Regresso The rdge regresso estmator usg a outcome Y s ust: The estmator s the: For smplcty, let us rotate X such that: ˆβ λ = argm w Y Xw + λ w ˆβ λ = (Σ + λi) ( X Y ) = (Σ + λi) ( Y X ) Σ := X X = dag(λ, λ,... λ d ) (ote ths rotato does ot alter the predctos of rotatoally varat algorthms). Wth ths choce, we have that: It s straghtforward to see that: ad t follows that: by ust takg expectatos. [ ˆβ λ ] = Lemma 3.. (Rsk Boud) If Var(Y ), we have that: Ths holds wth equalty f Var(Y ) =. Proof. For the varace term, we have: = Y [X ] λ + λ β = E[ ˆβ 0 ] [β λ ] := E[ ˆβ λ ] = λ λ + λ β E Y [R( ˆβ λ )] λ ( λ + λ ) + β λ ( + λ /λ) E Y ˆβ λ β λ Σ = λ E Y ([ ˆβ λ ] [β λ ] ) = = = λ (λ + λ) E[ (Y E[Y ])[X ] (Y E[Y ])[X ] ] λ (λ + λ) λ (λ + λ) λ (λ + λ) = Var(Y )[X ] = [X ] = = 3
4 Ths holds wth equalty f Var(Y ) =. For the bas term, β λ β Σ = λ ([β λ ] [β] ) = = λ β λ ( λ + λ ) β λ λ ( λ + λ ) ad the result follows from algebrac mapulatos. There followg boud characterzes the rsk for two atural settgs for λ. Corollary 3.. Assume Var(Y ) (Fte Dms) For λ = 0, E Y [R( ˆβ λ )] d Ad f V ar(y ) =, the E Y [R( ˆβ λ )] = d. (Ifte Dms) For λ = Σ trace β, the: E Y [R( ˆβ λ )] β Σ trace = β X β X where the trace orm s the sum of the sgular values ad X = max X. Furthermore, for all there exsts a dstrbuto Pr[Y ] ad a X such that the f λ E Y [R( ˆβ λ )] s Ω ( β Σ trace ) (so the above boud s tght up to log factors). Coceptually, the secod boud s dmeso free,.e. t does ot deped explctly o d, whch could be fte. Ad we are effectvely dog regresso a large (potetally) fte dmesoal space. Proof. The λ = 0 case follows drectly from the prevous lemma. Usg that (a + b) ab, we ca boud the varace term for geeral λ as follows: λ ( λ + λ ) λ λ λ = λ λ Aga, usg that (a + b) ab, the bas term s bouded as: So we have that: β λ ( + λ /λ) β λ λ /λ = λ β E Y [R( ˆβ λ )] Σ trace λ + λ β ad usg the choce of λ completes the proof. 4
5 To see the above boud s tght, cosder the followg problem. Let X = ad β = ad let Y = Xβ +η where η s ut varace. Here, we have that λ = so λ log ad β log, so the upper s log. Now oe ca wrte the rsk as: E Y [R( ˆβ λ )] = ( + + λ) ( + () λ ) ad ths s Ω( ), for all λ. = + λ ( + λ) () + λ dx (3) ( + xλ) = ( + λ )( λ( + λ) λ( + λ) ) (4) = ( λ + λ)( + λ + λ ) (5) However, ow we show that wth L complexty, we ca effectvely workg fte dmesos (where the dmeso s chose as a fucto of ). 3. Radom Proectos ad Maxmum Lkelhood Estmato Frst ote that f we proect to k = O( log ɛ ) dmesos the (usg P = k A), we have that for all : Let us defe the loss usg oly XP as: Let β P be the best ft of Y wth XP,.e. P β P X β X β X ɛ L P (w) = E Y Y XP w β P = argm w L P (w) ad let ˆβ P be the MLE ft of Y wth XP (so λ = 0). Now by the prevous corollary, the: (6) E Y [L P ( ˆβ P )] L P (β P ) = E Y [ XP ˆβP XP β P ] k Also ote that: L P (β P ) L P (P β) = E[ Y XP P β ] = E[ Y Xβ ] + Xβ XP P β = L(β) + (P β P X β X ) L(β) β ( X )ɛ = L(β) β Σ trace ɛ 5
6 Theorem 3.3. (Rsk Boud after Radom Proecto) Assumg Var(Y ), ad that P s ɛ er product preservg for k = O( log ɛ ), the: E Y XP ˆβP Xβ = E Y [L P ( ˆβ P )] L(β) k + β Σ traceɛ 0 log = ɛ + β Σ trace ɛ Hece, choosg ɛ = O( log β Σ trace ), mples that k = O( β Σ trace log ) ad: E Y XP ˆβP Xβ O ( ) log β Σ trace Proof. From above we have that: so that: L(β) L P (β P ) β Σ trace ɛ E Y [L P ( ˆβ P )] L(β) E Y [L P ( ˆβ P )] L P (β P ) + β Σ trace ɛ = E Y [ XP ˆβP XP β P ] + β Σ trace ɛ ad we have bouded the rsk the last terms as k. Ths matches the rsk boud up to log factors. Also, our algorthm s smply a MLE estmate k = O( β Σ trace log ) dmesos. Note that the umber of dmesos we choose s growg as O( ). Refereces The dscusso o classfcato used results from Satosh Vempala s moograph o radom proectos. The rdge regresso results, to my kowledge, are ew. 6
Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)
CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.
More informationBayes (Naïve or not) Classifiers: Generative Approach
Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg
More information1 Solution to Problem 6.40
1 Soluto to Problem 6.40 (a We wll wrte T τ (X 1,...,X where the X s are..d. wth PDF f(x µ, σ 1 ( x µ σ g, σ where the locato parameter µ s ay real umber ad the scale parameter σ s > 0. Lettg Z X µ σ we
More informationTESTS BASED ON MAXIMUM LIKELIHOOD
ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal
More informationChapter 14 Logistic Regression Models
Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as
More informationEconometric Methods. Review of Estimation
Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators
More informationMATH 247/Winter Notes on the adjoint and on normal operators.
MATH 47/Wter 00 Notes o the adjot ad o ormal operators I these otes, V s a fte dmesoal er product space over, wth gve er * product uv, T, S, T, are lear operators o V U, W are subspaces of V Whe we say
More informationDiscrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b
CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More informationIntroduction to local (nonparametric) density estimation. methods
Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest
More informationQualifying Exam Statistical Theory Problem Solutions August 2005
Qualfyg Exam Statstcal Theory Problem Solutos August 5. Let X, X,..., X be d uform U(,),
More informationLecture Note to Rice Chapter 8
ECON 430 HG revsed Nov 06 Lecture Note to Rce Chapter 8 Radom matrces Let Y, =,,, m, =,,, be radom varables (r.v. s). The matrx Y Y Y Y Y Y Y Y Y Y = m m m s called a radom matrx ( wth a ot m-dmesoal dstrbuto,
More information6.867 Machine Learning
6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though
More informationRademacher Complexity. Examples
Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th 018 3.1 Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed
More informationCS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x
CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters
More informationChapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements
Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall
More informationMultiple Linear Regression Analysis
LINEA EGESSION ANALYSIS MODULE III Lecture - 4 Multple Lear egresso Aalyss Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur Cofdece terval estmato The cofdece tervals multple
More informationLecture 9: Tolerant Testing
Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have
More informationLecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions
CO-511: Learg Theory prg 2017 Lecturer: Ro Lv Lecture 16: Bacpropogato Algorthm Dsclamer: These otes have ot bee subected to the usual scruty reserved for formal publcatos. They may be dstrbuted outsde
More informationMachine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012
Mache Learg CSE6740/CS764/ISYE6740, Fall 0 Itroducto to Regresso Le Sog Lecture 4, August 30, 0 Based o sldes from Erc g, CMU Readg: Chap. 3, CB Mache learg for apartmet hutg Suppose ou are to move to
More informationLecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model
Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The
More informationOrdinary Least Squares Regression. Simple Regression. Algebra and Assumptions.
Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos
More information18.413: Error Correcting Codes Lab March 2, Lecture 8
18.413: Error Correctg Codes Lab March 2, 2004 Lecturer: Dael A. Spelma Lecture 8 8.1 Vector Spaces A set C {0, 1} s a vector space f for x all C ad y C, x + y C, where we take addto to be compoet wse
More informationLecture 3 Probability review (cont d)
STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto
More informationLecture 4 Sep 9, 2015
CS 388R: Radomzed Algorthms Fall 205 Prof. Erc Prce Lecture 4 Sep 9, 205 Scrbe: Xagru Huag & Chad Voegele Overvew I prevous lectures, we troduced some basc probablty, the Cheroff boud, the coupo collector
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1
STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ
More informationρ < 1 be five real numbers. The
Lecture o BST 63: Statstcal Theory I Ku Zhag, /0/006 Revew for the prevous lecture Deftos: covarace, correlato Examples: How to calculate covarace ad correlato Theorems: propertes of correlato ad covarace
More informationX X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then
Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers
More informationBounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy
Bouds o the expected etropy ad KL-dvergece of sampled multomal dstrbutos Brado C. Roy bcroy@meda.mt.edu Orgal: May 18, 2011 Revsed: Jue 6, 2011 Abstract Iformato theoretc quattes calculated from a sampled
More informationUNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted
More informationChapter 5 Properties of a Random Sample
Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample
More informationObjectives of Multiple Regression
Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of
More informationChapter 4 Multiple Random Variables
Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:
More information{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:
Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed
More information9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d
9 U-STATISTICS Suppose,,..., are P P..d. wth CDF F. Our goal s to estmate the expectato t (P)=Eh(,,..., m ). Note that ths expectato requres more tha oe cotrast to E, E, or Eh( ). Oe example s E or P((,
More informationECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity
ECONOMETRIC THEORY MODULE VIII Lecture - 6 Heteroskedastcty Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur . Breusch Paga test Ths test ca be appled whe the replcated data
More informationCHAPTER VI Statistical Analysis of Experimental Data
Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca
More informationSummary of the lecture in Biostatistics
Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the
More informationClass 13,14 June 17, 19, 2015
Class 3,4 Jue 7, 9, 05 Pla for Class3,4:. Samplg dstrbuto of sample mea. The Cetral Lmt Theorem (CLT). Cofdece terval for ukow mea.. Samplg Dstrbuto for Sample mea. Methods used are based o CLT ( Cetral
More informationCS286.2 Lecture 4: Dinur s Proof of the PCP Theorem
CS86. Lecture 4: Dur s Proof of the PCP Theorem Scrbe: Thom Bohdaowcz Prevously, we have prove a weak verso of the PCP theorem: NP PCP 1,1/ (r = poly, q = O(1)). Wth ths result we have the desred costat
More informationNaïve Bayes MIT Course Notes Cynthia Rudin
Thaks to Şeyda Ertek Credt: Ng, Mtchell Naïve Bayes MIT 5.097 Course Notes Cytha Rud The Naïve Bayes algorthm comes from a geeratve model. There s a mportat dstcto betwee geeratve ad dscrmatve models.
More information18.657: Mathematics of Machine Learning
8.657: Mathematcs of Mache Learg Lecturer: Phlppe Rgollet Lecture 3 Scrbe: James Hrst Sep. 6, 205.5 Learg wth a fte dctoary Recall from the ed of last lecture our setup: We are workg wth a fte dctoary
More informationKernel-based Methods and Support Vector Machines
Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg
More informationThe Mathematical Appendix
The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.
More informationSTK4011 and STK9011 Autumn 2016
STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto
More informationX ε ) = 0, or equivalently, lim
Revew for the prevous lecture Cocepts: order statstcs Theorems: Dstrbutos of order statstcs Examples: How to get the dstrbuto of order statstcs Chapter 5 Propertes of a Radom Sample Secto 55 Covergece
More informationSpecial Instructions / Useful Data
JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth
More informationUNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:
More informationCIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights
CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:
More informationHomework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015
Fall 05 Homework : Solutos Problem : (Practce wth Asymptotc Notato) A essetal requremet for uderstadg scalg behavor s comfort wth asymptotc (or bg-o ) otato. I ths problem, you wll prove some basc facts
More informationD KL (P Q) := p i ln p i q i
Cheroff-Bouds 1 The Geeral Boud Let P 1,, m ) ad Q q 1,, q m ) be two dstrbutos o m elemets, e,, q 0, for 1,, m, ad m 1 m 1 q 1 The Kullback-Lebler dvergece or relatve etroy of P ad Q s defed as m D KL
More informationThe Occupancy and Coupon Collector problems
Chapter 4 The Occupacy ad Coupo Collector problems By Sarel Har-Peled, Jauary 9, 08 4 Prelmares [ Defto 4 Varace ad Stadard Devato For a radom varable X, let V E [ X [ µ X deote the varace of X, where
More informationLinear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab
Lear Regresso Lear Regresso th Shrkage Some sldes are due to Tomm Jaakkola, MIT AI Lab Itroducto The goal of regresso s to make quattatve real valued predctos o the bass of a vector of features or attrbutes.
More information( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model
Chapter 3 Asmptotc Theor ad Stochastc Regressors The ature of eplaator varable s assumed to be o-stochastc or fed repeated samples a regresso aalss Such a assumpto s approprate for those epermets whch
More informationLecture 02: Bounding tail distributions of a random variable
CSCI-B609: A Theorst s Toolkt, Fall 206 Aug 25 Lecture 02: Boudg tal dstrbutos of a radom varable Lecturer: Yua Zhou Scrbe: Yua Xe & Yua Zhou Let us cosder the ubased co flps aga. I.e. let the outcome
More informationConvergence of Large Margin Separable Linear Classification
Covergece of Large Marg Separable Lear Classfcato Tog Zhag Mathematcal Sceces Departmet IBM T.J. Watso Research Ceter Yorktow Heghts, NY 0598 tzhag@watso.bm.com Abstract Large marg lear classfcato methods
More informationSimulation Output Analysis
Smulato Output Aalyss Summary Examples Parameter Estmato Sample Mea ad Varace Pot ad Iterval Estmato ermatg ad o-ermatg Smulato Mea Square Errors Example: Sgle Server Queueg System x(t) S 4 S 4 S 3 S 5
More informationLecture 3. Sampling, sampling distributions, and parameter estimation
Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called
More informationMultivariate Transformation of Variables and Maximum Likelihood Estimation
Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Aalyss of Varace ad Desg of Exermets-I MODULE II LECTURE - GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCE Dr Shalabh Deartmet of Mathematcs ad Statstcs Ida Isttute of Techology Kaur Tukey s rocedure
More informationSolving Constrained Flow-Shop Scheduling. Problems with Three Machines
It J Cotemp Math Sceces, Vol 5, 2010, o 19, 921-929 Solvg Costraed Flow-Shop Schedulg Problems wth Three Maches P Pada ad P Rajedra Departmet of Mathematcs, School of Advaced Sceces, VIT Uversty, Vellore-632
More information9.1 Introduction to the probit and logit models
EC3000 Ecoometrcs Lecture 9 Probt & Logt Aalss 9. Itroducto to the probt ad logt models 9. The logt model 9.3 The probt model Appedx 9. Itroducto to the probt ad logt models These models are used regressos
More informationMaximum Likelihood Estimation
Marquette Uverst Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Coprght 08 b Marquette Uverst Maxmum Lkelhood Estmato We have bee sag that ~
More informationMu Sequences/Series Solutions National Convention 2014
Mu Sequeces/Seres Solutos Natoal Coveto 04 C 6 E A 6C A 6 B B 7 A D 7 D C 7 A B 8 A B 8 A C 8 E 4 B 9 B 4 E 9 B 4 C 9 E C 0 A A 0 D B 0 C C Usg basc propertes of arthmetc sequeces, we fd a ad bm m We eed
More informationA tighter lower bound on the circuit size of the hardest Boolean functions
Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the
More informationPart 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))
art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the
More informationChapter 9 Jordan Block Matrices
Chapter 9 Jorda Block atrces I ths chapter we wll solve the followg problem. Gve a lear operator T fd a bass R of F such that the matrx R (T) s as smple as possble. f course smple s a matter of taste.
More informationECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model
ECON 48 / WH Hog The Smple Regresso Model. Defto of the Smple Regresso Model Smple Regresso Model Expla varable y terms of varable x y = β + β x+ u y : depedet varable, explaed varable, respose varable,
More informationLecture 8: Linear Regression
Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE
More informationSupervised learning: Linear regression Logistic regression
CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s
More informationPTAS for Bin-Packing
CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,
More informationIII-16 G. Brief Review of Grand Orthogonality Theorem and impact on Representations (Γ i ) l i = h n = number of irreducible representations.
III- G. Bref evew of Grad Orthogoalty Theorem ad mpact o epresetatos ( ) GOT: h [ () m ] [ () m ] δδ δmm ll GOT puts great restrcto o form of rreducble represetato also o umber: l h umber of rreducble
More informationUnsupervised Learning and Other Neural Networks
CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all
More informationOverview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression
Overvew Basc cocepts of Bayesa learg Most probable model gve data Co tosses Lear regresso Logstc regresso Bayesa predctos Co tosses Lear regresso 30 Recap: regresso problems Iput to learg problem: trag
More informationAn Introduction to. Support Vector Machine
A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork
More informationENGI 3423 Simple Linear Regression Page 12-01
ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable
More informationRandom Variables and Probability Distributions
Radom Varables ad Probablty Dstrbutos * If X : S R s a dscrete radom varable wth rage {x, x, x 3,. } the r = P (X = xr ) = * Let X : S R be a dscrete radom varable wth rage {x, x, x 3,.}.If x r P(X = x
More informationGenerative classification models
CS 75 Mache Learg Lecture Geeratve classfcato models Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Data: D { d, d,.., d} d, Classfcato represets a dscrete class value Goal: lear f : X Y Bar classfcato
More informationIntroduction to Probability
Itroducto to Probablty Nader H Bshouty Departmet of Computer Scece Techo 32000 Israel e-mal: bshouty@cstechoacl 1 Combatorcs 11 Smple Rules I Combatorcs The rule of sum says that the umber of ways to choose
More informationStrong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity
BULLETIN of the MALAYSIAN MATHEMATICAL SCIENCES SOCIETY Bull. Malays. Math. Sc. Soc. () 7 (004), 5 35 Strog Covergece of Weghted Averaged Appromats of Asymptotcally Noepasve Mappgs Baach Spaces wthout
More information8.1 Hashing Algorithms
CS787: Advaced Algorthms Scrbe: Mayak Maheshwar, Chrs Hrchs Lecturer: Shuch Chawla Topc: Hashg ad NP-Completeess Date: September 21 2007 Prevously we looked at applcatos of radomzed algorthms, ad bega
More informationGeneralized Linear Regression with Regularization
Geeralze Lear Regresso wth Regularzato Zoya Bylsk March 3, 05 BASIC REGRESSION PROBLEM Note: I the followg otes I wll make explct what s a vector a what s a scalar usg vec t or otato, to avo cofuso betwee
More informationå 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018
Chrs Pech Fal Practce CS09 Dec 5, 08 Practce Fal Examato Solutos. Aswer: 4/5 8/7. There are multle ways to obta ths aswer; here are two: The frst commo method s to sum over all ossbltes for the rak of
More informationEstimation of Stress- Strength Reliability model using finite mixture of exponential distributions
Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur
More informationESS Line Fitting
ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here
More informationThe number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter
LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s
More informationDimensionality reduction Feature selection
CS 750 Mache Learg Lecture 3 Dmesoalty reducto Feature selecto Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 750 Mache Learg Dmesoalty reducto. Motvato. Classfcato problem eample: We have a put data
More information5 Short Proofs of Simplified Stirling s Approximation
5 Short Proofs of Smplfed Strlg s Approxmato Ofr Gorodetsky, drtymaths.wordpress.com Jue, 20 0 Itroducto Strlg s approxmato s the followg (somewhat surprsg) approxmato of the factoral,, usg elemetary fuctos:
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More information4 Inner Product Spaces
11.MH1 LINEAR ALGEBRA Summary Notes 4 Ier Product Spaces Ier product s the abstracto to geeral vector spaces of the famlar dea of the scalar product of two vectors or 3. I what follows, keep these key
More informationLINEAR REGRESSION ANALYSIS
LINEAR REGRESSION ANALYSIS MODULE V Lecture - Correctg Model Iadequaces Through Trasformato ad Weghtg Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur Aalytcal methods for
More informationChapter 8. Inferences about More Than Two Population Central Values
Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1
STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal
More informationRecall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I
Chapter 8 Heterosedastcty Recall MLR 5 Homsedastcty error u has the same varace gve ay values of the eplaatory varables Varu,..., = or EUU = I Suppose other GM assumptos hold but have heterosedastcty.
More informationTraining Sample Model: Given n observations, [[( Yi, x i the sample model can be expressed as (1) where, zero and variance σ
Stat 74 Estmato for Geeral Lear Model Prof. Goel Broad Outle Geeral Lear Model (GLM): Trag Samle Model: Gve observatos, [[( Y, x ), x = ( x,, xr )], =,,, the samle model ca be exressed as Y = µ ( x, x,,
More informationBayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier
Baa Classfcato CS6L Data Mg: Classfcato() Referece: J. Ha ad M. Kamber, Data Mg: Cocepts ad Techques robablstc learg: Calculate explct probabltes for hypothess, amog the most practcal approaches to certa
More informationG S Power Flow Solution
G S Power Flow Soluto P Q I y y * 0 1, Y y Y 0 y Y Y 1, P Q ( k) ( k) * ( k 1) 1, Y Y PQ buses * 1 P Q Y ( k1) *( k) ( k) Q Im[ Y ] 1 P buses & Slack bus ( k 1) *( k) ( k) Y 1 P Re[ ] Slack bus 17 Calculato
More informationLattices. Mathematical background
Lattces Mathematcal backgroud Lattces : -dmesoal Eucldea space. That s, { T x } x x = (,, ) :,. T T If x= ( x,, x), y = ( y,, y), the xy, = xy (er product of xad y) x = /2 xx, (Eucldea legth or orm of
More informationExtreme Value Theory: An Introduction
(correcto d Extreme Value Theory: A Itroducto by Laures de Haa ad Aa Ferrera Wth ths webpage the authors ted to form the readers of errors or mstakes foud the book after publcato. We also gve extesos for
More informationSTK3100 and STK4100 Autumn 2017
SK3 ad SK4 Autum 7 Geeralzed lear models Part III Covers the followg materal from chaters 4 ad 5: Sectos 4..5, 4.3.5, 4.3.6, 4.4., 4.4., ad 4.4.3 Sectos 5.., 5.., ad 5.5. Ørulf Borga Deartmet of Mathematcs
More information