1 Review and Overview

Size: px
Start display at page:

Download "1 Review and Overview"

Transcription

1 CS9T/STATS3: Statstcal Learg Teory Lecturer: Tegyu Ma Lecture #7 Scrbe: Bra Zag October 5, 08 Revew ad Overvew We wll frst gve a bref revew of wat as bee covered so far I te frst few lectures, we stated ad proved asymptotcs o te maxmum lkelood estmator I partcular, te well-specfed case, we sowed tat L(ĥ) L( ) p + o ( ) were ĥ s te emprcal MLE, s te groud trut fucto, p s te umber of parameters te model, s te umber of trag examples, ad L s te expected egatve log lkelood (test error) We te trastoed to o-asymptotcs results, ad smultaeously we dropped te assumpto of well-specfcty We frst observed tat L(ĥ) L( ) ˆL() L() wc allows us to use uform covergece results to establs bouds o te geeralzato error Our atteto tus tured to boudg te rgt ad sde of te above equato I te case of a fte ypotess class H, usg Hoeffdg s equalty ad te a uo boud, we sowed tat, wt probablty at least δ, log H + log(/δ) ˆL() L() I te case of a fte ypotess class H parametrzed by a bouded parameter θ R p, we cocluded tat, wt probablty O ( p 0), p log ˆL() L() as log as te loss fuctos are suffcetly ce (Lpsctz ad bouded) Last week, we defed te oto of Rademacer complexty ad used t to smplfy our dscusso of uform covergece bouds I partcular, we saw tat log(/δ) ˆL() L() R S (F ) + were R S (F ) s te sample Rademacer complexty o a sample S of sze, ad F s a famly of loss fuctos We ote tat R S (F ) ca be replaced wt te expected Rademacer complexty R (F ) = E R S (F ), but tat t s usually easer to reaso about R S (F ) because tere s oe fewer expectato to worry about If we use te γ-marg loss, te R S (F ) R S(H) γ

2 I ts lecture, we wll focus o boudg R S (H), wc by Talagrad s Lemma ad te above dscusso wll lead to bouds o L(ĥ) L( ) We wll start wt te case of a lear fucto wt bouded parameter, ad move to te more geeral settg of a eural etwork wt a sgle dde layer We ote tat focusg o R S (H) stead of R S (F ) carres aoter advatage: t allows us to gore te labels ad look at oly te puts, wc ofte smplfes te aalyss Rademacer Complexty of Lear Models Teorem Let H = { x w T x : w B } Assume tat E x p x C Te R S (H) B x () ad Proof We ave R S (H) = E R (H) BC () w B = E w B w T x w T ( = B E x sce te L -orm s self-dual = B E x ) x by Jese s equalty: E z = E z E z for oegatve z, sce s cocave B E 0 / x + j x T x j were te secod sum vases because te s are depedet wt mea 0 / j = B x Ts proves Eq () Eq () follows from takg a expectato: R (H) = E R S (H) = B E x B E x BC were te frst equalty s aoter applcato of Jese, ad te secod follows by defto of C

3 Teorem Let H = { x w T x : w B } Assume tat x C for all Let d be te dmeso of te put; e x R d for all Te R (H) BC log d We ote frst te smlartes ad dffereces betwee Teorems ad Teorem looks somewat weaker: we ave to make a stroger assumpto, amely, x C for all, stead of E x C, ad te addto of a factor of log d We also remark tat t s possble eve lkely tat Teorems lke ad apply for oter pars of orm-dual orm pars However, sce te most commo orms are L, L, L, we wll restrct our atteto to tese for ts class We wll ow gve a sketc of te proof of Teorem Te full proof s left as a omework problem Proof Sketc We wll prove a slgtly dfferet statemet, amely tat wt probablty d O() over te radom coce of, we ave w B w T x BC log d (3) Ts looks smlar eoug to our teorem tat te teorem souds belevable as log as te tal of te dstrbuto of te LHS s ot too eavy ( ) w T x = w T x = B x w B w B Let v = x We clam tat, wt probablty d O() over te radom coce of, v C log d, from wc (3) follows mmedately To see ts, fx a dex j Te v j = x j Sce te x j are all bouded by C ad te are depedet, Hoeffdg s equalty mples tat ( ( )) ε Pr[ v j ε] exp O C Takg ε = O(C log d), we coclude tat Pr[ v j ε] d O() Now a uo boud over j =,, d gves tat Pr [ v j C log d ] d O(), as desred Notce tat te costat dde by te O() ca be made small by creasg te costat dde te expresso for ε; we wll tus ot cocer ourselves wt fgurg out te costat factors sce t s uecessary 3 Rademacer Complexty of Neural Networks Somewat surprsgly, t as bee emprcally observed [] tat, a eural etwork wt a sgle dde layer ad o regularzato, creasg te umber of dde uts mproves (or, at least, does ot urt) te geeralzato loss eve past te pot we te umber of dde uts suffces to aceve zero trag error A recet paper [] gves some teoretcal bouds tat justfy tese practcal observatos I te ext two lectures, we wll see some of tese bouds Trougout ts secto, we wll use te followg otato: 3

4 Θ = (w, U) are te parameters of te model f Θ (x) = w T φ(ux) s te model fucto m s te umber of dde uts (e te umber of rows of U, ad te umber of elemets of w) φ(x) = max(x, 0) s te (elemet-wse) ReLU fucto {x } = Rd s te trag set Te ma teorem we wll seek to prove as te form Teorem 3 (Teorem 3 [], formally) Suppose we tra a two-layer eural etwork to mmze a logstc loss fucto Te te geeralzato error decreases as te trag error creases Te proof ad precse statemet of ts result wll be deferred to te ext lecture We wll frst warm up by sowg a smpler statemet Teorem 4 (Specal case of Teorem 43, Secto 64 of Percy Lag s otes) Let H = { f Θ : w B ad u B for all } were u s te t colum of U Te R (H) BB C m Proof Let g U (x) φ(ux) for matrces U Te f Θ (x) = w T g U (x) Tus: R S (H) = E Θ = E Θ w T g U (x ) w T ( = E w Θ ) g U (x ) g U (x ) sce we ca always pck w to pot te same drecto as g U (x ) = B E g U: u j B j U (x ) B m E max φ(u T j x ) j u j B by symmetry = B m E u B φ(u T x ) 4

5 ({ Te expectato looks very suspcously lke R S x φ(u T x) : u B }) ; we oly dffer from te defto of Rademacer complexty by a absolute value sg I fact, some sources, cludg te orgal paper o Rademacer complexty [3] defe t wt te absolute value sg A aalogue of Talagrad s Lemma for ts defto s stated Teorem (Secto 3) of [3], wc we lose a extra factor of Usg ts, sce φ s -Lpsctz, we ca cotue R S (H) B m E u T x u B = B m E u T x u B = B ({ mr S x u T x : u B }) R (H) B ({ mr x u T x : u B }) m BB C were te last le follows from Teorem Teorem 4 s weaker ta te teorem we would lke to prove, sce tere s a m te umerator wc we d lke to get rd of We wll refe ts boud te ext lecture Refereces [] Beam Neysabur, Srad Bojaapall, Davd McAllester, ad Nat Srebro Explorg geeralzato deep learg I Advaces Neural Iformato Processg Systems, pages , 07 [] C We, J D Lee, Q Lu, ad T Ma O te Marg Teory of Feedforward Neural Networks ArXv e-prts, October 08 [3] Peter L Bartlett ad Saar Medelso Rademacer ad gaussa complextes: Rsk bouds ad structural results Joural of Mace Learg Researc, 3(Nov):463 48, 00 5

Rademacher Complexity. Examples

Rademacher Complexity. Examples Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th 018 3.1 Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed

More information

Lecture 02: Bounding tail distributions of a random variable

Lecture 02: Bounding tail distributions of a random variable CSCI-B609: A Theorst s Toolkt, Fall 206 Aug 25 Lecture 02: Boudg tal dstrbutos of a radom varable Lecturer: Yua Zhou Scrbe: Yua Xe & Yua Zhou Let us cosder the ubased co flps aga. I.e. let the outcome

More information

3. Basic Concepts: Consequences and Properties

3. Basic Concepts: Consequences and Properties : 3. Basc Cocepts: Cosequeces ad Propertes Markku Jutt Overvew More advaced cosequeces ad propertes of the basc cocepts troduced the prevous lecture are derved. Source The materal s maly based o Sectos.6.8

More information

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971)) art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the

More information

Bayes (Naïve or not) Classifiers: Generative Approach

Bayes (Naïve or not) Classifiers: Generative Approach Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg

More information

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy Bouds o the expected etropy ad KL-dvergece of sampled multomal dstrbutos Brado C. Roy bcroy@meda.mt.edu Orgal: May 18, 2011 Revsed: Jue 6, 2011 Abstract Iformato theoretc quattes calculated from a sampled

More information

Lecture 4 Sep 9, 2015

Lecture 4 Sep 9, 2015 CS 388R: Radomzed Algorthms Fall 205 Prof. Erc Prce Lecture 4 Sep 9, 205 Scrbe: Xagru Huag & Chad Voegele Overvew I prevous lectures, we troduced some basc probablty, the Cheroff boud, the coupo collector

More information

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d 9 U-STATISTICS Suppose,,..., are P P..d. wth CDF F. Our goal s to estmate the expectato t (P)=Eh(,,..., m ). Note that ths expectato requres more tha oe cotrast to E, E, or Eh( ). Oe example s E or P((,

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

Dimensionality Reduction and Learning

Dimensionality Reduction and Learning CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that

More information

18.413: Error Correcting Codes Lab March 2, Lecture 8

18.413: Error Correcting Codes Lab March 2, Lecture 8 18.413: Error Correctg Codes Lab March 2, 2004 Lecturer: Dael A. Spelma Lecture 8 8.1 Vector Spaces A set C {0, 1} s a vector space f for x all C ad y C, x + y C, where we take addto to be compoet wse

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture) CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.

More information

Idea is to sample from a different distribution that picks points in important regions of the sample space. Want ( ) ( ) ( ) E f X = f x g x dx

Idea is to sample from a different distribution that picks points in important regions of the sample space. Want ( ) ( ) ( ) E f X = f x g x dx Importace Samplg Used for a umber of purposes: Varace reducto Allows for dffcult dstrbutos to be sampled from. Sestvty aalyss Reusg samples to reduce computatoal burde. Idea s to sample from a dfferet

More information

Chapter 14 Logistic Regression Models

Chapter 14 Logistic Regression Models Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as

More information

TESTS BASED ON MAXIMUM LIKELIHOOD

TESTS BASED ON MAXIMUM LIKELIHOOD ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal

More information

Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions

Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions CO-511: Learg Theory prg 2017 Lecturer: Ro Lv Lecture 16: Bacpropogato Algorthm Dsclamer: These otes have ot bee subected to the usual scruty reserved for formal publcatos. They may be dstrbuted outsde

More information

Chapter 5 Properties of a Random Sample

Chapter 5 Properties of a Random Sample Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample

More information

X ε ) = 0, or equivalently, lim

X ε ) = 0, or equivalently, lim Revew for the prevous lecture Cocepts: order statstcs Theorems: Dstrbutos of order statstcs Examples: How to get the dstrbuto of order statstcs Chapter 5 Propertes of a Radom Sample Secto 55 Covergece

More information

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers

More information

Supplementary Material for Limits on Sparse Support Recovery via Linear Sketching with Random Expander Matrices

Supplementary Material for Limits on Sparse Support Recovery via Linear Sketching with Random Expander Matrices Joata Scarlett ad Volka Cever Supplemetary Materal for Lmts o Sparse Support Recovery va Lear Sketcg wt Radom Expader Matrces (AISTATS 26, Joata Scarlett ad Volka Cever) Note tat all ctatos ere are to

More information

Lecture Notes Types of economic variables

Lecture Notes Types of economic variables Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte

More information

A tighter lower bound on the circuit size of the hardest Boolean functions

A tighter lower bound on the circuit size of the hardest Boolean functions Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the

More information

Lecture 9: Tolerant Testing

Lecture 9: Tolerant Testing Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have

More information

Functions of Random Variables

Functions of Random Variables Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,

More information

Nonlinear Piecewise-Defined Difference Equations with Reciprocal Quadratic Terms

Nonlinear Piecewise-Defined Difference Equations with Reciprocal Quadratic Terms Joural of Matematcs ad Statstcs Orgal Researc Paper Nolear Pecewse-Defed Dfferece Equatos wt Recprocal Quadratc Terms Ramada Sabra ad Saleem Safq Al-Asab Departmet of Matematcs, Faculty of Scece, Jaza

More information

STK4011 and STK9011 Autumn 2016

STK4011 and STK9011 Autumn 2016 STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematcs of Mache Learg Lecturer: Phlppe Rgollet Lecture 3 Scrbe: James Hrst Sep. 6, 205.5 Learg wth a fte dctoary Recall from the ed of last lecture our setup: We are workg wth a fte dctoary

More information

MATH 247/Winter Notes on the adjoint and on normal operators.

MATH 247/Winter Notes on the adjoint and on normal operators. MATH 47/Wter 00 Notes o the adjot ad o ormal operators I these otes, V s a fte dmesoal er product space over, wth gve er * product uv, T, S, T, are lear operators o V U, W are subspaces of V Whe we say

More information

Numerical Analysis Formulae Booklet

Numerical Analysis Formulae Booklet Numercal Aalyss Formulae Booklet. Iteratve Scemes for Systems of Lear Algebrac Equatos:.... Taylor Seres... 3. Fte Dfferece Approxmatos... 3 4. Egevalues ad Egevectors of Matrces.... 3 5. Vector ad Matrx

More information

CHAPTER 4 RADICAL EXPRESSIONS

CHAPTER 4 RADICAL EXPRESSIONS 6 CHAPTER RADICAL EXPRESSIONS. The th Root of a Real Number A real umber a s called the th root of a real umber b f Thus, for example: s a square root of sce. s also a square root of sce ( ). s a cube

More information

Lecture 3 Probability review (cont d)

Lecture 3 Probability review (cont d) STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted

More information

arxiv: v1 [cs.lg] 22 Feb 2015

arxiv: v1 [cs.lg] 22 Feb 2015 SDCA wthout Dualty Sha Shalev-Shwartz arxv:50.0677v cs.lg Feb 05 Abstract Stochastc Dual Coordate Ascet s a popular method for solvg regularzed loss mmzato for the case of covex losses. I ths paper we

More information

Kernel-based Methods and Support Vector Machines

Kernel-based Methods and Support Vector Machines Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg

More information

1 Onto functions and bijections Applications to Counting

1 Onto functions and bijections Applications to Counting 1 Oto fuctos ad bectos Applcatos to Coutg Now we move o to a ew topc. Defto 1.1 (Surecto. A fucto f : A B s sad to be surectve or oto f for each b B there s some a A so that f(a B. What are examples of

More information

1 Solution to Problem 6.40

1 Solution to Problem 6.40 1 Soluto to Problem 6.40 (a We wll wrte T τ (X 1,...,X where the X s are..d. wth PDF f(x µ, σ 1 ( x µ σ g, σ where the locato parameter µ s ay real umber ad the scale parameter σ s > 0. Lettg Z X µ σ we

More information

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek Partally Codtoal Radom Permutato Model 7- vestgato of Partally Codtoal RP Model wth Respose Error TRODUCTO Ed Staek We explore the predctor that wll result a smple radom sample wth respose error whe a

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

CHAPTER VI Statistical Analysis of Experimental Data

CHAPTER VI Statistical Analysis of Experimental Data Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca

More information

Complete Convergence and Some Maximal Inequalities for Weighted Sums of Random Variables

Complete Convergence and Some Maximal Inequalities for Weighted Sums of Random Variables Joural of Sceces, Islamc Republc of Ira 8(4): -6 (007) Uversty of Tehra, ISSN 06-04 http://sceces.ut.ac.r Complete Covergece ad Some Maxmal Iequaltes for Weghted Sums of Radom Varables M. Am,,* H.R. Nl

More information

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015 Fall 05 Homework : Solutos Problem : (Practce wth Asymptotc Notato) A essetal requremet for uderstadg scalg behavor s comfort wth asymptotc (or bg-o ) otato. I ths problem, you wll prove some basc facts

More information

Asymptotic Formulas Composite Numbers II

Asymptotic Formulas Composite Numbers II Iteratoal Matematcal Forum, Vol. 8, 203, o. 34, 65-662 HIKARI Ltd, www.m-kar.com ttp://d.do.org/0.2988/mf.203.3854 Asymptotc Formulas Composte Numbers II Rafael Jakmczuk Dvsó Matemátca, Uversdad Nacoal

More information

arxiv:math/ v1 [math.gm] 8 Dec 2005

arxiv:math/ v1 [math.gm] 8 Dec 2005 arxv:math/05272v [math.gm] 8 Dec 2005 A GENERALIZATION OF AN INEQUALITY FROM IMO 2005 NIKOLAI NIKOLOV The preset paper was spred by the thrd problem from the IMO 2005. A specal award was gve to Yure Boreko

More information

4 Inner Product Spaces

4 Inner Product Spaces 11.MH1 LINEAR ALGEBRA Summary Notes 4 Ier Product Spaces Ier product s the abstracto to geeral vector spaces of the famlar dea of the scalar product of two vectors or 3. I what follows, keep these key

More information

8.1 Hashing Algorithms

8.1 Hashing Algorithms CS787: Advaced Algorthms Scrbe: Mayak Maheshwar, Chrs Hrchs Lecturer: Shuch Chawla Topc: Hashg ad NP-Completeess Date: September 21 2007 Prevously we looked at applcatos of radomzed algorthms, ad bega

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

D KL (P Q) := p i ln p i q i

D KL (P Q) := p i ln p i q i Cheroff-Bouds 1 The Geeral Boud Let P 1,, m ) ad Q q 1,, q m ) be two dstrbutos o m elemets, e,, q 0, for 1,, m, ad m 1 m 1 q 1 The Kullback-Lebler dvergece or relatve etroy of P ad Q s defed as m D KL

More information

General Method for Calculating Chemical Equilibrium Composition

General Method for Calculating Chemical Equilibrium Composition AE 6766/Setzma Sprg 004 Geeral Metod for Calculatg Cemcal Equlbrum Composto For gve tal codtos (e.g., for gve reactats, coose te speces to be cluded te products. As a example, for combusto of ydroge wt

More information

The Arithmetic-Geometric mean inequality in an external formula. Yuki Seo. October 23, 2012

The Arithmetic-Geometric mean inequality in an external formula. Yuki Seo. October 23, 2012 Sc. Math. Japocae Vol. 00, No. 0 0000, 000 000 1 The Arthmetc-Geometrc mea equalty a exteral formula Yuk Seo October 23, 2012 Abstract. The classcal Jese equalty ad ts reverse are dscussed by meas of terally

More information

Introduction to Matrices and Matrix Approach to Simple Linear Regression

Introduction to Matrices and Matrix Approach to Simple Linear Regression Itroducto to Matrces ad Matrx Approach to Smple Lear Regresso Matrces Defto: A matrx s a rectagular array of umbers or symbolc elemets I may applcatos, the rows of a matrx wll represet dvduals cases (people,

More information

The Mathematical Appendix

The Mathematical Appendix The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.

More information

Generalized Linear Regression with Regularization

Generalized Linear Regression with Regularization Geeralze Lear Regresso wth Regularzato Zoya Bylsk March 3, 05 BASIC REGRESSION PROBLEM Note: I the followg otes I wll make explct what s a vector a what s a scalar usg vec t or otato, to avo cofuso betwee

More information

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall

More information

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg

More information

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 THE ROYAL STATISTICAL SOCIETY 06 EAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 The Socety s provdg these solutos to assst cadtes preparg for the examatos 07. The solutos are teded as learg ads ad should

More information

Special Instructions / Useful Data

Special Instructions / Useful Data JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth

More information

PTAS for Bin-Packing

PTAS for Bin-Packing CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem CS86. Lecture 4: Dur s Proof of the PCP Theorem Scrbe: Thom Bohdaowcz Prevously, we have prove a weak verso of the PCP theorem: NP PCP 1,1/ (r = poly, q = O(1)). Wth ths result we have the desred costat

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Marquette Uverst Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Coprght 08 b Marquette Uverst Maxmum Lkelhood Estmato We have bee sag that ~

More information

Chapter 9 Jordan Block Matrices

Chapter 9 Jordan Block Matrices Chapter 9 Jorda Block atrces I ths chapter we wll solve the followg problem. Gve a lear operator T fd a bass R of F such that the matrx R (T) s as smple as possble. f course smple s a matter of taste.

More information

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:

More information

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Multivariate Transformation of Variables and Maximum Likelihood Estimation Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty

More information

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package

More information

Laboratory I.10 It All Adds Up

Laboratory I.10 It All Adds Up Laboratory I. It All Adds Up Goals The studet wll work wth Rema sums ad evaluate them usg Derve. The studet wll see applcatos of tegrals as accumulatos of chages. The studet wll revew curve fttg sklls.

More information

Non-uniform Turán-type problems

Non-uniform Turán-type problems Joural of Combatoral Theory, Seres A 111 2005 106 110 wwwelsevercomlocatecta No-uform Turá-type problems DhruvMubay 1, Y Zhao 2 Departmet of Mathematcs, Statstcs, ad Computer Scece, Uversty of Illos at

More information

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA THE ROYAL STATISTICAL SOCIETY 3 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER I STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad

More information

= lim. (x 1 x 2... x n ) 1 n. = log. x i. = M, n

= lim. (x 1 x 2... x n ) 1 n. = log. x i. = M, n .. Soluto of Problem. M s obvously cotuous o ], [ ad ], [. Observe that M x,..., x ) M x,..., x ) )..) We ext show that M s odecreasg o ], [. Of course.) mles that M s odecreasg o ], [ as well. To show

More information

Qualifying Exam Statistical Theory Problem Solutions August 2005

Qualifying Exam Statistical Theory Problem Solutions August 2005 Qualfyg Exam Statstcal Theory Problem Solutos August 5. Let X, X,..., X be d uform U(,),

More information

Lecture 3. Sampling, sampling distributions, and parameter estimation

Lecture 3. Sampling, sampling distributions, and parameter estimation Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called

More information

Exercises for Square-Congruence Modulo n ver 11

Exercises for Square-Congruence Modulo n ver 11 Exercses for Square-Cogruece Modulo ver Let ad ab,.. Mark True or False. a. 3S 30 b. 3S 90 c. 3S 3 d. 3S 4 e. 4S f. 5S g. 0S 55 h. 8S 57. 9S 58 j. S 76 k. 6S 304 l. 47S 5347. Fd the equvalece classes duced

More information

5 Short Proofs of Simplified Stirling s Approximation

5 Short Proofs of Simplified Stirling s Approximation 5 Short Proofs of Smplfed Strlg s Approxmato Ofr Gorodetsky, drtymaths.wordpress.com Jue, 20 0 Itroducto Strlg s approxmato s the followg (somewhat surprsg) approxmato of the factoral,, usg elemetary fuctos:

More information

Transforms that are commonly used are separable

Transforms that are commonly used are separable Trasforms s Trasforms that are commoly used are separable Eamples: Two-dmesoal DFT DCT DST adamard We ca the use -D trasforms computg the D separable trasforms: Take -D trasform of the rows > rows ( )

More information

Lecture Note to Rice Chapter 8

Lecture Note to Rice Chapter 8 ECON 430 HG revsed Nov 06 Lecture Note to Rce Chapter 8 Radom matrces Let Y, =,,, m, =,,, be radom varables (r.v. s). The matrx Y Y Y Y Y Y Y Y Y Y = m m m s called a radom matrx ( wth a ot m-dmesoal dstrbuto,

More information

Class 13,14 June 17, 19, 2015

Class 13,14 June 17, 19, 2015 Class 3,4 Jue 7, 9, 05 Pla for Class3,4:. Samplg dstrbuto of sample mea. The Cetral Lmt Theorem (CLT). Cofdece terval for ukow mea.. Samplg Dstrbuto for Sample mea. Methods used are based o CLT ( Cetral

More information

Beam Warming Second-Order Upwind Method

Beam Warming Second-Order Upwind Method Beam Warmg Secod-Order Upwd Method Petr Valeta Jauary 6, 015 Ths documet s a part of the assessmet work for the subject 1DRP Dfferetal Equatos o Computer lectured o FNSPE CTU Prague. Abstract Ths documet

More information

A Remark on the Uniform Convergence of Some Sequences of Functions

A Remark on the Uniform Convergence of Some Sequences of Functions Advaces Pure Mathematcs 05 5 57-533 Publshed Ole July 05 ScRes. http://www.scrp.org/joural/apm http://dx.do.org/0.436/apm.05.59048 A Remark o the Uform Covergece of Some Sequeces of Fuctos Guy Degla Isttut

More information

Third handout: On the Gini Index

Third handout: On the Gini Index Thrd hadout: O the dex Corrado, a tala statstca, proposed (, 9, 96) to measure absolute equalt va the mea dfferece whch s defed as ( / ) where refers to the total umber of dvduals socet. Assume that. The

More information

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn: Chapter 3 3- Busess Statstcs: A Frst Course Ffth Edto Chapter 2 Correlato ad Smple Lear Regresso Busess Statstcs: A Frst Course, 5e 29 Pretce-Hall, Ic. Chap 2- Learg Objectves I ths chapter, you lear:

More information

Module 7: Probability and Statistics

Module 7: Probability and Statistics Lecture 4: Goodess of ft tests. Itroducto Module 7: Probablty ad Statstcs I the prevous two lectures, the cocepts, steps ad applcatos of Hypotheses testg were dscussed. Hypotheses testg may be used to

More information

22 Nonparametric Methods.

22 Nonparametric Methods. 22 oparametrc Methods. I parametrc models oe assumes apror that the dstrbutos have a specfc form wth oe or more ukow parameters ad oe tres to fd the best or atleast reasoably effcet procedures that aswer

More information

CHAPTER 3 POSTERIOR DISTRIBUTIONS

CHAPTER 3 POSTERIOR DISTRIBUTIONS CHAPTER 3 POSTERIOR DISTRIBUTIONS If scece caot measure the degree of probablt volved, so much the worse for scece. The practcal ma wll stck to hs apprecatve methods utl t does, or wll accept the results

More information

Chain Rules for Entropy

Chain Rules for Entropy Cha Rules for Etroy The etroy of a collecto of radom varables s the sum of codtoal etroes. Theorem: Let be radom varables havg the mass robablty x x.x. The...... The roof s obtaed by reeatg the alcato

More information

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever.

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever. 9.4 Sequeces ad Seres Pre Calculus 9.4 SEQUENCES AND SERIES Learg Targets:. Wrte the terms of a explctly defed sequece.. Wrte the terms of a recursvely defed sequece. 3. Determe whether a sequece s arthmetc,

More information

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y. .46. a. The frst varable (X) s the frst umber the par ad s plotted o the horzotal axs, whle the secod varable (Y) s the secod umber the par ad s plotted o the vertcal axs. The scatterplot s show the fgure

More information

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018 Chrs Pech Fal Practce CS09 Dec 5, 08 Practce Fal Examato Solutos. Aswer: 4/5 8/7. There are multle ways to obta ths aswer; here are two: The frst commo method s to sum over all ossbltes for the rak of

More information

Introduction to local (nonparametric) density estimation. methods

Introduction to local (nonparametric) density estimation. methods Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest

More information

SPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS

SPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS SPECIAL CONSIDERAIONS FOR VOLUMERIC Z-ES FOR PROPORIONS Oe s stctve reacto to the questo of whether two percetages are sgfcatly dfferet from each other s to treat them as f they were proportos whch the

More information

Chapter 8: Statistical Analysis of Simulated Data

Chapter 8: Statistical Analysis of Simulated Data Marquette Uversty MSCS600 Chapter 8: Statstcal Aalyss of Smulated Data Dael B. Rowe, Ph.D. Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 08 by Marquette Uversty MSCS600 Ageda 8. The Sample

More information

Vapnik Chervonenkis classes

Vapnik Chervonenkis classes Vapk Chervoeks classes Maxm Ragsky September 23, 2014 A key result o the ERM algorthm, proved the prevous lecture, was that P( f ) L (F ) + 4ER (F (Z )) + 2log(1/δ) wth probablty at least 1 δ. The quatty

More information

6. Nonparametric techniques

6. Nonparametric techniques 6. Noparametrc techques Motvato Problem: how to decde o a sutable model (e.g. whch type of Gaussa) Idea: just use the orgal data (lazy learg) 2 Idea 1: each data pot represets a pece of probablty P(x)

More information

Chapter 3. Differentiation 3.3 Differentiation Rules

Chapter 3. Differentiation 3.3 Differentiation Rules 3.3 Dfferetato Rules 1 Capter 3. Dfferetato 3.3 Dfferetato Rules Dervatve of a Costat Fucto. If f as te costat value f(x) = c, te f x = [c] = 0. x Proof. From te efto: f (x) f(x + ) f(x) o c c 0 = 0. QED

More information

Chapter 3. Differentiation 3.2 Differentiation Rules for Polynomials, Exponentials, Products and Quotients

Chapter 3. Differentiation 3.2 Differentiation Rules for Polynomials, Exponentials, Products and Quotients 3.2 Dfferetato Rules 1 Capter 3. Dfferetato 3.2 Dfferetato Rules for Polyomals, Expoetals, Proucts a Quotets Rule 1. Dervatve of a Costat Fucto. If f as te costat value f(x) = c, te f x = [c] = 0. x Proof.

More information

Algorithms Design & Analysis. Hash Tables

Algorithms Design & Analysis. Hash Tables Algorthms Desg & Aalyss Hash Tables Recap Lower boud Order statstcs 2 Today s topcs Drect-accessble table Hash tables Hash fuctos Uversal hashg Perfect Hashg Ope addressg 3 Symbol-table problem Symbol

More information

Chapter 8. Inferences about More Than Two Population Central Values

Chapter 8. Inferences about More Than Two Population Central Values Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha

More information

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s

More information

Likewise, properties of the optimal policy for equipment replacement & maintenance problems can be used to reduce the computation.

Likewise, properties of the optimal policy for equipment replacement & maintenance problems can be used to reduce the computation. Whe solvg a vetory repleshmet problem usg a MDP model, kowg that the optmal polcy s of the form (s,s) ca reduce the computatoal burde. That s, f t s optmal to replesh the vetory whe the vetory level s,

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:

More information