16 EXPECTATION MAXIMIZATION

Size: px
Start display at page:

Download "16 EXPECTATION MAXIMIZATION"

Transcription

1 16 EXPECTATION MAXIMIZATION A he is oly a egg s way of akig aother egg. Sauel Butler Suppose you were buildig a aive Bayes odel for a text categorizatio proble. After you were doe, your boss told you that it becae prohibitively expesive to obtai labeled data. You ow have a probabilistic odel that assues access to labels, but you do t have ay labels! Ca you still do soethig? Aazigly, you ca. You ca treat the labels as hidde variables, ad attept to lear the at the sae tie as you lear the paraeters of your odel. A very broad faily of algoriths for solvig probles just like this is the expectatio axiizatio faily. I this chapter, you will derive expectatio axiizatio (EM) algoriths for clusterig ad diesioality reductio, ad the see why EM works. Learig Objectives: Explai the relatioship betwee paraeters ad hidde variables. Costruct geerative stories for clusterig ad diesioality reductio. Draw a graph explaiig how EM works by costructig covex lower bouds. Ipleet EM for clusterig with ixtures of Gaussias, ad cotrastig it with k-eas. Evaluate the differeces betwee EM ad gradiet descet for hidde variable odels. Depedecies: 16.1 Gradig a Exa without a Aswer Key Alice s achie learig professor Carlos gives out a exa that cosists of 50 true/false questios. Alice s class of 100 studets takes the exa ad Carlos goes to grade their solutios. If Carlos ade a aswer key, this would be easy: he would just cout the fractio of correctly aswered questios each studet got, ad that would be their score. But, like ay professors, Carlos was really busy ad did t have tie to ake a aswer key. Ca he still grade the exa? There are two isights that suggest that he ight be able to. Suppose he kow ahead of tie that Alice was a awesoe studet, ad is basically guarateed to get 100% o the exa. I that case, Carlos ca siply use Alice s aswers as the groud truth. More geerally, if Carlos assues that o average studets are better tha rado guessig, he ca hope that the ajority aswer for each questio is likely to be correct. Cobiig this with the previous isight, whe doig the votig, he ight wat to pay ore attetio to the aswers of the better studets. To be a bit ore pedatic, suppose there are N = 100 studets ad M = 50 questios. Each studet has a score s, betwee 0 ad

2 expectatio axiizatio that deotes how well they do o the exa. The score is what we really wat to copute. For each questio ad each studet, the studet has provided a aswer a,, which is either zero or oe. There is also a ukow groud truth aswer for each questio, which we ll call t, which is also either zero or oe. As a startig poit, let s cosider a siple heuristic ad the coplexify it. The heuristic is the ajority vote heuristic ad works as follows. First, we estiate t as the ost coo aswer for questio : t = argax t 1a, = t. Oce we have a guess for each true aswer, we estiate each studets score as how ay aswers they produced that atch this guessed key: s = 1 M 1a, = t. Oce we have these scores, however, we ight wat to trust soe of the studets ore tha others. I particular, aswers fro studets with high scores are perhaps ore likely to be correct, so we ca recopute the groud truth, accordig to weighted votes. The weight of the votes will be precisely the score the correspodig each studet: t = argax t s 1a, = t (16.1) You ca recogize this as a chicke ad egg proble. If you kew the studet s scores, you could estiate a aswer key. If you had a aswer key, you could copute studet scores. A very coo strategy i coputer sciece for dealig with such chicke ad egg probles is to iterate. Take a guess at the first, copute the secod, recopute the first, ad so o. I order to develop this idea forally, we have to case the proble i ters of a probabilistic odel with a geerative story. The geerative story we ll use is: 1. For each questio, choose a true aswer t Ber(0.5) 2. For each studet, choose a score s Ui(0, 1) 3. For each questio ad each studet, choose a aswer a, Ber(s ) t Ber(1 s ) 1 t I the first step, we geerate the true aswers idepedetly by flippig a fair coi. I the secod step, each studets overall score is deteried to be a uifor rado uber betwee zero ad oe. The tricky step is step three, where each studets aswer is geerated for each questio. Cosider studet aswerig questio, ad suppose that s = 0.9. If t = 1, the a, should be 1 (i.e., correct) 90% of the tie; this ca be accoplished by drawig the aswer fro Ber(0.9). O the other had, if t = 0, the a, should 1 (i.e., icorrect) 10% of the tie; this ca be accoplished by drawig

3 188 a course i achie learig the aswer fro Ber(0.1). The expoet i step 3 selects which of two Beroulli distributios to draw fro, ad the ipleets this rule. This ca be traslated ito the followig likelihood: p(a, t, s) = 0.5 t t = 0.5 M 1 s a,t (1 s ) (1 a,)t s (1 a,)(1 t ) (1 s ) a,(1 t ) (16.2) s a,t (1 s ) (1 a,)t s (1 a,)(1 t ) (1 s ) a,(1 t ) (16.3) Suppose we kew the true lables t. We ca take the log of this likelihood ad differetiate it with respect to the score s of soe studet (ote: we ca drop the 0.5 M ter because it is just a costat): log p(a, t, s) = log p(a, t, s) s a, t log s + (1 a, )(1 t ) log(s ) + (1 a, )t log(1 s ) + a, (1 t ) log(1 s ) (16.4) a, t + (1 a, )(1 t ) = (1 a,)t + a, (1 t ) s 1 s (16.5) The derivative has the for s A 1 s. If we set this equal to zero ad solve for s, we get a optiu of s =. I this case: B A A+B A = a, t + (1 a, )(1 t ) (16.6) B = (1 a, )t + a, (1 t ) (16.7) A + B = 1 = M (16.8) Puttig this together, we get: s = 1 M a, t + (1 a, )(1 t ) (16.9) I the case of kow ts, this atches exactly what we had i the heuristic. However, we do ot kow t, so istead of usig the true values of t, we re goig to use their expectatios. I particular, we will copute s by axiizig its likelihood uder the expected values

4 expectatio axiizatio 189 of t, hece the ae expectatio axiizatio. If we are goig to copute expectatios of t, we have to say: expectatios accordig to which probability distributio? We will use the distributio p(t a, s). Let t deote E t p(t a,s)t. Because t is a biary variable, its expectatio is equal to it s probability; aely: t = p(t a, s). How ca we copute this? We will copute C = p(t = 1, a, s) ad D = p(t = 0, a, s) ad the copute t = C/(C + D). The coputatio is straightforward: C = 0.5 s a, (1 s ) 1 a, = 0.5 s (1 s ) (16.10) : : a, =1 a, =0 D = 0.5 s 1 a, (1 s ) a, = 0.5 (1 s ) s (16.11) : : a, =1 a, =0 If you ispect the value of C, it is basically votig (i a product for, ot a su for) the scores of those studets who agree that the aswer is 1 with oe-ius-the-score of those studets who do ot. The value of D is doig the reverse. This is a for of ultiplicative votig, which has the effect that if a give studet has a perfect score of 1.0, their results will carry the vote copletely. We ow have a way to: 1. Copute expected groud truth values t, give scores. 2. Optiize scores s give expected groud truth values. The full solutio is the to alterate betwee these two. You ca start by iitializig the groud truth values at the ajority vote (this sees like a safe iitializatio). Give those, copute ew scores. Give those ew scores, copute ew groud truth values. Ad repeat util tired. I the ext two sectios, we will cosider a ore coplex usupervised learig odel for clusterig, ad the a geeric atheatical fraework for expectatio axiizatio, which will aswer questios like: will this process coverge, ad, if so, to what? 16.2 Clusterig with a Mixture of Gaussias I Chapter 9, you leared about probabilitic odels for classificatio based o desity estiatio. Let s start with a fairly siple classificatio odel that assues we have labeled data. We will shortly reove this assuptio. Our odel will state that we have K classes, ad data fro class k is draw fro a Gaussia with ea µ k ad variace σ 2 k. The choice of classes is paraeterized by θ. The geerative story for this odel is:

5 190 a course i achie learig 1. For each exaple = 1... N: (a) Choose a label y Disc(θ) (b) Choose exaple x Nor(µ y, σ 2 y ) This geerative story ca be directly traslated ito a likelihood as before: p(d) = Mult(y θ)nor(x µ y, σy 2 ) (16.12) for each exaple {}} { = θ y 2πσy 2 D 2 }{{} exp 1 x 2 2σy 2 µ y choose label } {{ } choose feature values (16.13) If you had access to labels, this would be all well ad good, ad you could obtai closed for solutios for the axiu likelihood estiates of all paraeters by takig a log ad the takig gradiets of the log likelihood: θ k = fractio of traiig exaples i class k (16.14) = 1 N y = k µ k = ea of traiig exaples i class k (16.15) σ 2 k = y = kx y = k = variace of traiig exaples i class k (16.16) = y = k x µ k y = k Suppose that you do t have labels. Aalogously to the K-eas algorith, oe potetial solutio is to iterate. You ca start off with guesses for the values of the ukow variables, ad the iteratively iprove the over tie. I K-eas, the approach was the assig exaples to labels (or clusters). This tie, istead of akig hard assigets ( exaple 10 belogs to cluster 4 ), we ll ake soft assigets ( exaple 10 belogs half to cluster 4, a quarter to cluster 2 ad a quarter to cluster 5 ). So as ot to cofuse ourselves too uch, we ll itroduce a ew variable, z = z,1,..., z,k (that sus to oe), to deote a fractioal assiget of exaples to clusters. This otio of soft-assigets is visualized i Figure Here, we ve depicted each exaple as a pie chart, ad it s colorig deotes the degree to which it s bee assiged to each (of three) clusters. The size of the pie pieces correspod to the z values.? You should be able to derive the axiu likelihood solutio results forally by ow. Figure 16.1: e:piecharts: A figure

6 expectatio axiizatio 191 Forally, z,k deotes the probability that exaple is assiged to cluster k: z,k = p(y = k x ) (16.17) = p(y = k, x ) p(x ) (16.18) = 1 Z Mult(k θ)nor(x µ k, σ 2 k ) (16.19) Here, the oralizer Z is to esure that z sus to oe. Give a set of paraeters (the θs, µs ad σ 2 s), the fractioal assigets z,k are easy to copute. Now, aki to K-eas, give fractioal assigets, you eed to recopute estiates of the odel paraeters. I aalogy to the axiu likelihood solutio (Eqs (??)-(??)), you ca do this by coutig fractioal poits rather tha full poits. This gives the followig re-estiatio updates: θ k = fractio of traiig exaples i class k (16.20) = 1 N z,k µ k = ea of fractioal exaples i class k (16.21) = z,k x z,k σ 2 k = variace of fractioal exaples i class k (16.22) = z,k x µ k z,k All that has happeed here is that the hard assigets y = k have bee replaced with soft assigets z,k. As a bit of foreshadowig of what is to coe, what we ve doe is essetially replace kow labels with expected labels, hece the ae expectatio axiizatio. Puttig this together yields Algorith This is the GMM ( Gaussia Mixture Models ) algorith, because the probabilitic odel beig leared describes a dataset as beig draw fro a ixture distributio, where each copoet of this distributio is a Gaussia. Just as i the K-eas algorith, this approach is succeptible to local optia ad quality of iitializatio. The heuristics for coputig better iitializers for K-eas are also useful here.? Aside fro the fact that GMMs use soft assigets ad K-eas uses hard assigets, there are other differeces betwee the two approaches. What are they? 16.3 The Expectatio Maxiizatio Fraework At this poit, you ve see a ethod for learig i a particular probabilistic odel with hidde variables. Two questios reai: (1) ca

7 192 a course i achie learig Algorith 38 GMM(X, K) 1: for k = 1 to K do 2: µ k soe rado locatio // radoly iitialize ea for kth cluster 3: σk 2 1 // iitialize variaces 4: θ k 1/K // each cluster equally likely a priori 5: ed for 6: repeat 7: for = 1 to N do 8: for k = 1 to K do 9: z,k θ k 2πσ 2 D 2 k exp 1 x µ k 2 // copute 2σ 2 k (uoralized) fractioal assigets 10: ed for 11: z 1 k z,k z 12: ed for 13: for k = 1 to K do // oralize fractioal assigets 14: θ k 1 N z,k // re-estiate prior probability of cluster k 15: µ k z,k x z,k 16: σk 2 z,k x µ k z,k 17: ed for 18: util coverged // re-estiate ea of cluster k // re-estiate variace of cluster k 19: retur z // retur cluster assigets you apply this idea ore geerally ad (2) why is it eve a reasoable thig to do? Expectatio axiizatio is a faily of algoriths for perforig axiu likelihood estiatio i probabilistic odels with hidde variables. The geeral flavor of how we will proceed is as follows. We wat to axiize the log likelihood L, but this will tur out to be difficult to do directly. Istead, we ll pick a surrogate fuctio L that s a lower boud o L (i.e., L L everywhere) that s (hopefully) easier to axiize. We ll costruct the surrogate i such a way that icreasig it will force the true likelihood to also go up. After axiizig L, we ll costruct a ew lower boud ad optiize that. This process is show pictorially i Figure To proceed, cosider a arbitrary probabilistic odel p(x, y θ), where x deotes the observed data, y deotes the hidde data ad θ deotes the paraeters. I the case of Gaussia Mixture Models, x was the data poits, y was the (ukow) labels ad θ icluded the cluster prior probabilities, the cluster eas ad the cluster variaces. Now, give access oly to a uber of exaples x 1,..., x N, you would like to estiate the paraeters (θ) of the odel. Probabilistically, this eas that soe of the variables are ukow ad therefore you eed to argialize (or su) over their possible values. Now, your data cosists oly of X = x 1, x 2,..., x N, Figure 16.2: e:lowerboud: A figure showig successive lower bouds

8 expectatio axiizatio 193 ot the (x, y) pairs i D. You ca the write the likelihood as: p(x θ) = y 1 y 2 y N p(x, y 1, y 2,... y N θ) = y 1 y 2 y N = p(x, y θ) y p(x, y θ) argializatio (16.23) exaples are idepedet (16.24) algebra (16.25) At this poit, the atural thig to do is to take logs ad the start takig gradiets. However, oce you start takig logs, you ru ito a proble: the log caot eat the su! L(X θ) = log p(x, y θ) (16.26) y Naely, the log gets stuck outside the su ad caot ove i to decopose the rest of the likelihood ter! The ext step is to apply the soewhat strage, but stragely useful, trick of ultiplyig by 1. I particular, let q( ) be a arbitrary probability distributio. We will ultiply the p(... ) ter above by q(y )/q(y ), a valid step so log as q is ever zero. This leads to: L(X θ) = log q(y ) p(x, y θ) y q(y ) (16.27) We will ow costruct a lower boud usig Jese s iequality. This is a very useful (ad easy to prove!) result that states that f ( i λ i x i ) i λ i f (x i ), so log as (a) λ i 0 for all i, (b) i λ i = 1, ad (c) f is cocave. If this looks failiar, that s just because it s a direct result of the defiitio of cocavity. Recall that f is cocave if f (ax + by) a f (x) + b f (x) wheever a + b = 1. You ca ow apply Jese s iequality to the log likelihood by idetifyig the list of q(y )s as the λs, log as f (which is, ideed, cocave) ad each x as the p/q ter. This yields:? Prove Jese s iequality usig the defiitio of cocavity ad iductio. L(X θ) = q(y ) log p(x, y θ) y q(y ) q(y ) log p(x, y θ) q(y ) log q(y ) y (16.28) (16.29) L(X θ) (16.30) Note that this iequality holds for ay choice of fuctio q, so log as its o-egative ad sus to oe. I particular, it eed t eve by the

9 194 a course i achie learig sae fuctio q for each. We will eed to take advatage of both of these properties. We have succeeded i our first goal: costructig a lower boud o L. Whe you go to optiize this lower boud for θ, the oly part that atters is the first ter. The secod ter, q log q, drops out as a fuctio of θ. This eas that the the axiizatio you eed to be able to copute, for fixed q s, is: θ (ew) arg ax θ q (y ) log p(x, y θ) (16.31) y This is exactly the sort of axiizatio doe for Gaussia ixture odels whe we recoputed ew eas, variaces ad cluster prior probabilities. The secod questio is: what should q ( ) actually be? Ay reasoable q will lead to a lower boud, so i order to choose oe q over aother, we eed aother criterio. Recall that we are hopig to axiize L by istead axiizig a lower boud. I order to esure that a icrease i the lower boud iplies a icrease i L, we eed to esure that L(X θ) = L(X θ). I words: L should be a lower boud o L that akes cotact at the curret poit, θ Further Readig TODO further readig

Expectation maximization

Expectation maximization Motivatio Expectatio maximizatio Subhrasu Maji CMSCI 689: Machie Learig 14 April 015 Suppose you are builig a aive Bayes spam classifier. After your are oe your boss tells you that there is o moey to label

More information

Mixture models (cont d)

Mixture models (cont d) 6.867 Machie learig, lecture 5 (Jaakkola) Lecture topics: Differet types of ixture odels (cot d) Estiatig ixtures: the EM algorith Mixture odels (cot d) Basic ixture odel Mixture odels try to capture ad

More information

CS 70 Second Midterm 7 April NAME (1 pt): SID (1 pt): TA (1 pt): Name of Neighbor to your left (1 pt): Name of Neighbor to your right (1 pt):

CS 70 Second Midterm 7 April NAME (1 pt): SID (1 pt): TA (1 pt): Name of Neighbor to your left (1 pt): Name of Neighbor to your right (1 pt): CS 70 Secod Midter 7 April 2011 NAME (1 pt): SID (1 pt): TA (1 pt): Nae of Neighbor to your left (1 pt): Nae of Neighbor to your right (1 pt): Istructios: This is a closed book, closed calculator, closed

More information

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar. Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.

More information

ECE 901 Lecture 4: Estimation of Lipschitz smooth functions

ECE 901 Lecture 4: Estimation of Lipschitz smooth functions ECE 9 Lecture 4: Estiatio of Lipschitz sooth fuctios R. Nowak 5/7/29 Cosider the followig settig. Let Y f (X) + W, where X is a rado variable (r.v.) o X [, ], W is a r.v. o Y R, idepedet of X ad satisfyig

More information

5.6 Binomial Multi-section Matching Transformer

5.6 Binomial Multi-section Matching Transformer 4/14/21 5_6 Bioial Multisectio Matchig Trasforers 1/1 5.6 Bioial Multi-sectio Matchig Trasforer Readig Assiget: pp. 246-25 Oe way to axiize badwidth is to costruct a ultisectio Γ f that is axially flat.

More information

A PROBABILITY PROBLEM

A PROBABILITY PROBLEM A PROBABILITY PROBLEM A big superarket chai has the followig policy: For every Euros you sped per buy, you ear oe poit (suppose, e.g., that = 3; i this case, if you sped 8.45 Euros, you get two poits,

More information

We have also learned that, thanks to the Central Limit Theorem and the Law of Large Numbers,

We have also learned that, thanks to the Central Limit Theorem and the Law of Large Numbers, Cofidece Itervals III What we kow so far: We have see how to set cofidece itervals for the ea, or expected value, of a oral probability distributio, both whe the variace is kow (usig the stadard oral,

More information

Statistics and Data Analysis in MATLAB Kendrick Kay, February 28, Lecture 4: Model fitting

Statistics and Data Analysis in MATLAB Kendrick Kay, February 28, Lecture 4: Model fitting Statistics ad Data Aalysis i MATLAB Kedrick Kay, kedrick.kay@wustl.edu February 28, 2014 Lecture 4: Model fittig 1. The basics - Suppose that we have a set of data ad suppose that we have selected the

More information

A string of not-so-obvious statements about correlation in the data. (This refers to the mechanical calculation of correlation in the data.

A string of not-so-obvious statements about correlation in the data. (This refers to the mechanical calculation of correlation in the data. STAT-UB.003 NOTES for Wedesday 0.MAY.0 We will use the file JulieApartet.tw. We ll give the regressio of Price o SqFt, show residual versus fitted plot, save residuals ad fitted. Give plot of (Resid, Price,

More information

5.6 Binomial Multi-section Matching Transformer

5.6 Binomial Multi-section Matching Transformer 4/14/2010 5_6 Bioial Multisectio Matchig Trasforers 1/1 5.6 Bioial Multi-sectio Matchig Trasforer Readig Assiget: pp. 246-250 Oe way to axiize badwidth is to costruct a ultisectio Γ f that is axially flat.

More information

On Modeling On Minimum Description Length Modeling. M-closed

On Modeling On Minimum Description Length Modeling. M-closed O Modelig O Miiu Descriptio Legth Modelig M M-closed M-ope Do you believe that the data geeratig echais really is i your odel class M? 7 73 Miiu Descriptio Legth Priciple o-m-closed predictive iferece

More information

Mixtures of Gaussians and the EM Algorithm

Mixtures of Gaussians and the EM Algorithm Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity

More information

Automated Proofs for Some Stirling Number Identities

Automated Proofs for Some Stirling Number Identities Autoated Proofs for Soe Stirlig Nuber Idetities Mauel Kauers ad Carste Scheider Research Istitute for Sybolic Coputatio Johaes Kepler Uiversity Altebergerstraße 69 A4040 Liz, Austria Subitted: Sep 1, 2007;

More information

19.1 The dictionary problem

19.1 The dictionary problem CS125 Lecture 19 Fall 2016 19.1 The dictioary proble Cosider the followig data structural proble, usually called the dictioary proble. We have a set of ites. Each ite is a (key, value pair. Keys are i

More information

Axis Aligned Ellipsoid

Axis Aligned Ellipsoid Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tau.edu/~suhasii/teachig.htl Suhasii Subba Rao Exaple The itroge cotet of three differet clover plats is give below. 3DOK1 3DOK5 3DOK7

More information

The Hypergeometric Coupon Collection Problem and its Dual

The Hypergeometric Coupon Collection Problem and its Dual Joural of Idustrial ad Systes Egieerig Vol., o., pp -7 Sprig 7 The Hypergeoetric Coupo Collectio Proble ad its Dual Sheldo M. Ross Epstei Departet of Idustrial ad Systes Egieerig, Uiversity of Souther

More information

1 The Primal and Dual of an Optimization Problem

1 The Primal and Dual of an Optimization Problem CS 189 Itroductio to Machie Learig Fall 2017 Note 18 Previously, i our ivestigatio of SVMs, we forulated a costraied optiizatio proble that we ca solve to fid the optial paraeters for our hyperplae decisio

More information

Chapter 2. Asymptotic Notation

Chapter 2. Asymptotic Notation Asyptotic Notatio 3 Chapter Asyptotic Notatio Goal : To siplify the aalysis of ruig tie by gettig rid of details which ay be affected by specific ipleetatio ad hardware. [1] The Big Oh (O-Notatio) : It

More information

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019 Outlie CSCI-567: Machie Learig Sprig 209 Gaussia mixture models Prof. Victor Adamchik 2 Desity estimatio U of Souther Califoria Mar. 26, 209 3 Naive Bayes Revisited March 26, 209 / 57 March 26, 209 2 /

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Math 10A final exam, December 16, 2016

Math 10A final exam, December 16, 2016 Please put away all books, calculators, cell phoes ad other devices. You may cosult a sigle two-sided sheet of otes. Please write carefully ad clearly, USING WORDS (ot just symbols). Remember that the

More information

The Binomial Multi-Section Transformer

The Binomial Multi-Section Transformer 4/15/2010 The Bioial Multisectio Matchig Trasforer preset.doc 1/24 The Bioial Multi-Sectio Trasforer Recall that a ulti-sectio atchig etwork ca be described usig the theory of sall reflectios as: where:

More information

Expectation-Maximization Algorithm.

Expectation-Maximization Algorithm. Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................

More information

Define a Markov chain on {1,..., 6} with transition probability matrix P =

Define a Markov chain on {1,..., 6} with transition probability matrix P = Pla Group Work 0. The title says it all Next Tie: MCMC ad Geeral-state Markov Chais Midter Exa: Tuesday 8 March i class Hoework 4 due Thursday Uless otherwise oted, let X be a irreducible, aperiodic Markov

More information

Lecture 19. Curve fitting I. 1 Introduction. 2 Fitting a constant to measured data

Lecture 19. Curve fitting I. 1 Introduction. 2 Fitting a constant to measured data Lecture 9 Curve fittig I Itroductio Suppose we are preseted with eight poits of easured data (x i, y j ). As show i Fig. o the left, we could represet the uderlyig fuctio of which these data are saples

More information

Algorithms for Clustering

Algorithms for Clustering CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

More information

HOMEWORK 2 SOLUTIONS

HOMEWORK 2 SOLUTIONS HOMEWORK SOLUTIONS CSE 55 RANDOMIZED AND APPROXIMATION ALGORITHMS 1. Questio 1. a) The larger the value of k is, the smaller the expected umber of days util we get all the coupos we eed. I fact if = k

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Bertrand s postulate Chapter 2

Bertrand s postulate Chapter 2 Bertrad s postulate Chapter We have see that the sequece of prie ubers, 3, 5, 7,... is ifiite. To see that the size of its gaps is ot bouded, let N := 3 5 p deote the product of all prie ubers that are

More information

AVERAGE MARKS SCALING

AVERAGE MARKS SCALING TERTIARY INSTITUTIONS SERVICE CENTRE Level 1, 100 Royal Street East Perth, Wester Australia 6004 Telephoe (08) 9318 8000 Facsiile (08) 95 7050 http://wwwtisceduau/ 1 Itroductio AVERAGE MARKS SCALING I

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

1 (12 points) Red-Black trees and Red-Purple trees

1 (12 points) Red-Black trees and Red-Purple trees CS6 Hoework 3 Due: 29 April 206, 2 oo Subit o Gradescope Haded out: 22 April 206 Istructios: Please aswer the followig questios to the best of your ability. If you are asked to desig a algorith, please

More information

COMP 2804 Solutions Assignment 1

COMP 2804 Solutions Assignment 1 COMP 2804 Solutios Assiget 1 Questio 1: O the first page of your assiget, write your ae ad studet uber Solutio: Nae: Jaes Bod Studet uber: 007 Questio 2: I Tic-Tac-Toe, we are give a 3 3 grid, cosistig

More information

) is a square matrix with the property that for any m n matrix A, the product AI equals A. The identity matrix has a ii

) is a square matrix with the property that for any m n matrix A, the product AI equals A. The identity matrix has a ii square atrix is oe that has the sae uber of rows as colus; that is, a atrix. he idetity atrix (deoted by I, I, or [] I ) is a square atrix with the property that for ay atrix, the product I equals. he

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Statistics for Applications Fall Problem Set 7

Statistics for Applications Fall Problem Set 7 18.650. Statistics for Applicatios Fall 016. Proble Set 7 Due Friday, Oct. 8 at 1 oo Proble 1 QQ-plots Recall that the Laplace distributio with paraeter λ > 0 is the cotiuous probaλ bility easure with

More information

Math 4707 Spring 2018 (Darij Grinberg): midterm 2 page 1. Math 4707 Spring 2018 (Darij Grinberg): midterm 2 with solutions [preliminary version]

Math 4707 Spring 2018 (Darij Grinberg): midterm 2 page 1. Math 4707 Spring 2018 (Darij Grinberg): midterm 2 with solutions [preliminary version] Math 4707 Sprig 08 Darij Griberg: idter page Math 4707 Sprig 08 Darij Griberg: idter with solutios [preliiary versio] Cotets 0.. Coutig first-eve tuples......................... 3 0.. Coutig legal paths

More information

Some Examples on Gibbs Sampling and Metropolis-Hastings methods

Some Examples on Gibbs Sampling and Metropolis-Hastings methods Soe Exaples o Gibbs Saplig ad Metropolis-Hastigs ethods S420/620 Itroductio to Statistical Theory, Fall 2012 Gibbs Sapler Saple a ultidiesioal probability distributio fro coditioal desities. Suppose d

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

The Binomial Multi- Section Transformer

The Binomial Multi- Section Transformer 4/4/26 The Bioial Multisectio Matchig Trasforer /2 The Bioial Multi- Sectio Trasforer Recall that a ulti-sectio atchig etwork ca be described usig the theory of sall reflectios as: where: ( ω ) = + e +

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

Answer Key, Problem Set 1, Written

Answer Key, Problem Set 1, Written Cheistry 1 Mies, Sprig, 018 Aswer Key, Proble Set 1, Writte 1. 14.3;. 14.34 (add part (e): Estiate / calculate the iitial rate of the reactio); 3. NT1; 4. NT; 5. 14.37; 6. 14.39; 7. 14.41; 8. NT3; 9. 14.46;

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer. 6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio

More information

Stanford Statistics 311/Electrical Engineering 377

Stanford Statistics 311/Electrical Engineering 377 I. Uiversal predictio ad codig a. Gae: sequecex ofdata, adwattopredict(orcode)aswellasifwekew distributio of data b. Two versios: probabilistic ad adversarial. I either case, let p ad q be desities or

More information

Learning Theory for Conditional Risk Minimization: Supplementary Material

Learning Theory for Conditional Risk Minimization: Supplementary Material Learig Theory for Coditioal Risk Miiizatio: Suppleetary Material Alexader Zii IST Austria azii@istacat Christoph H Lapter IST Austria chl@istacat Proofs Proof of Theore After the applicatio of (6) ad (8)

More information

Lecture Outline. 2 Separating Hyperplanes. 3 Banach Mazur Distance An Algorithmist s Toolkit October 22, 2009

Lecture Outline. 2 Separating Hyperplanes. 3 Banach Mazur Distance An Algorithmist s Toolkit October 22, 2009 18.409 A Algorithist s Toolkit October, 009 Lecture 1 Lecturer: Joatha Keler Scribes: Alex Levi (009) 1 Outlie Today we ll go over soe of the details fro last class ad ake precise ay details that were

More information

Hidden Variables, the EM Algorithm, and Mixtures of Gaussians

Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Hidde Variables the EM Algorith ad Mitures of Gaussias Couter Visio Jia-Bi Huag Virgiia Tech May slides fro D. Hoie Adiistrative stuffs Fial roject roosal due Oct 7 (Thursday) Tis for fial roject Set u

More information

Seunghee Ye Ma 8: Week 5 Oct 28

Seunghee Ye Ma 8: Week 5 Oct 28 Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Lecture 2: Concentration Bounds

Lecture 2: Concentration Bounds CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy

More information

Contents Two Sample t Tests Two Sample t Tests

Contents Two Sample t Tests Two Sample t Tests Cotets 3.5.3 Two Saple t Tests................................... 3.5.3 Two Saple t Tests Setup: Two Saples We ow focus o a sceario where we have two idepedet saples fro possibly differet populatios. Our

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

11 KERNEL METHODS From Feature Combinations to Kernels. φ(x) = 1, 2x 1, 2x 2, 2x 3,..., 2x D, Learning Objectives:

11 KERNEL METHODS From Feature Combinations to Kernels. φ(x) = 1, 2x 1, 2x 2, 2x 3,..., 2x D, Learning Objectives: 11 KERNEL METHODS May who have had a opportuity of kowig ay ore about atheatics cofuse it with arithetic, ad cosider it a arid sciece. I reality, however, it is a sciece which requires a great aout of

More information

Lecture 2: April 3, 2013

Lecture 2: April 3, 2013 TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 2: April 3, 203 Scribe: Shubhedu Trivedi Coi tosses cotiued We retur to the coi tossig example from the last lecture agai: Example. Give,

More information

Tomoki Toda. Augmented Human Communication Laboratory Graduate School of Information Science

Tomoki Toda. Augmented Human Communication Laboratory Graduate School of Information Science Seuetial Data Modelig d class Basics of seuetial data odelig ooki oda Augeted Hua Couicatio Laboratory Graduate School of Iforatio Sciece Basic Aroaches How to efficietly odel joit robability of high diesioal

More information

42 Dependence and Bases

42 Dependence and Bases 42 Depedece ad Bases The spa s(a) of a subset A i vector space V is a subspace of V. This spa ay be the whole vector space V (we say the A spas V). I this paragraph we study subsets A of V which spa V

More information

Math 475, Problem Set #12: Answers

Math 475, Problem Set #12: Answers Math 475, Problem Set #12: Aswers A. Chapter 8, problem 12, parts (b) ad (d). (b) S # (, 2) = 2 2, sice, from amog the 2 ways of puttig elemets ito 2 distiguishable boxes, exactly 2 of them result i oe

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig

More information

Lecture 11. Solution of Nonlinear Equations - III

Lecture 11. Solution of Nonlinear Equations - III Eiciecy o a ethod Lecture Solutio o Noliear Equatios - III The eiciecy ide o a iterative ethod is deied by / E r r: rate o covergece o the ethod : total uber o uctios ad derivative evaluatios at each step

More information

Lecture 12: September 27

Lecture 12: September 27 36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.

More information

Homework 5 Solutions

Homework 5 Solutions Homework 5 Solutios p329 # 12 No. To estimate the chace you eed the expected value ad stadard error. To do get the expected value you eed the average of the box ad to get the stadard error you eed the

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

6.867 Machine learning, lecture 13 (Jaakkola)

6.867 Machine learning, lecture 13 (Jaakkola) Lecture topics: Boostig, argi, ad gradiet descet copleity of classifiers, geeralizatio Boostig Last tie we arrived at a boostig algorith for sequetially creatig a eseble of base classifiers. Our base classifiers

More information

PARTIAL DIFFERENTIAL EQUATIONS SEPARATION OF VARIABLES

PARTIAL DIFFERENTIAL EQUATIONS SEPARATION OF VARIABLES Diola Bagayoko (0 PARTAL DFFERENTAL EQUATONS SEPARATON OF ARABLES. troductio As discussed i previous lectures, partial differetial equatios arise whe the depedet variale, i.e., the fuctio, varies with

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Lecture 6: Integration and the Mean Value Theorem. slope =

Lecture 6: Integration and the Mean Value Theorem. slope = Math 8 Istructor: Padraic Bartlett Lecture 6: Itegratio ad the Mea Value Theorem Week 6 Caltech 202 The Mea Value Theorem The Mea Value Theorem abbreviated MVT is the followig result: Theorem. Suppose

More information

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 10 October Minimaxity and least favorable prior sequences STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15 CS 70 Discrete Mathematics ad Probability Theory Summer 2014 James Cook Note 15 Some Importat Distributios I this ote we will itroduce three importat probability distributios that are widely used to model

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

Discrete Mathematics: Lectures 8 and 9 Principle of Inclusion and Exclusion Instructor: Arijit Bishnu Date: August 11 and 13, 2009

Discrete Mathematics: Lectures 8 and 9 Principle of Inclusion and Exclusion Instructor: Arijit Bishnu Date: August 11 and 13, 2009 Discrete Matheatics: Lectures 8 ad 9 Priciple of Iclusio ad Exclusio Istructor: Arijit Bishu Date: August ad 3, 009 As you ca observe by ow, we ca cout i various ways. Oe such ethod is the age-old priciple

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

Bernoulli Polynomials Talks given at LSBU, October and November 2015 Tony Forbes

Bernoulli Polynomials Talks given at LSBU, October and November 2015 Tony Forbes Beroulli Polyoials Tals give at LSBU, October ad Noveber 5 Toy Forbes Beroulli Polyoials The Beroulli polyoials B (x) are defied by B (x), Thus B (x) B (x) ad B (x) x, B (x) x x + 6, B (x) dx,. () B 3

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >

More information

MA131 - Analysis 1. Workbook 2 Sequences I

MA131 - Analysis 1. Workbook 2 Sequences I MA3 - Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................

More information

Binomial transform of products

Binomial transform of products Jauary 02 207 Bioial trasfor of products Khristo N Boyadzhiev Departet of Matheatics ad Statistics Ohio Norther Uiversity Ada OH 4580 USA -boyadzhiev@ouedu Abstract Give the bioial trasfors { b } ad {

More information

Name Period ALGEBRA II Chapter 1B and 2A Notes Solving Inequalities and Absolute Value / Numbers and Functions

Name Period ALGEBRA II Chapter 1B and 2A Notes Solving Inequalities and Absolute Value / Numbers and Functions Nae Period ALGEBRA II Chapter B ad A Notes Solvig Iequalities ad Absolute Value / Nubers ad Fuctios SECTION.6 Itroductio to Solvig Equatios Objectives: Write ad solve a liear equatio i oe variable. Solve

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018) Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black

More information

Integrals of Functions of Several Variables

Integrals of Functions of Several Variables Itegrals of Fuctios of Several Variables We ofte resort to itegratios i order to deterie the exact value I of soe quatity which we are uable to evaluate by perforig a fiite uber of additio or ultiplicatio

More information

Probability Theory. Exercise Sheet 4. ETH Zurich HS 2017

Probability Theory. Exercise Sheet 4. ETH Zurich HS 2017 ETH Zurich HS 2017 D-MATH, D-PHYS Prof. A.-S. Szita Coordiator Yili Wag Probability Theory Exercise Sheet 4 Exercise 4.1 Let X ) N be a sequece of i.i.d. rado variables i a probability space Ω, A, P ).

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information