1 Principal Component Analysis in High Dimensions and the Spike Model

Size: px
Start display at page:

Download "1 Principal Component Analysis in High Dimensions and the Spike Model"

Transcription

1 Pricipal Compoet Aalysis i High Dimesios ad the Spike Model. Dimesio Reductio ad PCA Whe faced with a high dimesioal dataset, a atural approach is to try to reduce its dimesio, either by projectig it to a lower dimesio space or by fidig a better represetatio for the data. Durig this course we will see a few differet ways of doig dimesio reductio. We will start with Pricipal Compoet Aalysis (PCA. I fact, PCA cotiues to be oe of the best (ad simplest tools for exploratory data aalysis. Remarkably, it dates back to a 90 paper by Karl Pearso Pea0]! Let s say we have data poits x,..., x i R p, for some p, ad we are iterested i (liearly projectig the data to d < p dimesios. his is particularly useful if, say, oe wats to visualize the data i two or three dimesios. here are a couple of differet ways we ca try to choose this projectio:. Fidig the d-dimesioal affie subspace for which the projectios of x,..., x o it best approximate the origial poits x,..., x.. Fidig the d dimesioal projectio of x,..., x that preserved as much variace of the data as possible. As we will see below, these two approaches are equivalet ad they correspod to Pricipal Compoet Aalysis. Before proceedig, we recall a couple of simple statistical quatities associated with x,..., x, that will reappear below. Give x,..., x we defie its sample mea as ad its sample covariace as µ = x k, (4 k= Σ = ( x k µ (x k µ. (5 k= Remark. If x,..., x are idepedetly sampled from a distributio, µ ad Σ are ubiased estimators for, respectively, the mea ad covariace of the distributio. We will start with the first iterpretatio of PCA ad the show that it is equivalet to the secod... PCA as best d-dimesioal affie fit We are tryig to approximate each x k by d x k µ + (β k i v i, (6 i= 0

2 where v,..., v d is a orthoormal basis for the d-dimesioal subspace, µ R p represets the traslatio, ad β k correspods to the coefficiets of x k. If we represet the subspace by V = v p v d ] R d the we ca rewrite (7 as x k µ + V β k, (7 where V V = I d d as the vectors v i are orthoormal. We will measure goodess of fit i terms of least squares ad attempt to solve mi x k (µ + V β k (8 µ, V, β k V V =I k= We start by optimizig for µ. It is easy to see that the first order coditios for µ correspod to µ x k (µ + V β k = 0 (x k (µ + V β k = 0. k= k= hus, the optimal value µ of µ satisfies ( ( µ V βk = 0. x k k= k= Because k= β k = 0 we have that the optimal µ is give by µ = x k = µ, k= the sample mea. We ca the proceed o fidig the solutio for (9 by solvig mi xk µ V β k. (9 V, β k V V =I k= Let us proceed by optimizig for β k. Sice the problem decouples for each k, we ca focus o, for each k, d mi x k µ V β k = mi x k µ ( βk i v i. (0 β β k k Sice v,..., vd are orthoormal, it is easy to see that the solutio is give by (β k i = vi (x k µ which ca be succictly writte as β k = V (x k µ. hus, (9 is equivalet to i= mi (x k µ V V (x k µ. ( V V =I k=

3 Note that (x k µ V V (x k µ = (x k µ (x k µ (x k µ V V (x k µ ( + (xk µ V V V V (x k µ = (x k µ (x k µ (x k µ V V (x k µ. Sice (x k µ (x k µ does ot deped o V, miimizig (9 is equivalet to max (x k µ V V (x k µ. ( V V =I k= A few more simple algebraic maipulatios usig properties of the trace: (x k µ V V (x k µ = ] r (xk µ V V (x k µ k= k= = ] r V (x k µ (x k µ V k= = r V (x k µ (x k µ V k= = ( r V ] Σ V. ] his meas that the solutio to (3 is give by max r V ] Σ V. V V =I (3 As we saw above (recall ( the solutio is give by V = v,, v d ] where v,..., v d correspod to the d leadig eigevectors of Σ. Let us first show that iterpretatio ( of fidig the d-dimesioal projectio of x,..., x that preserves the most variace also arrives to the optimizatio problem (3... PCA as d-dimesioal projectio that preserves the most variace We aim to fid a orthoormal basis v,..., vd (orgaized as V = v,..., v d ] with V V = I d d of a d-dimesioal space such that the projectio of x,..., x projected o this subspace has the most variace. Equivaletly we ca ask for the poits v x k., vd x k k=

4 to have as much variace as possible. Hece, we are iterested i solvig Note that max V x k V x r. (4 V V =I k= r= V xk V x r = ( V (x k µ = r V Σ V, k= r= k= showig that (4 is equivalet to (3 ad that the two iterpretatios of PCA are ideed equivalet...3 Fidig the Pricipal Compoets Whe give a dataset x,..., x R p, i order to compute the Pricipal Compoets oe eeds to fid the leadig eigevectors of Σ = ( xk µ (x k µ. k= A aive way of doig this would be to costruct Σ (which takes O(p work ad the fidig its spectral decompositio (which takes O(p 3 work. his meas that the computatioal complexity of this procedure is O ( max { p, p 3 } (see HJ85] ad/or Gol96]. A alterative is to use the Sigular Value Decompositio (. Let X = x x ] recall that, Σ = ( X ( X µ. µ Let us take the SVD of X µ = ULDUR with U L O(p, D diagoal, ad UR U R = I. he, L R R L L UL, Σ = ( X µ ( X µ = U DU U DU = U D meaig that U L correspod to the eigevectors of Σ. Computig the SVD of X µ takes O(mi p, p but if oe is iterested i simply computig the top d eigevectors the this computatioal costs reduces to O(dp. his ca be further improved with radomized algorithms. here are radomized algorithms that compute a approximate solutio i O ( p log d + (p + d time (see for example HM09, RS09, MM5]...4 Which d should we pick? Give a dataset, if the objective is to visualize it the pickig d = or d = 3 might make the most sese. However, PCA is useful for may other purposes, for example: ( ofte times the data belogs to a lower dimesioal space but is corrupted by high dimesioal oise. Whe usig PCA it is oftetimess possible to reduce the oise while keepig the sigal. ( Oe may be iterested i ruig a algorithm that would be too computatioally expesive to ru i high dimesios, If there is time, we might discuss some of these methods later i the course. 3

5 dimesio reductio may help there, etc. I these applicatios (ad may others it is ot clear how to pick d. (+ If we deote the k-th largest eigevalue of Σ as λ (Σ, the the k-th pricipal compoet has (+ λ (Σ k a r(σ proportio of the variace. A fairly popular heuristic is to try to choose the cut-off at a compoet that has sigificatly more variace tha the oe immediately after. his is usually visualized by a scree plot: a plot of the values of the ordered eigevalues. Here is a example: k It is commo to the try to idetify a elbow o the scree plot to choose the cut-off. I the ext Sectio we will look ito radom matrix theory to try to uderstad better the behavior of the eigevalues of Σ ad it will help us uderstad whe to cut-off...5 A related ope problem We ow show a iterestig ope problem posed by Mallat ad Zeitoui at MZ] Ope Problem. (Mallat ad Zeitoui MZ] Let g N (0, Σ be a gaussia radom vector i R p with a kow covariace matrix Σ ad d < p. Now, for ay orthoormal basis V = v,..., v p ] of R p, cosider the followig radom variable Γ V : Give a draw of the radom vector g, Γ V is the squared l orm of the largest projectio of g o a subspace geerated by d elemets of the basis V. he questio is: What is the basis V for which E Γ V ] is maximized? Note that r (Σ = p k= λ k (Σ. 4

6 he cojecture i MZ] is that the optimal basis is the eigedecompositio of Σ. It is kow that this is the case for d = (see MZ] but the questio remais ope for d >. It is ot very difficult to see that oe ca assume, without loss of geerality, that Σ is diagoal. A particularly ituitive way of statig the problem is:. Give Σ R p p ad d. Pick a orthoormal basis v,..., v p 3. Give g N (0, Σ 4. Pick d elemets ṽ,..., ṽ d of the basis d 5. Score: ( i= ṽ i g he objective is to pick the basis i order to maximize the expected value of the Score. Notice that if the steps of the procedure were take i a slightly differet order o which step 4 would take place before havig access to the draw of g (step 3 the the best basis is ideed the eigebasis of Σ ad the best subset of the basis is simply the leadig eigevectors (otice the resemblace with PCA, as described above. More formally, we ca write the problem as fidig ( argmax E max vi g, V R p p V V =I S p] S =d i S where g N (0, Σ. he observatio regardig the differet orderig of the steps amouts to sayig that the eigebasis of Σ is the optimal solutio for i g argmax max E ] ( v. V R p p S p] V V =I S =d i S. PCA i high dimesios ad Marceko-Pastur Let us assume that the data poits x,..., x R are idepedet draws of a gaussia radom variable g N (0, Σ for some covariace Σ R p p. I this case whe we use PCA we are hopig to fid low dimesioal structure i the distributio, which should correspod to large eigevalues of Σ (ad their correspodig eigevectors. For this reaso (ad sice PCA depeds o the spectral properties of Σ we would like to uderstad whether the spectral properties of Σ (eigevalues ad eigevectors are close to the oes of Σ. Sice EΣ = Σ, if p is fixed ad the law of large umbers guaratees that ideed Σ Σ. However, i may moder applicatios it is ot ucommo to have p i the order of (or, sometimes, eve larger!. For example, if our dataset is composed by images the is the umber of images ad p the umber of pixels per image; it is coceivable that the umber of pixels be o the order of the umber of images i a set. Ufortuately, i that case, it is o loger clear that Σ Σ. Dealig with this type of difficulties is the realm of high dimesioal statistics. 5 p

7 For simplicity we will istead try to uderstad the spectral properties of S = XX. Sice x N (0, Σ we kow that µ 0 (ad, clearly, the spectral properties of S will be essetially the same as Σ. 3 Let us start by lookig ito a simple example, Σ = I. I that case, the distributio has o low dimesioal structure, as the distributio is rotatio ivariat. he followig is a histogram (left ad a scree plot of the eigevalues of a sample of S (whe Σ = I for p = 500 ad = 000. he red lie is the eigevalue distributio predicted by the Marcheko-Pastur distributio (5, that we will discuss below. As oe ca see i the image, there are may eigevalues cosiderably larger tha (ad some cosiderably larger tha others. Notice that, if give this profile of eigevalues of Σ oe could potetially be led to believe that the data has low dimesioal structure, whe i truth the distributio it was draw from is isotropic. Uderstadig the distributio of eigevalues of radom matrices is i the core of Radom Matrix heory (there are may good books o Radom Matrix heory, e.g. ao] ad AGZ0]. his particular limitig distributio was first established i 967 by Marcheko ad Pastur MP67] ad is ow referred to as the Marcheko-Pastur distributio. hey showed that, if p ad are both goig to with their ratio fixed p/ = γ, the sample distributio of the eigevalues of S (like the histogram above, i the limit, will be df γ (λ = (γ + λ (λ γ γ,γ π γλ + ](λdλ, (5 3 I this case, S is actually the Maximum likelihood estimator for Σ, we ll talk about Maximum likelihood estimatio later i the course. 6

8 with support γ, γ + ]. his is plotted as the red lie i the figure above. Remark. We will ot show the proof of the Marcheko-Pastur heorem here (you ca see, for example, Bai99] for several differet proofs of it, but a approach to a proof is usig the so-called momet method. he core of the idea is to ote that oe ca compute momets of the eigevalue distributio i two ways ad ote that (i the limit for ay k, ( ] k ( p γ+ E r XX = E r S k = E λ k i (S k = λ df γ (λ, p p p γ i= ( ad that the quatities E r XX ] k p ca be estimated (these estimates rely essetially i combi- atorics. he distributio df γ (λ ca the be computed from its momets... A related ope problem Ope Problem. (Mootoicity of sigular values BKS3a] Cosider the settig above but with p =, the X R is a matrix with iid N (0, etries. Let ( σ i X, deote the i-th sigular value 4 of X, ad defie ( ] α R ( := E σ i X, as the expected value of the average sigular value of X. he cojecture is that, for every, i= α R ( + α R (. Moreover, for the aalogous quatity α C ( defied over the complex umbers, meaig simply that each etry of X is a iid complex valued stadard gaussia CN (0, the reverse iequality is cojectured for all : α C ( + α C (. Notice that the sigular values of X are simply the square roots of the eigevalues of S, ( σ i X = λi (S. 4 he i-th diagoal elemet of Σ i the SVD X = UΣV. 7

9 his meas that we ca compute α R i the limit (sice we kow the limitig distributio of λ i (S ad get (sice p = we have γ =, γ = 0, ad γ+ = ( λ λ 8 lim α R ( = λ df (λ = λ = π λ 3π Also, α R ( simply correspods to the expected value of the absolute value of a stadard gaussia g α R ( = E g = π , which is compatible with the cojecture. O the complex valued side, the Marcheko-Pastur distributio also holds for the complex valued case ad so lim α C ( = lim αr( ad α C ( ca also be easily calculated ad see to be larger tha the limit..3 Spike Models ad BBP trasitio What if there actually is some (liear low dimesioal structure o the data? Whe ca we expect to capture it with PCA? A particularly simple, yet relevat, example to aalyse is whe the covariace matrix Σ is a idetity with a rak perturbatio, which we refer to as a spike model Σ = I + βvv, for v a uit orm vector ad β 0. Oe way to thik about this istace is as each data poit x cosistig of a sigal part βg 0 v where g 0 is a oe-dimesioal stadard gaussia (a gaussia multiple of a fixed vector βv ad a oise part g N (0, I (idepedet of g 0. he x = g + βg 0 v is a gaussia radom variable x N (0, I + βvv. A atural questio is whether this rak perturbatio ca be see i S. Let us build some ituitio with a example, the followig is the histogram of the eigevalues of a sample of S for p = 500, = 000, v is the first elemet of the caoical basis v = e, ad β =.5: 0 8

10 he images suggests that there is a eigevalue of S that pops out of the support of the Marcheko-Pastur distributio (below we will estimate the locatio of this eigevalue, ad that estimate correspods to the red x. It is worth oticig that the largest eigevalues of Σ is simply + β =.5 while the largest eigevalue of S appears cosiderably larger tha that. Let us try ow the same experimet for β = 0.5: ad it appears that, for β = 0.5, the distributio of the eigevalues appears to be udistiguishable from whe Σ = I. his motivates the followig questio: Questio.3 For which values of γ ad β do we expect to see a eigevalue of S poppig out of the support of the Marcheko-Pastur distributio, ad what is the limit value that we expect it to take? As we will see below, there is a critical value of β below which we do t expect to see a chage i the distributio of eivealues ad above which we expect oe of the eigevalues to pop out of the support, this is kow as BBP trasitio (after Baik, Be Arous, ad Péché BBAP05]. here are may very ice papers about this ad similar pheomea, icludig Pau, Joh0, BBAP05, Pau07, BS05, Kar05, BGN, BGN]. 5 I what follows we will fid the critical value of β ad estimate the locatio of the largest eigevalue of S. While the argumet we will use ca be made precise (ad is borrowed from Pau] we will be igorig a few details for the sake of expositio. I short, the argumet below ca be trasformed ito a rigorous proof, but it is ot oe at the preset form! First of all, it is ot difficult to see that we ca assume that v = e (sice everythig else is rotatio ivariat. We wat to uderstad the behavior of the leadig eigevalue of S = xix i = XX, i= 5 Notice that the Marcheko-Pastur theorem does ot imply that all eigevalues are actually i the support of the Marchek-Pastur distributio, it just rules out that a o-vaishig proportio are. However, it is possible to show that ideed, i the limit, all eigevalues will be i the support (see, for example, Pau]. 9

11 where We ca write X as X = x,..., x ] R p. X = ] + βz Z, where Z R ad Z R (p, both populated with i.i.d. stadard gaussia etries (N (0,. he, ( + βz Z + βz S = XX = Z ] + βz Z. Z Z ] Now, let λˆ v ad v = where v R p ad v R, deote, respectively, a eigevalue ad v associated eigevector for S. By the defiitio of eigevalue ad eigevector we have ( + βz Z ] ] ] + βz Z v = λ ˆ v + βz Z Z Z, v v which ca be rewritte as (7 is equivalet to ( + βz ˆ Z v + + βz Z v = λv (6 + βz ˆ Z v + Z Z v = λv. (7 ( + βz Z v = ˆλ I Z Z v. If λˆ I Z Z is ivertible (this wo t be justified here, but it is i Pau] the we ca rewrite it as which we ca the plug i (6 to get ( ˆ v = λ I Z Z + βz Z v, ( + βz Z v + + βz Z (ˆ λ I Z ˆ Z + βz Z v = λv If v = 0 (agai, ot properly justified here, see Pau] the this meas that ˆ ˆ λ = ( + βz Z + ( + βz Z λ I Z Z + βz Z (8 First observatio is that because Z R has stadard gaussia etries the Z Z, meaig that ( ] ˆ ˆ λ = ( + β + Z Z λ I Z Z Z Z. (9 0

12 Cosider the SVD of Z = UΣV where U R p ad V R p p have orthoormal colums (meaig that U U = I p p ad V V = I p p, ad Σ is a diagoal matrix. ake D = Σ the Z Z = V Σ V = V DV, meaig that the diagoal etries of D correspod to the eigevalues of Z Z which we expect to p be distributed (i the limit accordig to the Marcheko-Pastur distributio for γ. Replacig back i (9 ( ˆ UD ( ˆ ( / ] / λ = ( + β + Z V λ I V DV UD V Z ] ( = ( + β U ( Z V ˆ D / + λ I V DV V D / ( U Z ( = ( + β + U ( ] Z D / V V λˆ ] I D V V D / ( U Z = ( + β + ( U Z ( D / ˆ λ I D] D / ( U ] Z. Sice the colums of U are orthoormal, g := U Z R p is a isotropic gaussia (g N (0,, i fact, Egg = EU Z ( U Z = EU Z Z U = U E Z Z ] U = U U = I(p (p. We proceed ˆλ = ( + β + g D p / (ˆ ] λ I D = ( + β + g D jj j ˆλ Djj j= ] D / g Because we expect the diagoal etries of D to be distributed accordig to the Marcheko-Pastur distributio ad g to be idepedet to it we expect that (agai, ot properly justified here, see Pau] γ+ p j g D j x p j df λˆ D jj ˆ γ (x. j= γ λ x We thus get a equatio for λ: ˆ γ + ] x ˆλ = ( + β + γ df γ ( x, λˆ x which ca be easily solved with the help of a program that computes itegrals symbolically (such as Mathematica to give (you ca also see Pau] for a derivatio: ( γ ˆλ = ( + β +, (0 β γ

13 which is particularly elegat (specially cosiderig the size of some the equatios used i the derivatio. A importat thig to otice is that for β = γ we have ( γ ˆλ = ( + γ + γ = ( + γ = γ +, suggestig that β = γ is the critical poit. Ideed this is the case ad it is possible to make the above argumet rigorous 6 ad show that i the model described above, If β γ the λ max (S γ +, ad if β > γ the ( γ λ max (S ( + β + > γ +. β Aother importat questio is wether the leadig eigevector actually correlates with the plated perturbatio (i this case e. urs out that very similar techiques ca aswer this questio as well Pau] ad show that the leadig eigevector v max of S will be o-trivially correlated with e if ad oly if β > γ, more precisely: If β γ the v max, e 0, ad if β > γ the β v max, e γ. γ β.3. A brief metio of Wiger matrices Aother very importat radom matrix model is the Wiger matrix (ad it will show up later i this course. Give a iteger, a stadard gaussia Wiger matrix W R is a symmetric matrix with idepedet N (0, etries (except for the fact that W ij = W ji. I the limit, the eigevalues W are distributed accordig to the so-called semi-circular law of dsc(x = 4 x π,] (xdx, ad there is also a BBP like trasitio for this matrix esemble FP06]. More precisely, if v is a uit-orm vector i R ad ξ 0 the the largest eigevalue of W + ξvv satisfies 6 Note that i the argumet above it was t eve completely clear where it was used that the eigevalue was actually the leadig oe. I the actual proof oe first eeds to make sure that there is a eigevalue outside of the support ad the proof oly holds for that oe, you ca see Pau]

14 If ξ the ad if ξ > the λ max ( W + ξvv, ( λ max W + ξvv ξ +. ( ξ.3. A ope problem about spike models Ope Problem.3 (Spike Model for cut SDP MS5]. As sice bee solved MS5] Let W deote a symmetric Wiger matrix with i.i.d. etries W ij N (0,. Also, give B R symmetric, defie: Q(B = max {r(bx : X 0, X ii = }. Defie q(ξ as What is the value of ξ, defied as q(ξ = lim ( ξ EQ + W. ξ = if {ξ 0 : q(ξ > }. It is kow that, if 0 ξ, q(ξ = MS5]. Oe ca show that Q(B λ max (B. I fact, max {r(bx : X 0, X ii = } max {r(bx : X 0, r X = }. It is also ot difficult to show (hit: take the spectral decompositio of X that { } max r(bx : X 0, X ii = = λ max (B. his meas that for ξ >, q(ξ ξ +. ξ i= Remark.4 Optimizatio problems of the type of max {r(bx : X 0, X ii = } are semidefiite programs, they will be a major player later i the course! ( ] Sice E r ξ + W ξ, by takig X = we expect that q(ξ ξ. hese observatios imply that ξ < (see MS5]. A reasoable cojecture is that it is equal to. his would imply that a certai semidefiite programmig based algorithm for clusterig uder the Stochastic Block Model o clusters (we will discuss these thigs later i the course is optimal for detectio (see MS5]. 7 Remark.5 We remark that Ope Problem.3 as sice bee solved MS5]. 7 Later i the course we will discuss clusterig uder the Stochastic Block Model quite thoroughly, ad will see how this same SDP is kow to be optimal for exact recovery ABH4, HWX4, Ba5c]. 3

15 MI OpeCourseWare 8.S096 opics i Mathematics of Data Sciece Fall 05 For iformatio about citig these materials or our erms of Use, visit:

18.S096: Principal Component Analysis in High Dimensions and the Spike Model

18.S096: Principal Component Analysis in High Dimensions and the Spike Model 18.S096: Pricipal Compoet Aalysis i High Dimesios ad the Spike Model opics i Mathematics of Data Sciece (Fall 2015) Afoso S. Badeira badeira@mit.edu http://math.mit.edu/~badeira September 18, 2015 hese

More information

18.S096: Homework Problem Set 1 (revised)

18.S096: Homework Problem Set 1 (revised) 8.S096: Homework Problem Set (revised) Topics i Mathematics of Data Sciece (Fall 05) Afoso S. Badeira Due o October 6, 05 Exteded to: October 8, 05 This homework problem set is due o October 6, at the

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture 9: Pricipal Compoet Aalysis The text i black outlies mai ideas to retai from the lecture. The text i blue give a deeper uderstadig of how we derive or get

More information

5.1 Review of Singular Value Decomposition (SVD)

5.1 Review of Singular Value Decomposition (SVD) MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of

More information

Cov(aX, cy ) Var(X) Var(Y ) It is completely invariant to affine transformations: for any a, b, c, d R, ρ(ax + b, cy + d) = a.s. X i. as n.

Cov(aX, cy ) Var(X) Var(Y ) It is completely invariant to affine transformations: for any a, b, c, d R, ρ(ax + b, cy + d) = a.s. X i. as n. CS 189 Itroductio to Machie Learig Sprig 218 Note 11 1 Caoical Correlatio Aalysis The Pearso Correlatio Coefficiet ρ(x, Y ) is a way to measure how liearly related (i other words, how well a liear model

More information

Spectral Partitioning in the Planted Partition Model

Spectral Partitioning in the Planted Partition Model Spectral Graph Theory Lecture 21 Spectral Partitioig i the Plated Partitio Model Daiel A. Spielma November 11, 2009 21.1 Itroductio I this lecture, we will perform a crude aalysis of the performace of

More information

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

Inverse Matrix. A meaning that matrix B is an inverse of matrix A. Iverse Matrix Two square matrices A ad B of dimesios are called iverses to oe aother if the followig holds, AB BA I (11) The otio is dual but we ofte write 1 B A meaig that matrix B is a iverse of matrix

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

1 Last time: similar and diagonalizable matrices

1 Last time: similar and diagonalizable matrices Last time: similar ad diagoalizable matrices Let be a positive iteger Suppose A is a matrix, v R, ad λ R Recall that v a eigevector for A with eigevalue λ if v ad Av λv, or equivaletly if v is a ozero

More information

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j. Eigevalue-Eigevector Istructor: Nam Su Wag eigemcd Ay vector i real Euclidea space of dimesio ca be uiquely epressed as a liear combiatio of liearly idepedet vectors (ie, basis) g j, j,,, α g α g α g α

More information

CMSE 820: Math. Foundations of Data Sci.

CMSE 820: Math. Foundations of Data Sci. Lecture 17 8.4 Weighted path graphs Take from [10, Lecture 3] As alluded to at the ed of the previous sectio, we ow aalyze weighted path graphs. To that ed, we prove the followig: Theorem 6 (Fiedler).

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

For a 3 3 diagonal matrix we find. Thus e 1 is a eigenvector corresponding to eigenvalue λ = a 11. Thus matrix A has eigenvalues 2 and 3.

For a 3 3 diagonal matrix we find. Thus e 1 is a eigenvector corresponding to eigenvalue λ = a 11. Thus matrix A has eigenvalues 2 and 3. Closed Leotief Model Chapter 6 Eigevalues I a closed Leotief iput-output-model cosumptio ad productio coicide, i.e. V x = x = x Is this possible for the give techology matrix V? This is a special case

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Lecture 8: October 20, Applications of SVD: least squares approximation

Lecture 8: October 20, Applications of SVD: least squares approximation Mathematical Toolkit Autum 2016 Lecturer: Madhur Tulsiai Lecture 8: October 20, 2016 1 Applicatios of SVD: least squares approximatio We discuss aother applicatio of sigular value decompositio (SVD) of

More information

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition 6. Kalma filter implemetatio for liear algebraic equatios. Karhue-Loeve decompositio 6.1. Solvable liear algebraic systems. Probabilistic iterpretatio. Let A be a quadratic matrix (ot obligatory osigular.

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

Topics in Eigen-analysis

Topics in Eigen-analysis Topics i Eige-aalysis Li Zajiag 28 July 2014 Cotets 1 Termiology... 2 2 Some Basic Properties ad Results... 2 3 Eige-properties of Hermitia Matrices... 5 3.1 Basic Theorems... 5 3.2 Quadratic Forms & Noegative

More information

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices Radom Matrices with Blocks of Itermediate Scale Strogly Correlated Bad Matrices Jiayi Tog Advisor: Dr. Todd Kemp May 30, 07 Departmet of Mathematics Uiversity of Califoria, Sa Diego Cotets Itroductio Notatio

More information

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian Chapter 2 EM algorithms The Expectatio-Maximizatio (EM) algorithm is a maximum likelihood method for models that have hidde variables eg. Gaussia Mixture Models (GMMs), Liear Dyamic Systems (LDSs) ad Hidde

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Session 5. (1) Principal component analysis and Karhunen-Loève transformation

Session 5. (1) Principal component analysis and Karhunen-Loève transformation 200 Autum semester Patter Iformatio Processig Topic 2 Image compressio by orthogoal trasformatio Sessio 5 () Pricipal compoet aalysis ad Karhue-Loève trasformatio Topic 2 of this course explais the image

More information

Last time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object

Last time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object 6.3 Stochastic Estimatio ad Cotrol, Fall 004 Lecture 7 Last time: Momets of the Poisso distributio from its geeratig fuctio. Gs () e dg µ e ds dg µ ( s) µ ( s) µ ( s) µ e ds dg X µ ds X s dg dg + ds ds

More information

Singular value decomposition. Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaine

Singular value decomposition. Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaine Lecture 11 Sigular value decompositio Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaie V1.2 07/12/2018 1 Sigular value decompositio (SVD) at a glace Motivatio: the image of the uit sphere S

More information

Lecture 2: April 3, 2013

Lecture 2: April 3, 2013 TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 2: April 3, 203 Scribe: Shubhedu Trivedi Coi tosses cotiued We retur to the coi tossig example from the last lecture agai: Example. Give,

More information

CHAPTER I: Vector Spaces

CHAPTER I: Vector Spaces CHAPTER I: Vector Spaces Sectio 1: Itroductio ad Examples This first chapter is largely a review of topics you probably saw i your liear algebra course. So why cover it? (1) Not everyoe remembers everythig

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001.

Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001. Physics 324, Fall 2002 Dirac Notatio These otes were produced by David Kapla for Phys. 324 i Autum 2001. 1 Vectors 1.1 Ier product Recall from liear algebra: we ca represet a vector V as a colum vector;

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

Lecture 12: September 27

Lecture 12: September 27 36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.

More information

Chi-Squared Tests Math 6070, Spring 2006

Chi-Squared Tests Math 6070, Spring 2006 Chi-Squared Tests Math 6070, Sprig 2006 Davar Khoshevisa Uiversity of Utah February XXX, 2006 Cotets MLE for Goodess-of Fit 2 2 The Multiomial Distributio 3 3 Applicatio to Goodess-of-Fit 6 3 Testig for

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

THE KALMAN FILTER RAUL ROJAS

THE KALMAN FILTER RAUL ROJAS THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider

More information

LINEAR ALGEBRA. Paul Dawkins

LINEAR ALGEBRA. Paul Dawkins LINEAR ALGEBRA Paul Dawkis Table of Cotets Preface... ii Outlie... iii Systems of Equatios ad Matrices... Itroductio... Systems of Equatios... Solvig Systems of Equatios... 5 Matrices... 7 Matrix Arithmetic

More information

Section 4.3. Boolean functions

Section 4.3. Boolean functions Sectio 4.3. Boolea fuctios Let us take aother look at the simplest o-trivial Boolea algebra, ({0}), the power-set algebra based o a oe-elemet set, chose here as {0}. This has two elemets, the empty set,

More information

The Growth of Functions. Theoretical Supplement

The Growth of Functions. Theoretical Supplement The Growth of Fuctios Theoretical Supplemet The Triagle Iequality The triagle iequality is a algebraic tool that is ofte useful i maipulatig absolute values of fuctios. The triagle iequality says that

More information

Notes for Lecture 11

Notes for Lecture 11 U.C. Berkeley CS78: Computatioal Complexity Hadout N Professor Luca Trevisa 3/4/008 Notes for Lecture Eigevalues, Expasio, ad Radom Walks As usual by ow, let G = (V, E) be a udirected d-regular graph with

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

Factor Analysis. Lecture 10: Factor Analysis and Principal Component Analysis. Sam Roweis

Factor Analysis. Lecture 10: Factor Analysis and Principal Component Analysis. Sam Roweis Lecture 10: Factor Aalysis ad Pricipal Compoet Aalysis Sam Roweis February 9, 2004 Whe we assume that the subspace is liear ad that the uderlyig latet variable has a Gaussia distributio we get a model

More information

MATH10212 Linear Algebra B Proof Problems

MATH10212 Linear Algebra B Proof Problems MATH22 Liear Algebra Proof Problems 5 Jue 26 Each problem requests a proof of a simple statemet Problems placed lower i the list may use the results of previous oes Matrices ermiats If a b R the matrix

More information

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS DEMETRES CHRISTOFIDES Abstract. Cosider a ivertible matrix over some field. The Gauss-Jorda elimiatio reduces this matrix to the idetity

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

if j is a neighbor of i,

if j is a neighbor of i, We see that if i = j the the coditio is trivially satisfied. Otherwise, T ij (i) = (i)q ij mi 1, (j)q ji, ad, (i)q ij T ji (j) = (j)q ji mi 1, (i)q ij. (j)q ji Now there are two cases, if (j)q ji (i)q

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the

More information

Seunghee Ye Ma 8: Week 5 Oct 28

Seunghee Ye Ma 8: Week 5 Oct 28 Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES.

ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES. ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES. ANDREW SALCH 1. The Jacobia criterio for osigularity. You have probably oticed by ow that some poits o varieties are smooth i a sese somethig

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Symmetric Matrices and Quadratic Forms

Symmetric Matrices and Quadratic Forms 7 Symmetric Matrices ad Quadratic Forms 7.1 DIAGONALIZAION OF SYMMERIC MARICES SYMMERIC MARIX A symmetric matrix is a matrix A such that. A = A Such a matrix is ecessarily square. Its mai diagoal etries

More information

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman: Math 224 Fall 2017 Homework 4 Drew Armstrog Problems from 9th editio of Probability ad Statistical Iferece by Hogg, Tais ad Zimmerma: Sectio 2.3, Exercises 16(a,d),18. Sectio 2.4, Exercises 13, 14. Sectio

More information

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row: Math 5-4 Tue Feb 4 Cotiue with sectio 36 Determiats The effective way to compute determiats for larger-sized matrices without lots of zeroes is to ot use the defiitio, but rather to use the followig facts,

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 9 Multicolliearity Dr Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Multicolliearity diagostics A importat questio that

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n, CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe

More information

MA131 - Analysis 1. Workbook 3 Sequences II

MA131 - Analysis 1. Workbook 3 Sequences II MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients. Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

More information

Homework Set #3 - Solutions

Homework Set #3 - Solutions EE 15 - Applicatios of Covex Optimizatio i Sigal Processig ad Commuicatios Dr. Adre Tkaceko JPL Third Term 11-1 Homework Set #3 - Solutios 1. a) Note that x is closer to x tha to x l i the Euclidea orm

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Bayesian Methods: Introduction to Multi-parameter Models

Bayesian Methods: Introduction to Multi-parameter Models Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested

More information

Lecture 19. sup y 1,..., yn B d n

Lecture 19. sup y 1,..., yn B d n STAT 06A: Polyomials of adom Variables Lecture date: Nov Lecture 19 Grothedieck s Iequality Scribe: Be Hough The scribes are based o a guest lecture by ya O Doell. I this lecture we prove Grothedieck s

More information

Stochastic Matrices in a Finite Field

Stochastic Matrices in a Finite Field Stochastic Matrices i a Fiite Field Abstract: I this project we will explore the properties of stochastic matrices i both the real ad the fiite fields. We first explore what properties 2 2 stochastic matrices

More information

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods TMA4205 Numerical Liear Algebra The Poisso problem i R 2 : diagoalizatio methods September 3, 2007 c Eiar M Røquist Departmet of Mathematical Scieces NTNU, N-749 Trodheim, Norway All rights reserved A

More information

State Space Representation

State Space Representation Optimal Cotrol, Guidace ad Estimatio Lecture 2 Overview of SS Approach ad Matrix heory Prof. Radhakat Padhi Dept. of Aerospace Egieerig Idia Istitute of Sciece - Bagalore State Space Represetatio Prof.

More information

THE SPECTRAL RADII AND NORMS OF LARGE DIMENSIONAL NON-CENTRAL RANDOM MATRICES

THE SPECTRAL RADII AND NORMS OF LARGE DIMENSIONAL NON-CENTRAL RANDOM MATRICES COMMUN. STATIST.-STOCHASTIC MODELS, 0(3), 525-532 (994) THE SPECTRAL RADII AND NORMS OF LARGE DIMENSIONAL NON-CENTRAL RANDOM MATRICES Jack W. Silverstei Departmet of Mathematics, Box 8205 North Carolia

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

The Binomial Theorem

The Binomial Theorem The Biomial Theorem Robert Marti Itroductio The Biomial Theorem is used to expad biomials, that is, brackets cosistig of two distict terms The formula for the Biomial Theorem is as follows: (a + b ( k

More information

Probability, Expectation Value and Uncertainty

Probability, Expectation Value and Uncertainty Chapter 1 Probability, Expectatio Value ad Ucertaity We have see that the physically observable properties of a quatum system are represeted by Hermitea operators (also referred to as observables ) such

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Lecture 20. Brief Review of Gram-Schmidt and Gauss s Algorithm

Lecture 20. Brief Review of Gram-Schmidt and Gauss s Algorithm 8.409 A Algorithmist s Toolkit Nov. 9, 2009 Lecturer: Joatha Keler Lecture 20 Brief Review of Gram-Schmidt ad Gauss s Algorithm Our mai task of this lecture is to show a polyomial time algorithm which

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis Recursive Algorithms Recurreces Computer Sciece & Egieerig 35: Discrete Mathematics Christopher M Bourke cbourke@cseuledu A recursive algorithm is oe i which objects are defied i terms of other objects

More information

Lecture 1: Basic problems of coding theory

Lecture 1: Basic problems of coding theory Lecture 1: Basic problems of codig theory Error-Correctig Codes (Sprig 016) Rutgers Uiversity Swastik Kopparty Scribes: Abhishek Bhrushudi & Aditya Potukuchi Admiistrivia was discussed at the begiig of

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

CHAPTER 5. Theory and Solution Using Matrix Techniques

CHAPTER 5. Theory and Solution Using Matrix Techniques A SERIES OF CLASS NOTES FOR 2005-2006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 3 A COLLECTION OF HANDOUTS ON SYSTEMS OF ORDINARY DIFFERENTIAL

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Axis Aligned Ellipsoid

Axis Aligned Ellipsoid Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

The Jordan Normal Form: A General Approach to Solving Homogeneous Linear Systems. Mike Raugh. March 20, 2005

The Jordan Normal Form: A General Approach to Solving Homogeneous Linear Systems. Mike Raugh. March 20, 2005 The Jorda Normal Form: A Geeral Approach to Solvig Homogeeous Liear Sstems Mike Raugh March 2, 25 What are we doig here? I this ote, we describe the Jorda ormal form of a matrix ad show how it ma be used

More information

Principle Of Superposition

Principle Of Superposition ecture 5: PREIMINRY CONCEP O RUCUR NYI Priciple Of uperpositio Mathematically, the priciple of superpositio is stated as ( a ) G( a ) G( ) G a a or for a liear structural system, the respose at a give

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam will cover.-.9. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for you

More information

Mon Apr Second derivative test, and maybe another conic diagonalization example. Announcements: Warm-up Exercise:

Mon Apr Second derivative test, and maybe another conic diagonalization example. Announcements: Warm-up Exercise: Math 2270-004 Week 15 otes We will ot ecessarily iish the material rom a give day's otes o that day We may also add or subtract some material as the week progresses, but these otes represet a i-depth outlie

More information