Greedy Subspace Clustering

Size: px
Start display at page:

Download "Greedy Subspace Clustering"

Transcription

1 Technical Report 111 Greey Subspace Clustering Research Supervisor Constantine Caramanis Wireless Networing an Communications Group September 016

2 Data-Supporte Transportation Operations & Planning Center D-STOP A Tier 1 USDOT University Transportation Center at The University of Texas at Austin D-STOP is a collaborative initiative by researchers at the Center for Transportation Research an the Wireless Networing an Communications Group at The University of Texas at Austin.

3 1. Report No. D-STOP/016/ Title an Subtitle Greey Subspace Clustering Technical Report Documentation Page. Government Accession No. 3. Recipient's Catalog No.. Report Date September Performing Organization Coe 7. Authors Dohyung Par, Constantine Caramanis, Sujay Sanghavi 8. Performing Organization Report No. Report Performing Organization Name an Aress Data-Supporte Transportation Operations & Planning Center D- STOP The University of Texas at Austin 1616 Guaalupe Street, Suite 4.0 Austin, Texas Sponsoring Agency Name an Aress Data-Supporte Transportation Operations & Planning Center D- STOP The University of Texas at Austin 1616 Guaalupe Street, Suite 4.0 Austin, Texas Wor Unit No. TRAIS 11. Contract or Grant No. DTRT13-G-UTC8 13. Type of Report an Perio Covere 14. Sponsoring Agency Coe 1. Supplementary Notes Supporte by a grant from the U.S. Department of Transportation, University Transportation Centers Program. 16. Abstract We consier the problem of subspace clustering: given points that lie on or near the union of many lowimensional linear subspaces, recover the subspaces. To this en, one first ientifies sets of points close to the same subspace an uses the sets to estimate the subspaces. As the geometric structure of the clusters linear subspaces forbis proper performance of general istance base approaches such as K-means, many moel-specific methos have been propose. In this paper, we provie new simple an efficient algorithms for this problem. Our statistical analysis shows that the algorithms are guarantee exact perfect clustering performance uner certain conitions on the number of points an the affinity between subspaces. These conitions are weaer than those consiere in the stanar statistical literature. Experimental results on synthetic ata generate from the stanar unions of subspaces moel emonstrate our theory. We also show that our algorithm performs competitively against state-of-the-art algorithms on real-worl applications such as motion segmentation an face clustering, but with much simpler implementation an lower computational cost. 17. Key Wors Subspace clustering; algorithms 19. Security Classif.of this report Unclassifie Form DOT F Distribution Statement No restrictions. This ocument is available to the public through NTIS National Technical Information Service 8 Port Royal Roa Springfiel, Virginia Security Classif.of this page Unclassifie Reprouction of complete page authorize 1. No. of Pages 6. Price

4 Disclaimer The contents of this report reflect the views of the authors, who are responsible for the facts an the accuracy of the information presente herein. This ocument is isseminate uner the sponsorship of the U.S. Department of Transportation s University Transportation Centers Program, in the interest of information exchange. The U.S. Government assumes no liability for the contents or use thereof. The contents of this report reflect the views of the authors, who are responsible for the facts an the accuracy of the information presente herein. Mention of trae names or commercial proucts oes not constitute enorsement or recommenation for use. Acnowlegements The authors recognize that support for this research was provie by a grant from the U.S. Department of Transportation, University Transportation Centers.

5 Project Upate: Greey Subspace Clustering PI: Constantine Caramanis Subspace clustering is the problem of fitting a collection of high-imensional ata points to a union of subspaces. This problem can be regare as Mixture of PCA because the points are to be partitione into subsets, each of which is moele by a low-imensional subspace. This moel naturally arises in settings where ata from multiple latent phenomena are mixe together an nee to be separate. Recent algorithms for subspace clustering are base on two steps. At the first step, a number of neighbor points are collecte for each ata point. Then the points are segmente using spectral clustering. The state-of-the-art algorithms solve a convex program with size as large as the square number of ata points. As the amount of the ata increases, the computational cost becomes critical. To reuce the cost, we propose greey algorithms. Our algorithms are provable in the sense that the exactly correct clustering is guarantee uner certain conitions in the stanar moels. For example, when -imensional subspaces are rawn uniformly at ranom in, an ranom ata points are ii uniformly generate on each subspace, exact clustering is guarantee with high probability if an. This conition is no worse than the existing statistical guarantees. In practice, our greey algorithm, which is in general significantly faster than solving a convex program, performs competitively against the algorithms on realworl benchmar atasets. We observe that our algorithm outperforms the existing greey algorithms. This was the result reporte in the following report. Since the publication of those results in the Proceeings of the Neural Information Processing Systems conference NIPS in December 014, we continue to wor on this project in collaboration with Prof. Sujay Sanghavi UT Austin an Dr. Dohyung Par now a researcher at Faceboo through August 016. Our empirical results continue to represent the state of the art among all currently available algorithms. Therefore we believe it important to see eeper theoretical insight into precisely why our algorithm seems to perform better than others. While we have been able to unerstan much about how our algorithm operates an why it succees, we were unable to obtain new theoretical guarantees beyon those alreay reporte. As a consequence, no further publications were a irect consequence of this line of wor, though in general this topic then le us to consier mixe regression, which has prouce further publications as reporte elsewhere.

6 Greey Subspace Clustering Dohyung Par Dept. of Electrical an Computer Engineering The University of Texas at Austin Constantine Caramanis Dept. of Electrical an Computer Engineering The University of Texas at Austin Sujay Sanghavi Dept. of Electrical an Computer Engineering The University of Texas at Austin Abstract We consier the problem of subspace clustering: given points that lie on or near the union of many low-imensional linear subspaces, recover the subspaces. To this en, one first ientifies sets of points close to the same subspace an uses the sets to estimate the subspaces. As the geometric structure of the clusters linear subspaces forbis proper performance of general istance base approches such as K-means, many moel-specific methos have been propose. In this paper, we provie new simple an efficient algorithms for this problem. Our statistical analysis shows that the algorithms are guarantee exact perfect clustering performance uner certain conitions on the number of points an the affinity between subspaces. These conitions are weaer than those consiere in the stanar statistical literature. Experimental results on synthetic ata generate from the stanar unions of subspaces moel emonstrate our theory. We also show that our algorithm performs competitively against state-of-the-art algorithms on real-worl applications such as motion segmentation an face clustering, but with much simpler implementation an lower computational cost. 1 Introuction Subspace clustering is a classic problem where one is given points in a high-imensional ambient space an woul lie to approximate them by a union of lower-imensional linear subspaces. In particular, each subspace contains a subset of the points. This problem is har because one nees to jointly fin the subspaces, an the points corresponing to each; the ata we are given are unlabele. The unions of subspaces moel naturally arises in settings where ata from multiple latent phenomena are mixe together an nee to be separate. Applications of subspace clustering inclue motion segmentation [3], face clustering [8], gene expression analysis [10], an system ientification []. In these applications, ata points with the same label e.g., face images of a person uner varying illumination conitions, feature points of a moving rigi object in a vieo sequence lie on a lowimensional subspace, an the mixe ataset can be moele by unions of subspaces. For etaile escription of the applications, we refer the reaers to the reviews [10, 0] an references therein. There is now a sizable literature on empirical methos for this particular problem an some statistical analysis as well. Many recently propose methos, which perform remarably well an have theoretical guarantees on their performances, can be characterize as involving two steps: a fining a neighborhoo for each ata point, an b fining the subspaces an/or clustering the points given these neighborhoos. Here, neighbors of a point are other points that the algorithm estimates to lie on the same subspace as the point an not necessarily just closest in Eucliean istance. 1

7 Subspace Conitions for: Algorithm What is guarantee conition Fully ranom moel Semi-ranom moel SSC [4, 16] Correct neighborhoos None logn/ p = O lognl LRR [14] Exact clustering No intersection - - SSC-OMP [3] Correct neighborhoos No intersection - - max aff = O logn/ lognl TSC [6, 7] Exact clustering None p = O 1 lognl max aff = O 1 lognl LRSSC [4] Correct neighborhoos None p = O 1 lognl - NSN+GSR Exact clustering None p = O log n lognl max aff = O log n log L lognl NSN+Spectral Exact clustering None p = O log n lognl - Table 1: Subspace clustering algorithms with theoretical guarantees. LRR an SSC-OMP have only eterministic guarantees, not statistical ones. In the two stanar statistical moels, there are n ata points on each of L -imensional subspaces in R p. For the efinition of max aff, we refer the reaers to Section 3.1. Our contributions: In this paper we evise new algorithms for each of the two steps above; a we evelop a new metho, Nearest Subspace Neighbor NSN, to etermine a neighborhoo set for each point, an b a new metho, Greey Subspace Recovery GSR, to recover subspaces from given neighborhoos. Each of these two methos can be use in conjunction with other methos for the corresponing other step; however, in this paper we focus on two algorithms that use NSN followe by GSR an Spectral clustering, respectively. Our main result is establishing statistical guarantees for exact clustering with general subspace conitions, in the stanar moels consiere in recent analytical literature on subspace clustering. Our conition for exact recovery is weaer than the conitions of other existing algorithms that only guarantee correct neighborhoos 1, which o not always lea to correct clustering. We provie numerical results which emonstrate our theory. We also show that for the real-worl applications our algorithm performs competitively against those of state-of-the-art algorithms, but the computational cost is much lower than them. Moreover, our algorithms are much simpler to implement. 1.1 Relate wor The problem was first formulate in the ata mining community [10]. Most of the relate wor in this fiel assumes that an unerlying subspace is parallel to some canonical axes. Subspace clustering for unions of arbitrary subspaces is consiere mostly in the machine learning an the computer vision communities [0]. Most of the results from those communities are base on empirical justification. They provie algorithms erive from theoretical intuition an showe that they perform empirically well with practical ataset. To name a few, GPCA [1], Spectral curvature clustering SCC [], an many iterative methos [1, 19, 6] show their goo empirical performance for subspace clustering. However, they lac theoretical analysis that guarantees exact clustering. As escribe above, several algorithms with a common structure are recently propose with both theoretical guarantees an remarable empirical performance. Elhamifar an Vial [4] propose an algorithm calle Sparse Subspace Clustering SSC, which uses l 1 -minimization for neighborhoo construction. They prove that if the subspaces have no intersection, SSC always fins a correct neighborhoo matrix. Later, Soltanolotabi an Canes [16] provie a statistical guarantee of the algorithm for subspaces with intersection. Dyer et al. [3] propose another algorithm calle SSC- OMP, which uses Orthogonal Matching Pursuit OMP instea of l 1 -minimization in SSC. Another algorithm calle Low-Ran Representation LRR which uses nuclear norm minimization is propose by Liu et al. [14]. Wang et al. [4] propose an hybri algorithm, Low-Ran an Sparse Subspace Clustering LRSSC, which involves both l 1 -norm an nuclear norm. Hecel an Bölcsei [6] presente Thresholing base Subspace Clustering TSC, which constructs neighborhoos base on the inner proucts between ata points. All of these algorithms use spectral clustering for the clustering step. The analysis in those papers focuses on neither exact recovery of the subspaces nor exact clustering in general subspace conitions. SSC, SSC-OMP, an LRSSC only guarantee correct neighborhoos which o not always lea to exact clustering. LRR guarantees exact clustering only when 1 By correct neighborhoo, we mean that for each point every neighbor point lies on the same subspace. By no intersection between subspaces, we mean that they share only the null point.

8 the subspaces have no intersections. In this paper, we provie novel algorithms that guarantee exact clustering in general subspace conitions. When we were preparing this manuscript, it is prove that TSC guarantees exact clustering uner certain conitions [7], but the conitions are stricter than ours. See Table 1 1. Notation There is a set of N ata points in R p, enote by Y = {y 1,..., y N }. The ata points are lying on or near a union of L subspaces D = L i=1 D i. Each subspace D i is of imension i which is smaller than p. For each point y j, w j enotes the inex of the nearest subspace. Let N i enote the number of points whose nearest subspace is D i, i.e., N i = N j=1 I w j=i. Throughout this paper, sets an subspaces are enote by calligraphic letters. Matrices an ey parameters are enote by letters in upper case, an vectors an scalars are enote by letters in lower case. We frequently enote the set of n inices by [n] = {1,,..., n}. As usual, span{ } enotes a subspace spanne by a set of vectors. For example, span{v 1,..., v n } = {v : v = n i=1 α iv i, α 1,..., α n R}. Proj U y is efine as the projection of y onto subspace U. That is, Proj U y = arg min u U y u. I{ } enotes the inicator function which is one if the statement is true an zero otherwise. Finally, enotes the irect sum. Algorithms We propose two algorithms for subspace clustering as follows. NSN+GSR : Run Nearest Subspace Neighbor NSN to construct a neighborhoo matrix W {0, 1} N N, an then run Greey Subspace Recovery GSR for W. NSN+Spectral : Run Nearest Subspace Neighbor NSN to construct a neighborhoo matrix W {0, 1} N N, an then run spectral clustering for Z = W + W..1 Nearest Subspace Neighbor NSN NSN approaches the problem of fining neighbor points most liely to be on the same subspace in a greey fashion. At first, given a point y without any other nowlege, the one single point that is most liely to be a neighbor of y is the nearest point of the line span{y}. In the following steps, if we have foun a few correct neighbor points lying on the same true subspace an have no other nowlege about the true subspace an the rest of the points, then the most potentially correct point is the one closest to the subspace spanne by the correct neighbors we have. This motivates us to propose NSN escribe in the following. Algorithm 1 Nearest Subspace Neighbor NSN Input: A set of N samples Y = {y 1,..., y N }, The number of require neighbors K, Maximum subspace imension max. Output: A neighborhoo matrix W {0, 1} N N y i y i / y i, i [N] Normalize magnitues for i = 1,..., N o Run NSN for each ata point I i {i} for = 1,..., K o Iteratively a the closest point to the current subspace if max then U span{y j : j I i } en if j arg max j [N]\Ii Proj U y j I i I i {j } en for W ij I j Ii or y j U, j [N] Construct the neighborhoo matrix en for NSN collects K neighbors sequentially for each point. At each step, a -imensional subspace U spanne by the point an its 1 neighbors is constructe, an the point closest to the subspace is 3

9 newly collecte. After max, the subspace U constructe at the max th step is use for collecting neighbors. At last, if there are more points lying on U, they are also counte as neighbors. The subspace U can be store in the form of a matrix U R p imu whose columns form an orthonormal basis of U. Then Proj U y j can be compute easily because it is equal to U y j. While a naive implementation requires OK pn computational cost, this can be reuce to OKpN, an the faster implementation is escribe in Section A.1. We note that this computational cost is much lower than that of the convex optimization base methos e.g., SSC [4] an LRR [14] which solve a convex program with N variables an pn constraints. NSN for subspace clustering shares the same philosophy with Orthogonal Matching Pursuit OMP for sparse recovery in the sense that it incrementally pics the point ictionary element that is the most liely to be correct, assuming that the algorithms have foun the correct ones. In subspace clustering, that point is the one closest to the subspace spanne by the currently selecte points, while in sparse recovery it is the one closest to the resiual of linear regression by the selecte points. In the sparse recovery literature, the performance of OMP is shown to be comparable to that of Basis Pursuit l 1 -minimization both theoretically an empirically [18, 11]. One of the contributions of this wor is to show that this high-level intuition is inee born out, provable, as we show that NSN also performs well in collecting neighbors lying on the same subspace.. Greey Subspace Recovery GSR Suppose that NSN has foun correct neighbors for a ata point. How can we chec if they are inee correct, that is, lying on the same true subspace? One natural way is to count the number of points close to the subspace spanne by the neighbors. If they span one of the true subspaces, then many other points will be lying on the span. If they o not span any true subspaces, few points will be close to it. This fact motivates us to use a greey algorithm to recover the subspaces. Using the neighborhoo constructe by NSN or some other algorithm, we recover the L subspaces. If there is a neighborhoo set containing only the points on the same subspace for each subspace, the algorithm successfully recovers the unions of the true subspaces exactly. Algorithm Greey Subspace Recovery GSR Input: N points Y = {y 1,..., y N }, A neighborhoo matrix W {0, 1} N N, Error boun ɛ Output: Estimate subspaces ˆD = L ˆD l=1 l. Estimate labels ŵ 1,..., ŵ N y i y i / y i, i [N] Normalize magnitues W i Top-{y j : W ij = 1}, i [N] Estimate a subspace using the neighbors for each point I [N] while I o Iteratively pic the best subspace estimates i N arg max i I j=1 I{ Proj W i y j 1 ɛ} ˆD l Ŵi I I \ {j : Proj Wi y j 1 ɛ} en while ŵ i arg max l [L] Proj ˆDl y i, i [N] Label the points using the subspace estimates Recall that the matrix W contains the labelings of the points, so that W ij = 1 if point i is assigne to subspace j. Top-{y j : W ij = 1} enotes the -imensional principal subspace of the set of vectors {y j : W ij = 1}. This can be obtaine by taing the first left singular vectors of the matrix whose columns are the vector in the set. If there are only vectors in the set, Gram-Schmit orthogonalization will give us the subspace. As in NSN, it is efficient to store a subspace W i in the form of its orthogonal basis because we can easily compute the norm of a projection onto the subspace. Testing a caniate subspace by counting the number of near points has alreay been consiere in the subspace clustering literature. In [], the authors propose to run RANom SAmple Consensus RANSAC iteratively. RANSAC ranomly selects a few points an checs if there are many other points near the subspace spanne by the collecte points. Instea of ranomly choosing sample points, GSR receives some caniate subspaces in the form of sets of points from NSN or possibly some other algorithm an selects subspaces in a greey way as specifie in the algorithm above. 4

10 3 Theoretical results We analyze our algorithms in two stanar noiseless moels. The main theorems present sufficient conitions uner which the algorithms cluster the points exactly with high probability. For simplicity of analysis, we assume that every subspace is of the same imension, an the number of ata points on each subspace is the same, i.e., 1 = = L, n N 1 = = N L. We assume that is nown to the algorithm. Nonetheless, our analysis can exten to the general case. 3.1 Statistical moels We consier two moels which have been use in the subspace clustering literature: Fully ranom moel: The subspaces are rawn ii uniformly at ranom, an the points are also ii ranomly generate. Semi-ranom moel: The subspaces are arbitrarily etermine, but the points are ii ranomly generate. Let D i R p, i [L] be a matrix whose columns form an orthonormal basis of D i. An important measure that we use in the analysis is the affinity between two subspaces, efine as affi, j D i D j F =1 = cos θ i,j [0, 1], where θ i,j is the th principal angle between D i an D j. Two subspaces D i an D j are ientical if an only if affi, j = 1. If affi, j = 0, every vector on D i is orthogonal to any vectors on D j. We also efine the maximum affinity as max aff max affi, j [0, 1]. i,j [L],i j There are N = nl points, an there are n points exactly lying on each subspace. We assume that each ata point y i is rawn ii uniformly at ranom from S p 1 D wi where S p 1 is the unit sphere in R p. Equivalently, y i = D wi x i, x i UnifS 1, i [N]. As the points are generate ranomly on their corresponing subspaces, there are no points lying on an intersection of two subspaces, almost surely. This implies that with probability one the points are clustere correctly provie that the true subspaces are recovere exactly. 3. Main theorems The first theorem gives a statistical guarantee for the fully ranom moel. Theorem 1 Suppose L -imensional subspaces an n points on each subspace are generate in the fully ranom moel with n polynomial in. There are constants C 1, C > 0 such that if n > C 1 log ne δ, p < C log n lognlδ 1, 1 then with probability at least 1 3Lδ 1 δ, NSN+GSR3 clusters the points exactly. Also, there are other constants C 1, C > 0 such that if 1 with C 1 an C replace by C 1 an C hols then NSN+Spectral 4 clusters the points exactly with probability at least 1 3Lδ 1 δ. e is the exponential constant. 3 NSN with K = max = followe by GSR with arbitrarily small ɛ. 4 NSN with K = max =.

11 Our sufficient conitions for exact clustering explain when subspace clustering becomes easy or ifficult, an they are consistent with our intuition. For NSN to fin correct neighbors, the points on the same subspace shoul be many enough so that they loo lie lying on a subspace. This conition is spelle out in the first inequality of 1. We note that the conition hols even when n/ is a constant, i.e., n is linear in. The secon inequality implies that the imension of the subspaces shoul not be too high for subspaces to be istinguishable. If is high, the ranom subspaces are more liely to be close to each other, an hence they become more ifficult to be istinguishe. However, as n increases, the points become ense on the subspaces, an hence it becomes easier to ientify ifferent subspaces. Let us compare our result with the conitions require for success in the fully ranom moel in the existing literature. In [16], it is require for SSC to have correct neighborhoos that n shoul be superlinear in when /p fixe. In [6, 4], the conitions on /p becomes worse as we have more points. On the other han, our algorithms are guarantee exact clustering of the points, an the sufficient conition is orer-wise at least as goo as the conitions for correct neighborhoos by the existing algorithms See Table 1. Moreover, exact clustering is guarantee even when n is linear in, an /p fixe. For the semi-ranom moel, we have the following general theorem. Theorem Suppose L -imensional subspaces are arbitrarily chosen, an n points on each subspace are generate in the semi-ranom moel with n polynomial in. There are constants C 1, C > 0 such that if, max aff < n > C 1 log ne δ C log n loglδ 1 lognlδ 1. then with probability at least 1 3Lδ 1 δ, NSN+GSR clusters the points exactly. In the semi-ranom moel, the sufficient conition oes not epen on the ambient imension p. When the affinities between subspaces are fixe, an the points are exactly lying on the subspaces, the ifficulty of the problem oes not epen on the ambient imension. It rather epens on max aff, which measures how close the subspaces are. As they become closer to each other, it becomes more ifficult to istinguish the subspaces. The secon inequality of explains this intuition. The inequality also shows that if we have more ata points, the problem becomes easier to ientify ifferent subspaces. Compare with other algorithms, NSN+GSR is guarantee exact clustering, an more importantly, the conition on max aff improves as n grows. This remar is consistent with the practical performance of the algorithm which improves as the number of ata points increases, while the existing guarantees of other algorithms are not. In [16], correct neighborhoos in SSC are guarantee if max aff = O logn// lognl. In [6], exact clustering of TSC is guarantee if max aff = O1/ lognl. However, these algorithms perform empirically better as the number of ata points increases. 4 Experimental results In this section, we empirically compare our algorithms with the existing algorithms in terms of clustering performance an computational time on a single estop. For NSN, we use the fast implementation escribe in Section A.1. The compare algorithms are K-means, K-flats 6, SSC, LRR, SCC, TSC 7, an SSC-OMP 8. The numbers of replicates in K-means, K-flats, an the K- NSN with K = 1 an max = log followe by GSR with arbitrarily small ɛ. 6 K-flats is similar to K-means. At each iteration, it computes top- principal subspaces of the points with the same label, an then labels every point base on its istances to those subspaces. 7 The MATLAB coes for SSC, LRR, SCC, an TSC are obtaine from jhu.eu/ ehsan/coe.htm, an glchen/scc.html, commth/research/ownloas/sc.html, respectively. 8 For each ata point, OMP constructs a neighborhoo for each point by regressing the point on the other points up to 10 4 accuracy. 6

12 Ambient imension p SSC SSC OMP LRR TSC NSN+Spectral Number of points per imension for each subspace n/ NSN+GSR Figure 1: CE of algorithms on ranom -imensional subspaces an n ranom points on each subspace. The figures shows CE for ifferent numbers of n/ an ambient imension p. /p is fixe to be 3/. Brighter cells represent that less ata points are clustere incorrectly. Ambient imension p l1 minimization SSC OMP SSC OMP Nuclear norm min. LRR Nearest neighbor TSC Number of points per imension for each subspace n/ NSN Figure : NSE for the same moel parameters as those in Figure 1. Brighter cells represent that more ata points have all correct neighbors. Time sec im ambient space, five 10 im subspaces l1 minimization SSC OMP SSC OMP Nuclear norm min. LRR Thresholing TSC NSN 100 im ambient space, 10 im subspaces, 0 points/subspace Time sec Number of ata points per subspace n Number of subspaces L Figure 3: Average computational time of the neighborhoo selection algorithms means use in the spectral clustering are all fixe to 10. The algorithms are compare in terms of Clustering error CE an Neighborhoo selection error NSE, efine as 1 CE = min π Π L N N Iw i πŵ i, i=1 NSE = 1 N N I j : W ij 0, w i w j where Π L is the permutation space of [L]. CE is the proportion of incorrectly labele ata points. Since clustering is invariant up to permutation of label inices, the error is equal to the minimum isagreement over the permutation of label inices. NSE measures the proportion of the points which o not have all correct neighbors Synthetic ata We compare the performances on synthetic ata generate from the fully ranom moel. In R p, five -imensional subspaces are generate uniformly at ranom. Then for each subspace n unitnorm points are generate ii uniformly at ranom on the subspace. To see the agreement with the theoretical result, we ran the algorithms uner fixe /p an varie n an. We set /p = 3/ so that each pair of subspaces has intersection. Figures 1 an show CE an NSE, respectively. Each error value is average over 100 trials. Figure 1 inicates that our algorithm clusters the ata points better than the other algorithms. As preicte in the theorems, the clustering performance improves 9 For the neighborhoo matrices from SSC, LRR, an SSC-OMP, the points with the maximum weights are regare as neighbors for each point. For TSC, the nearest neighbors are collecte for each point. i=1 7

13 L Algorithms K-means K-flats SSC LRR SCC SSC-OMP8 TSC10 NSN+Spectral Mean CE % Meian CE % Avg. Time sec Mean CE % Meian CE % Avg. Time sec Table : CE an computational time of algorithms on Hopins1 ataset. L is the number of clusters motions. The numbers in the parentheses represent the number of neighbors for each point collecte in the corresponing algorithms. L Algorithms K-means K-flats SSC SSC-OMP TSC NSN+Spectral Mean CE % Meian CE % Avg. Time sec Mean CE % Meian CE % Avg. Time sec Mean CE % Meian CE % Avg. Time sec Mean CE % Meian CE % Avg. Time sec Table 3: CE an computational time of algorithms on Extene Yale B ataset. For each number of clusters faces L, the algorithms ran over 100 ranom subsets rawn from the overall 38 clusters. as the number of points increases. However, it also improves as the imension of subspaces grows in contrast to the theoretical analysis. We believe that this is because our analysis on GSR is not tight. In Figure, we can see that more ata points obtain correct neighbors as n increases or ecreases, which conforms the theoretical analysis. We also compare the computational time of the neighborhoo selection algorithms for ifferent numbers of subspaces an ata points. As shown in Figure 3, the greey algorithms OMP, Thresholing, an NSN are significantly more scalable than the convex optimization base algorithms l 1 -minimization an nuclear norm minimization. 4. Real-worl ata : motion segmentation an face clustering We compare our algorithm with the existing ones in the applications of motion segmentation an face clustering. For the motion segmentation, we use Hopins1 ataset [17], which contains 1 vieo sequences of or 3 motions. For the face clustering, we use Extene Yale B ataset with croppe images from [, 13]. The ataset contains 64 images for each of 38 iniviuals in frontal view an ifferent illumination conitions. To compare with the existing algorithms, we use the set of 48 4 resize raw images provie by the authors of [4]. The parameters of the existing algorithms were set as provie in their source coes. 10 Tables an 3 show CE an average computational time. 11 We can see that NSN+Spectral performs competitively with the methos with the lowest errors, but much faster. Compare to the other greey neighborhoo construction base algorithms, SSC-OMP an TSC, our algorithm performs significantly better. Acnowlegments The authors woul lie to acnowlege NSF grants 13043, 09409, 1017, an DTRA grant HDTRA for supporting this research. This research was also partially supporte by the U.S. Department of Transportation through the Data-Supporte Transportation Operations an Planning D-STOP Tier 1 University Transportation Center. 10 As SSC-OMP an TSC o not have propose number of parameters for motion segmentation, we foun the numbers minimizing the mean CE. The numbers are given in the table. 11 The LRR coe provie by the author i not perform properly with the face clustering ataset that we use. We i not run NSN+GSR since the ata points are not well istribute in its corresponing subspaces. 8

14 References [1] P. S. Braley an O. L. Mangasarian. K-plane clustering. Journal of Global Optimization, 161:3 3, 000. [] G. Chen an G. Lerman. Spectral curvature clustering. International Journal of Computer Vision, 813: , 009. [3] E. L. Dyer, A. C. Sanaranarayanan, an R. G. Baraniu. Greey feature selection for subspace clustering. The Journal of Machine Learning Research JMLR, 141:487 17, 013. [4] E. Elhamifar an R. Vial. Sparse subspace clustering: Algorithm, theory, an applications. Pattern Analysis an Machine Intelligence, IEEE Transactions on, 311:76 781, 013. [] A. S. Georghiaes, P. N. Belhumeur, an D. J. Kriegman. From few to many: Illumination cone moels for face recognition uner variable lighting an pose. IEEE Trans. Pattern Anal. Mach. Intelligence, 3 6: , 001. [6] R. Hecel an H. Bölcsei. Subspace clustering via thresholing an spectral clustering. In IEEE International Conference on Acoustics, Speech, an Signal Processing ICASSP, May 013. [7] R. Hecel an H. Bölcsei. Robust subspace clustering via thresholing. arxiv preprint arxiv: v, 014. [8] J. Ho, M.-H. Yang, J. Lim, K.-C. Lee, an D. Kriegman. Clustering appearances of objects uner varying illumination conitions. In IEEE conference on Computer Vision an Pattern Recognition CVPR, 003. [9] T. Inglot. Inequalities for quantiles of the chi-square istribution. Probability an Mathematical Statistics, 30:339 31, 010. [10] H.-P. Kriegel, P. Kröger, an A. Zime. Clustering high-imensional ata: A survey on subspace clustering, pattern-base clustering, an correlation clustering. ACM Transactions on Knowlege Discovery from Data TKDD, 31:1, 009. [11] S. Kunis an H. Rauhut. Ranom sampling of sparse trigonometric polynomials, ii. orthogonal matching pursuit versus basis pursuit. Founations of Computational Mathematics, 86: , 008. [1] M. Leoux. The concentration of measure phenomenon, volume 89. AMS Boostore, 00. [13] K. C. Lee, J. Ho, an D. Kriegman. Acquiring linear subspaces for face recognition uner variable lighting. IEEE Trans. Pattern Anal. Mach. Intelligence, 7: , 00. [14] G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, an Y. Ma. Robust recovery of subspace structures by low-ran representation. Pattern Analysis an Machine Intelligence, IEEE Transactions on, 31: , 013. [1] V. D. Milman an G. Schechtman. Asymptotic Theory of Finite Dimensional Norme Spaces: Isoperimetric Inequalities in Riemannian Manifols. Lecture Notes in Mathematics. Springer, [16] M. Soltanolotabi an E. J. Canes. A geometric analysis of subspace clustering with outliers. The Annals of Statistics, 404:19 38, 01. [17] R. Tron an R. Vial. A benchmar for the comparison of 3- motion segmentation algorithms. In IEEE conference on Computer Vision an Pattern Recognition CVPR, 007. [18] J. A. Tropp an A. C. Gilbert. Signal recovery from ranom measurements via orthogonal matching pursuit. Information Theory, IEEE Transactions on, 31: , 007. [19] P. Tseng. Nearest q-flat to m points. Journal of Optimization Theory an Applications, 101:49, 000. [0] R. Vial. Subspace clustering. Signal Processing Magazine, IEEE, 8: 68, 011. [1] R. Vial, Y. Ma, an S. Sastry. Generalize principal component analysis. In IEEE conference on Computer Vision an Pattern Recognition CVPR, 003. [] R. Vial, S. Soatto, Y. Ma, an S. Sastry. An algebraic geometric approach to the ientification of a class of linear hybri systems. In Decision an Control, 003. Proceeings. 4n IEEE Conference on, volume 1, pages IEEE, 003. [3] R. Vial, R. Tron, an R. Hartley. Multiframe motion segmentation with missing ata using power factorization an GPCA. International Journal of Computer Vision, 791:8 10, 008. [4] Y.-X. Wang, H. Xu, an C. Leng. Provable subspace clustering: When LRR meets SSC. In Avances in Neural Information Processing Systems NIPS, December 013. [] A. Y. Yang, S. R. Rao, an Y. Ma. Robust statistical estimation an segmentation of multiple subspaces. In IEEE conference on Computer Vision an Pattern Recognition CVPR, 006. [6] T. Zhang, A. Szlam, Y. Wang, an G. Lerman. Hybri linear moeling via local best-fit flats. International journal of computer vision, 1003:17 40, 01. 9

15 A Discussion on implementation issues A.1 A faster implementation for NSN At each step of NSN, the algorithm computes the projections of all points onto a subspace an fin one with the largest norm. A naive implementation of the algorithm requires OpK N time complexity. In fact, we can reuce the complexity to OpKN. Instea of fining the maximum norm of the projections, we can fin the maximum square norm of the projections. Let U be the subspace U at step. For any ata point y, we have Proj U y = Proj U 1 y + u y where u is the new orthogonal axis ae from U 1 to mae U. That is, U 1 u an U = U 1 u. As Proj U 1 y is alreay compute in the 1 th step, we o not nee to compute it again at step. Base on this fact, we have a faster implementation as escribe in the following. Note that P j at the th step is equal to Proj U y j in the original NSN algorithm. Algorithm 3 Fast Nearest Subspace Neighbor F-NSN Input: A set of N samples Y = {y 1,..., y N }, The number of require neighbors K, Maximum subspace imension max. Output: A neighborhoo matrix W {0, 1} N N y i y i / y i, i [N] for i = 1,..., N o I i {i}, u 1 y i P j 0, j [N] for = 1,..., K o if max then P j P j + u y j, j [N] en if j arg max j [N],j / Ii P j I i I i {j } if < max then u +1 y j l=1 u l y j u l y j l=1 u l y j u l en if en for W ij I j Ii or P j=1, j [N] en for A. Estimation of the number of clusters When L is unnown, it can be estimate at the clustering step. For Spectral clustering, a well-nown approach to estimate L is to fin a nee point in the singular values of the neighborhoo matrix. It is the point where the ifference between two consecutive singular values are the largest. For GSR, we o not nee to estimate the number of clusters a priori. Once the algorithms finishes, the number of the resulting groups will be our estimate of L. A.3 Parameter setting The choices of K an max epen on the imension of the subspaces. If ata points are lying exactly on the moel subspaces, K = max = is enough for GSR to recover a subspace. In practical situations where the points are near the subspaces, it is better to set K to be larger than. max can also be larger than because the max aitional imensions, which may be inuce from the noise, o no intersect with the other subspaces in practice. For Extene Yale B ataset an Hopins1 ataset, we foun that NSN+Spectral performs well if K is set to be aroun. 10

16 B Proofs B.1 Proof outline We escribe the first few high-level steps in the proofs of our main theorems. Exact clustering of our algorithms epens on whether NSN can fin all correct neighbors for the ata points so that the following algorithm GSR or Spectral clustering can cluster the points exactly. For NSN+GSR, exact clustering is guarantee when there is a point on each subspace that have all correct neighbors which are at least 1. For NSN+Spectral, exact clustering is guarantee when each ata point has only the n 1 other points on the same subspace as neighbors. In the following, we explain why these are true. Step 1-1: Exact clustering conition for GSR The two statistical moels have a property that for any -imensional subspace in R p other than the true subspaces D 1,..., D L the probability of any points lying on the subspace is zero. Hence, we claim the following. Fact 3 Best -imensional fit With probability one, the true subspaces D 1,..., D L are the L subspaces containing the most points among the set of all possible -imensional subspaces. Then it suffices for each subspace to have one point whose neighbors are 1 all correct points on the same subspace. This is because the subspace spanne by those points is almost surely ientical to the true subspace they are lying on, an that subspace will be pice by GSR. Fact 4 If NSN with K 1 fins all correct neighbors for at least one point on each subspace, GSR recovers all the true subspaces an clusters the ata points exactly with probability one. In the following steps, we consier one ata point for each subspace. We show that NSN with K = max = fins all correct neighbors for the point with probability at least 1 3δ 1 δ. Then the union boun an Fact 4 establish exact clustering with probability at least 1 3Lδ 1 δ. Step 1-: Exact clustering conition for spectral clustering It is ifficult to analyze spectral clustering for the resulting neighborhoo matrix of NSN. A trivial case for a neighborhoo matrix to result in exact clustering is when the points on the same subspace form a single fully connecte component. If NSN with K = max = fins all correct neighbors for every ata point, the subspace U at the last step = K is almost surely ientical to the true subspace that the points lie on. Hence, the resulting neighborhoo matrix W form L fully connecte components each of which contains all of the points on the same subspace. In the rest of the proof, we show that if 1 hols, NSN fins all correct neighbors for a fixe point with probability 1 3δ 1 δ. Let us assume that this is true. If 1 with C 1 an C replace by C1 4 an C hols, we have n > C 1 log ne, δ/n p < C log n lognlδ/n 1. Then it follows from the union boun that NSN fins all correct neighbors for all of the n points on each subspace with probability at least 1 3Lδ 1 δ, an hence we obtain W ij = I wi=w j for every i, j [N]. Exact clustering is guarantee. Step : Success conition for NSN Now the only proposition that we nee to prove is that for each subspace D i NSN fins all correct neighbors for a ata point which is a uniformly ranom unit vector on the subspace with probability at least 1 3δ 1 δ. As our analysis is inepenent of the subspaces, we only consier D 1. Without loss of generality, we assume that y 1 lies on D 1 w 1 = 1 an focus on this ata point. 11

17 When NSN fins neighbors of y 1, the algorithm constructs max subspaces incrementally. At each step = 1,..., K, if the largest projection onto U of the uncollecte points on the same true subspace D 1 is greater than the largest projection among the points on ifferent subspaces, then NSN collects a correct neighbor. In a mathematical expression, we want to satisfy for each step of = 1,..., K. max Proj U y j > max Proj U y j 3 j:w j=1,j / I 1 j:w j 1,j / I 1 The rest of the proof is to show 1 an lea to 3 with probability 1 3δ 1 δ in their corresponing moels. It is ifficult to prove 3 itself because the subspaces, the ata points, an the inex set I 1 are all epenent of each other. Instea, we introuce an Oracle algorithm whose success is equivalent to the success of NSN, but the analysis is easier. Then the Oracle algorithm is analyze using stochastic orering, bouns on orer statistics of ranom projections, an the measure concentration inequalities for ranom subspaces. The rest of the proof is provie in Sections B.3 an B.4. B. Preliminary lemmas Before we step into the technical parts of the proof, we introuce the main ingreients which will be use. The following lemma is about upper an lower bouns on the orer statistics for the projections of ii uniformly ranom unit vectors. Lemma Let x 1,..., x n be rawn ii uniformly at ranom from the -imensional unit ball S 1. Let z n m+1 enote the m th largest value of {z i Ax i, 1 i n} where A R. a. Suppose that the rows of A are orthonormal to each other. For any α 0, 1, there exists a constant C > 0 such that for n, m,, N where n m + 1 Cm log ne 4 mδ we have zn m+1 { > + 1 min n m + 1 log Cm log ne, α } mδ with probability at least 1 δ m. b. For any matrix A, with probability at least 1 δ m. z n m+1 < A F + A π + log ne mδ 6 Lemma b can be prove by using the measure concentration inequalities [1]. Not only can they provie inequalities for ranom unit vectors, they also give us inequalities for ranom subspaces. Lemma 6 Let the columns of X R be the orthonormal basis of a -imensional ranom subspace rawn uniformly at ranom in -imensional space. a. For any matrix A R p. b. [1, 1] If A is boune, then we have { Pr AX F > E[ AX F ] = A F A F + A } 8π 1 + t e 1t 8. 1

18 B.3 Proof of Theorem Following Section B.1, we show in this section that if hols then NSN fins all correct neighbors for y 1 which is assume to be on D 1 with probability at least 1 3δ 1 δ. Step 3: NSN Oracle algorithm Consier the Oracle algorithm in the following. Unlie NSN, this algorithm nows the true label of each ata point. It pics the point closest to the current subspace among the points with the same label. Since we assume w 1 = 1, the Oracle algorithm for y 1 pics a point in {y j : w j = 1} at every step. Algorithm 4 NSN Oracle algorithm for y 1 assuming w 1 = 1 Input: A set of N samples Y = {y 1,..., y N }, The number of require neighbors K = 1, Maximum subspace imension max = log I 1 1 {1} for = 1,..., K o if max then V span{y j : j I 1 } j arg max j [N]:w j=1,j Proj / I V y j 1 else j arg max j [N]:w j=1,j Proj / I Vmax y j 1 en if if max j [N]:wj=1,j / I i Return FAILURE en if I +1 1 I 1 {j } en for Return SUCCESS Proj V y j max j [N]:wj 1 Proj V y then Note that the Oracle algorithm returns failure if an only if the original algorithm pics an incorrect neighbor for y 1. The reason is as follows. Suppose that NSN for y 1 pics the first incorrect point at step. By the step 1, correct points have been chosen because they are the nearest points for the subspaces in the corresponing steps. The Oracle algorithm will also pic those points because they are the nearest points among the correct points. Hence U V. At step, NSN pics an incorrect point as it is the closest to U. The Oracle algorithm will eclare failure because that incorrect point is closer than the closest point among the correct points. In the same manner, we see that NSN fails if the Oracle NSN fails. Therefore, we can instea analyze the success of the Oracle algorithm. The success conition is written as Proj V y j > Proj Vmax y j > max Proj V j [N]:w y, = 1,..., max, j 1 max Proj V j [N]:w max y, = max + 1,..., K. 7 j 1 Note that V s are inepenent of the points {y j : j [N], w j 1}. We will use this fact in the following steps. Step 4: Lower bouns on the projection of correct points the LHS of 7 Let V R be such that the columns of D 1 V form an orthogonal basis of V. Such a V exists because V is a -imensional subspace of D 1. Then we have Proj V y j = V D 1 D 1 x j = V x j In this step, we obtain lower bouns on V x j for max an V max x j for > max. 13

19 It is ifficult to analyze V x j because V an x j are epenent. We instea analyze another ranom variable that is stochastically ominate by V x j. Then we use a high-probability lower boun on that variable which also lower bouns V x j with high probability. Define P,m as the m th largest norm of the projections of n 1 ii uniformly ranom unit vector on S 1 onto a -imensional subspace. Since the istribution of the ranom unit vector is isotropic, the istribution of P,m is ientical for any -imensional subspaces inepenent of the ranom unit vectors. We have the following ey lemma. Lemma 7 V x j stochastically ominates P,, i.e., Pr{ V x j t} Pr{P, t} for any t 0. Moreover, P, Pˆ, for any ˆ. The proof of the lemma is provie in Appenix B.. Now we can use the lower boun on P, given in Lemma a to boun V x j. Let us pic α an C for which the lemma hols. The first inequality of with C 1 = C + 1 leas to n > C log ne δ, an also n > C log ne, = 1,..., 1. 8 δ Hence, it follow from Lemma a that for each = 1,..., max, we have V x j { + 1 min n + 1 log C log ne, α } δ { + 1 min n log C log ne, α } log δ with probability at least 1 δ. For > max, we want to boun Proj Vmax y j. We again use Lemma 7 to obtain the boun. Since the conition for the lemma hols as shown in 8, we have V max x j log + { 1 min n + 1 log C log ne, α } log δ log + 1 min { log n C log ne δ with probability at least 1 δ, for every = max + 1,..., 1., α log The union boun gives that 9 an 10 hol for all = 1,..., 1 simultaneously with probability at least 1 δ 1 δ. } 9 10 Step : Upper bouns on the projection of incorrect points the RHS of 7 Since we have Proj V y j = V D 1 D wj x j, the RHS of 7 can be written as max V D1 D wj x j 11 j:j [N],w j 1 In this step, we want to boun 11 for every = 1,..., 1 by using the concentration inequalities for V an x j. Since V an x j are inepenent, the inequality for x j hols for any fixe V. 14

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

Necessary and Sufficient Conditions for Sketched Subspace Clustering

Necessary and Sufficient Conditions for Sketched Subspace Clustering Necessary an Sufficient Conitions for Sketche Subspace Clustering Daniel Pimentel-Alarcón, Laura Balzano 2, Robert Nowak University of Wisconsin-Maison, 2 University of Michigan-Ann Arbor Abstract This

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Keywors: multi-view learning, clustering, canonical correlation analysis Abstract Clustering ata in high-imensions is believe to be a har problem in general. A number of efficient clustering algorithms

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

Iterated Point-Line Configurations Grow Doubly-Exponentially

Iterated Point-Line Configurations Grow Doubly-Exponentially Iterate Point-Line Configurations Grow Doubly-Exponentially Joshua Cooper an Mark Walters July 9, 008 Abstract Begin with a set of four points in the real plane in general position. A to this collection

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information

Discrete Mathematics

Discrete Mathematics Discrete Mathematics 309 (009) 86 869 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: wwwelseviercom/locate/isc Profile vectors in the lattice of subspaces Dániel Gerbner

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

Capacity Analysis of MIMO Systems with Unknown Channel State Information

Capacity Analysis of MIMO Systems with Unknown Channel State Information Capacity Analysis of MIMO Systems with Unknown Channel State Information Jun Zheng an Bhaskar D. Rao Dept. of Electrical an Computer Engineering University of California at San Diego e-mail: juzheng@ucs.eu,

More information

Quantum mechanical approaches to the virial

Quantum mechanical approaches to the virial Quantum mechanical approaches to the virial S.LeBohec Department of Physics an Astronomy, University of Utah, Salt Lae City, UT 84112, USA Date: June 30 th 2015 In this note, we approach the virial from

More information

arxiv: v1 [stat.ml] 28 Mar 2017

arxiv: v1 [stat.ml] 28 Mar 2017 Algebraic Variety Moels for High-Rank Matrix Completion Greg Ongie Rebecca Willett Robert D. Nowak Laura Balzano March 29, 27 arxiv:73.963v [stat.ml] 28 Mar 27 Abstract We consier a generalization of low-rank

More information

Scalable Subspace Clustering

Scalable Subspace Clustering Scalable Subspace Clustering René Vidal Center for Imaging Science, Laboratory for Computational Sensing and Robotics, Institute for Computational Medicine, Department of Biomedical Engineering, Johns

More information

Ramsey numbers of some bipartite graphs versus complete graphs

Ramsey numbers of some bipartite graphs versus complete graphs Ramsey numbers of some bipartite graphs versus complete graphs Tao Jiang, Michael Salerno Miami University, Oxfor, OH 45056, USA Abstract. The Ramsey number r(h, K n ) is the smallest positive integer

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

Lecture 6: Calculus. In Song Kim. September 7, 2011

Lecture 6: Calculus. In Song Kim. September 7, 2011 Lecture 6: Calculus In Song Kim September 7, 20 Introuction to Differential Calculus In our previous lecture we came up with several ways to analyze functions. We saw previously that the slope of a linear

More information

On the Surprising Behavior of Distance Metrics in High Dimensional Space

On the Surprising Behavior of Distance Metrics in High Dimensional Space On the Surprising Behavior of Distance Metrics in High Dimensional Space Charu C. Aggarwal, Alexaner Hinneburg 2, an Daniel A. Keim 2 IBM T. J. Watson Research Center Yortown Heights, NY 0598, USA. charu@watson.ibm.com

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

Achieving Stable Subspace Clustering by Post-Processing Generic Clustering Results

Achieving Stable Subspace Clustering by Post-Processing Generic Clustering Results Achieving Stable Subspace Clustering by Post-Processing Generic Clustering Results Duc-Son Pham Ognjen Arandjelović Svetha Venatesh Department of Computing School of Computer Science School of Information

More information

Robustness and Perturbations of Minimal Bases

Robustness and Perturbations of Minimal Bases Robustness an Perturbations of Minimal Bases Paul Van Dooren an Froilán M Dopico December 9, 2016 Abstract Polynomial minimal bases of rational vector subspaces are a classical concept that plays an important

More information

Sparse Subspace Clustering

Sparse Subspace Clustering Sparse Subspace Clustering Based on Sparse Subspace Clustering: Algorithm, Theory, and Applications by Elhamifar and Vidal (2013) Alex Gutierrez CSCI 8314 March 2, 2017 Outline 1 Motivation and Background

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

SYNCHRONOUS SEQUENTIAL CIRCUITS

SYNCHRONOUS SEQUENTIAL CIRCUITS CHAPTER SYNCHRONOUS SEUENTIAL CIRCUITS Registers an counters, two very common synchronous sequential circuits, are introuce in this chapter. Register is a igital circuit for storing information. Contents

More information

CONTROL CHARTS FOR VARIABLES

CONTROL CHARTS FOR VARIABLES UNIT CONTOL CHATS FO VAIABLES Structure.1 Introuction Objectives. Control Chart Technique.3 Control Charts for Variables.4 Control Chart for Mean(-Chart).5 ange Chart (-Chart).6 Stanar Deviation Chart

More information

A New Minimum Description Length

A New Minimum Description Length A New Minimum Description Length Soosan Beheshti, Munther A. Dahleh Laboratory for Information an Decision Systems Massachusetts Institute of Technology soosan@mit.eu,ahleh@lis.mit.eu Abstract The minimum

More information

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

Lecture 2 Lagrangian formulation of classical mechanics Mechanics Lecture Lagrangian formulation of classical mechanics 70.00 Mechanics Principle of stationary action MATH-GA To specify a motion uniquely in classical mechanics, it suffices to give, at some time t 0,

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

arxiv: v1 [cs.lg] 22 Mar 2014

arxiv: v1 [cs.lg] 22 Mar 2014 CUR lgorithm with Incomplete Matrix Observation Rong Jin an Shenghuo Zhu Dept. of Computer Science an Engineering, Michigan State University, rongjin@msu.eu NEC Laboratories merica, Inc., zsh@nec-labs.com

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

Collapsed Gibbs and Variational Methods for LDA. Example Collapsed MoG Sampling

Collapsed Gibbs and Variational Methods for LDA. Example Collapsed MoG Sampling Case Stuy : Document Retrieval Collapse Gibbs an Variational Methos for LDA Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 7 th, 0 Example

More information

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes Leaving Ranomness to Nature: -Dimensional Prouct Coes through the lens of Generalize-LDPC coes Tavor Baharav, Kannan Ramchanran Dept. of Electrical Engineering an Computer Sciences, U.C. Berkeley {tavorb,

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

arxiv: v4 [cs.ds] 7 Mar 2014

arxiv: v4 [cs.ds] 7 Mar 2014 Analysis of Agglomerative Clustering Marcel R. Ackermann Johannes Blömer Daniel Kuntze Christian Sohler arxiv:101.697v [cs.ds] 7 Mar 01 Abstract The iameter k-clustering problem is the problem of partitioning

More information

Bohr Model of the Hydrogen Atom

Bohr Model of the Hydrogen Atom Class 2 page 1 Bohr Moel of the Hyrogen Atom The Bohr Moel of the hyrogen atom assumes that the atom consists of one electron orbiting a positively charge nucleus. Although it oes NOT o a goo job of escribing

More information

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y Ph195a lecture notes, 1/3/01 Density operators for spin- 1 ensembles So far in our iscussion of spin- 1 systems, we have restricte our attention to the case of pure states an Hamiltonian evolution. Toay

More information

Math 1B, lecture 8: Integration by parts

Math 1B, lecture 8: Integration by parts Math B, lecture 8: Integration by parts Nathan Pflueger 23 September 2 Introuction Integration by parts, similarly to integration by substitution, reverses a well-known technique of ifferentiation an explores

More information

Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis

Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis Chuang Wang, Yonina C. Elar, Fellow, IEEE an Yue M. Lu, Senior Member, IEEE Abstract We present a high-imensional analysis

More information

arxiv: v2 [cs.ds] 11 May 2016

arxiv: v2 [cs.ds] 11 May 2016 Optimizing Star-Convex Functions Jasper C.H. Lee Paul Valiant arxiv:5.04466v2 [cs.ds] May 206 Department of Computer Science Brown University {jasperchlee,paul_valiant}@brown.eu May 3, 206 Abstract We

More information

Integration Review. May 11, 2013

Integration Review. May 11, 2013 Integration Review May 11, 2013 Goals: Review the funamental theorem of calculus. Review u-substitution. Review integration by parts. Do lots of integration eamples. 1 Funamental Theorem of Calculus In

More information

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys Homewor Solutions EM, Mixture Moels, PCA, Dualitys CMU 0-75: Machine Learning Fall 05 http://www.cs.cmu.eu/~bapoczos/classes/ml075_05fall/ OUT: Oct 5, 05 DUE: Oct 9, 05, 0:0 AM An EM algorithm for a Mixture

More information

On combinatorial approaches to compressed sensing

On combinatorial approaches to compressed sensing On combinatorial approaches to compresse sensing Abolreza Abolhosseini Moghaam an Hayer Raha Department of Electrical an Computer Engineering, Michigan State University, East Lansing, MI, U.S. Emails:{abolhos,raha}@msu.eu

More information

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS ALINA BUCUR, CHANTAL DAVID, BROOKE FEIGON, MATILDE LALÍN 1 Introuction In this note, we stuy the fluctuations in the number

More information

Entanglement is not very useful for estimating multiple phases

Entanglement is not very useful for estimating multiple phases PHYSICAL REVIEW A 70, 032310 (2004) Entanglement is not very useful for estimating multiple phases Manuel A. Ballester* Department of Mathematics, University of Utrecht, Box 80010, 3508 TA Utrecht, The

More information

A Sketch of Menshikov s Theorem

A Sketch of Menshikov s Theorem A Sketch of Menshikov s Theorem Thomas Bao March 14, 2010 Abstract Let Λ be an infinite, locally finite oriente multi-graph with C Λ finite an strongly connecte, an let p

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

Expected Value of Partial Perfect Information

Expected Value of Partial Perfect Information Expecte Value of Partial Perfect Information Mike Giles 1, Takashi Goa 2, Howar Thom 3 Wei Fang 1, Zhenru Wang 1 1 Mathematical Institute, University of Oxfor 2 School of Engineering, University of Tokyo

More information

Chapter 6: Energy-Momentum Tensors

Chapter 6: Energy-Momentum Tensors 49 Chapter 6: Energy-Momentum Tensors This chapter outlines the general theory of energy an momentum conservation in terms of energy-momentum tensors, then applies these ieas to the case of Bohm's moel.

More information

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS Yannick DEVILLE Université Paul Sabatier Laboratoire Acoustique, Métrologie, Instrumentation Bât. 3RB2, 8 Route e Narbonne,

More information

Algorithms and matching lower bounds for approximately-convex optimization

Algorithms and matching lower bounds for approximately-convex optimization Algorithms an matching lower bouns for approximately-convex optimization Yuanzhi Li Department of Computer Science Princeton University Princeton, NJ, 08450 yuanzhil@cs.princeton.eu Anrej Risteski Department

More information

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations Diophantine Approximations: Examining the Farey Process an its Metho on Proucing Best Approximations Kelly Bowen Introuction When a person hears the phrase irrational number, one oes not think of anything

More information

Acute sets in Euclidean spaces

Acute sets in Euclidean spaces Acute sets in Eucliean spaces Viktor Harangi April, 011 Abstract A finite set H in R is calle an acute set if any angle etermine by three points of H is acute. We examine the maximal carinality α() of

More information

Influence of weight initialization on multilayer perceptron performance

Influence of weight initialization on multilayer perceptron performance Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -

More information

Parameter estimation: A new approach to weighting a priori information

Parameter estimation: A new approach to weighting a priori information Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a

More information

On the minimum distance of elliptic curve codes

On the minimum distance of elliptic curve codes On the minimum istance of elliptic curve coes Jiyou Li Department of Mathematics Shanghai Jiao Tong University Shanghai PRChina Email: lijiyou@sjtueucn Daqing Wan Department of Mathematics University of

More information

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection an System Ientification Borhan M Sananaji, Tyrone L Vincent, an Michael B Wakin Abstract In this paper,

More information

Qubit channels that achieve capacity with two states

Qubit channels that achieve capacity with two states Qubit channels that achieve capacity with two states Dominic W. Berry Department of Physics, The University of Queenslan, Brisbane, Queenslan 4072, Australia Receive 22 December 2004; publishe 22 March

More information

Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes

Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes Online ICA: Unerstaning Global Dynamics of Nonconvex Optimization via Diffusion Processes Chris Junchi Li Zhaoran Wang Han Liu Department of Operations Research an Financial Engineering, Princeton University

More information

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold CHAPTER 1 : DIFFERENTIABLE MANIFOLDS 1.1 The efinition of a ifferentiable manifol Let M be a topological space. This means that we have a family Ω of open sets efine on M. These satisfy (1), M Ω (2) the

More information

On Characterizing the Delay-Performance of Wireless Scheduling Algorithms

On Characterizing the Delay-Performance of Wireless Scheduling Algorithms On Characterizing the Delay-Performance of Wireless Scheuling Algorithms Xiaojun Lin Center for Wireless Systems an Applications School of Electrical an Computer Engineering, Purue University West Lafayette,

More information

Sparse Reconstruction of Systems of Ordinary Differential Equations

Sparse Reconstruction of Systems of Ordinary Differential Equations Sparse Reconstruction of Systems of Orinary Differential Equations Manuel Mai a, Mark D. Shattuck b,c, Corey S. O Hern c,a,,e, a Department of Physics, Yale University, New Haven, Connecticut 06520, USA

More information

DIFFERENTIAL GEOMETRY, LECTURE 15, JULY 10

DIFFERENTIAL GEOMETRY, LECTURE 15, JULY 10 DIFFERENTIAL GEOMETRY, LECTURE 15, JULY 10 5. Levi-Civita connection From now on we are intereste in connections on the tangent bunle T X of a Riemanninam manifol (X, g). Out main result will be a construction

More information

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners Lower Bouns for Local Monotonicity Reconstruction from Transitive-Closure Spanners Arnab Bhattacharyya Elena Grigorescu Mahav Jha Kyomin Jung Sofya Raskhonikova Davi P. Wooruff Abstract Given a irecte

More information

Research Article When Inflation Causes No Increase in Claim Amounts

Research Article When Inflation Causes No Increase in Claim Amounts Probability an Statistics Volume 2009, Article ID 943926, 10 pages oi:10.1155/2009/943926 Research Article When Inflation Causes No Increase in Claim Amounts Vytaras Brazauskas, 1 Bruce L. Jones, 2 an

More information

Sharp Thresholds. Zachary Hamaker. March 15, 2010

Sharp Thresholds. Zachary Hamaker. March 15, 2010 Sharp Threshols Zachary Hamaker March 15, 2010 Abstract The Kolmogorov Zero-One law states that for tail events on infinite-imensional probability spaces, the probability must be either zero or one. Behavior

More information

On the number of isolated eigenvalues of a pair of particles in a quantum wire

On the number of isolated eigenvalues of a pair of particles in a quantum wire On the number of isolate eigenvalues of a pair of particles in a quantum wire arxiv:1812.11804v1 [math-ph] 31 Dec 2018 Joachim Kerner 1 Department of Mathematics an Computer Science FernUniversität in

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

Database-friendly Random Projections

Database-friendly Random Projections Database-frienly Ranom Projections Dimitris Achlioptas Microsoft ABSTRACT A classic result of Johnson an Linenstrauss asserts that any set of n points in -imensional Eucliean space can be embee into k-imensional

More information

WESD - Weighted Spectral Distance for Measuring Shape Dissimilarity

WESD - Weighted Spectral Distance for Measuring Shape Dissimilarity 1 WESD - Weighte Spectral Distance for Measuring Shape Dissimilarity Ener Konukoglu, Ben Glocker, Antonio Criminisi an Kilian M. Pohl Abstract This article presents a new istance for measuring shape issimilarity

More information

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Proceeings of the 4th East-European Conference on Avances in Databases an Information Systems ADBIS) 200 Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Eleftherios Tiakas, Apostolos.

More information

arxiv: v4 [stat.ml] 21 Dec 2016

arxiv: v4 [stat.ml] 21 Dec 2016 Online Active Linear Regression via Thresholing Carlos Riquelme Stanfor University riel@stanfor.eu Ramesh Johari Stanfor University rjohari@stanfor.eu Baosen Zhang University of Washington zhangbao@uw.eu

More information

The total derivative. Chapter Lagrangian and Eulerian approaches

The total derivative. Chapter Lagrangian and Eulerian approaches Chapter 5 The total erivative 51 Lagrangian an Eulerian approaches The representation of a flui through scalar or vector fiels means that each physical quantity uner consieration is escribe as a function

More information

Lecture 6 : Dimensionality Reduction

Lecture 6 : Dimensionality Reduction CPS290: Algorithmic Founations of Data Science February 3, 207 Lecture 6 : Dimensionality Reuction Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will consier the roblem of maing

More information

Construction of Equiangular Signatures for Synchronous CDMA Systems

Construction of Equiangular Signatures for Synchronous CDMA Systems Construction of Equiangular Signatures for Synchronous CDMA Systems Robert W Heath Jr Dept of Elect an Comp Engr 1 University Station C0803 Austin, TX 78712-1084 USA rheath@eceutexaseu Joel A Tropp Inst

More information

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

Problem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs

Problem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs Problem Sheet 2: Eigenvalues an eigenvectors an their use in solving linear ODEs If you fin any typos/errors in this problem sheet please email jk28@icacuk The material in this problem sheet is not examinable

More information

II. First variation of functionals

II. First variation of functionals II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent

More information

Generalizing Kronecker Graphs in order to Model Searchable Networks

Generalizing Kronecker Graphs in order to Model Searchable Networks Generalizing Kronecker Graphs in orer to Moel Searchable Networks Elizabeth Boine, Babak Hassibi, Aam Wierman California Institute of Technology Pasaena, CA 925 Email: {eaboine, hassibi, aamw}@caltecheu

More information

Linear and quadratic approximation

Linear and quadratic approximation Linear an quaratic approximation November 11, 2013 Definition: Suppose f is a function that is ifferentiable on an interval I containing the point a. The linear approximation to f at a is the linear function

More information

2Algebraic ONLINE PAGE PROOFS. foundations

2Algebraic ONLINE PAGE PROOFS. foundations Algebraic founations. Kick off with CAS. Algebraic skills.3 Pascal s triangle an binomial expansions.4 The binomial theorem.5 Sets of real numbers.6 Surs.7 Review . Kick off with CAS Playing lotto Using

More information

CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, and Tony Wu

CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, and Tony Wu CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, an Tony Wu Abstract Popular proucts often have thousans of reviews that contain far too much information for customers to igest. Our goal for the

More information

A Novel Decoupled Iterative Method for Deep-Submicron MOSFET RF Circuit Simulation

A Novel Decoupled Iterative Method for Deep-Submicron MOSFET RF Circuit Simulation A Novel ecouple Iterative Metho for eep-submicron MOSFET RF Circuit Simulation CHUAN-SHENG WANG an YIMING LI epartment of Mathematics, National Tsing Hua University, National Nano evice Laboratories, an

More information

Homework 2 EM, Mixture Models, PCA, Dualitys

Homework 2 EM, Mixture Models, PCA, Dualitys Homework 2 EM, Mixture Moels, PCA, Dualitys CMU 10-715: Machine Learning (Fall 2015) http://www.cs.cmu.eu/~bapoczos/classes/ml10715_2015fall/ OUT: Oct 5, 2015 DUE: Oct 19, 2015, 10:20 AM Guielines The

More information

Logarithmic spurious regressions

Logarithmic spurious regressions Logarithmic spurious regressions Robert M. e Jong Michigan State University February 5, 22 Abstract Spurious regressions, i.e. regressions in which an integrate process is regresse on another integrate

More information

Spurious Significance of Treatment Effects in Overfitted Fixed Effect Models Albrecht Ritschl 1 LSE and CEPR. March 2009

Spurious Significance of Treatment Effects in Overfitted Fixed Effect Models Albrecht Ritschl 1 LSE and CEPR. March 2009 Spurious Significance of reatment Effects in Overfitte Fixe Effect Moels Albrecht Ritschl LSE an CEPR March 2009 Introuction Evaluating subsample means across groups an time perios is common in panel stuies

More information

Robust Low Rank Kernel Embeddings of Multivariate Distributions

Robust Low Rank Kernel Embeddings of Multivariate Distributions Robust Low Rank Kernel Embeings of Multivariate Distributions Le Song, Bo Dai College of Computing, Georgia Institute of Technology lsong@cc.gatech.eu, boai@gatech.eu Abstract Kernel embeing of istributions

More information

Equations of lines in

Equations of lines in Roberto s Notes on Linear Algebra Chapter 6: Lines, planes an other straight objects Section 1 Equations of lines in What ou nee to know alrea: The ot prouct. The corresponence between equations an graphs.

More information

Calculus of Variations

Calculus of Variations 16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information