Text Mining Approaches for Surveillance
|
|
- Edmund Owen
- 5 years ago
- Views:
Transcription
1 Text Mining Approaches for Surveillance Document Space Workshop, IPAM/UCLA Michael W. Berry and Murray Browne Department of Computer Science, UTK January 23, / 61
2 Outline Enron Background Electronic Mail Surveillance Conclusions and References 2 / 61
3 Enron Background Collection Enron Background Collection Historical Events Electronic Mail Surveillance Conclusions and References 3 / 61
4 Enron Background Collection Collection By-product of the FERC investigation of Enron (originally contained 15 million messages). 4 / 61
5 Enron Background Collection Collection By-product of the FERC investigation of Enron (originally contained 15 million messages). This study used the improved corpus known as the Enron set, which was edited by Dr. William Cohen at CMU. 5 / 61
6 Enron Background Collection Collection By-product of the FERC investigation of Enron (originally contained 15 million messages). This study used the improved corpus known as the Enron set, which was edited by Dr. William Cohen at CMU. This set had over 500,000 messages. The majority were sent in the 1999 to 2001 timeframe. 6 / 61
7 Enron Background Historical Events Enron Background Collection Historical Events Electronic Mail Surveillance Conclusions and References 7 / 61
8 Enron Background Historical Events Enron Historical Ongoing, problematic, development of the Dabhol Power Company (DPC) in the Indian state of Maharashtra. 8 / 61
9 Enron Background Historical Events Enron Historical Ongoing, problematic, development of the Dabhol Power Company (DPC) in the Indian state of Maharashtra. Deregulation of the Calif. energy industry, which led to rolling electricity blackouts in the summer of 2000 (and subsequent investigations). 9 / 61
10 Enron Background Historical Events Enron Historical Ongoing, problematic, development of the Dabhol Power Company (DPC) in the Indian state of Maharashtra. Deregulation of the Calif. energy industry, which led to rolling electricity blackouts in the summer of 2000 (and subsequent investigations). Revelation of Enron s deceptive business and accounting practices that led to an abrupt collapse of the energy colossus in October, 2001; Enron filed for bankruptcy in December, / 61
11 Motivation Enron Background Motivation Underlying Optimization Problem MM Method (Lee and Seung) Enforcing Statistical Sparsity Hybrid NMF Approach Electronic Mail Surveillance Conclusions and References 11 / 61
12 Motivation NMF Origins NMF (Nonnegative Matrix Factorization) can be used to approximate high-dimensional data having nonnegative components. 12 / 61
13 Motivation NMF Origins NMF (Nonnegative Matrix Factorization) can be used to approximate high-dimensional data having nonnegative components. Lee and Seung (1999) demonstrated its use as a sum-by-parts representation of image data in order to both identify and classify image features. 13 / 61
14 Motivation NMF Origins NMF (Nonnegative Matrix Factorization) can be used to approximate high-dimensional data having nonnegative components. Lee and Seung (1999) demonstrated its use as a sum-by-parts representation of image data in order to both identify and classify image features. [Xu et al., 2003] demonstrated how NMF-based indexing could outperform SVD-based Latent Semantic Indexing (LSI) for some information retrieval tasks. 14 / 61
15 Motivation NMF for Image Processing A i SVD W Hi U Sparse NMF verses Dense SVD Bases; Lee and Seung (1999) Σ V i 15 / 61
16 Motivation NMF for Text Mining (Medlars) Highest Weighted Terms in Basis Vector W *1 Highest Weighted Terms in Basis Vector W *2 term 1 ventricular 2 aortic 3 septal 4 left 5 defect 6 regurgitation 7 ventricle 8 valve 9 cardiac 10 pressure weight Highest Weighted Terms in Basis Vector W *5 term 1 oxygen flow pressure blood cerebral hypothermia fluid venous arterial perfusion weight Highest Weighted Terms in Basis Vector W *6 term 1 children child autistic speech group early visual anxiety emotional autism weight term 1 kidney marrow dna cells nephrectomy unilateral lymphocytes bone thymidine rats weight Interpretable NMF feature vectors; Langville et al. (2006) 16 / 61
17 Underlying Optimization Problem Derivation Given an m n term-by-message (sparse) matrix X. 17 / 61
18 Underlying Optimization Problem Derivation Given an m n term-by-message (sparse) matrix X. Compute two reduced-dim. matrices W,H so that X WH; W is m r and H is r n, with r n. 18 / 61
19 Underlying Optimization Problem Derivation Given an m n term-by-message (sparse) matrix X. Compute two reduced-dim. matrices W,H so that X WH; W is m r and H is r n, with r n. Optimization problem: min X W,H WH 2 F, subject to W ij 0 and H ij 0, i, j. 19 / 61
20 Underlying Optimization Problem Derivation Given an m n term-by-message (sparse) matrix X. Compute two reduced-dim. matrices W,H so that X WH; W is m r and H is r n, with r n. Optimization problem: min X W,H WH 2 F, subject to W ij 0 and H ij 0, i, j. General approach: construct initial estimates for W and H and then improve them via alternating iterations. 20 / 61
21 MM Method (Lee and Seung) Multiplicative Method (MM) Multiplicative update rules for W and H (Lee and Seung, 1999): 1. Initialize W and H with non-negative values, and scale the columns of W to unit norm. 2. Iterate for each c, j, and i until convergence or after k iterations: 2.1 H cj H cj (W T X ) cj (W T WH) cj + ɛ 2.2 W ic W ic (XH T ) ic (WHH T ) ic + ɛ 2.3 Scale the columns of W to unit norm. 21 / 61
22 MM Method (Lee and Seung) Multiplicative Method (MM) Multiplicative update rules for W and H (Lee and Seung, 1999): 1. Initialize W and H with non-negative values, and scale the columns of W to unit norm. 2. Iterate for each c, j, and i until convergence or after k iterations: 2.1 H cj H cj (W T X ) cj (W T WH) cj + ɛ 2.2 W ic W ic (XH T ) ic (WHH T ) ic + ɛ 2.3 Scale the columns of W to unit norm. Setting ɛ = 10 9 will suffice [Shahnaz et al., 2006]. 22 / 61
23 MM Method (Lee and Seung) Normalization, Complexity, and Convergence Important to normalize X initially and the basis matrix W at each iteration. 23 / 61
24 MM Method (Lee and Seung) Normalization, Complexity, and Convergence Important to normalize X initially and the basis matrix W at each iteration. When optimizing on a unit hypersphere, the column (or feature) vectors of W, denoted by W k, are effectively mapped to the surface of the hypersphere by repeated normalization. 24 / 61
25 MM Method (Lee and Seung) Normalization, Complexity, and Convergence Important to normalize X initially and the basis matrix W at each iteration. When optimizing on a unit hypersphere, the column (or feature) vectors of W, denoted by W k, are effectively mapped to the surface of the hypersphere by repeated normalization. MM implementation of NMF requires O(rmn) operations per iteration; Lee and Seung (1999) proved that X WH 2 F is monotonically non-increasing with MM. 25 / 61
26 MM Method (Lee and Seung) Normalization, Complexity, and Convergence Important to normalize X initially and the basis matrix W at each iteration. When optimizing on a unit hypersphere, the column (or feature) vectors of W, denoted by W k, are effectively mapped to the surface of the hypersphere by repeated normalization. MM implementation of NMF requires O(rmn) operations per iteration; Lee and Seung (1999) proved that X WH 2 F is monotonically non-increasing with MM. From a nonlinear optimization perspective, MM/NMF can be considered a diagonally-scaled gradient descent method. 26 / 61
27 Enforcing Statistical Sparsity Hoyer s Method From neural network applications, Hoyer (2002) enforced statistical sparsity for the weight matrix H in order to enhance the parts-based data representations in the matrix W. 27 / 61
28 Enforcing Statistical Sparsity Hoyer s Method From neural network applications, Hoyer (2002) enforced statistical sparsity for the weight matrix H in order to enhance the parts-based data representations in the matrix W. Mu et al. (2003) suggested a regularization approach to achieve statistical sparsity in the matrix H: point count regularization; penalize the number of nonzeros in H rather than ij H ij. 28 / 61
29 Enforcing Statistical Sparsity Hoyer s Method From neural network applications, Hoyer (2002) enforced statistical sparsity for the weight matrix H in order to enhance the parts-based data representations in the matrix W. Mu et al. (2003) suggested a regularization approach to achieve statistical sparsity in the matrix H: point count regularization; penalize the number of nonzeros in H rather than ij H ij. Goal of increased sparsity better representation of parts or features spanned by the corpus (X ) [Shahnaz et al., 2006]. 29 / 61
30 Hybrid NMF Approach GD-CLS Hybrid Approach First use MM to compute an approximation to W for each iteration a gradient descent (GD) optimization step. 30 / 61
31 Hybrid NMF Approach GD-CLS Hybrid Approach First use MM to compute an approximation to W for each iteration a gradient descent (GD) optimization step. Then, compute the weight matrix H using a constrained least squares (CLS) model to penalize non-smoothness (i.e., non-sparsity) in H common Tikohonov regularization technique used in image processing (Prasad et al., 2003). 31 / 61
32 Hybrid NMF Approach GD-CLS Hybrid Approach First use MM to compute an approximation to W for each iteration a gradient descent (GD) optimization step. Then, compute the weight matrix H using a constrained least squares (CLS) model to penalize non-smoothness (i.e., non-sparsity) in H common Tikohonov regularization technique used in image processing (Prasad et al., 2003). Convergence to a non-stationary point evidenced (but no formal proof given to date). 32 / 61
33 Hybrid NMF Approach GD-CLS Algorithm 1. Initialize W and H with non-negative values, and scale the columns of W to unit norm. 33 / 61
34 Hybrid NMF Approach GD-CLS Algorithm 1. Initialize W and H with non-negative values, and scale the columns of W to unit norm. 2. Iterate until convergence or after k iterations: (XH 2.1 W ic W T ) ic ic (WHH T, for c and i ) ic + ɛ 2.2 Rescale the columns of W to unit norm. 2.3 Solve the constrained least squares problem: min H j { X j WH j λ H j 2 2}, where the subscript j denotes the j th column, for j = 1,..., m. 34 / 61
35 Hybrid NMF Approach GD-CLS Algorithm 1. Initialize W and H with non-negative values, and scale the columns of W to unit norm. 2. Iterate until convergence or after k iterations: (XH 2.1 W ic W T ) ic ic (WHH T, for c and i ) ic + ɛ 2.2 Rescale the columns of W to unit norm. 2.3 Solve the constrained least squares problem: min H j { X j WH j λ H j 2 2}, where the subscript j denotes the j th column, for j = 1,..., m. Any negative values in H j are set to zero. The parameter λ is a regularization value that is used to balance the reduction of the metric X j WH j 2 2 with enforcement of smoothness and sparsity in H [Shahnaz et al., 2006]. 35 / 61
36 Electronic Mail Surveillance Message Parsing Enron Background Electronic Mail Surveillance Message Parsing Term Weighting GD-CLS Benchmarks Clustering and Topic Extraction Topic Tracking (Through Time) Smoothing Effects Comparison Conclusions and References 36 / 61
37 Electronic Mail Surveillance Message Parsing inbox Collection Parsed inbox folder of all 150 accounts (users) via GTP (General Text Parser); 495-term stoplist used and extracted terms must appear in more than 1 and more than once globally. 37 / 61
38 Electronic Mail Surveillance Message Parsing private Collection Parsed all mail directories (of all 150 accounts) with the exception of all documents, calendar, contacts, deleted items, discussion threads, inbox, notes inbox, sent, sent items, and sent mail; 495-term stoplist used and extracted terms must appear in more than 1 and more than once globally. 38 / 61
39 Electronic Mail Surveillance Message Parsing private Collection Parsed all mail directories (of all 150 accounts) with the exception of all documents, calendar, contacts, deleted items, discussion threads, inbox, notes inbox, sent, sent items, and sent mail; 495-term stoplist used and extracted terms must appear in more than 1 and more than once globally. Distribution of messages sent in the year 2001: Month Msgs Terms Month Msgs Terms Jan 3,621 17,888 Jul 3,077 17,617 Feb 2,804 16,958 Aug 2,828 16,417 Mar 3,525 20,305 Sep 2,330 15,405 Apr 4,273 24,010 Oct 2,821 20,995 May 4,261 24,335 Nov 2,204 18,693 Jun 4,324 18,599 Dec 1,489 8, / 61
40 Electronic Mail Surveillance Term Weighting Term Weighting Schemes For m n term-by-message matrix X = [x ij ], define x ij = l ij g i d j, where l ij is the local weight for term i occurring in message j, g i is the global weight for term i in the subcollection, and d j is a document normalization factor (set d j = 1). 40 / 61
41 Electronic Mail Surveillance Term Weighting Term Weighting Schemes For m n term-by-message matrix X = [x ij ], define x ij = l ij g i d j, where l ij is the local weight for term i occurring in message j, g i is the global weight for term i in the subcollection, and d j is a document normalization factor (set d j = 1). Schemes used in parsing inbox and private subcollections: Name Local Global txx Term Frequency None l ij = f ij g i = 1 lex Logarithmic Entropy (Define: p ij = f ij / j f ij) l ij = log(1 + f ij ) g i = 1+ ( j p ijlog(p ij ))/ log n) 41 / 61
42 Electronic Mail Surveillance GD-CLS Benchmarks Computational Complexity Rank-50 NMF (X WH) computed on a 450MHz (Dual) UltraSPARC-II processor using 100 iterations: Mail Dictionary Time Collection Messages Terms λ (sec.) inbox 44, , , , , 521 private 65, , , , , / 61
43 Electronic Mail Surveillance Clustering and Topic Extraction private with Log-Entropy Weighting Identify rows of H from X WH or H k with λ = 0.1; r = 50 feature vectors (W k ) generated by GD-CLS: Feature Cluster Topic Dominant Index (k) Size Description Terms California ca, cpuc, gov, socalgas, sempra, org, sce, gmssr, aelaw, ci Louise Kitchen evp, fortune, named top britain, woman, woman by ceo, avon, fiorina, Fortune cfo, hewlett, packard Fantasy game, wr, qb, play, football rb, season, injury, updated, fantasy, image (Cluster size no. of H k elements > row max/10) 43 / 61
44 Electronic Mail Surveillance Clustering and Topic Extraction private with Log-Entropy Weighting Additional topic clusters of significant size: Feature Cluster Topic Dominant Index (k) Size Description Terms Texas UT, orange, longhorn longhorn[s], texas, football true, truorange, newsletter recruiting, oklahoma, defensive Enron partnership[s], collapse fastow, shares, sec, stock, shareholder, investors, equity, lay s dabhol, dpc, india, about India mseb, maharashtra, indian, lenders, delhi, foreign, minister 44 / 61
45 Electronic Mail Surveillance Topic Tracking (Through Time) 2001 Topics Tracked by GD-CLS r = 50 features, lex term weighting, λ = / 61
46 Electronic Mail Surveillance Smoothing Effects Comparison Two Penalty Term Formulation Introduce smoothing on W k (feature vectors) in addition to H k : min W,H { X WH 2 F + α W 2 F + β H 2 F }, where F is the Frobenius norm. 46 / 61
47 Electronic Mail Surveillance Smoothing Effects Comparison Two Penalty Term Formulation Introduce smoothing on W k (feature vectors) in addition to H k : min W,H { X WH 2 F + α W 2 F + β H 2 F }, where F is the Frobenius norm. Constrained NMF (CNMF) iteration [Piper et al., 2004]: H cj H cj (W T X ) cj βh cj (W T WH) cj + ɛ W ic W ic (XH T ) ic αw ic (WHH T ) ic + ɛ 47 / 61
48 Electronic Mail Surveillance Smoothing Effects Comparison Term Distribution in Feature Vectors Terms Wt Lambda Alpha Topics Blackouts Cal Stocks Collapse UT Texasfoot Chronicle Indian India Fastow Collapse Gas CFO Kitchen Californians Cal Solar Partnerships Collapse Workers Maharashtra India Mseb India Beach Ljm Collapse Tues IPPS Cal Rebates Ljm Collapse 48 / 61
49 Electronic Mail Surveillance Smoothing Effects Comparison British National Corpus (BNC) Noun Conservation In collaboration with P. Keila and D. Skillicorn (Queens Univ.) 289,695 subset (all mail folders - not just private) Smoothing solely applied to NMF W matrix (α = 0.001, 0.01, 0.1, 0.25, 0.50, 0.75, 1.00 with β = 0) Log-entropy term weighting applied to the term-by-message matrix X Monitor top ten nouns for each feature vector (ranked by descending component values) and extract those appearing in two or more features; topics assigned manually. 49 / 61
50 Electronic Mail Surveillance Smoothing Effects Comparison BNC Noun Distribution in Feature Vectors Noun GF Entropy Alpha Topic Waxman Downfall Lieberman Downfall Scandal Downfall Nominee(s) Barone Downfall MEADE Downfall Fichera California blackout Prabhu India-strong Tata India-weak Rupee(s) India-strong Soybean(s) Rushing Football - college Dlrs Janus India-weak BSES India-weak Caracas Escondido California/Blackout Promoters Energy/Scottish Aramco India-weak DOORMAN Bawdy/Real Estate 50 / 61
51 Conclusions and References Enron Background Electronic Mail Surveillance Conclusions and References Conclusions Future Work References 51 / 61
52 Conclusions and References Conclusions Conclusions GD-CLS Algorithm can effectively produce a parts-based approximation X WH of a sparse term-by-message matrix X. 52 / 61
53 Conclusions and References Conclusions Conclusions GD-CLS Algorithm can effectively produce a parts-based approximation X WH of a sparse term-by-message matrix X. Smoothing on the features matrix (W ) as opposed to the weight matrix H forces more reuse of higher weighted terms. 53 / 61
54 Conclusions and References Conclusions Conclusions GD-CLS Algorithm can effectively produce a parts-based approximation X WH of a sparse term-by-message matrix X. Smoothing on the features matrix (W ) as opposed to the weight matrix H forces more reuse of higher weighted terms. Surveillance systems based on GD-CLS could be used to monitor discussions without the need to isolate or perhaps incriminate individual employees. 54 / 61
55 Conclusions and References Conclusions Conclusions GD-CLS Algorithm can effectively produce a parts-based approximation X WH of a sparse term-by-message matrix X. Smoothing on the features matrix (W ) as opposed to the weight matrix H forces more reuse of higher weighted terms. Surveillance systems based on GD-CLS could be used to monitor discussions without the need to isolate or perhaps incriminate individual employees. Potential applications include the monitoring/tracking of company morale, employee feedback to policy decisions, and extracurricular activities 55 / 61
56 Conclusions and References Future Work Future Work Further work needed in determining effects of alternative term weighting schemes (for X ) and choices of control parameters (α, β, λ) on quality of the basis vectors W k. 56 / 61
57 Conclusions and References Future Work Future Work Further work needed in determining effects of alternative term weighting schemes (for X ) and choices of control parameters (α, β, λ) on quality of the basis vectors W k. How does document (or message) clustering change with different column ranks (r) in the matrix W? 57 / 61
58 Conclusions and References Future Work Future Work Further work needed in determining effects of alternative term weighting schemes (for X ) and choices of control parameters (α, β, λ) on quality of the basis vectors W k. How does document (or message) clustering change with different column ranks (r) in the matrix W? Use MailMiner and similar text mining software to produce a topic annotated Enron subset for the public domain. 58 / 61
59 Conclusions and References Future Work Future Work Further work needed in determining effects of alternative term weighting schemes (for X ) and choices of control parameters (α, β, λ) on quality of the basis vectors W k. How does document (or message) clustering change with different column ranks (r) in the matrix W? Use MailMiner and similar text mining software to produce a topic annotated Enron subset for the public domain. Explore use of NMF for automated gene classification; Semantic Gene Organizer (K. Heinrich, PhD Thesis 2006) 59 / 61
60 Conclusions and References References For Further Reading F. Shahnaz, M.W. Berry, V.P. Pauca, and R.J. Plemmons. Document Clustering Using Nonnegative Matrix Factorization. Information Processing & Management 42(2), 2006, pp J. Piper, P. Pauca, R. Plemmons, and M. Giffin. Object Characterization from Spectral Data using Independent Component Analysis and Information Theory. Proc. AMOS Technical Conference, Maui, HI, September W. Xu, X. Liu, and Y. Gong. Document-Clustering based on Non-Negative Matrix Factorization. Proceedings of SIGIR 03, July 28 - August 1, Toronto, CA, pp , / 61
61 Conclusions and References References SMD06 Text Mining Workshop Hyatt Regency Bethesda Bethesda, Maryland April 22, 2006 to be held in conjunction with Sixth SIAM International Conference on Data Mining (SDM 2006) and also in conjunction with SIAM's Link Analysis, Counterterrorism Security Workshop, which is also being held on April 22, Website: 61 / 61
Collaborators. NMF Origins
Collaborators Using Nonnegative Matrix and Tensor Factorizations for Email Surveillance Semi-Plenary Session IX, EURO XXII, Prague, CR Michael W. Berry, Murray Browne, and Brett W. Bader University of
More informationSurveillance Using Nonnegative Matrix Factorization
Email Surveillance Using Nonnegative Matrix Factorization Michael W. Berry Murray Browne February 25, 2005 Abstract In this study, we apply a non-negative matrix factorization approach for the extraction
More informationNonnegative Matrix Factorization. Data Mining
The Nonnegative Matrix Factorization in Data Mining Amy Langville langvillea@cofc.edu Mathematics Department College of Charleston Charleston, SC Yahoo! Research 10/18/2005 Outline Part 1: Historical Developments
More informationCollaborators. NNMF Origins
Collaborators Using Nonnegative Matrix and Tensor Factorizations for Topic and Scenario Detection and Tracking Math and Stat Colloquim, Utah State University Michael W. Berry Department of Electrical Engineering
More informationAlgorithms and Applications for Approximate Nonnegative Matrix Factorization
Algorithms and Applications for Approximate Nonnegative Matrix Factorization Michael W. Berry and Murray Browne Department of Computer Science, University of Tennessee, Knoxville, TN 37996-3450 Amy N.
More informationAlgorithms and applications for approximate nonnegative matrix factorization
Computational Statistics & Data Analysis 52 (2007) 155 173 www.elsevier.com/locate/csda Algorithms and applications for approximate nonnegative matrix factorization Michael W. Berry a,, Murray Browne a,
More informationGraph Inference with Imperfect Edge Classifiers
Graph Inference with Imperfect Edge Classifiers Michael W. Trosset Department of Statistics Indiana University Joint work with David Brinda (Yale University) and Shantanu Jain (Indiana University), supported
More informationClustering. SVD and NMF
Clustering with the SVD and NMF Amy Langville Mathematics Department College of Charleston Dagstuhl 2/14/2007 Outline Fielder Method Extended Fielder Method and SVD Clustering with SVD vs. NMF Demos with
More informationMULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION
MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION Ehsan Hosseini Asl 1, Jacek M. Zurada 1,2 1 Department of Electrical and Computer Engineering University of Louisville, Louisville,
More informationPreserving Privacy in Data Mining using Data Distortion Approach
Preserving Privacy in Data Mining using Data Distortion Approach Mrs. Prachi Karandikar #, Prof. Sachin Deshpande * # M.E. Comp,VIT, Wadala, University of Mumbai * VIT Wadala,University of Mumbai 1. prachiv21@yahoo.co.in
More informationObject Characterization from Spectral Data Using Nonnegative Factorization and Information Theory
Object Characterization from Spectral Data Using Nonnegative Factorization and Information Theory J. Piper V. Paul Pauca Robert J. Plemmons Maile Giffin Abstract The identification and classification of
More informationOrthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds
Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds Jiho Yoo and Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 31 Hyoja-dong,
More informationCONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT
CONOLUTIE NON-NEGATIE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT Paul D. O Grady Barak A. Pearlmutter Hamilton Institute National University of Ireland, Maynooth Co. Kildare, Ireland. ABSTRACT Discovering
More informationNon-Negative Matrix Factorization
Chapter 3 Non-Negative Matrix Factorization Part 2: Variations & applications Geometry of NMF 2 Geometry of NMF 1 NMF factors.9.8 Data points.7.6 Convex cone.5.4 Projections.3.2.1 1.5 1.5 1.5.5 1 3 Sparsity
More informationPerformance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization
Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Tabitha Samuel, Master s Candidate Dr. Michael W. Berry, Major Professor Abstract: Increasingly
More informationUsing Matrix Decompositions in Formal Concept Analysis
Using Matrix Decompositions in Formal Concept Analysis Vaclav Snasel 1, Petr Gajdos 1, Hussam M. Dahwa Abdulla 1, Martin Polovincak 1 1 Dept. of Computer Science, Faculty of Electrical Engineering and
More information2019 Settlement Calendar for ASX Cash Market Products. ASX Settlement
2019 Settlement Calendar for ASX Cash Market Products ASX Settlement Settlement Calendar for ASX Cash Market Products 1 ASX Settlement Pty Limited (ASX Settlement) operates a trade date plus two Business
More informationNon-Negative Matrix Factorization with Quasi-Newton Optimization
Non-Negative Matrix Factorization with Quasi-Newton Optimization Rafal ZDUNEK, Andrzej CICHOCKI Laboratory for Advanced Brain Signal Processing BSI, RIKEN, Wako-shi, JAPAN Abstract. Non-negative matrix
More informationGAMINGRE 8/1/ of 7
FYE 09/30/92 JULY 92 0.00 254,550.00 0.00 0 0 0 0 0 0 0 0 0 254,550.00 0.00 0.00 0.00 0.00 254,550.00 AUG 10,616,710.31 5,299.95 845,656.83 84,565.68 61,084.86 23,480.82 339,734.73 135,893.89 67,946.95
More informationCOMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017
COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University TOPIC MODELING MODELS FOR TEXT DATA
More informationTracking Accuracy: An Essential Step to Improve Your Forecasting Process
Tracking Accuracy: An Essential Step to Improve Your Forecasting Process Presented by Eric Stellwagen President & Co-founder Business Forecast Systems, Inc. estellwagen@forecastpro.com Business Forecast
More informationPublished by ASX Settlement Pty Limited A.B.N Settlement Calendar for ASX Cash Market Products
Published by Pty Limited A.B.N. 49 008 504 532 2012 Calendar for Cash Market Products Calendar for Cash Market Products¹ Pty Limited ( ) operates a trade date plus three Business (T+3) settlement discipline
More informationA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion
More informationLecture 5: Web Searching using the SVD
Lecture 5: Web Searching using the SVD Information Retrieval Over the last 2 years the number of internet users has grown exponentially with time; see Figure. Trying to extract information from this exponentially
More information9 Searching the Internet with the SVD
9 Searching the Internet with the SVD 9.1 Information retrieval Over the last 20 years the number of internet users has grown exponentially with time; see Figure 1. Trying to extract information from this
More information13 Searching the Web with the SVD
13 Searching the Web with the SVD 13.1 Information retrieval Over the last 20 years the number of internet users has grown exponentially with time; see Figure 1. Trying to extract information from this
More informationP7.7 A CLIMATOLOGICAL STUDY OF CLOUD TO GROUND LIGHTNING STRIKES IN THE VICINITY OF KENNEDY SPACE CENTER, FLORIDA
P7.7 A CLIMATOLOGICAL STUDY OF CLOUD TO GROUND LIGHTNING STRIKES IN THE VICINITY OF KENNEDY SPACE CENTER, FLORIDA K. Lee Burns* Raytheon, Huntsville, Alabama Ryan K. Decker NASA, Marshall Space Flight
More informationInteraction on spectral data with: Kira Abercromby (NASA-Houston) Related Papers at:
Low-Rank Nonnegative Factorizations for Spectral Imaging Applications Bob Plemmons Wake Forest University Collaborators: Christos Boutsidis (U. Patras), Misha Kilmer (Tufts), Peter Zhang, Paul Pauca, (WFU)
More informationDetect and Track Latent Factors with Online Nonnegative Matrix Factorization
Detect and Track Latent Factors with Online Nonnegative Matrix Factorization Bin Cao 1, Dou Shen 2, Jian-Tao Sun 3, Xuanhui Wang 4, Qiang Yang 2 and Zheng Chen 3 1 Peking university, Beijing, China 2 Hong
More informationIntroduction to Course
.. Introduction to Course Oran Kittithreerapronchai 1 1 Department of Industrial Engineering, Chulalongkorn University Bangkok 10330 THAILAND last updated: September 17, 2016 COMP METH v2.00: intro 1/
More information2017 Settlement Calendar for ASX Cash Market Products ASX SETTLEMENT
2017 Settlement Calendar for ASX Cash Market Products ASX SETTLEMENT Settlement Calendar for ASX Cash Market Products 1 ASX Settlement Pty Limited (ASX Settlement) operates a trade date plus two Business
More informationarxiv: v2 [math.oc] 6 Oct 2011
Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization Nicolas Gillis 1 and François Glineur 2 arxiv:1107.5194v2 [math.oc] 6 Oct 2011 Abstract Nonnegative
More informationNon-negative matrix factorization and term structure of interest rates
Non-negative matrix factorization and term structure of interest rates Hellinton H. Takada and Julio M. Stern Citation: AIP Conference Proceedings 1641, 369 (2015); doi: 10.1063/1.4906000 View online:
More informationPerformance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization
Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Tabitha Samuel, Master s Candidate Dr. Michael W. Berry, Major Professor What is the Parallel Computing
More informationDAILY QUESTIONS 28 TH JUNE 18 REASONING - CALENDAR
DAILY QUESTIONS 28 TH JUNE 18 REASONING - CALENDAR LEAP AND NON-LEAP YEAR *A non-leap year has 365 days whereas a leap year has 366 days. (as February has 29 days). *Every year which is divisible by 4
More informationNon-negative matrix factorization with fixed row and column sums
Available online at www.sciencedirect.com Linear Algebra and its Applications 9 (8) 5 www.elsevier.com/locate/laa Non-negative matrix factorization with fixed row and column sums Ngoc-Diep Ho, Paul Van
More informationData Mining and Matrices
Data Mining and Matrices 6 Non-Negative Matrix Factorization Rainer Gemulla, Pauli Miettinen May 23, 23 Non-Negative Datasets Some datasets are intrinsically non-negative: Counters (e.g., no. occurrences
More informationNon-Negative Matrix Factorization
Chapter 3 Non-Negative Matrix Factorization Part : Introduction & computation Motivating NMF Skillicorn chapter 8; Berry et al. (27) DMM, summer 27 2 Reminder A T U Σ V T T Σ, V U 2 Σ 2,2 V 2.8.6.6.3.6.5.3.6.3.6.4.3.6.4.3.3.4.5.3.5.8.3.8.3.3.5
More informationNote on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing
Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing 1 Zhong-Yuan Zhang, 2 Chris Ding, 3 Jie Tang *1, Corresponding Author School of Statistics,
More informationDOZENALS. A project promoting base 12 counting and measuring. Ideas and designs by DSA member (#342) and board member, Timothy F. Travis.
R AENBO DOZENALS A project promoting base 12 counting and measuring. Ideas and designs by DSA member (#342) and board member Timothy F. Travis. I became aware as a teenager of base twelve numbering from
More informationDimension Reduction and Iterative Consensus Clustering
Dimension Reduction and Iterative Consensus Clustering Southeastern Clustering and Ranking Workshop August 24, 2009 Dimension Reduction and Iterative 1 Document Clustering Geometry of the SVD Centered
More informationFour Basic Steps for Creating an Effective Demand Forecasting Process
Four Basic Steps for Creating an Effective Demand Forecasting Process Presented by Eric Stellwagen President & Cofounder Business Forecast Systems, Inc. estellwagen@forecastpro.com Business Forecast Systems,
More informationSTATISTICAL FORECASTING and SEASONALITY (M. E. Ippolito; )
STATISTICAL FORECASTING and SEASONALITY (M. E. Ippolito; 10-6-13) PART I OVERVIEW The following discussion expands upon exponential smoothing and seasonality as presented in Chapter 11, Forecasting, in
More informationNonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, Chi-square Statistic, and a Hybrid Method
Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, hi-square Statistic, and a Hybrid Method hris Ding a, ao Li b and Wei Peng b a Lawrence Berkeley National Laboratory,
More informationFast Coordinate Descent methods for Non-Negative Matrix Factorization
Fast Coordinate Descent methods for Non-Negative Matrix Factorization Inderjit S. Dhillon University of Texas at Austin SIAM Conference on Applied Linear Algebra Valencia, Spain June 19, 2012 Joint work
More informationDetermine the trend for time series data
Extra Online Questions Determine the trend for time series data Covers AS 90641 (Statistics and Modelling 3.1) Scholarship Statistics and Modelling Chapter 1 Essent ial exam notes Time series 1. The value
More information2018 Annual Review of Availability Assessment Hours
2018 Annual Review of Availability Assessment Hours Amber Motley Manager, Short Term Forecasting Clyde Loutan Principal, Renewable Energy Integration Karl Meeusen Senior Advisor, Infrastructure & Regulatory
More informationNonnegative Matrix Factorization
Nonnegative Matrix Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationEBS IT Meeting July 2016
EBS IT Meeting 18 19 July 2016 Conference Call Details Conference call: UK Numbers Tel: 0808 238 9819 or Tel: 0207 950 1251 Participant code: 4834 7876... Join online meeting https://meet.nationalgrid.com/antonio.delcastillozas/hq507d31
More information1. Ignoring case, extract all unique words from the entire set of documents.
CS 378 Introduction to Data Mining Spring 29 Lecture 2 Lecturer: Inderjit Dhillon Date: Jan. 27th, 29 Keywords: Vector space model, Latent Semantic Indexing(LSI), SVD 1 Vector Space Model The basic idea
More informationExamination of Initialization Techniques for Nonnegative Matrix Factorization
Georgia State University ScholarWorks @ Georgia State University Mathematics Theses Department of Mathematics and Statistics 11-21-2008 Examination of Initialization Techniques for Nonnegative Matrix Factorization
More informationUSING SINGULAR VALUE DECOMPOSITION (SVD) AS A SOLUTION FOR SEARCH RESULT CLUSTERING
POZNAN UNIVE RSIY OF E CHNOLOGY ACADE MIC JOURNALS No. 80 Electrical Engineering 2014 Hussam D. ABDULLA* Abdella S. ABDELRAHMAN* Vaclav SNASEL* USING SINGULAR VALUE DECOMPOSIION (SVD) AS A SOLUION FOR
More informationFactor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra
Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE 5290 - Artificial Intelligence Grad Project Dr. Debasis Mitra Group 6 Taher Patanwala Zubin Kadva Factor Analysis (FA) 1. Introduction Factor
More informationWHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and Rainfall For Selected Arizona Cities
WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and 2001-2002 Rainfall For Selected Arizona Cities Phoenix Tucson Flagstaff Avg. 2001-2002 Avg. 2001-2002 Avg. 2001-2002 October 0.7 0.0
More informationGroup Sparse Non-negative Matrix Factorization for Multi-Manifold Learning
LIU, LU, GU: GROUP SPARSE NMF FOR MULTI-MANIFOLD LEARNING 1 Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning Xiangyang Liu 1,2 liuxy@sjtu.edu.cn Hongtao Lu 1 htlu@sjtu.edu.cn
More informationMachine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012
Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component
More informationAn Optimization-based Approach to Decentralized Assignability
2016 American Control Conference (ACC) Boston Marriott Copley Place July 6-8, 2016 Boston, MA, USA An Optimization-based Approach to Decentralized Assignability Alborz Alavian and Michael Rotkowitz Abstract
More informationENGINE SERIAL NUMBERS
ENGINE SERIAL NUMBERS The engine number was also the serial number of the car. Engines were numbered when they were completed, and for the most part went into a chassis within a day or so. However, some
More informationNonnegative Matrix Factorization with Toeplitz Penalty
Journal of Informatics and Mathematical Sciences Vol. 10, Nos. 1 & 2, pp. 201 215, 2018 ISSN 0975-5748 (online); 0974-875X (print) Published by RGN Publications http://dx.doi.org/10.26713/jims.v10i1-2.851
More informationResearch Article Comparative Features Extraction Techniques for Electrocardiogram Images Regression
Research Journal of Applied Sciences, Engineering and Technology (): 6, 7 DOI:.96/rjaset..6 ISSN: -79; e-issn: -767 7 Maxwell Scientific Organization Corp. Submitted: September 8, 6 Accepted: November,
More informationTechnical note on seasonal adjustment for Capital goods imports
Technical note on seasonal adjustment for Capital goods imports July 1, 2013 Contents 1 Capital goods imports 2 1.1 Additive versus multiplicative seasonality..................... 2 2 Steps in the seasonal
More information2003 Water Year Wrap-Up and Look Ahead
2003 Water Year Wrap-Up and Look Ahead Nolan Doesken Colorado Climate Center Prepared by Odie Bliss http://ccc.atmos.colostate.edu Colorado Average Annual Precipitation Map South Platte Average Precipitation
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 PCA, NMF Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Summary: PCA PCA is SVD
More informationMonthly Magnetic Bulletin
BRITISH GEOLOGICAL SURVEY Ascension Island Observatory Monthly Magnetic Bulletin December 2008 08/12/AS Crown copyright; Ordnance Survey ASCENSION ISLAND OBSERVATORY MAGNETIC DATA 1. Introduction Ascension
More informationGTR # VLTs GTR/VLT/Day %Δ:
MARYLAND CASINOS: MONTHLY REVENUES TOTAL REVENUE, GROSS TERMINAL REVENUE, WIN/UNIT/DAY, TABLE DATA, AND MARKET SHARE CENTER FOR GAMING RESEARCH, DECEMBER 2017 Executive Summary Since its 2010 casino debut,
More informationNonnegative Matrix Factorization with Orthogonality Constraints
Nonnegative Matrix Factorization with Orthogonality Constraints Jiho Yoo and Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology (POSTECH) Pohang, Republic
More informationRegularized NNLS Algorithms for Nonnegative Matrix Factorization with Application to Text Document Clustering
Regularized NNLS Algorithms for Nonnegative Matrix Factorization with Application to Text Document Clustering Rafal Zdunek Abstract. Nonnegative Matrix Factorization (NMF) has recently received much attention
More informationManning & Schuetze, FSNLP (c) 1999,2000
558 15 Topics in Information Retrieval (15.10) y 4 3 2 1 0 0 1 2 3 4 5 6 7 8 Figure 15.7 An example of linear regression. The line y = 0.25x + 1 is the best least-squares fit for the four points (1,1),
More informationWinter Season Resource Adequacy Analysis Status Report
Winter Season Resource Adequacy Analysis Status Report Tom Falin Director Resource Adequacy Planning Markets & Reliability Committee October 26, 2017 Winter Risk Winter Season Resource Adequacy and Capacity
More informationYear 9 Science Work Rate Calendar Term
Year Science Work Rate Calendar Term 0 that weekly contact is maintained with each teacher through email, Blackboard, phone, return of work, etc. in addition to the return of work outlined below. topics
More informationVariable Latent Semantic Indexing
Variable Latent Semantic Indexing Prabhakar Raghavan Yahoo! Research Sunnyvale, CA November 2005 Joint work with A. Dasgupta, R. Kumar, A. Tomkins. Yahoo! Research. Outline 1 Introduction 2 Background
More informationNon-negative Matrix Factorization: Algorithms, Extensions and Applications
Non-negative Matrix Factorization: Algorithms, Extensions and Applications Emmanouil Benetos www.soi.city.ac.uk/ sbbj660/ March 2013 Emmanouil Benetos Non-negative Matrix Factorization March 2013 1 / 25
More informationRank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs
Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Bhargav Kanagal Department of Computer Science University of Maryland College Park, MD 277 bhargav@cs.umd.edu Vikas
More information* Matrix Factorization and Recommendation Systems
Matrix Factorization and Recommendation Systems Originally presented at HLF Workshop on Matrix Factorization with Loren Anderson (University of Minnesota Twin Cities) on 25 th September, 2017 15 th March,
More informationManning & Schuetze, FSNLP, (c)
page 554 554 15 Topics in Information Retrieval co-occurrence Latent Semantic Indexing Term 1 Term 2 Term 3 Term 4 Query user interface Document 1 user interface HCI interaction Document 2 HCI interaction
More informationYACT (Yet Another Climate Tool)? The SPI Explorer
YACT (Yet Another Climate Tool)? The SPI Explorer Mike Crimmins Assoc. Professor/Extension Specialist Dept. of Soil, Water, & Environmental Science The University of Arizona Yes, another climate tool for
More informationQuick Introduction to Nonnegative Matrix Factorization
Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices
More informationLecture Notes 10: Matrix Factorization
Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To
More informationTornado Hazard Risk Analysis: A Report for Rutherford County Emergency Management Agency
Tornado Hazard Risk Analysis: A Report for Rutherford County Emergency Management Agency by Middle Tennessee State University Faculty Lisa Bloomer, Curtis Church, James Henry, Ahmad Khansari, Tom Nolan,
More informationSunspot Index and Long-term Solar Observations World Data Center supported by the ICSU - WDS
Sunspot Index and Long-term Solar Observations World Data Center supported by the ICSU - WDS 2016 n 7 WARNING OF MAJOR DATA CHANGE Over the past 4 years a community effort has been carried out to revise
More informationCorn Basis Information By Tennessee Crop Reporting District
UT EXTENSION THE UNIVERSITY OF TENNESSEE INSTITUTE OF AGRICULTURE AE 05-13 Corn Basis Information By Tennessee Crop Reporting District 1994-2003 Delton C. Gerloff, Professor The University of Tennessee
More informationSimultaneous Discovery of Common and Discriminative Topics via Joint Nonnegative Matrix Factorization
Simultaneous Discovery of Common and Discriminative Topics via Joint Nonnegative Matrix Factorization Hannah Kim 1, Jaegul Choo 2, Jingu Kim 3, Chandan K. Reddy 4, Haesun Park 1 Georgia Tech 1, Korea University
More informationRecommendation Systems
Recommendation Systems Popularity Recommendation Systems Predicting user responses to options Offering news articles based on users interests Offering suggestions on what the user might like to buy/consume
More informationMultivariate Regression Model Results
Updated: August, 0 Page of Multivariate Regression Model Results 4 5 6 7 8 This exhibit provides the results of the load model forecast discussed in Schedule. Included is the forecast of short term system
More informationLecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University
Lecture 15 20 Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University Modeling for Time Series Forecasting Forecasting is a necessary input to planning, whether in business,
More information2018 Technical Standards and Safety Authority and Ministry of Advanced Education & Skills Development Examination Schedule
Ministry of Advanced Education & Skills Development ination Schedule Last Updated: June 7, 2018 Document Uncontrolled if Printed ination Schedule Preface: The following sets out and identifies the 2018
More informationMatrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang
Matrix Factorization & Latent Semantic Analysis Review Yize Li, Lanbo Zhang Overview SVD in Latent Semantic Indexing Non-negative Matrix Factorization Probabilistic Latent Semantic Indexing Vector Space
More informationBUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7
BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7 1. The definitions follow: (a) Time series: Time series data, also known as a data series, consists of observations on a
More informationThe 2010/11 drought in the Horn of Africa: Monitoring and forecasts using ECMWF products
The 2010/11 drought in the Horn of Africa: Monitoring and forecasts using ECMWF products Emanuel Dutra Fredrik Wetterhall Florian Pappenberger Souhail Boussetta Gianpaolo Balsamo Linus Magnusson Slide
More informationMLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT
MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net
More informationCalculations Equation of Time. EQUATION OF TIME = apparent solar time - mean solar time
Calculations Equation of Time APPARENT SOLAR TIME is the time that is shown on sundials. A MEAN SOLAR DAY is a constant 24 hours every day of the year. Apparent solar days are measured from noon one day
More informationNonnegative Matrix Factorization Clustering on Multiple Manifolds
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Nonnegative Matrix Factorization Clustering on Multiple Manifolds Bin Shen, Luo Si Department of Computer Science,
More informationTechnical note on seasonal adjustment for M0
Technical note on seasonal adjustment for M0 July 1, 2013 Contents 1 M0 2 2 Steps in the seasonal adjustment procedure 3 2.1 Pre-adjustment analysis............................... 3 2.2 Seasonal adjustment.................................
More informationReconstruction of Reflectance Spectra Using Robust Nonnegative Matrix Factorization
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 9, SEPTEMBER 2006 3637 Reconstruction of Reflectance Spectra Using Robust Nonnegative Matrix Factorization A. Ben Hamza and David J. Brady Abstract
More informationPRELIMINARY DRAFT FOR DISCUSSION PURPOSES
Memorandum To: David Thompson From: John Haapala CC: Dan McDonald Bob Montgomery Date: February 24, 2003 File #: 1003551 Re: Lake Wenatchee Historic Water Levels, Operation Model, and Flood Operation This
More informationSemantics with Dense Vectors. Reference: D. Jurafsky and J. Martin, Speech and Language Processing
Semantics with Dense Vectors Reference: D. Jurafsky and J. Martin, Speech and Language Processing 1 Semantics with Dense Vectors We saw how to represent a word as a sparse vector with dimensions corresponding
More informationCompressed Sensing and Neural Networks
and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications
More informationarxiv: v1 [stat.ml] 23 Dec 2015
k-means Clustering Is Matrix Factorization Christian Bauckhage arxiv:151.07548v1 [stat.ml] 3 Dec 015 B-IT, University of Bonn, Bonn, Germany Fraunhofer IAIS, Sankt Augustin, Germany http://mmprec.iais.fraunhofer.de/bauckhage.html
More informationOn the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering
On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering Chris Ding Xiaofeng He Horst D. Simon Abstract Current nonnegative matrix factorization (NMF) deals with X = FG T type. We
More informationData Mining and Matrices
Data Mining and Matrices 05 Semi-Discrete Decomposition Rainer Gemulla, Pauli Miettinen May 16, 2013 Outline 1 Hunting the Bump 2 Semi-Discrete Decomposition 3 The Algorithm 4 Applications SDD alone SVD
More information