Text Mining Approaches for Surveillance

Size: px
Start display at page:

Download "Text Mining Approaches for Surveillance"

Transcription

1 Text Mining Approaches for Surveillance Document Space Workshop, IPAM/UCLA Michael W. Berry and Murray Browne Department of Computer Science, UTK January 23, / 61

2 Outline Enron Background Electronic Mail Surveillance Conclusions and References 2 / 61

3 Enron Background Collection Enron Background Collection Historical Events Electronic Mail Surveillance Conclusions and References 3 / 61

4 Enron Background Collection Collection By-product of the FERC investigation of Enron (originally contained 15 million messages). 4 / 61

5 Enron Background Collection Collection By-product of the FERC investigation of Enron (originally contained 15 million messages). This study used the improved corpus known as the Enron set, which was edited by Dr. William Cohen at CMU. 5 / 61

6 Enron Background Collection Collection By-product of the FERC investigation of Enron (originally contained 15 million messages). This study used the improved corpus known as the Enron set, which was edited by Dr. William Cohen at CMU. This set had over 500,000 messages. The majority were sent in the 1999 to 2001 timeframe. 6 / 61

7 Enron Background Historical Events Enron Background Collection Historical Events Electronic Mail Surveillance Conclusions and References 7 / 61

8 Enron Background Historical Events Enron Historical Ongoing, problematic, development of the Dabhol Power Company (DPC) in the Indian state of Maharashtra. 8 / 61

9 Enron Background Historical Events Enron Historical Ongoing, problematic, development of the Dabhol Power Company (DPC) in the Indian state of Maharashtra. Deregulation of the Calif. energy industry, which led to rolling electricity blackouts in the summer of 2000 (and subsequent investigations). 9 / 61

10 Enron Background Historical Events Enron Historical Ongoing, problematic, development of the Dabhol Power Company (DPC) in the Indian state of Maharashtra. Deregulation of the Calif. energy industry, which led to rolling electricity blackouts in the summer of 2000 (and subsequent investigations). Revelation of Enron s deceptive business and accounting practices that led to an abrupt collapse of the energy colossus in October, 2001; Enron filed for bankruptcy in December, / 61

11 Motivation Enron Background Motivation Underlying Optimization Problem MM Method (Lee and Seung) Enforcing Statistical Sparsity Hybrid NMF Approach Electronic Mail Surveillance Conclusions and References 11 / 61

12 Motivation NMF Origins NMF (Nonnegative Matrix Factorization) can be used to approximate high-dimensional data having nonnegative components. 12 / 61

13 Motivation NMF Origins NMF (Nonnegative Matrix Factorization) can be used to approximate high-dimensional data having nonnegative components. Lee and Seung (1999) demonstrated its use as a sum-by-parts representation of image data in order to both identify and classify image features. 13 / 61

14 Motivation NMF Origins NMF (Nonnegative Matrix Factorization) can be used to approximate high-dimensional data having nonnegative components. Lee and Seung (1999) demonstrated its use as a sum-by-parts representation of image data in order to both identify and classify image features. [Xu et al., 2003] demonstrated how NMF-based indexing could outperform SVD-based Latent Semantic Indexing (LSI) for some information retrieval tasks. 14 / 61

15 Motivation NMF for Image Processing A i SVD W Hi U Sparse NMF verses Dense SVD Bases; Lee and Seung (1999) Σ V i 15 / 61

16 Motivation NMF for Text Mining (Medlars) Highest Weighted Terms in Basis Vector W *1 Highest Weighted Terms in Basis Vector W *2 term 1 ventricular 2 aortic 3 septal 4 left 5 defect 6 regurgitation 7 ventricle 8 valve 9 cardiac 10 pressure weight Highest Weighted Terms in Basis Vector W *5 term 1 oxygen flow pressure blood cerebral hypothermia fluid venous arterial perfusion weight Highest Weighted Terms in Basis Vector W *6 term 1 children child autistic speech group early visual anxiety emotional autism weight term 1 kidney marrow dna cells nephrectomy unilateral lymphocytes bone thymidine rats weight Interpretable NMF feature vectors; Langville et al. (2006) 16 / 61

17 Underlying Optimization Problem Derivation Given an m n term-by-message (sparse) matrix X. 17 / 61

18 Underlying Optimization Problem Derivation Given an m n term-by-message (sparse) matrix X. Compute two reduced-dim. matrices W,H so that X WH; W is m r and H is r n, with r n. 18 / 61

19 Underlying Optimization Problem Derivation Given an m n term-by-message (sparse) matrix X. Compute two reduced-dim. matrices W,H so that X WH; W is m r and H is r n, with r n. Optimization problem: min X W,H WH 2 F, subject to W ij 0 and H ij 0, i, j. 19 / 61

20 Underlying Optimization Problem Derivation Given an m n term-by-message (sparse) matrix X. Compute two reduced-dim. matrices W,H so that X WH; W is m r and H is r n, with r n. Optimization problem: min X W,H WH 2 F, subject to W ij 0 and H ij 0, i, j. General approach: construct initial estimates for W and H and then improve them via alternating iterations. 20 / 61

21 MM Method (Lee and Seung) Multiplicative Method (MM) Multiplicative update rules for W and H (Lee and Seung, 1999): 1. Initialize W and H with non-negative values, and scale the columns of W to unit norm. 2. Iterate for each c, j, and i until convergence or after k iterations: 2.1 H cj H cj (W T X ) cj (W T WH) cj + ɛ 2.2 W ic W ic (XH T ) ic (WHH T ) ic + ɛ 2.3 Scale the columns of W to unit norm. 21 / 61

22 MM Method (Lee and Seung) Multiplicative Method (MM) Multiplicative update rules for W and H (Lee and Seung, 1999): 1. Initialize W and H with non-negative values, and scale the columns of W to unit norm. 2. Iterate for each c, j, and i until convergence or after k iterations: 2.1 H cj H cj (W T X ) cj (W T WH) cj + ɛ 2.2 W ic W ic (XH T ) ic (WHH T ) ic + ɛ 2.3 Scale the columns of W to unit norm. Setting ɛ = 10 9 will suffice [Shahnaz et al., 2006]. 22 / 61

23 MM Method (Lee and Seung) Normalization, Complexity, and Convergence Important to normalize X initially and the basis matrix W at each iteration. 23 / 61

24 MM Method (Lee and Seung) Normalization, Complexity, and Convergence Important to normalize X initially and the basis matrix W at each iteration. When optimizing on a unit hypersphere, the column (or feature) vectors of W, denoted by W k, are effectively mapped to the surface of the hypersphere by repeated normalization. 24 / 61

25 MM Method (Lee and Seung) Normalization, Complexity, and Convergence Important to normalize X initially and the basis matrix W at each iteration. When optimizing on a unit hypersphere, the column (or feature) vectors of W, denoted by W k, are effectively mapped to the surface of the hypersphere by repeated normalization. MM implementation of NMF requires O(rmn) operations per iteration; Lee and Seung (1999) proved that X WH 2 F is monotonically non-increasing with MM. 25 / 61

26 MM Method (Lee and Seung) Normalization, Complexity, and Convergence Important to normalize X initially and the basis matrix W at each iteration. When optimizing on a unit hypersphere, the column (or feature) vectors of W, denoted by W k, are effectively mapped to the surface of the hypersphere by repeated normalization. MM implementation of NMF requires O(rmn) operations per iteration; Lee and Seung (1999) proved that X WH 2 F is monotonically non-increasing with MM. From a nonlinear optimization perspective, MM/NMF can be considered a diagonally-scaled gradient descent method. 26 / 61

27 Enforcing Statistical Sparsity Hoyer s Method From neural network applications, Hoyer (2002) enforced statistical sparsity for the weight matrix H in order to enhance the parts-based data representations in the matrix W. 27 / 61

28 Enforcing Statistical Sparsity Hoyer s Method From neural network applications, Hoyer (2002) enforced statistical sparsity for the weight matrix H in order to enhance the parts-based data representations in the matrix W. Mu et al. (2003) suggested a regularization approach to achieve statistical sparsity in the matrix H: point count regularization; penalize the number of nonzeros in H rather than ij H ij. 28 / 61

29 Enforcing Statistical Sparsity Hoyer s Method From neural network applications, Hoyer (2002) enforced statistical sparsity for the weight matrix H in order to enhance the parts-based data representations in the matrix W. Mu et al. (2003) suggested a regularization approach to achieve statistical sparsity in the matrix H: point count regularization; penalize the number of nonzeros in H rather than ij H ij. Goal of increased sparsity better representation of parts or features spanned by the corpus (X ) [Shahnaz et al., 2006]. 29 / 61

30 Hybrid NMF Approach GD-CLS Hybrid Approach First use MM to compute an approximation to W for each iteration a gradient descent (GD) optimization step. 30 / 61

31 Hybrid NMF Approach GD-CLS Hybrid Approach First use MM to compute an approximation to W for each iteration a gradient descent (GD) optimization step. Then, compute the weight matrix H using a constrained least squares (CLS) model to penalize non-smoothness (i.e., non-sparsity) in H common Tikohonov regularization technique used in image processing (Prasad et al., 2003). 31 / 61

32 Hybrid NMF Approach GD-CLS Hybrid Approach First use MM to compute an approximation to W for each iteration a gradient descent (GD) optimization step. Then, compute the weight matrix H using a constrained least squares (CLS) model to penalize non-smoothness (i.e., non-sparsity) in H common Tikohonov regularization technique used in image processing (Prasad et al., 2003). Convergence to a non-stationary point evidenced (but no formal proof given to date). 32 / 61

33 Hybrid NMF Approach GD-CLS Algorithm 1. Initialize W and H with non-negative values, and scale the columns of W to unit norm. 33 / 61

34 Hybrid NMF Approach GD-CLS Algorithm 1. Initialize W and H with non-negative values, and scale the columns of W to unit norm. 2. Iterate until convergence or after k iterations: (XH 2.1 W ic W T ) ic ic (WHH T, for c and i ) ic + ɛ 2.2 Rescale the columns of W to unit norm. 2.3 Solve the constrained least squares problem: min H j { X j WH j λ H j 2 2}, where the subscript j denotes the j th column, for j = 1,..., m. 34 / 61

35 Hybrid NMF Approach GD-CLS Algorithm 1. Initialize W and H with non-negative values, and scale the columns of W to unit norm. 2. Iterate until convergence or after k iterations: (XH 2.1 W ic W T ) ic ic (WHH T, for c and i ) ic + ɛ 2.2 Rescale the columns of W to unit norm. 2.3 Solve the constrained least squares problem: min H j { X j WH j λ H j 2 2}, where the subscript j denotes the j th column, for j = 1,..., m. Any negative values in H j are set to zero. The parameter λ is a regularization value that is used to balance the reduction of the metric X j WH j 2 2 with enforcement of smoothness and sparsity in H [Shahnaz et al., 2006]. 35 / 61

36 Electronic Mail Surveillance Message Parsing Enron Background Electronic Mail Surveillance Message Parsing Term Weighting GD-CLS Benchmarks Clustering and Topic Extraction Topic Tracking (Through Time) Smoothing Effects Comparison Conclusions and References 36 / 61

37 Electronic Mail Surveillance Message Parsing inbox Collection Parsed inbox folder of all 150 accounts (users) via GTP (General Text Parser); 495-term stoplist used and extracted terms must appear in more than 1 and more than once globally. 37 / 61

38 Electronic Mail Surveillance Message Parsing private Collection Parsed all mail directories (of all 150 accounts) with the exception of all documents, calendar, contacts, deleted items, discussion threads, inbox, notes inbox, sent, sent items, and sent mail; 495-term stoplist used and extracted terms must appear in more than 1 and more than once globally. 38 / 61

39 Electronic Mail Surveillance Message Parsing private Collection Parsed all mail directories (of all 150 accounts) with the exception of all documents, calendar, contacts, deleted items, discussion threads, inbox, notes inbox, sent, sent items, and sent mail; 495-term stoplist used and extracted terms must appear in more than 1 and more than once globally. Distribution of messages sent in the year 2001: Month Msgs Terms Month Msgs Terms Jan 3,621 17,888 Jul 3,077 17,617 Feb 2,804 16,958 Aug 2,828 16,417 Mar 3,525 20,305 Sep 2,330 15,405 Apr 4,273 24,010 Oct 2,821 20,995 May 4,261 24,335 Nov 2,204 18,693 Jun 4,324 18,599 Dec 1,489 8, / 61

40 Electronic Mail Surveillance Term Weighting Term Weighting Schemes For m n term-by-message matrix X = [x ij ], define x ij = l ij g i d j, where l ij is the local weight for term i occurring in message j, g i is the global weight for term i in the subcollection, and d j is a document normalization factor (set d j = 1). 40 / 61

41 Electronic Mail Surveillance Term Weighting Term Weighting Schemes For m n term-by-message matrix X = [x ij ], define x ij = l ij g i d j, where l ij is the local weight for term i occurring in message j, g i is the global weight for term i in the subcollection, and d j is a document normalization factor (set d j = 1). Schemes used in parsing inbox and private subcollections: Name Local Global txx Term Frequency None l ij = f ij g i = 1 lex Logarithmic Entropy (Define: p ij = f ij / j f ij) l ij = log(1 + f ij ) g i = 1+ ( j p ijlog(p ij ))/ log n) 41 / 61

42 Electronic Mail Surveillance GD-CLS Benchmarks Computational Complexity Rank-50 NMF (X WH) computed on a 450MHz (Dual) UltraSPARC-II processor using 100 iterations: Mail Dictionary Time Collection Messages Terms λ (sec.) inbox 44, , , , , 521 private 65, , , , , / 61

43 Electronic Mail Surveillance Clustering and Topic Extraction private with Log-Entropy Weighting Identify rows of H from X WH or H k with λ = 0.1; r = 50 feature vectors (W k ) generated by GD-CLS: Feature Cluster Topic Dominant Index (k) Size Description Terms California ca, cpuc, gov, socalgas, sempra, org, sce, gmssr, aelaw, ci Louise Kitchen evp, fortune, named top britain, woman, woman by ceo, avon, fiorina, Fortune cfo, hewlett, packard Fantasy game, wr, qb, play, football rb, season, injury, updated, fantasy, image (Cluster size no. of H k elements > row max/10) 43 / 61

44 Electronic Mail Surveillance Clustering and Topic Extraction private with Log-Entropy Weighting Additional topic clusters of significant size: Feature Cluster Topic Dominant Index (k) Size Description Terms Texas UT, orange, longhorn longhorn[s], texas, football true, truorange, newsletter recruiting, oklahoma, defensive Enron partnership[s], collapse fastow, shares, sec, stock, shareholder, investors, equity, lay s dabhol, dpc, india, about India mseb, maharashtra, indian, lenders, delhi, foreign, minister 44 / 61

45 Electronic Mail Surveillance Topic Tracking (Through Time) 2001 Topics Tracked by GD-CLS r = 50 features, lex term weighting, λ = / 61

46 Electronic Mail Surveillance Smoothing Effects Comparison Two Penalty Term Formulation Introduce smoothing on W k (feature vectors) in addition to H k : min W,H { X WH 2 F + α W 2 F + β H 2 F }, where F is the Frobenius norm. 46 / 61

47 Electronic Mail Surveillance Smoothing Effects Comparison Two Penalty Term Formulation Introduce smoothing on W k (feature vectors) in addition to H k : min W,H { X WH 2 F + α W 2 F + β H 2 F }, where F is the Frobenius norm. Constrained NMF (CNMF) iteration [Piper et al., 2004]: H cj H cj (W T X ) cj βh cj (W T WH) cj + ɛ W ic W ic (XH T ) ic αw ic (WHH T ) ic + ɛ 47 / 61

48 Electronic Mail Surveillance Smoothing Effects Comparison Term Distribution in Feature Vectors Terms Wt Lambda Alpha Topics Blackouts Cal Stocks Collapse UT Texasfoot Chronicle Indian India Fastow Collapse Gas CFO Kitchen Californians Cal Solar Partnerships Collapse Workers Maharashtra India Mseb India Beach Ljm Collapse Tues IPPS Cal Rebates Ljm Collapse 48 / 61

49 Electronic Mail Surveillance Smoothing Effects Comparison British National Corpus (BNC) Noun Conservation In collaboration with P. Keila and D. Skillicorn (Queens Univ.) 289,695 subset (all mail folders - not just private) Smoothing solely applied to NMF W matrix (α = 0.001, 0.01, 0.1, 0.25, 0.50, 0.75, 1.00 with β = 0) Log-entropy term weighting applied to the term-by-message matrix X Monitor top ten nouns for each feature vector (ranked by descending component values) and extract those appearing in two or more features; topics assigned manually. 49 / 61

50 Electronic Mail Surveillance Smoothing Effects Comparison BNC Noun Distribution in Feature Vectors Noun GF Entropy Alpha Topic Waxman Downfall Lieberman Downfall Scandal Downfall Nominee(s) Barone Downfall MEADE Downfall Fichera California blackout Prabhu India-strong Tata India-weak Rupee(s) India-strong Soybean(s) Rushing Football - college Dlrs Janus India-weak BSES India-weak Caracas Escondido California/Blackout Promoters Energy/Scottish Aramco India-weak DOORMAN Bawdy/Real Estate 50 / 61

51 Conclusions and References Enron Background Electronic Mail Surveillance Conclusions and References Conclusions Future Work References 51 / 61

52 Conclusions and References Conclusions Conclusions GD-CLS Algorithm can effectively produce a parts-based approximation X WH of a sparse term-by-message matrix X. 52 / 61

53 Conclusions and References Conclusions Conclusions GD-CLS Algorithm can effectively produce a parts-based approximation X WH of a sparse term-by-message matrix X. Smoothing on the features matrix (W ) as opposed to the weight matrix H forces more reuse of higher weighted terms. 53 / 61

54 Conclusions and References Conclusions Conclusions GD-CLS Algorithm can effectively produce a parts-based approximation X WH of a sparse term-by-message matrix X. Smoothing on the features matrix (W ) as opposed to the weight matrix H forces more reuse of higher weighted terms. Surveillance systems based on GD-CLS could be used to monitor discussions without the need to isolate or perhaps incriminate individual employees. 54 / 61

55 Conclusions and References Conclusions Conclusions GD-CLS Algorithm can effectively produce a parts-based approximation X WH of a sparse term-by-message matrix X. Smoothing on the features matrix (W ) as opposed to the weight matrix H forces more reuse of higher weighted terms. Surveillance systems based on GD-CLS could be used to monitor discussions without the need to isolate or perhaps incriminate individual employees. Potential applications include the monitoring/tracking of company morale, employee feedback to policy decisions, and extracurricular activities 55 / 61

56 Conclusions and References Future Work Future Work Further work needed in determining effects of alternative term weighting schemes (for X ) and choices of control parameters (α, β, λ) on quality of the basis vectors W k. 56 / 61

57 Conclusions and References Future Work Future Work Further work needed in determining effects of alternative term weighting schemes (for X ) and choices of control parameters (α, β, λ) on quality of the basis vectors W k. How does document (or message) clustering change with different column ranks (r) in the matrix W? 57 / 61

58 Conclusions and References Future Work Future Work Further work needed in determining effects of alternative term weighting schemes (for X ) and choices of control parameters (α, β, λ) on quality of the basis vectors W k. How does document (or message) clustering change with different column ranks (r) in the matrix W? Use MailMiner and similar text mining software to produce a topic annotated Enron subset for the public domain. 58 / 61

59 Conclusions and References Future Work Future Work Further work needed in determining effects of alternative term weighting schemes (for X ) and choices of control parameters (α, β, λ) on quality of the basis vectors W k. How does document (or message) clustering change with different column ranks (r) in the matrix W? Use MailMiner and similar text mining software to produce a topic annotated Enron subset for the public domain. Explore use of NMF for automated gene classification; Semantic Gene Organizer (K. Heinrich, PhD Thesis 2006) 59 / 61

60 Conclusions and References References For Further Reading F. Shahnaz, M.W. Berry, V.P. Pauca, and R.J. Plemmons. Document Clustering Using Nonnegative Matrix Factorization. Information Processing & Management 42(2), 2006, pp J. Piper, P. Pauca, R. Plemmons, and M. Giffin. Object Characterization from Spectral Data using Independent Component Analysis and Information Theory. Proc. AMOS Technical Conference, Maui, HI, September W. Xu, X. Liu, and Y. Gong. Document-Clustering based on Non-Negative Matrix Factorization. Proceedings of SIGIR 03, July 28 - August 1, Toronto, CA, pp , / 61

61 Conclusions and References References SMD06 Text Mining Workshop Hyatt Regency Bethesda Bethesda, Maryland April 22, 2006 to be held in conjunction with Sixth SIAM International Conference on Data Mining (SDM 2006) and also in conjunction with SIAM's Link Analysis, Counterterrorism Security Workshop, which is also being held on April 22, Website: 61 / 61

Collaborators. NMF Origins

Collaborators. NMF Origins Collaborators Using Nonnegative Matrix and Tensor Factorizations for Email Surveillance Semi-Plenary Session IX, EURO XXII, Prague, CR Michael W. Berry, Murray Browne, and Brett W. Bader University of

More information

Surveillance Using Nonnegative Matrix Factorization

Surveillance Using Nonnegative Matrix Factorization Email Surveillance Using Nonnegative Matrix Factorization Michael W. Berry Murray Browne February 25, 2005 Abstract In this study, we apply a non-negative matrix factorization approach for the extraction

More information

Nonnegative Matrix Factorization. Data Mining

Nonnegative Matrix Factorization. Data Mining The Nonnegative Matrix Factorization in Data Mining Amy Langville langvillea@cofc.edu Mathematics Department College of Charleston Charleston, SC Yahoo! Research 10/18/2005 Outline Part 1: Historical Developments

More information

Collaborators. NNMF Origins

Collaborators. NNMF Origins Collaborators Using Nonnegative Matrix and Tensor Factorizations for Topic and Scenario Detection and Tracking Math and Stat Colloquim, Utah State University Michael W. Berry Department of Electrical Engineering

More information

Algorithms and Applications for Approximate Nonnegative Matrix Factorization

Algorithms and Applications for Approximate Nonnegative Matrix Factorization Algorithms and Applications for Approximate Nonnegative Matrix Factorization Michael W. Berry and Murray Browne Department of Computer Science, University of Tennessee, Knoxville, TN 37996-3450 Amy N.

More information

Algorithms and applications for approximate nonnegative matrix factorization

Algorithms and applications for approximate nonnegative matrix factorization Computational Statistics & Data Analysis 52 (2007) 155 173 www.elsevier.com/locate/csda Algorithms and applications for approximate nonnegative matrix factorization Michael W. Berry a,, Murray Browne a,

More information

Graph Inference with Imperfect Edge Classifiers

Graph Inference with Imperfect Edge Classifiers Graph Inference with Imperfect Edge Classifiers Michael W. Trosset Department of Statistics Indiana University Joint work with David Brinda (Yale University) and Shantanu Jain (Indiana University), supported

More information

Clustering. SVD and NMF

Clustering. SVD and NMF Clustering with the SVD and NMF Amy Langville Mathematics Department College of Charleston Dagstuhl 2/14/2007 Outline Fielder Method Extended Fielder Method and SVD Clustering with SVD vs. NMF Demos with

More information

MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION

MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION Ehsan Hosseini Asl 1, Jacek M. Zurada 1,2 1 Department of Electrical and Computer Engineering University of Louisville, Louisville,

More information

Preserving Privacy in Data Mining using Data Distortion Approach

Preserving Privacy in Data Mining using Data Distortion Approach Preserving Privacy in Data Mining using Data Distortion Approach Mrs. Prachi Karandikar #, Prof. Sachin Deshpande * # M.E. Comp,VIT, Wadala, University of Mumbai * VIT Wadala,University of Mumbai 1. prachiv21@yahoo.co.in

More information

Object Characterization from Spectral Data Using Nonnegative Factorization and Information Theory

Object Characterization from Spectral Data Using Nonnegative Factorization and Information Theory Object Characterization from Spectral Data Using Nonnegative Factorization and Information Theory J. Piper V. Paul Pauca Robert J. Plemmons Maile Giffin Abstract The identification and classification of

More information

Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds

Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds Jiho Yoo and Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 31 Hyoja-dong,

More information

CONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT

CONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT CONOLUTIE NON-NEGATIE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT Paul D. O Grady Barak A. Pearlmutter Hamilton Institute National University of Ireland, Maynooth Co. Kildare, Ireland. ABSTRACT Discovering

More information

Non-Negative Matrix Factorization

Non-Negative Matrix Factorization Chapter 3 Non-Negative Matrix Factorization Part 2: Variations & applications Geometry of NMF 2 Geometry of NMF 1 NMF factors.9.8 Data points.7.6 Convex cone.5.4 Projections.3.2.1 1.5 1.5 1.5.5 1 3 Sparsity

More information

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Tabitha Samuel, Master s Candidate Dr. Michael W. Berry, Major Professor Abstract: Increasingly

More information

Using Matrix Decompositions in Formal Concept Analysis

Using Matrix Decompositions in Formal Concept Analysis Using Matrix Decompositions in Formal Concept Analysis Vaclav Snasel 1, Petr Gajdos 1, Hussam M. Dahwa Abdulla 1, Martin Polovincak 1 1 Dept. of Computer Science, Faculty of Electrical Engineering and

More information

2019 Settlement Calendar for ASX Cash Market Products. ASX Settlement

2019 Settlement Calendar for ASX Cash Market Products. ASX Settlement 2019 Settlement Calendar for ASX Cash Market Products ASX Settlement Settlement Calendar for ASX Cash Market Products 1 ASX Settlement Pty Limited (ASX Settlement) operates a trade date plus two Business

More information

Non-Negative Matrix Factorization with Quasi-Newton Optimization

Non-Negative Matrix Factorization with Quasi-Newton Optimization Non-Negative Matrix Factorization with Quasi-Newton Optimization Rafal ZDUNEK, Andrzej CICHOCKI Laboratory for Advanced Brain Signal Processing BSI, RIKEN, Wako-shi, JAPAN Abstract. Non-negative matrix

More information

GAMINGRE 8/1/ of 7

GAMINGRE 8/1/ of 7 FYE 09/30/92 JULY 92 0.00 254,550.00 0.00 0 0 0 0 0 0 0 0 0 254,550.00 0.00 0.00 0.00 0.00 254,550.00 AUG 10,616,710.31 5,299.95 845,656.83 84,565.68 61,084.86 23,480.82 339,734.73 135,893.89 67,946.95

More information

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University TOPIC MODELING MODELS FOR TEXT DATA

More information

Tracking Accuracy: An Essential Step to Improve Your Forecasting Process

Tracking Accuracy: An Essential Step to Improve Your Forecasting Process Tracking Accuracy: An Essential Step to Improve Your Forecasting Process Presented by Eric Stellwagen President & Co-founder Business Forecast Systems, Inc. estellwagen@forecastpro.com Business Forecast

More information

Published by ASX Settlement Pty Limited A.B.N Settlement Calendar for ASX Cash Market Products

Published by ASX Settlement Pty Limited A.B.N Settlement Calendar for ASX Cash Market Products Published by Pty Limited A.B.N. 49 008 504 532 2012 Calendar for Cash Market Products Calendar for Cash Market Products¹ Pty Limited ( ) operates a trade date plus three Business (T+3) settlement discipline

More information

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion

More information

Lecture 5: Web Searching using the SVD

Lecture 5: Web Searching using the SVD Lecture 5: Web Searching using the SVD Information Retrieval Over the last 2 years the number of internet users has grown exponentially with time; see Figure. Trying to extract information from this exponentially

More information

9 Searching the Internet with the SVD

9 Searching the Internet with the SVD 9 Searching the Internet with the SVD 9.1 Information retrieval Over the last 20 years the number of internet users has grown exponentially with time; see Figure 1. Trying to extract information from this

More information

13 Searching the Web with the SVD

13 Searching the Web with the SVD 13 Searching the Web with the SVD 13.1 Information retrieval Over the last 20 years the number of internet users has grown exponentially with time; see Figure 1. Trying to extract information from this

More information

P7.7 A CLIMATOLOGICAL STUDY OF CLOUD TO GROUND LIGHTNING STRIKES IN THE VICINITY OF KENNEDY SPACE CENTER, FLORIDA

P7.7 A CLIMATOLOGICAL STUDY OF CLOUD TO GROUND LIGHTNING STRIKES IN THE VICINITY OF KENNEDY SPACE CENTER, FLORIDA P7.7 A CLIMATOLOGICAL STUDY OF CLOUD TO GROUND LIGHTNING STRIKES IN THE VICINITY OF KENNEDY SPACE CENTER, FLORIDA K. Lee Burns* Raytheon, Huntsville, Alabama Ryan K. Decker NASA, Marshall Space Flight

More information

Interaction on spectral data with: Kira Abercromby (NASA-Houston) Related Papers at:

Interaction on spectral data with: Kira Abercromby (NASA-Houston) Related Papers at: Low-Rank Nonnegative Factorizations for Spectral Imaging Applications Bob Plemmons Wake Forest University Collaborators: Christos Boutsidis (U. Patras), Misha Kilmer (Tufts), Peter Zhang, Paul Pauca, (WFU)

More information

Detect and Track Latent Factors with Online Nonnegative Matrix Factorization

Detect and Track Latent Factors with Online Nonnegative Matrix Factorization Detect and Track Latent Factors with Online Nonnegative Matrix Factorization Bin Cao 1, Dou Shen 2, Jian-Tao Sun 3, Xuanhui Wang 4, Qiang Yang 2 and Zheng Chen 3 1 Peking university, Beijing, China 2 Hong

More information

Introduction to Course

Introduction to Course .. Introduction to Course Oran Kittithreerapronchai 1 1 Department of Industrial Engineering, Chulalongkorn University Bangkok 10330 THAILAND last updated: September 17, 2016 COMP METH v2.00: intro 1/

More information

2017 Settlement Calendar for ASX Cash Market Products ASX SETTLEMENT

2017 Settlement Calendar for ASX Cash Market Products ASX SETTLEMENT 2017 Settlement Calendar for ASX Cash Market Products ASX SETTLEMENT Settlement Calendar for ASX Cash Market Products 1 ASX Settlement Pty Limited (ASX Settlement) operates a trade date plus two Business

More information

arxiv: v2 [math.oc] 6 Oct 2011

arxiv: v2 [math.oc] 6 Oct 2011 Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization Nicolas Gillis 1 and François Glineur 2 arxiv:1107.5194v2 [math.oc] 6 Oct 2011 Abstract Nonnegative

More information

Non-negative matrix factorization and term structure of interest rates

Non-negative matrix factorization and term structure of interest rates Non-negative matrix factorization and term structure of interest rates Hellinton H. Takada and Julio M. Stern Citation: AIP Conference Proceedings 1641, 369 (2015); doi: 10.1063/1.4906000 View online:

More information

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Tabitha Samuel, Master s Candidate Dr. Michael W. Berry, Major Professor What is the Parallel Computing

More information

DAILY QUESTIONS 28 TH JUNE 18 REASONING - CALENDAR

DAILY QUESTIONS 28 TH JUNE 18 REASONING - CALENDAR DAILY QUESTIONS 28 TH JUNE 18 REASONING - CALENDAR LEAP AND NON-LEAP YEAR *A non-leap year has 365 days whereas a leap year has 366 days. (as February has 29 days). *Every year which is divisible by 4

More information

Non-negative matrix factorization with fixed row and column sums

Non-negative matrix factorization with fixed row and column sums Available online at www.sciencedirect.com Linear Algebra and its Applications 9 (8) 5 www.elsevier.com/locate/laa Non-negative matrix factorization with fixed row and column sums Ngoc-Diep Ho, Paul Van

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 6 Non-Negative Matrix Factorization Rainer Gemulla, Pauli Miettinen May 23, 23 Non-Negative Datasets Some datasets are intrinsically non-negative: Counters (e.g., no. occurrences

More information

Non-Negative Matrix Factorization

Non-Negative Matrix Factorization Chapter 3 Non-Negative Matrix Factorization Part : Introduction & computation Motivating NMF Skillicorn chapter 8; Berry et al. (27) DMM, summer 27 2 Reminder A T U Σ V T T Σ, V U 2 Σ 2,2 V 2.8.6.6.3.6.5.3.6.3.6.4.3.6.4.3.3.4.5.3.5.8.3.8.3.3.5

More information

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing 1 Zhong-Yuan Zhang, 2 Chris Ding, 3 Jie Tang *1, Corresponding Author School of Statistics,

More information

DOZENALS. A project promoting base 12 counting and measuring. Ideas and designs by DSA member (#342) and board member, Timothy F. Travis.

DOZENALS. A project promoting base 12 counting and measuring. Ideas and designs by DSA member (#342) and board member, Timothy F. Travis. R AENBO DOZENALS A project promoting base 12 counting and measuring. Ideas and designs by DSA member (#342) and board member Timothy F. Travis. I became aware as a teenager of base twelve numbering from

More information

Dimension Reduction and Iterative Consensus Clustering

Dimension Reduction and Iterative Consensus Clustering Dimension Reduction and Iterative Consensus Clustering Southeastern Clustering and Ranking Workshop August 24, 2009 Dimension Reduction and Iterative 1 Document Clustering Geometry of the SVD Centered

More information

Four Basic Steps for Creating an Effective Demand Forecasting Process

Four Basic Steps for Creating an Effective Demand Forecasting Process Four Basic Steps for Creating an Effective Demand Forecasting Process Presented by Eric Stellwagen President & Cofounder Business Forecast Systems, Inc. estellwagen@forecastpro.com Business Forecast Systems,

More information

STATISTICAL FORECASTING and SEASONALITY (M. E. Ippolito; )

STATISTICAL FORECASTING and SEASONALITY (M. E. Ippolito; ) STATISTICAL FORECASTING and SEASONALITY (M. E. Ippolito; 10-6-13) PART I OVERVIEW The following discussion expands upon exponential smoothing and seasonality as presented in Chapter 11, Forecasting, in

More information

Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, Chi-square Statistic, and a Hybrid Method

Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, Chi-square Statistic, and a Hybrid Method Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, hi-square Statistic, and a Hybrid Method hris Ding a, ao Li b and Wei Peng b a Lawrence Berkeley National Laboratory,

More information

Fast Coordinate Descent methods for Non-Negative Matrix Factorization

Fast Coordinate Descent methods for Non-Negative Matrix Factorization Fast Coordinate Descent methods for Non-Negative Matrix Factorization Inderjit S. Dhillon University of Texas at Austin SIAM Conference on Applied Linear Algebra Valencia, Spain June 19, 2012 Joint work

More information

Determine the trend for time series data

Determine the trend for time series data Extra Online Questions Determine the trend for time series data Covers AS 90641 (Statistics and Modelling 3.1) Scholarship Statistics and Modelling Chapter 1 Essent ial exam notes Time series 1. The value

More information

2018 Annual Review of Availability Assessment Hours

2018 Annual Review of Availability Assessment Hours 2018 Annual Review of Availability Assessment Hours Amber Motley Manager, Short Term Forecasting Clyde Loutan Principal, Renewable Energy Integration Karl Meeusen Senior Advisor, Infrastructure & Regulatory

More information

Nonnegative Matrix Factorization

Nonnegative Matrix Factorization Nonnegative Matrix Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

EBS IT Meeting July 2016

EBS IT Meeting July 2016 EBS IT Meeting 18 19 July 2016 Conference Call Details Conference call: UK Numbers Tel: 0808 238 9819 or Tel: 0207 950 1251 Participant code: 4834 7876... Join online meeting https://meet.nationalgrid.com/antonio.delcastillozas/hq507d31

More information

1. Ignoring case, extract all unique words from the entire set of documents.

1. Ignoring case, extract all unique words from the entire set of documents. CS 378 Introduction to Data Mining Spring 29 Lecture 2 Lecturer: Inderjit Dhillon Date: Jan. 27th, 29 Keywords: Vector space model, Latent Semantic Indexing(LSI), SVD 1 Vector Space Model The basic idea

More information

Examination of Initialization Techniques for Nonnegative Matrix Factorization

Examination of Initialization Techniques for Nonnegative Matrix Factorization Georgia State University ScholarWorks @ Georgia State University Mathematics Theses Department of Mathematics and Statistics 11-21-2008 Examination of Initialization Techniques for Nonnegative Matrix Factorization

More information

USING SINGULAR VALUE DECOMPOSITION (SVD) AS A SOLUTION FOR SEARCH RESULT CLUSTERING

USING SINGULAR VALUE DECOMPOSITION (SVD) AS A SOLUTION FOR SEARCH RESULT CLUSTERING POZNAN UNIVE RSIY OF E CHNOLOGY ACADE MIC JOURNALS No. 80 Electrical Engineering 2014 Hussam D. ABDULLA* Abdella S. ABDELRAHMAN* Vaclav SNASEL* USING SINGULAR VALUE DECOMPOSIION (SVD) AS A SOLUION FOR

More information

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE 5290 - Artificial Intelligence Grad Project Dr. Debasis Mitra Group 6 Taher Patanwala Zubin Kadva Factor Analysis (FA) 1. Introduction Factor

More information

WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and Rainfall For Selected Arizona Cities

WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and Rainfall For Selected Arizona Cities WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and 2001-2002 Rainfall For Selected Arizona Cities Phoenix Tucson Flagstaff Avg. 2001-2002 Avg. 2001-2002 Avg. 2001-2002 October 0.7 0.0

More information

Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning

Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning LIU, LU, GU: GROUP SPARSE NMF FOR MULTI-MANIFOLD LEARNING 1 Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning Xiangyang Liu 1,2 liuxy@sjtu.edu.cn Hongtao Lu 1 htlu@sjtu.edu.cn

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

An Optimization-based Approach to Decentralized Assignability

An Optimization-based Approach to Decentralized Assignability 2016 American Control Conference (ACC) Boston Marriott Copley Place July 6-8, 2016 Boston, MA, USA An Optimization-based Approach to Decentralized Assignability Alborz Alavian and Michael Rotkowitz Abstract

More information

ENGINE SERIAL NUMBERS

ENGINE SERIAL NUMBERS ENGINE SERIAL NUMBERS The engine number was also the serial number of the car. Engines were numbered when they were completed, and for the most part went into a chassis within a day or so. However, some

More information

Nonnegative Matrix Factorization with Toeplitz Penalty

Nonnegative Matrix Factorization with Toeplitz Penalty Journal of Informatics and Mathematical Sciences Vol. 10, Nos. 1 & 2, pp. 201 215, 2018 ISSN 0975-5748 (online); 0974-875X (print) Published by RGN Publications http://dx.doi.org/10.26713/jims.v10i1-2.851

More information

Research Article Comparative Features Extraction Techniques for Electrocardiogram Images Regression

Research Article Comparative Features Extraction Techniques for Electrocardiogram Images Regression Research Journal of Applied Sciences, Engineering and Technology (): 6, 7 DOI:.96/rjaset..6 ISSN: -79; e-issn: -767 7 Maxwell Scientific Organization Corp. Submitted: September 8, 6 Accepted: November,

More information

Technical note on seasonal adjustment for Capital goods imports

Technical note on seasonal adjustment for Capital goods imports Technical note on seasonal adjustment for Capital goods imports July 1, 2013 Contents 1 Capital goods imports 2 1.1 Additive versus multiplicative seasonality..................... 2 2 Steps in the seasonal

More information

2003 Water Year Wrap-Up and Look Ahead

2003 Water Year Wrap-Up and Look Ahead 2003 Water Year Wrap-Up and Look Ahead Nolan Doesken Colorado Climate Center Prepared by Odie Bliss http://ccc.atmos.colostate.edu Colorado Average Annual Precipitation Map South Platte Average Precipitation

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 PCA, NMF Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Summary: PCA PCA is SVD

More information

Monthly Magnetic Bulletin

Monthly Magnetic Bulletin BRITISH GEOLOGICAL SURVEY Ascension Island Observatory Monthly Magnetic Bulletin December 2008 08/12/AS Crown copyright; Ordnance Survey ASCENSION ISLAND OBSERVATORY MAGNETIC DATA 1. Introduction Ascension

More information

GTR # VLTs GTR/VLT/Day %Δ:

GTR # VLTs GTR/VLT/Day %Δ: MARYLAND CASINOS: MONTHLY REVENUES TOTAL REVENUE, GROSS TERMINAL REVENUE, WIN/UNIT/DAY, TABLE DATA, AND MARKET SHARE CENTER FOR GAMING RESEARCH, DECEMBER 2017 Executive Summary Since its 2010 casino debut,

More information

Nonnegative Matrix Factorization with Orthogonality Constraints

Nonnegative Matrix Factorization with Orthogonality Constraints Nonnegative Matrix Factorization with Orthogonality Constraints Jiho Yoo and Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology (POSTECH) Pohang, Republic

More information

Regularized NNLS Algorithms for Nonnegative Matrix Factorization with Application to Text Document Clustering

Regularized NNLS Algorithms for Nonnegative Matrix Factorization with Application to Text Document Clustering Regularized NNLS Algorithms for Nonnegative Matrix Factorization with Application to Text Document Clustering Rafal Zdunek Abstract. Nonnegative Matrix Factorization (NMF) has recently received much attention

More information

Manning & Schuetze, FSNLP (c) 1999,2000

Manning & Schuetze, FSNLP (c) 1999,2000 558 15 Topics in Information Retrieval (15.10) y 4 3 2 1 0 0 1 2 3 4 5 6 7 8 Figure 15.7 An example of linear regression. The line y = 0.25x + 1 is the best least-squares fit for the four points (1,1),

More information

Winter Season Resource Adequacy Analysis Status Report

Winter Season Resource Adequacy Analysis Status Report Winter Season Resource Adequacy Analysis Status Report Tom Falin Director Resource Adequacy Planning Markets & Reliability Committee October 26, 2017 Winter Risk Winter Season Resource Adequacy and Capacity

More information

Year 9 Science Work Rate Calendar Term

Year 9 Science Work Rate Calendar Term Year Science Work Rate Calendar Term 0 that weekly contact is maintained with each teacher through email, Blackboard, phone, return of work, etc. in addition to the return of work outlined below. topics

More information

Variable Latent Semantic Indexing

Variable Latent Semantic Indexing Variable Latent Semantic Indexing Prabhakar Raghavan Yahoo! Research Sunnyvale, CA November 2005 Joint work with A. Dasgupta, R. Kumar, A. Tomkins. Yahoo! Research. Outline 1 Introduction 2 Background

More information

Non-negative Matrix Factorization: Algorithms, Extensions and Applications

Non-negative Matrix Factorization: Algorithms, Extensions and Applications Non-negative Matrix Factorization: Algorithms, Extensions and Applications Emmanouil Benetos www.soi.city.ac.uk/ sbbj660/ March 2013 Emmanouil Benetos Non-negative Matrix Factorization March 2013 1 / 25

More information

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Bhargav Kanagal Department of Computer Science University of Maryland College Park, MD 277 bhargav@cs.umd.edu Vikas

More information

* Matrix Factorization and Recommendation Systems

* Matrix Factorization and Recommendation Systems Matrix Factorization and Recommendation Systems Originally presented at HLF Workshop on Matrix Factorization with Loren Anderson (University of Minnesota Twin Cities) on 25 th September, 2017 15 th March,

More information

Manning & Schuetze, FSNLP, (c)

Manning & Schuetze, FSNLP, (c) page 554 554 15 Topics in Information Retrieval co-occurrence Latent Semantic Indexing Term 1 Term 2 Term 3 Term 4 Query user interface Document 1 user interface HCI interaction Document 2 HCI interaction

More information

YACT (Yet Another Climate Tool)? The SPI Explorer

YACT (Yet Another Climate Tool)? The SPI Explorer YACT (Yet Another Climate Tool)? The SPI Explorer Mike Crimmins Assoc. Professor/Extension Specialist Dept. of Soil, Water, & Environmental Science The University of Arizona Yes, another climate tool for

More information

Quick Introduction to Nonnegative Matrix Factorization

Quick Introduction to Nonnegative Matrix Factorization Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices

More information

Lecture Notes 10: Matrix Factorization

Lecture Notes 10: Matrix Factorization Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To

More information

Tornado Hazard Risk Analysis: A Report for Rutherford County Emergency Management Agency

Tornado Hazard Risk Analysis: A Report for Rutherford County Emergency Management Agency Tornado Hazard Risk Analysis: A Report for Rutherford County Emergency Management Agency by Middle Tennessee State University Faculty Lisa Bloomer, Curtis Church, James Henry, Ahmad Khansari, Tom Nolan,

More information

Sunspot Index and Long-term Solar Observations World Data Center supported by the ICSU - WDS

Sunspot Index and Long-term Solar Observations World Data Center supported by the ICSU - WDS Sunspot Index and Long-term Solar Observations World Data Center supported by the ICSU - WDS 2016 n 7 WARNING OF MAJOR DATA CHANGE Over the past 4 years a community effort has been carried out to revise

More information

Corn Basis Information By Tennessee Crop Reporting District

Corn Basis Information By Tennessee Crop Reporting District UT EXTENSION THE UNIVERSITY OF TENNESSEE INSTITUTE OF AGRICULTURE AE 05-13 Corn Basis Information By Tennessee Crop Reporting District 1994-2003 Delton C. Gerloff, Professor The University of Tennessee

More information

Simultaneous Discovery of Common and Discriminative Topics via Joint Nonnegative Matrix Factorization

Simultaneous Discovery of Common and Discriminative Topics via Joint Nonnegative Matrix Factorization Simultaneous Discovery of Common and Discriminative Topics via Joint Nonnegative Matrix Factorization Hannah Kim 1, Jaegul Choo 2, Jingu Kim 3, Chandan K. Reddy 4, Haesun Park 1 Georgia Tech 1, Korea University

More information

Recommendation Systems

Recommendation Systems Recommendation Systems Popularity Recommendation Systems Predicting user responses to options Offering news articles based on users interests Offering suggestions on what the user might like to buy/consume

More information

Multivariate Regression Model Results

Multivariate Regression Model Results Updated: August, 0 Page of Multivariate Regression Model Results 4 5 6 7 8 This exhibit provides the results of the load model forecast discussed in Schedule. Included is the forecast of short term system

More information

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University Lecture 15 20 Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University Modeling for Time Series Forecasting Forecasting is a necessary input to planning, whether in business,

More information

2018 Technical Standards and Safety Authority and Ministry of Advanced Education & Skills Development Examination Schedule

2018 Technical Standards and Safety Authority and Ministry of Advanced Education & Skills Development Examination Schedule Ministry of Advanced Education & Skills Development ination Schedule Last Updated: June 7, 2018 Document Uncontrolled if Printed ination Schedule Preface: The following sets out and identifies the 2018

More information

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang Matrix Factorization & Latent Semantic Analysis Review Yize Li, Lanbo Zhang Overview SVD in Latent Semantic Indexing Non-negative Matrix Factorization Probabilistic Latent Semantic Indexing Vector Space

More information

BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7

BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7 BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7 1. The definitions follow: (a) Time series: Time series data, also known as a data series, consists of observations on a

More information

The 2010/11 drought in the Horn of Africa: Monitoring and forecasts using ECMWF products

The 2010/11 drought in the Horn of Africa: Monitoring and forecasts using ECMWF products The 2010/11 drought in the Horn of Africa: Monitoring and forecasts using ECMWF products Emanuel Dutra Fredrik Wetterhall Florian Pappenberger Souhail Boussetta Gianpaolo Balsamo Linus Magnusson Slide

More information

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net

More information

Calculations Equation of Time. EQUATION OF TIME = apparent solar time - mean solar time

Calculations Equation of Time. EQUATION OF TIME = apparent solar time - mean solar time Calculations Equation of Time APPARENT SOLAR TIME is the time that is shown on sundials. A MEAN SOLAR DAY is a constant 24 hours every day of the year. Apparent solar days are measured from noon one day

More information

Nonnegative Matrix Factorization Clustering on Multiple Manifolds

Nonnegative Matrix Factorization Clustering on Multiple Manifolds Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Nonnegative Matrix Factorization Clustering on Multiple Manifolds Bin Shen, Luo Si Department of Computer Science,

More information

Technical note on seasonal adjustment for M0

Technical note on seasonal adjustment for M0 Technical note on seasonal adjustment for M0 July 1, 2013 Contents 1 M0 2 2 Steps in the seasonal adjustment procedure 3 2.1 Pre-adjustment analysis............................... 3 2.2 Seasonal adjustment.................................

More information

Reconstruction of Reflectance Spectra Using Robust Nonnegative Matrix Factorization

Reconstruction of Reflectance Spectra Using Robust Nonnegative Matrix Factorization IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 9, SEPTEMBER 2006 3637 Reconstruction of Reflectance Spectra Using Robust Nonnegative Matrix Factorization A. Ben Hamza and David J. Brady Abstract

More information

PRELIMINARY DRAFT FOR DISCUSSION PURPOSES

PRELIMINARY DRAFT FOR DISCUSSION PURPOSES Memorandum To: David Thompson From: John Haapala CC: Dan McDonald Bob Montgomery Date: February 24, 2003 File #: 1003551 Re: Lake Wenatchee Historic Water Levels, Operation Model, and Flood Operation This

More information

Semantics with Dense Vectors. Reference: D. Jurafsky and J. Martin, Speech and Language Processing

Semantics with Dense Vectors. Reference: D. Jurafsky and J. Martin, Speech and Language Processing Semantics with Dense Vectors Reference: D. Jurafsky and J. Martin, Speech and Language Processing 1 Semantics with Dense Vectors We saw how to represent a word as a sparse vector with dimensions corresponding

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information

arxiv: v1 [stat.ml] 23 Dec 2015

arxiv: v1 [stat.ml] 23 Dec 2015 k-means Clustering Is Matrix Factorization Christian Bauckhage arxiv:151.07548v1 [stat.ml] 3 Dec 015 B-IT, University of Bonn, Bonn, Germany Fraunhofer IAIS, Sankt Augustin, Germany http://mmprec.iais.fraunhofer.de/bauckhage.html

More information

On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering

On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering Chris Ding Xiaofeng He Horst D. Simon Abstract Current nonnegative matrix factorization (NMF) deals with X = FG T type. We

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 05 Semi-Discrete Decomposition Rainer Gemulla, Pauli Miettinen May 16, 2013 Outline 1 Hunting the Bump 2 Semi-Discrete Decomposition 3 The Algorithm 4 Applications SDD alone SVD

More information