Vertex Nomination via Attributed Random Dot Product Graphs

Size: px
Start display at page:

Download "Vertex Nomination via Attributed Random Dot Product Graphs"

Transcription

1 Vertex Nomination via Attributed Random Dot Product Graphs Carey E. Priebe Department of Applied Mathematics & Statistics Johns Hopkins University 58th Session of the International Statistical Institute August 21-26, 2011 Dublin, Ireland David J. Marchette Glen A. Coppersmith 1 / 20

2 Vertex Nomination V \M κ n,m,m,r,p,s p M s p 2 / 20

3 Framework I A graph is a pair G = (V, E) of vertices V = {1,..., n} and edges E V (2), where V (2) denotes the set of unordered pairs of vertices. We define an attributed graph G a = (V, E a, ϕ V, ϕ E ) as follows. Let Φ V = {1,..., K V } and Φ E = {0, 1,..., K E }, and consider vertex attribution function ϕ V : V Φ V and edge attribution function ϕ E : V (2) Φ E. These functions attribute each vertex (resp. edge) of G a with an element of Φ V (resp. Φ E ). The edge set E a for G a is given by E a = {vw : ϕ(vw) > 0}; we write v w to mean that vw E a. For simplicity, we will assume that K E = K V = 2, so that there are two distinguished attributes for each of the vertices and edges. 3 / 20

4 Framework II For the vertex nomination problem, we write M = {v : ϕ V (v) = 1}; this is the collection of vertices of particular interest. We observe not G a = (V, E a, ϕ V, ϕ E ) but G o = (V, E a, ϕ V, ϕ E), where ϕ V : V Φ V {0} and ϕ V (v) = ϕ V (v) = 1 for v M M and ϕ V (v) = 0 for v V \ M. Thus we observe a subset M of cardinality m = M of the vertices of particular interest, and we wish to infer others we wish to nominate vertices from V \ M which are, we hope, in M \ M. 4 / 20

5 Vertex Nomination Our model is a κ graph with n = V vertices, m = M of which have attribute red and n m = V \ M of which have attribute green. We observe attributes for only m = M identified set vertices (filled circles) with the remaining n m = V \ M candidate set vertices (open circles) having occluded attributes. Edges with attribute green are of the not of interest topic (2) and red edges are the interesting topic (1). Pairs of red vertices (regardless of occlusion) are connected according to probability distribution over topics s, while pairs of vertices where at least one is labeled green are connected according to p. The vertex nomination task is to select one vertex from the candidate set V \ M (the vertices with occluded attributes, shown here as open circles) that is in M \ M (truly red) with high probability. 5 / 20

6 Vertex Nomination V \M κ n,m,m,r,p,s p M s p M M\M V \M M V \M V 6 / 20

7 Methods I Random dot product graph (RDPG) models are a special case of latent position models. These latent position models posit a social space associated with the vertices, where the relative positions in this social space determine the relationship (edge) probabilities. We will first give the basic definitions, then discuss estimation in the model. The basic idea is that each vertex v has associated with it a vector x v R d, and these vectors determine the edge probabilities of the random graph via P [v w] = x T v x w. (The vectors must satisfy 0 x T v x w 1.) We denote by X the n d matrix whose v th row corresponds to x v. 7 / 20

8 Methods II Given a graph G = (V, E) and a target dimensionality d for the latent vectors, we can obtain an estimate X in a single step as follows: 1. Let à = A + D, where D = deg(v)/(n 1) is the diagonal matrix with the normalized vertex degrees on the diagonal. 2. Compute the eigenvectors U and eigenvalues Λ of Ã, and set all negative entries in Λ to zero. 3. X = U1,...,d Λ1,...,d. This yields X = arg min X à XT X F. Imputing the diagonal of à allows for a one-step solution. The justification for the choice deg(v)/(n 1) is based on the observation that E[deg(v)] = w v xt v x w ; if all the vectors are the same (x w = x v w) then E[deg(v)] = (n 1)x T v x v. Under the assumption that the latent vectors are random variables and are independent and identically distributed, E[deg(v)] (n 1)E[Xv T X v ] by Cauchy-Schwarz. 8 / 20

9 Methods III We extend the RDPG model to allow for edge attributes in a very natural way. The idea is to posit that the edge existence and edge attributes are fundamentally tied. Intuitively, we are modeling the edges as if they were communications, with the attributes corresponding to topics, and the vectors in the RDPG model encoding the interest level of each individual in each of the topics. Let us consider the case where we allocate one dimension per edge attribute. Since Φ E = {0, 1,, K}, we define X to be an n K matrix where each row x v statisfies x vk 0 for all k and k x vk 1. (This is sufficient, but not necessary, to guarantee that the vectors satisfy 0 x T v x w 1.) Then the attributed RDPG model ARDPG(X) is given by P [v w] = x T v x w and P [ϕ E (vw) = k v w] = x vkx wk x T v x w for k = 1,, K. 9 / 20

10 Thus, Methods IV P [v w ϕ E (vw) = k] = x vk x wk. That is, the random variable ϕ E (vw) is distributed according to the discrete distribution K ϕ E (vw) D({0, 1,, K}, [1 x vk x wk, x v1 x w1,, x vk x wk ]). k=1 10 / 20

11 Methods V If the matrix X is itself random for instance, if the individual vertex vectors (rows of X) are obtained by first drawing X v Dirichlet(α v ) where the α v R K+1 + so that the vectors X v satisfy the constraint that X T v X w [0, 1] then the random variables ϕ E (vw) for ARDPG(X) are not independent but are conditionally independent, given X. (In a slight abuse of standard notation, we will consider X v Dirichlet(α v ) with α v R K+1 + to be a length K random vector, with X vk 0 for all k and k X vk 1; that is, we drop the (K + 1)st element of a standard Dirichlet in which X v(k+1) = 1 K k=1 X vk.) It is straightforward to extend this model to allow the vectors corresponding to each attribute to be multi-dimensional. 11 / 20

12 Methods VI We modify the unattributed RDPG algorithm to produce an estimation of the vectors in the attributed case: 1. Let L(G, d) denote the linear algebra approach to fit vectors of dimension d to the unattributed graph G. Thus L(G, d) is an n d matrix X that is the estimate of the RDPG vectors (the parameters of the RDPG model) that produced G. 2. Given attributed graph G a, for k {1,..., K} let G a k = (V, Ek a) where Ea k = {vw : ϕ E(vw) = k} is the subset of edges in E a with attribute k. 3. ˆx k = L(G a k, 1). 4. Xo = (ˆx 1,..., ˆx K ). We can ensure that the resulting vectors are in the first orthant, although post-processing is necessary if we demand that the vectors be in the simplex (or the unit ball) so that x T v x w [0, 1]. The simplicity of the algorithm is due to the decomposition Step 2. NB: dependency is lost, as we process each G k independently. 12 / 20

13 Methods VII Given an observed attributed graph G o with only some of the vertex attributes observed, the vertex nomination procedure we propose is as follows. (1) Fit an attributed RDPG, obtaining X o. Notice that this estimate does not use vertex attributes. (2) Using the subset M of vertices with observed vertex attributes, rank the candidate vertices V \ M according to their relationship to M. Having recovered latent space representations for the vertices in (1), the inferential step (2) may be performed in numerous ways; we present a few simple approaches in the experiments below. 13 / 20

14 Simulation Model I We define an attributed RDPG model κ(n, π V \M, m, π M, r) for the vertex nomination task as follows. Let π V \M and π M be in the standard (unit) K-simplex K R K+1 ; that is, they are (K + 1)-dimensional probability vectors. Let n > m > 0 and r > 0 be given. Consider and X V \M iid Dirichlet(rπ V \M + 1) X M iid Dirichlet(rπ M + 1) to be (n m) K and m K-dimensional matrices, respectively. Write X = (X V \M, X M ) as the n K matrix of (random) vertex vectors. Then κ(n, π V \M, m, π M, r) = ARDPG(X) is our attributed random dot product graph model, with each dimension corresponding to an edge attribute as described above. Again, the extension to multi-dimensional vertex vectors for each attribute is straightforward. 14 / 20

15 Simulation Model II G a κ(n, π V \M, m, π M, r), and for 0 < m < m the observed attributed graph G o κ(n, π V \M, m, π M, r; m ), involves selecting a random subset M M all n m vertex attributes for V \ M are missing and m m of the vertex attributes for M are missing completely at random. For our simulation experiments, we consider π V \M = [0.2, 0.2, 0.6] T and π M = [q, 0.2, 0.8 q] T with q ranging from 0.2 (no signal) to 0.8. The parameter r controls the variability of the Dirichlet random vectors; r = 0 gives a uniform distribution on the simplex (no signal) and r yields point mass (no variability); we use r = 100. We consider m = 10 and m = 5, and consider two cases for n (100 and 250). 15 / 20

16 Simulation Performance Criteria Tabular simulation results are presented, based on 1000 Monte Carlo replicates. We obtain the estimate of the latent vertex vector matrix X o as described above, and then rank the candidate vertices V \ M and evaluate based on (1) p = P [v M \ M ] where v is the top-ranked candidate vertex and (2) Normalized Sum of Reciprocal Ranks given by NSRR = v M\M 1 / rank(v) ( m m i=1 ) 1. i Our candidate vertex rankings, based on the latent space estimate X o, are given by (1) the number N(v) of observed signal vertices M amongst the m nearest neighbors of v and (2) the posteriors ρ(v) from a linear discriminant analysis classifier. 16 / 20

17 Simulation Results n = 100 n = 250 N(v) ρ(v) N(v) ρ(v) q p NSRR p NSRR p NSRR p NSRR Vertex nomination results as a function of q, based on 1000 Monte Carlo replicates. As expected, the performance improves as q increases. Note that in this case the performance improves as n increases. While the latent vertex vector estimation improves as n increases, larger n (for fixed m, m ) results in a harder vertex nomination problem. This trade-off deserves further study. 17 / 20

18 Enron Results Vertex nomination results on two Enron graphs, G a T 0 (Sep Feb , left) and G a T 1 (May Sep , right). The symbols correspond to M (squares) and V \ M (circles). A two-sample Wilcoxon rank sum test on the ranks r v of the nearest element of M to v yields p < 0.01 for G a T 1 and p 0.7 for G a T 0 statistically significant signal (M vs. V \ M) is identified for T 1 but not for T / 20

19 Conclusions & Discussion V \M κ n,m,m,r,p,s p M s p 19 / 20

20 Kronecker Quote The wealth of your practical experience with sane and interesting problems will give to mathematics a new direction and a new impetus. Leopold Kronecker to Hermann von Helmholtz 20 / 20

Optimizing the Quantity/Quality Trade-off in Connectome Inference

Optimizing the Quantity/Quality Trade-off in Connectome Inference Optimizing the Quantity/Quality Trade-off in Carey E. Priebe Department of Applied Mathematics & Statistics Johns Hopkins University October 31, 2011 Motivation The connections made by cortical brain cells

More information

arxiv: v1 [stat.ml] 29 Jul 2012

arxiv: v1 [stat.ml] 29 Jul 2012 arxiv:1207.6745v1 [stat.ml] 29 Jul 2012 Universally Consistent Latent Position Estimation and Vertex Classification for Random Dot Product Graphs Daniel L. Sussman, Minh Tang, Carey E. Priebe Johns Hopkins

More information

Graph Inference with Imperfect Edge Classifiers

Graph Inference with Imperfect Edge Classifiers Graph Inference with Imperfect Edge Classifiers Michael W. Trosset Department of Statistics Indiana University Joint work with David Brinda (Yale University) and Shantanu Jain (Indiana University), supported

More information

Manifold Learning for Subsequent Inference

Manifold Learning for Subsequent Inference Manifold Learning for Subsequent Inference Carey E. Priebe Johns Hopkins University June 20, 2018 DARPA Fundamental Limits of Learning (FunLoL) Los Angeles, California http://arxiv.org/abs/1806.01401 Key

More information

Statistical Inference on Random Graphs: Comparative Power Analyses via Monte Carlo

Statistical Inference on Random Graphs: Comparative Power Analyses via Monte Carlo Statistical Inference on Random Graphs: Comparative Power Analyses via Monte Carlo Henry Pao 1, Glen A. Coppersmith 2, and Carey E. Priebe 1 Johns Hopkins University 1 Department of Applied Mathematics

More information

Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS014) p.4149

Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS014) p.4149 Int. Statistical Inst.: Proc. 58th orld Statistical Congress, 011, Dublin (Session CPS014) p.4149 Invariant heory for Hypothesis esting on Graphs Priebe, Carey Johns Hopkins University, Applied Mathematics

More information

Scan Statistics on Enron Graphs

Scan Statistics on Enron Graphs Scan Statistics on Enron Graphs Carey E. Priebe cep@jhu.edu Johns Hopkins University Department of Applied Mathematics & Statistics Department of Computer Science Center for Imaging Science The wealth

More information

Two-sample hypothesis testing for random dot product graphs

Two-sample hypothesis testing for random dot product graphs Two-sample hypothesis testing for random dot product graphs Minh Tang Department of Applied Mathematics and Statistics Johns Hopkins University JSM 2014 Joint work with Avanti Athreya, Vince Lyzinski,

More information

A central limit theorem for an omnibus embedding of random dot product graphs

A central limit theorem for an omnibus embedding of random dot product graphs A central limit theorem for an omnibus embedding of random dot product graphs Keith Levin 1 with Avanti Athreya 2, Minh Tang 2, Vince Lyzinski 3 and Carey E. Priebe 2 1 University of Michigan, 2 Johns

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

DECISION MAKING SUPPORT AND EXPERT SYSTEMS

DECISION MAKING SUPPORT AND EXPERT SYSTEMS 325 ITHEA DECISION MAKING SUPPORT AND EXPERT SYSTEMS UTILITY FUNCTION DESIGN ON THE BASE OF THE PAIRED COMPARISON MATRIX Stanislav Mikoni Abstract: In the multi-attribute utility theory the utility functions

More information

Attribute fusion in a latent process model for time series of graphs

Attribute fusion in a latent process model for time series of graphs 1 Attribute fusion in a latent process model for time series of graphs Minh Tang, Youngser Park, Nam H. Lee, and Carey E. Priebe* Abstract Hypothesis testing on time series of attributed graphs has applications

More information

x y x y 15 y is directly proportional to x. a Draw the graph of y against x.

x y x y 15 y is directly proportional to x. a Draw the graph of y against x. 3 8.1 Direct proportion 1 x 2 3 5 10 12 y 6 9 15 30 36 B a Draw the graph of y against x. y 40 30 20 10 0 0 5 10 15 20 x b Write down a rule for y in terms of x.... c Explain why y is directly proportional

More information

On the conditionality principle in pattern recognition

On the conditionality principle in pattern recognition On the conditionality principle in pattern recognition Carey E. Priebe cep@jhu.edu Johns Hopkins University Department of Applied Mathematics & Statistics Department of Computer Science Center for Imaging

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Preliminaries and Complexity Theory

Preliminaries and Complexity Theory Preliminaries and Complexity Theory Oleksandr Romanko CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 16, 2006 Introduction Book structure: 2 Part I Linear Algebra

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G.

26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G. 10-708: Probabilistic Graphical Models, Spring 2015 26 : Spectral GMs Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G. 1 Introduction A common task in machine learning is to work with

More information

LINEAR ALGEBRA KNOWLEDGE SURVEY

LINEAR ALGEBRA KNOWLEDGE SURVEY LINEAR ALGEBRA KNOWLEDGE SURVEY Instructions: This is a Knowledge Survey. For this assignment, I am only interested in your level of confidence about your ability to do the tasks on the following pages.

More information

Modeling conditional distributions with mixture models: Theory and Inference

Modeling conditional distributions with mixture models: Theory and Inference Modeling conditional distributions with mixture models: Theory and Inference John Geweke University of Iowa, USA Journal of Applied Econometrics Invited Lecture Università di Venezia Italia June 2, 2005

More information

Sparse representation classification and positive L1 minimization

Sparse representation classification and positive L1 minimization Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng

More information

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar

More information

Chapter 0: Preliminaries

Chapter 0: Preliminaries Chapter 0: Preliminaries Adam Sheffer March 28, 2016 1 Notation This chapter briefly surveys some basic concepts and tools that we will use in this class. Graduate students and some undergrads would probably

More information

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) 0 overview Our Contributions: 1 overview Our Contributions: A near optimal low-rank

More information

Predicting unobserved links in incompletely observed networks

Predicting unobserved links in incompletely observed networks Computational Statistics & Data Analysis 52 (2008) 1373 1386 www.elsevier.com/locate/csda Predicting unobserved links in incompletely observed networks David J. Marchette a,, Carey E. Priebe b a Naval

More information

1 Number Systems and Errors 1

1 Number Systems and Errors 1 Contents 1 Number Systems and Errors 1 1.1 Introduction................................ 1 1.2 Number Representation and Base of Numbers............. 1 1.2.1 Normalized Floating-point Representation...........

More information

18.312: Algebraic Combinatorics Lionel Levine. Lecture 19

18.312: Algebraic Combinatorics Lionel Levine. Lecture 19 832: Algebraic Combinatorics Lionel Levine Lecture date: April 2, 20 Lecture 9 Notes by: David Witmer Matrix-Tree Theorem Undirected Graphs Let G = (V, E) be a connected, undirected graph with n vertices,

More information

Gaussian Models

Gaussian Models Gaussian Models ddebarr@uw.edu 2016-04-28 Agenda Introduction Gaussian Discriminant Analysis Inference Linear Gaussian Systems The Wishart Distribution Inferring Parameters Introduction Gaussian Density

More information

I. Multiple Choice Questions (Answer any eight)

I. Multiple Choice Questions (Answer any eight) Name of the student : Roll No : CS65: Linear Algebra and Random Processes Exam - Course Instructor : Prashanth L.A. Date : Sep-24, 27 Duration : 5 minutes INSTRUCTIONS: The test will be evaluated ONLY

More information

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Shuyang Ling Courant Institute of Mathematical Sciences, NYU Aug 13, 2018 Joint

More information

The Matrix-Tree Theorem

The Matrix-Tree Theorem The Matrix-Tree Theorem Christopher Eur March 22, 2015 Abstract: We give a brief introduction to graph theory in light of linear algebra. Our results culminates in the proof of Matrix-Tree Theorem. 1 Preliminaries

More information

Foundations of Computer Vision

Foundations of Computer Vision Foundations of Computer Vision Wesley. E. Snyder North Carolina State University Hairong Qi University of Tennessee, Knoxville Last Edited February 8, 2017 1 3.2. A BRIEF REVIEW OF LINEAR ALGEBRA Apply

More information

Math 126 Lecture 8. Summary of results Application to the Laplacian of the cube

Math 126 Lecture 8. Summary of results Application to the Laplacian of the cube Math 126 Lecture 8 Summary of results Application to the Laplacian of the cube Summary of key results Every representation is determined by its character. If j is the character of a representation and

More information

Data Preprocessing Tasks

Data Preprocessing Tasks Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can

More information

Math 304 Handout: Linear algebra, graphs, and networks.

Math 304 Handout: Linear algebra, graphs, and networks. Math 30 Handout: Linear algebra, graphs, and networks. December, 006. GRAPHS AND ADJACENCY MATRICES. Definition. A graph is a collection of vertices connected by edges. A directed graph is a graph all

More information

Laplacian Integral Graphs with Maximum Degree 3

Laplacian Integral Graphs with Maximum Degree 3 Laplacian Integral Graphs with Maximum Degree Steve Kirkland Department of Mathematics and Statistics University of Regina Regina, Saskatchewan, Canada S4S 0A kirkland@math.uregina.ca Submitted: Nov 5,

More information

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors 5 Eigenvalues and Eigenvectors 5.2 THE CHARACTERISTIC EQUATION DETERMINANATS n n Let A be an matrix, let U be any echelon form obtained from A by row replacements and row interchanges (without scaling),

More information

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018 U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018 Lecture 3 In which we show how to find a planted clique in a random graph. 1 Finding a Planted Clique We will analyze

More information

Language Information Processing, Advanced. Topic Models

Language Information Processing, Advanced. Topic Models Language Information Processing, Advanced Topic Models mcuturi@i.kyoto-u.ac.jp Kyoto University - LIP, Adv. - 2011 1 Today s talk Continue exploring the representation of text as histogram of words. Objective:

More information

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min Spectral Graph Theory Lecture 2 The Laplacian Daniel A. Spielman September 4, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened in class. The notes written before

More information

MA201: Further Mathematical Methods (Linear Algebra) 2002

MA201: Further Mathematical Methods (Linear Algebra) 2002 MA201: Further Mathematical Methods (Linear Algebra) 2002 General Information Teaching This course involves two types of teaching session that you should be attending: Lectures This is a half unit course

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Study of Topological Surfaces

Study of Topological Surfaces Study of Topological Surfaces Jamil Mortada Abstract This research project involves studying different topological surfaces, in particular, continuous, invertible functions from a surface to itself. Dealing

More information

Investigating the structure of high dimensional pattern recognition problems

Investigating the structure of high dimensional pattern recognition problems Investigating the structure of high dimensional pattern recognition problems Carey E. Priebe Department of Mathematical Sciences Whiting School of Engineering Johns Hopkins University altimore,

More information

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL MATH 3 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL MAIN TOPICS FOR THE FINAL EXAM:. Vectors. Dot product. Cross product. Geometric applications. 2. Row reduction. Null space, column space, row space, left

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

PRACTICE FINAL EXAM. why. If they are dependent, exhibit a linear dependence relation among them.

PRACTICE FINAL EXAM. why. If they are dependent, exhibit a linear dependence relation among them. Prof A Suciu MTH U37 LINEAR ALGEBRA Spring 2005 PRACTICE FINAL EXAM Are the following vectors independent or dependent? If they are independent, say why If they are dependent, exhibit a linear dependence

More information

October 25, 2013 INNER PRODUCT SPACES

October 25, 2013 INNER PRODUCT SPACES October 25, 2013 INNER PRODUCT SPACES RODICA D. COSTIN Contents 1. Inner product 2 1.1. Inner product 2 1.2. Inner product spaces 4 2. Orthogonal bases 5 2.1. Existence of an orthogonal basis 7 2.2. Orthogonal

More information

Introduction to Applied Linear Algebra with MATLAB

Introduction to Applied Linear Algebra with MATLAB Sigam Series in Applied Mathematics Volume 7 Rizwan Butt Introduction to Applied Linear Algebra with MATLAB Heldermann Verlag Contents Number Systems and Errors 1 1.1 Introduction 1 1.2 Number Representation

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Previously Monte Carlo Integration

Previously Monte Carlo Integration Previously Simulation, sampling Monte Carlo Simulations Inverse cdf method Rejection sampling Today: sampling cont., Bayesian inference via sampling Eigenvalues and Eigenvectors Markov processes, PageRank

More information

EK102 Linear Algebra PRACTICE PROBLEMS for Final Exam Spring 2016

EK102 Linear Algebra PRACTICE PROBLEMS for Final Exam Spring 2016 EK102 Linear Algebra PRACTICE PROBLEMS for Final Exam Spring 2016 Answer the questions in the spaces provided on the question sheets. You must show your work to get credit for your answers. There will

More information

Information retrieval LSI, plsi and LDA. Jian-Yun Nie

Information retrieval LSI, plsi and LDA. Jian-Yun Nie Information retrieval LSI, plsi and LDA Jian-Yun Nie Basics: Eigenvector, Eigenvalue Ref: http://en.wikipedia.org/wiki/eigenvector For a square matrix A: Ax = λx where x is a vector (eigenvector), and

More information

Lecture 6 Sept Data Visualization STAT 442 / 890, CM 462

Lecture 6 Sept Data Visualization STAT 442 / 890, CM 462 Lecture 6 Sept. 25-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Dual PCA It turns out that the singular value decomposition also allows us to formulate the principle components

More information

Scan Statistics on Enron Graphs

Scan Statistics on Enron Graphs Scan Statistics on Enron Graphs Carey E. Priebe cep@jhu.edu Johns Hopkins University Department of Applied Mathematics & Statistics Department of Computer Science Center for Imaging Science The wealth

More information

Adaptive Crowdsourcing via EM with Prior

Adaptive Crowdsourcing via EM with Prior Adaptive Crowdsourcing via EM with Prior Peter Maginnis and Tanmay Gupta May, 205 In this work, we make two primary contributions: derivation of the EM update for the shifted and rescaled beta prior and

More information

PROBABILISTIC LATENT SEMANTIC ANALYSIS

PROBABILISTIC LATENT SEMANTIC ANALYSIS PROBABILISTIC LATENT SEMANTIC ANALYSIS Lingjia Deng Revised from slides of Shuguang Wang Outline Review of previous notes PCA/SVD HITS Latent Semantic Analysis Probabilistic Latent Semantic Analysis Applications

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

12x + 18y = 30? ax + by = m

12x + 18y = 30? ax + by = m Math 2201, Further Linear Algebra: a practical summary. February, 2009 There are just a few themes that were covered in the course. I. Algebra of integers and polynomials. II. Structure theory of one endomorphism.

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors 5 Eigenvalues and Eigenvectors 5.2 THE CHARACTERISTIC EQUATION DETERMINANATS nn Let A be an matrix, let U be any echelon form obtained from A by row replacements and row interchanges (without scaling),

More information

1 Tridiagonal matrices

1 Tridiagonal matrices Lecture Notes: β-ensembles Bálint Virág Notes with Diane Holcomb 1 Tridiagonal matrices Definition 1. Suppose you have a symmetric matrix A, we can define its spectral measure (at the first coordinate

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs Ma/CS 6b Class 3: Eigenvalues in Regular Graphs By Adam Sheffer Recall: The Spectrum of a Graph Consider a graph G = V, E and let A be the adjacency matrix of G. The eigenvalues of G are the eigenvalues

More information

Notes on the framework of Ando and Zhang (2005) 1 Beyond learning good functions: learning good spaces

Notes on the framework of Ando and Zhang (2005) 1 Beyond learning good functions: learning good spaces Notes on the framework of Ando and Zhang (2005 Karl Stratos 1 Beyond learning good functions: learning good spaces 1.1 A single binary classification problem Let X denote the problem domain. Suppose we

More information

Lecture 7 MIMO Communica2ons

Lecture 7 MIMO Communica2ons Wireless Communications Lecture 7 MIMO Communica2ons Prof. Chun-Hung Liu Dept. of Electrical and Computer Engineering National Chiao Tung University Fall 2014 1 Outline MIMO Communications (Chapter 10

More information

On Spectral Graph Clustering

On Spectral Graph Clustering On Spectral Graph Clustering Carey E. Priebe Johns Hopkins University May 18, 2018 Symposium on Data Science and Statistics Reston, Virginia Minh Tang http://arxiv.org/abs/1607.08601 Annals of Statistics,

More information

1 Adjacency matrix and eigenvalues

1 Adjacency matrix and eigenvalues CSC 5170: Theory of Computational Complexity Lecture 7 The Chinese University of Hong Kong 1 March 2010 Our objective of study today is the random walk algorithm for deciding if two vertices in an undirected

More information

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability

More information

Quick Tour of Linear Algebra and Graph Theory

Quick Tour of Linear Algebra and Graph Theory Quick Tour of Linear Algebra and Graph Theory CS224W: Social and Information Network Analysis Fall 2014 David Hallac Based on Peter Lofgren, Yu Wayne Wu, and Borja Pelato s previous versions Matrices and

More information

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang CS 231A Section 1: Linear Algebra & Probability Review Kevin Tang Kevin Tang Section 1-1 9/30/2011 Topics Support Vector Machines Boosting Viola Jones face detector Linear Algebra Review Notation Operations

More information

Sketching as a Tool for Numerical Linear Algebra

Sketching as a Tool for Numerical Linear Algebra Sketching as a Tool for Numerical Linear Algebra (Part 2) David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania February, 2015 Sepehr Assadi (Penn) Sketching

More information

arxiv: v1 [stat.ml] 17 Sep 2012

arxiv: v1 [stat.ml] 17 Sep 2012 Generalized Canonical Correlation Analysis for Disparate Data Fusion Ming Sun a, Carey E. Priebe b,, Minh Tang c arxiv:1209.3761v1 [stat.ml] 17 Sep 2012 a Department of Electrical and Computer Engineering,

More information

LINEAR ALGEBRA W W L CHEN

LINEAR ALGEBRA W W L CHEN LINEAR ALGEBRA W W L CHEN c W W L Chen, 1997, 2008. This chapter is available free to all individuals, on the understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied,

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

series. Utilize the methods of calculus to solve applied problems that require computational or algebraic techniques..

series. Utilize the methods of calculus to solve applied problems that require computational or algebraic techniques.. 1 Use computational techniques and algebraic skills essential for success in an academic, personal, or workplace setting. (Computational and Algebraic Skills) MAT 203 MAT 204 MAT 205 MAT 206 Calculus I

More information

Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE

Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 55, NO. 9, SEPTEMBER 2010 1987 Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE Abstract

More information

Finite-Horizon Statistics for Markov chains

Finite-Horizon Statistics for Markov chains Analyzing FSDT Markov chains Friday, September 30, 2011 2:03 PM Simulating FSDT Markov chains, as we have said is very straightforward, either by using probability transition matrix or stochastic update

More information

Lecture 21: Spectral Learning for Graphical Models

Lecture 21: Spectral Learning for Graphical Models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

AUTOMORPHISM GROUPS AND SPECTRA OF CIRCULANT GRAPHS

AUTOMORPHISM GROUPS AND SPECTRA OF CIRCULANT GRAPHS AUTOMORPHISM GROUPS AND SPECTRA OF CIRCULANT GRAPHS MAX GOLDBERG Abstract. We explore ways to concisely describe circulant graphs, highly symmetric graphs with properties that are easier to generalize

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Math 118, Fall 2014 Final Exam

Math 118, Fall 2014 Final Exam Math 8, Fall 4 Final Exam True or false Please circle your choice; no explanation is necessary True There is a linear transformation T such that T e ) = e and T e ) = e Solution Since T is linear, if T

More information

Linear Algebra in Computer Vision. Lecture2: Basic Linear Algebra & Probability. Vector. Vector Operations

Linear Algebra in Computer Vision. Lecture2: Basic Linear Algebra & Probability. Vector. Vector Operations Linear Algebra in Computer Vision CSED441:Introduction to Computer Vision (2017F Lecture2: Basic Linear Algebra & Probability Bohyung Han CSE, POSTECH bhhan@postech.ac.kr Mathematics in vector space Linear

More information

arxiv:quant-ph/ v1 22 Aug 2005

arxiv:quant-ph/ v1 22 Aug 2005 Conditions for separability in generalized Laplacian matrices and nonnegative matrices as density matrices arxiv:quant-ph/58163v1 22 Aug 25 Abstract Chai Wah Wu IBM Research Division, Thomas J. Watson

More information

6.842 Randomness and Computation March 3, Lecture 8

6.842 Randomness and Computation March 3, Lecture 8 6.84 Randomness and Computation March 3, 04 Lecture 8 Lecturer: Ronitt Rubinfeld Scribe: Daniel Grier Useful Linear Algebra Let v = (v, v,..., v n ) be a non-zero n-dimensional row vector and P an n n

More information

An Importance Sampling Algorithm for Models with Weak Couplings

An Importance Sampling Algorithm for Models with Weak Couplings An Importance ampling Algorithm for Models with Weak Couplings Mehdi Molkaraie EH Zurich mehdi.molkaraie@alumni.ethz.ch arxiv:1607.00866v1 [cs.i] 4 Jul 2016 Abstract We propose an importance sampling algorithm

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

Content-based Recommendation

Content-based Recommendation Content-based Recommendation Suthee Chaidaroon June 13, 2016 Contents 1 Introduction 1 1.1 Matrix Factorization......................... 2 2 slda 2 2.1 Model................................. 3 3 flda 3

More information

Math 1553, Introduction to Linear Algebra

Math 1553, Introduction to Linear Algebra Learning goals articulate what students are expected to be able to do in a course that can be measured. This course has course-level learning goals that pertain to the entire course, and section-level

More information

22.4. Numerical Determination of Eigenvalues and Eigenvectors. Introduction. Prerequisites. Learning Outcomes

22.4. Numerical Determination of Eigenvalues and Eigenvectors. Introduction. Prerequisites. Learning Outcomes Numerical Determination of Eigenvalues and Eigenvectors 22.4 Introduction In Section 22. it was shown how to obtain eigenvalues and eigenvectors for low order matrices, 2 2 and. This involved firstly solving

More information

The quantum state as a vector

The quantum state as a vector The quantum state as a vector February 6, 27 Wave mechanics In our review of the development of wave mechanics, we have established several basic properties of the quantum description of nature:. A particle

More information

We wish the reader success in future encounters with the concepts of linear algebra.

We wish the reader success in future encounters with the concepts of linear algebra. Afterword Our path through linear algebra has emphasized spaces of vectors in dimension 2, 3, and 4 as a means of introducing concepts which go forward to IRn for arbitrary n. But linear algebra does not

More information

HW2 - Due 01/30. Each answer must be mathematically justified. Don t forget your name.

HW2 - Due 01/30. Each answer must be mathematically justified. Don t forget your name. HW2 - Due 0/30 Each answer must be mathematically justified. Don t forget your name. Problem. Use the row reduction algorithm to find the inverse of the matrix 0 0, 2 3 5 if it exists. Double check your

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 22 1 / 21 Overview

More information

Foundations of Adjacency Spectral Embedding. Daniel L. Sussman

Foundations of Adjacency Spectral Embedding. Daniel L. Sussman Foundations of Adjacency Spectral Embedding by Daniel L. Sussman A dissertation submitted to The Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy. Baltimore,

More information

Example Linear Algebra Competency Test

Example Linear Algebra Competency Test Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,

More information