Latent SVD Models for Relational Data
|
|
- Philomena Ball
- 5 years ago
- Views:
Transcription
1 Latent SVD Models for Relational Data Peter Hoff Statistics, iostatistics and enter for Statistics and the Social Sciences University of Washington 1
2 Outline Part 1: Multiplicative factor models for network data 1 Relational data 2 Models via matrix representations 3 Models via exchangeability 4 Examples: (a) International conflict data (b) Protein-protein interaction data Part 2: Rank selection for the SVD model 1 Prior distributions on the Steifel manifold 2 A Gibbs sampling scheme for rank estimation 3 A simulation study 4 Example: Global service firm locations 2
3 Relational data Relational data consists of a set of units or nodes A, and a set of variables Y {y i,j } specific to pairs of nodes (i, j) A A Examples: International relations: Let A index the set of countries Let y i,j be the number of military disputes initiated by i with target j Needle-sharing network: Let A index a set of IV drug users Let y i,j indicate needle-sharing activity between i and j Protein-protein interactions: Let A index a sets of proteins Let y i,j measure the interaction between i and j usiness locations: Let A 1 be a set of banks and A 2 be a set of cities For (i, j) A 1 A 2, let y i,j indicate the presence of an office of bank i in city j 3
4 Inferential goals in the regression framework Let y i,j be the measurement on i j, and x i,j be a vector of explanatory variables If y i,j is binary, then maybe we have Y = 1 0 NA NA 1 NA A X = x 1,2 x 1,3 x 1,4 x 1,5 x 2,1 x 2,3 x 2,4 x 2,5 x 3,1 x 3,2 x 3,4 x 3,5 x 4,1 x 4,2 x 4,3 x 4,5 x 5,1 x 5,2 x 5,3 x 5,4 1 A onsider a basic generalized linear model log odds(y i,j = 1) = β x i,j A model can provide a measure of the association between X and Y: ˆβ, se( ˆβ) predictions of missing or future observations: Pr(y 1,4 = 1 Y, X) 4
5 How might node heterogeneity affect network structure? 5
6 Latent variable models Deviations from ordinary logistic regression can be represented with log odds(y i,j = 1) = β x i,j + z i,j A simple latent variable model might include row and column effects: z i,j = a i + b j log odds(y i,j = 1) = β x i,j + a i + b j a i and b j induce across-node heterogeneity that is additive on the log-odds scale Inclusion of these effects in the model can dramatically improve within-sample model fit (measured by R 2, likelihood ratio, I, etc ); out-of-sample predictive performance (measured by cross-validation) ut this model only captures heterogeneity of outdegree/indegree, and can t represent more complicated structure, such as clustering, transitivity, etc 6
7 The singular value decomposition Z = M + E M represents systematic patterns in the effects and E represents noise Every M has a representation of the form M = UDV where, in the case m n, Recall, U is an m n matrix with orthonormal columns; V is an n n matrix with orthonormal columns; D is an n n diagonal matrix, with diagonal elements {d 1,, d n } typically taken to be a decreasing sequence of non-negative numbers The squared elements of the diagonal of D are the eigenvalues of M M and the columns of V are the corresponding eigenvectors The matrix U can be obtained from the first n eigenvectors of MM The number of non-zero elements of D is the rank of M Writing the row vectors as {u 1,, u m }, {v 1,, v n }, m i,j = u i Dv j 7
8 Multiplicative effects Interpretation: log odds(y i,j = 1) = β x i,j + u idv j + ɛ i,j Think of {u 1,, u m }, {v 1,, v n } as vectors of latent nodal attributes: u idv j = K d k u i,k v j,k k=1 In general, a latent variable model relating X to Y is For example, some potential models are g(e[y i,j ]) = β x i,j + u idv j + ɛ i,j If y i,j is binary, If y i,j is count data, If y i,j is continuous, log odds (y i,j = 1) = β x i,j + u idv j + ɛ i,j log E[y i,j ] = β x i,j + u idv j + ɛ i,j E[y i,j ] = β x i,j + u idv j + ɛ i,j Estimation: Given D, V, the predictor is linear in {u 1,, u m } This bilinear structure can be exploited (EM, Gibbs sampling, variational methods) 8
9 Alternative justification via exchangeability X represents known information about the nodes Z represents any additional patterns In this case we might be willing to use a model in which {z i,j } d = {z gi,hj } for all permutations g and h This is a type of exchangeability for arrays, sometimes called weak exchangeability Theorem (Aldous, Hoover): Let {z i,j } be a weakly exchangeable array Then {z i,j } d = {z i,j }, where z i,j = f(µ, u i, v j, ɛ i,j ) and µ, {u i }, {v j }, {ɛ i,j } are all independent random variables 9
10 International onflict Network, years of international relations data (thanks to Mike Ward and Xun ao) y i,j = indicator of a militarized disputes initiated by i with target j; x i,j an 8-dimensional covariate vector containing an intercept and 1 population initiator 2 population target 3 polity score initiator 4 polity score target 5 polity score initiator polity score target 6 log distance 7 number of shared intergovernmental organizations Model: y i,j are independent binary random variables with log-mean log E[y i,j ] = β x i,j + u idv j + ɛ i,j 10
11 International conflict network, : Parameter estimates log distance polity initiator intergov org polity target polity interaction log pop target log pop initiator regression coefficient 11
12 AFG ANG ARG AUL AH NG OT UI AN AO HA HN OS U YP DOM DR EGY FRN GHA GR GUI HON IND INS IRN IRQ ISR ITA JOR JPN KEN LR LES LI MAL MYA NI NIG NIR NTH OMA PAK PHI PRK QAT ROK RWA SAF SAL SAU SEN SIE SRI SUD SWA SYR TAW TAZ THI TOG TUR UAE UGA UKG USA VEN AFG AL ANG ARG AUL AH EL EN NG UI AM AN DI HA HL HN OL ON OS U YP DR EGY FRN GHA GN GR GUI GUY HAI HON IND INS IRN IRQ ISR ITA JOR JPN LR LES LI MAA MLI MON MOR MYA MZM NAM NI NIG NIR NTH OMA PAK PHI PNG PRK QAT ROK RWA SAF SAL SAU SEN SIE SIN SPN SUD SYR TAW TAZ THI TOG TRI TUR UAE UGA UKG USA VEN YEM ZAM ZIM 12
13 Analyzing undirected graphs In many applications, y i,j = y j,i by design log odds(y i,j = y j,i = 1) = β x i,j + z i,j As before, write Z = M + E, with all matrices symmetric All such M have an eigenvalue decomposition This suggests a model of the form M = UΛU m i,j = u iλu j log odds(y i,j = y j,i = 1) = β x i,j + u iλu j + ɛ i,j Justification via exchangeability: Let Z be a symmetric array (with an undefined diagonal), such that {z i,j } = d {z gi,gj } Then {z i,j } = d {zi,j }, where z i,j = f(µ, u i, u j, ɛ i,j ) µ, {u i }, {ɛ i,j } are independent random variables, and f(, u i, u j, ) = f(, u j, u i, ) 13
14 Protein-protein interaction network Interactions among 270 proteins in E coli (utland, 2005) 1 Data: ( ) i<j y i,j 001 Model: log odds(y i,j = 1) = µ + u i Λu j + ɛ i,j (analysis by Ryan ressler) 14
15 Positive Negative acca acc accd acpp aid alas b1731 b2097 b2341 b2342 b2496 bola cada cbpa clpa clpp clpx cob cspa csp csp cspd cspe dead dnaa dna dnae dnaj dnak dnaq eno faba fabi fabz fba fdhd flda ftse ftsz gada gad gida grea gre grpe gyra gyr hepa hfq hima hns hol hol hold hole hsca hsdm hsdr hslu hslv hupa hup hyp hypd ibpa inf iscs iscu kdsa ldc lig lpxd lyss lysu malt metk mopa mop mre muk muke mura narg narh narj nary ndk nfi nusa nusg pan par pept pfl phes phet pnp pola ppa prsa pssa pst pur pyka rec recd rece recj recq rfad rfb rhl rne rnha rnpa rpld rpoa rpo rpo rpod rpoh rpon rpos rpoz seca sec sel smp spot srm ssb suca suc tdcd tgt thdf topa top tpia tsf tufa tuf usg uvr vac yacg yacl ybcj ybdq ybez ybjx ycby ycea yce ycff ycil yeaz yedo yfc yfg yfid yfif ygdp ygfa yggh ygjd yhbu yhby yhhp yien yjef yjfq ylea ynh ynhd ynhe acc aas fab pls acps fabf fabg glmu ybg yiid ybbn dnan ydhd rho rpl rpli rplv rps rps rpsg narz clp cspg rpsm rpl rple rplm rpsa rpsd rpse rpla rplx rpsj rpst cspi ygif rpls rplu rpsf rpsh rpsi rpsn rpso rpsp rpsr dna dnax hola reca ent sapa b1742 ffh rplf gcpe ubix fusa cafa hrpa lon tig hlpa himd b1410 b1808 pria malp hyfg hyce ibp rpll gaty mukf nei inf usha sbc rplo rplt b2372 qor manx ycb rpm recg rplq yhir rplr yibl rplw yhbv fucu yab rpsl 15
16 RNA polymerase Negative rpoa gre b2372 usg inf grea rpo manx rpos rpon rpo rpoz yacl rpod nusg pola b1731 hepa nusa Positive 16
17 Prediction experiment for international conflict data # of links found K=0 K=1 K=2 K=3 fraction predicted to be links true links true non links # of pairs checked p^ 17
18 Prediction experiment for protein interaction data Latent dimension 2 interactions found trials 18
19 Data analysis with the singular value decomposition Many data analysis procedures for matrix-valued data Y are related to the SVD Given a model of the form Y = M + E the SVD provides Interpretation: y i,j = u i Dv j + ɛ i,j, u i and v j are the ith, jth rows of U, V Estimation: Applications: ˆMK = Û[,1:K] ˆD [1:K,1:K] ˆV [,1:K] if M is assumed to be of rank K biplots (Gabriel 1971, Gower and Hand 1996) reduced-rank interaction models (Gabriel 1978, 1998) analysis of relational data (Harshman et al, 1982) Factor analysis, image processing, data reduction, ut: How to select K? Given K, is ˆM K a good estimator? (E[Y Y] = M M + mσ 2 I ) ayesian work on factor analysis models: Rajan and Rayner (1997), Minka (2000), Lopes and West (2004) 19
20 A toy example Y = UDV + E = X d k U [,k] V [,k] + E singular values
21 A toy example Y = UDV + E = X d k U [,k] V [,k] + E singular values
22 SVD Model Let Y be an m n data matrix (suppose m n) onsider a model of the form Y = UDV + E where K {0,, n} ; U V K,m, V V K,n ; D = diag{d 1,, d K }; E is a matrix of independent normal noise Given a prior distribution on {U, D, V} we would like to calculate p({u 1,, u K }, {v 1,, v K }, {d 1,, d K } Y, K) p(k Y) 22
23 Prior distribution for U 1 z 1 uniform[m-sphere], set U [,1] = z 1 ; 2 z 2 uniform[(m 1)-sphere], set U [,2] = Null(U [,1] )z 2 ; K z K uniform [(m K + 1)-sphere], U [,K] = Null(U [,1],, U [,K 1] )z K The resulting distribution for U is the uniform (invariant) distribution on the Steifel manifold, and is exchangeable under row and column permutations Prior conditional distributions: (U [,j] U [, j] ) d = Null(U [,1],, U [,j 1], U [,j+1],, U [,K] )z j, z j uniform [(m K + 1)-sphere] 23
24 Posterior distribution for U Recall UDV = K k=1 d ku [,k] V [,k], so so Y UDV = (Y U [, j] D [ j, j] V [, j]) d j U [,j] V [,j] E j d j U [,j] V [,j] Y UDV 2 = E j d j U [,j] V [,j] 2 = E j 2 2d j U [,j] E j V [,j] + d j U [,j] V [,j] 2 = E j 2 2d j U [,j] E j V [,j] + d 2 j p(y U, V, D, φ) = (2πφ) mn/2 exp{ 1 2 φ E j 2 + φd j U [,j] E j V [,j] 1 2 φd2 j} p(u [,j] Y, U [, j], V, D) p(u [,j] U [, j] ) exp{φd j U [,j] E j V [,j] } (U [,j] U [, j], Y, V, D) d = Null(U [,1],, U [,j 1], U [,j+1],, U [,K] )z j N j z j, p(z j ) exp{z j µ}, µ = φd j N je j V [,j] 24
25 Evaluating the dimension It can all be done by comparing M 1 : Y = duv + E M 0 : Y = E To decide whether or not to add a dimension, we need to calculate p(m 1 Y) p(m 0 Y) = p(m 1) p(y M 1 ) p(m 0 ) p(y M 0 ) The numerator of the ayes factor is the hard part We need to integrate the following wrt the prior distribution: p(y M 1, u, v, d) = (2π) nm/2 exp{ 1 2 φ Y 2 /2 + φdu Yv φd 2 /2} = p(y M 0 ) exp{ φd 2 /2} exp{u [dφy]v} omputing E[e u Av ] over u Sm and v S n is difficult but doable 25
26 Simulation study For each of (m, n) {(10, 10), (100, 10), (100, 100)}, 100 datasets were generated as follows: U uniform(v 5,m ), V uniform(v 5,n ) ; D = diag{d 1,, d 5 }, {d 1,, d 5 } iid uniform( 1 2 µ mn, 3 2 µ mn) Y = UDV + E, where E is an m n matrix of standard normal noise where µ mn = m + n + 2 mn This choice was not arbitrary: The largest eigenvalue of E is approximately (Edelman, 1988) λ max (m + n + 2 mn) 26
27 Simulation study E[p(K Y)] m=10, n=10 p(k^ ) Density E[p(K Y)] m=100, n=10 p(k^ ) Density E[p(K Y)] m=100, n=100 p(k^ ) Density K K^ ASE ASE LS 27
28 Final Example Data on 46 global service firms and 55 cities, obtained from the Globalization and World ities study group ( y i,j = 1 if firm j has an office in city i and y i,j = 0 otherwise log odds(y i,j = 1) = β + u i Dv j + e i,j K p(k Y) log p(y θ^) scan K K 28
29 ity effects u2 ^ Kuala Lumpur Taipei Manila Seoul Sydney uenos Aires Mexico Johannesburg ity angkok Shanghai eijing Geneva Los Angeles Zurich aracas Santiago Toronto Jakarta Hong Kong Tokyo Singapore hicago Houston Amsterdam Frankfurt Melbourne MontrealSan Francisco Osaka Sao Paulo oston Milan Dusseldorf Istanbul Miami openhagen Madrid New York arcelona Hamburg Munich Stockholm Paris London udapest Atlanta Prague Warsaw Dallas Rome erlin Minneapolis Moscow Washington,D russels ^ u 1 29
30 Summary and future directions Summary: SVD models are a natural way to represent patterns in relational data; The SVD structure can be incorporated into a variety of model forms; MA on model rank can be done Future Directions: Extensions to higher dimensional arrays; Dynamic network inference (times series models for Y, U, D, V); Restrict latent characteristics to be orthogonal to design vectors 30
Latent SVD Models for Relational Data
Latent SVD Models for Relational Data Peter Hoff Statistics, iostatistics and enter for Statistics and the Social Sciences University of Washington 1 Outline 1 Relational data 2 Exchangeability for arrays
More informationLatent Factor Models for Relational Data
Latent Factor Models for Relational Data Peter Hoff Statistics, Biostatistics and Center for Statistics and the Social Sciences University of Washington 1 Outline Part 1: Multiplicative factor models for
More informationLatent Factor Models for Relational Data
Latent Factor Models for Relational Data Peter Hoff Statistics, Biostatistics and Center for Statistics and the Social Sciences University of Washington Outline Part 1: Multiplicative factor models for
More informationLatent Factor Models for Relational Data
Latent Factor Models for Relational Data Peter Hoff Statistics, Biostatistics and Center for Statistics and the Social Sciences University of Washington Outline Part 1: Multiplicative factor models for
More informationHierarchical models for multiway data
Hierarchical models for multiway data Peter Hoff Statistics, Biostatistics and the CSSS University of Washington Array-valued data y i,j,k = jth variable of ith subject under condition k (psychometrics).
More informationMean and covariance models for relational arrays
Mean and covariance models for relational arrays Peter Hoff Statistics, Biostatistics and the CSSS University of Washington Outline Introduction and examples Models for multiway mean structure Models for
More informationHigher order patterns via factor models
1/39 Higher order patterns via factor models 567 Statistical analysis of social networks Peter Hoff Statistics, University of Washington 2/39 Conflict data Y
More informationProbability models for multiway data
Probability models for multiway data Peter Hoff Statistics, Biostatistics and the CSSS University of Washington Outline Introduction and examples Hierarchical models for multiway factors Deep interactions
More informationSimulation of the matrix Bingham-von Mises-Fisher distribution, with applications to multivariate and relational data
Simulation of the matrix Bingham-von Mises-Fisher distribution, with applications to multivariate and relational data Peter D. Hoff 1 Technical Report no. 526 Department of Statistics University of Washington
More informationModel averaging and dimension selection for the singular value decomposition
Model averaging and dimension selection for the singular value decomposition Peter D. Hoff Technical Report no. 494 Department of Statistics University of Washington January 10, 2006 Abstract Many multivariate
More informationModel averaging and dimension selection for the singular value decomposition
Model averaging and dimension selection for the singular value decomposition Peter D. Hoff December 9, 2005 Abstract Many multivariate data reduction techniques for an m n matrix Y involve the model Y
More informationarxiv:math/ v1 [math.st] 1 Sep 2006
Model averaging and dimension selection for the singular value decomposition arxiv:math/0609042v1 [math.st] 1 Sep 2006 Peter D. Hoff February 2, 2008 Abstract Many multivariate data analysis techniques
More informationRandom Effects Models for Network Data
Random Effects Models for Network Data Peter D. Hoff 1 Working Paper no. 28 Center for Statistics and the Social Sciences University of Washington Seattle, WA 98195-4320 January 14, 2003 1 Department of
More informationPorous Europe: European Cities in Global Urban Arenas
Porous Europe: European Cities in Global Urban Arenas Ben Derudder & Peter Taylor Globalization and World Cities study group and network (GaWC, http://www.lboro.ac.uk/gawc) Rationale Three basic observations:
More informationRobust Detection of Hierarchical Communities from Escherichia coli Gene Expression Data
Robust Detection of Hierarchical Communities from Escherichia coli Gene Expression Data Santiago Treviño III 1,2, Yudong Sun 1,2, Tim F. Cooper 3, Kevin E. Bassler 1,2 * 1 Department of Physics, University
More informationMultilinear tensor regression for longitudinal relational data
Multilinear tensor regression for longitudinal relational data Peter D. Hoff 1 Working Paper no. 149 Center for Statistics and the Social Sciences University of Washington Seattle, WA 98195-4320 November
More informationProperties of Matrices and Operations on Matrices
Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,
More informationLecture 6: Methods for high-dimensional problems
Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,
More informationModeling homophily and stochastic equivalence in symmetric relational data
Modeling homophily and stochastic equivalence in symmetric relational data Peter D. Hoff Departments of Statistics and Biostatistics University of Washington Seattle, WA 98195-4322. hoff@stat.washington.edu
More informationLinear Algebra (Review) Volker Tresp 2018
Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c
More informationMid-Sized Cities: Territorial Capital of Europe
Mid-Sized Cities: Territorial Capital of Europe IUFA Conference 2009 Mid-size cities and the knowledge economy Bologna, Italy 13-17 June 2009 Contents 1. Metropolitan fever 2. Les Villes d`europe 3. Territorial
More informationLinear Algebra Review. Vectors
Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors
More informationLinear algebra for computational statistics
University of Seoul May 3, 2018 Vector and Matrix Notation Denote 2-dimensional data array (n p matrix) by X. Denote the element in the ith row and the jth column of X by x ij or (X) ij. Denote by X j
More informationDyadic data analysis with amen
Dyadic data analysis with amen Peter D. Hoff May 23, 2017 Abstract Dyadic data on pairs of objects, such as relational or social network data, often exhibit strong statistical dependencies. Certain types
More informationUrbanisation & Economic Geography (2)
GEOG1101: Introduction to Economic Geography Tuesday 20 October, 2009 Urbanisation & Economic Geography (2) Kevon Rhiney Department of Geography and Geology University of the West Indies, Mona Lecture
More informationReview of Linear Algebra
Review of Linear Algebra Dr Gerhard Roth COMP 40A Winter 05 Version Linear algebra Is an important area of mathematics It is the basis of computer vision Is very widely taught, and there are many resources
More informationThe Singular Value Decomposition
The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will
More informationENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition
ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University of Hong Kong Lecture 8: QR Decomposition
More informationSteven J. Leon University of Massachusetts, Dartmouth
INSTRUCTOR S SOLUTIONS MANUAL LINEAR ALGEBRA WITH APPLICATIONS NINTH EDITION Steven J. Leon University of Massachusetts, Dartmouth Boston Columbus Indianapolis New York San Francisco Amsterdam Cape Town
More informationLinear Algebra (Review) Volker Tresp 2017
Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.
More informationCS 143 Linear Algebra Review
CS 143 Linear Algebra Review Stefan Roth September 29, 2003 Introductory Remarks This review does not aim at mathematical rigor very much, but instead at ease of understanding and conciseness. Please see
More informationGlobal City By: System Administrator On: 2013-05-11 00:59 (29667 Reads) Global City or world city World Cities Are Influential on a Global Scale: Economically powerful integrated into the world economy
More informationMachine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University
More informationFundamentals of Matrices
Maschinelles Lernen II Fundamentals of Matrices Christoph Sawade/Niels Landwehr/Blaine Nelson Tobias Scheffer Matrix Examples Recap: Data Linear Model: f i x = w i T x Let X = x x n be the data matrix
More informationUnsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent
Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:
More informationCollege Algebra. Third Edition. Concepts Through Functions. Michael Sullivan. Michael Sullivan, III. Chicago State University. Joliet Junior College
College Algebra Concepts Through Functions Third Edition Michael Sullivan Chicago State University Michael Sullivan, III Joliet Junior College PEARSON Boston Columbus Indianapolis New York San Francisco
More informationDimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas
Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx
More informationHierarchical multilinear models for multiway data
Hierarchical multilinear models for multiway data Peter D. Hoff 1 Technical Report no. 564 Department of Statistics University of Washington Seattle, WA 98195-4322. December 28, 2009 1 Departments of Statistics
More informationMulti-level Models: Idea
Review of 140.656 Review Introduction to multi-level models The two-stage normal-normal model Two-stage linear models with random effects Three-stage linear models Two-stage logistic regression with random
More informationPositive Definite Matrix
1/29 Chia-Ping Chen Professor Department of Computer Science and Engineering National Sun Yat-sen University Linear Algebra Positive Definite, Negative Definite, Indefinite 2/29 Pure Quadratic Function
More information(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =
. (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)
More informationReview of similarity transformation and Singular Value Decomposition
Review of similarity transformation and Singular Value Decomposition Nasser M Abbasi Applied Mathematics Department, California State University, Fullerton July 8 7 page compiled on June 9, 5 at 9:5pm
More informationSampling and incomplete network data
1/58 Sampling and incomplete network data 567 Statistical analysis of social networks Peter Hoff Statistics, University of Washington 2/58 Network sampling methods It is sometimes difficult to obtain a
More informationj=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.
Lecture Notes: Orthogonal and Symmetric Matrices Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk Orthogonal Matrix Definition. Let u = [u
More informationDATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD
DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationEquivariant and scale-free Tucker decomposition models
Equivariant and scale-free Tucker decomposition models Peter David Hoff 1 Working Paper no. 142 Center for Statistics and the Social Sciences University of Washington Seattle, WA 98195-4320 December 31,
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter
More informationPCA and admixture models
PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1
More informationQuick Tour of Linear Algebra and Graph Theory
Quick Tour of Linear Algebra and Graph Theory CS224w: Social and Information Network Analysis Fall 2012 Yu Wayne Wu Based on Borja Pelato s version in Fall 2011 Matrices and Vectors Matrix: A rectangular
More informationWeighted Low Rank Approximations
Weighted Low Rank Approximations Nathan Srebro and Tommi Jaakkola Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Weighted Low Rank Approximations What is
More informationLecture 9: Low Rank Approximation
CSE 521: Design and Analysis of Algorithms I Fall 2018 Lecture 9: Low Rank Approximation Lecturer: Shayan Oveis Gharan February 8th Scribe: Jun Qi Disclaimer: These notes have not been subjected to the
More informationAppendix A: Matrices
Appendix A: Matrices A matrix is a rectangular array of numbers Such arrays have rows and columns The numbers of rows and columns are referred to as the dimensions of a matrix A matrix with, say, 5 rows
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationDS-GA 1002 Lecture notes 10 November 23, Linear models
DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.
More informationIntroduction PCA classic Generative models Beyond and summary. PCA, ICA and beyond
PCA, ICA and beyond Summer School on Manifold Learning in Image and Signal Analysis, August 17-21, 2009, Hven Technical University of Denmark (DTU) & University of Copenhagen (KU) August 18, 2009 Motivation
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationData Mining Techniques
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality
More informationScalable Gaussian process models on matrices and tensors
Scalable Gaussian process models on matrices and tensors Alan Qi CS & Statistics Purdue University Joint work with F. Yan, Z. Xu, S. Zhe, and IBM Research! Models for graph and multiway data Model Algorithm
More informationEigenvalues and diagonalization
Eigenvalues and diagonalization Patrick Breheny November 15 Patrick Breheny BST 764: Applied Statistical Modeling 1/20 Introduction The next topic in our course, principal components analysis, revolves
More informationGraphical Model Selection
May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor
More informationEMPIRICAL POLITICAL ANALYSIS
Eighth Edition SUB Hamburg A/533757 EMPIRICAL POLITICAL ANALYSIS QUANTITATIVE AND QUALITATIVE RESEARCH METHODS Craig Leonard Brians Virginia Polytechnic and State University Lars Willnat Indiana University
More informationSingular Value Decomposition and Principal Component Analysis (PCA) I
Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression
More informationQuick Tour of Linear Algebra and Graph Theory
Quick Tour of Linear Algebra and Graph Theory CS224W: Social and Information Network Analysis Fall 2014 David Hallac Based on Peter Lofgren, Yu Wayne Wu, and Borja Pelato s previous versions Matrices and
More informationTotal Least Squares Approach in Regression Methods
WDS'08 Proceedings of Contributed Papers, Part I, 88 93, 2008. ISBN 978-80-7378-065-4 MATFYZPRESS Total Least Squares Approach in Regression Methods M. Pešta Charles University, Faculty of Mathematics
More informationlinearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice
3. Eigenvalues and Eigenvectors, Spectral Representation 3.. Eigenvalues and Eigenvectors A vector ' is eigenvector of a matrix K, if K' is parallel to ' and ' 6, i.e., K' k' k is the eigenvalue. If is
More informationDimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining
Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can
More informationIntroduction to Econometrics
Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle
More informationMLES & Multivariate Normal Theory
Merlise Clyde September 6, 2016 Outline Expectations of Quadratic Forms Distribution Linear Transformations Distribution of estimates under normality Properties of MLE s Recap Ŷ = ˆµ is an unbiased estimate
More informationUNIT 6: The singular value decomposition.
UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T
More information2012 Global Cities Index and Emerging Cities Outlook
2012 Global Cities Index and Emerging Cities Outlook New York, London, Paris, and Tokyo remain today s leading cities, but an analysis of key trends in emerging cities suggests that Beijing and Shanghai
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationStatistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1
Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of
More informationPart I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes
Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with
More informationCS 340 Lec. 6: Linear Dimensionality Reduction
CS 340 Lec. 6: Linear Dimensionality Reduction AD January 2011 AD () January 2011 1 / 46 Linear Dimensionality Reduction Introduction & Motivation Brief Review of Linear Algebra Principal Component Analysis
More informationA Hierarchical Perspective on Lee-Carter Models
A Hierarchical Perspective on Lee-Carter Models Paul Eilers Leiden University Medical Centre L-C Workshop, Edinburgh 24 The vantage point Previous presentation: Iain Currie led you upward From Glen Gumbel
More informationA note on multiple imputation for general purpose estimation
A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume
More informationSupplemental Information. Exploiting Natural Fluctuations. to Identify Kinetic Mechanisms. in Sparsely Characterized Systems
Cell Systems, Volume Supplemental Information Exploiting Natural Fluctuations to Identify Kinetic Mechanisms in Sparsely Characterized Systems Andreas Hilfinger, Thomas M. Norman, and Johan Paulsson Exploiting
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject
More informationAssignment 2 (Sol.) Introduction to Machine Learning Prof. B. Ravindran
Assignment 2 (Sol.) Introduction to Machine Learning Prof. B. Ravindran 1. Let A m n be a matrix of real numbers. The matrix AA T has an eigenvector x with eigenvalue b. Then the eigenvector y of A T A
More informationEssentials of College Algebra
Essentials of College Algebra For these Global Editions, the editorial team at Pearson has collaborated with educators across the world to address a wide range of subjects and requirements, equipping students
More informationCitizen Centric Cities. The Sustainable Cities Index 2018 Europe
Citizen Centric Cities The Sustainable Cities Index 2018 Europe Foreword The 2018 edition of Arcadis Sustainable Cities Index (SCI) explores city sustainability from the perspective of the citizen. We
More information14 Singular Value Decomposition
14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationLinear Algebra. Session 12
Linear Algebra. Session 12 Dr. Marco A Roque Sol 08/01/2017 Example 12.1 Find the constant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c = 1 c = 0 f (x)
More informationBoundary. DIFFERENTIAL EQUATIONS with Fourier Series and. Value Problems APPLIED PARTIAL. Fifth Edition. Richard Haberman PEARSON
APPLIED PARTIAL DIFFERENTIAL EQUATIONS with Fourier Series and Boundary Value Problems Fifth Edition Richard Haberman Southern Methodist University PEARSON Boston Columbus Indianapolis New York San Francisco
More information8.1 Concentration inequality for Gaussian random matrix (cont d)
MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration
More informationStatistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010
Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu COMPSTAT
More informationThe International-Trade Network: Gravity Equations and Topological Properties
The International-Trade Network: Gravity Equations and Topological Properties Giorgio Fagiolo Sant Anna School of Advanced Studies Laboratory of Economics and Management Piazza Martiri della Libertà 33
More informationIntroduction to Economic Geography
Introduction to Economic Geography Globalization, Uneven Development and Place 2nd edition Danny MacKinnon and Andrew Cumbers Harlow, England London New York Boston San Francisco Toronto Sydney Singapore
More informationLinear Methods in Data Mining
Why Methods? linear methods are well understood, simple and elegant; algorithms based on linear methods are widespread: data mining, computer vision, graphics, pattern recognition; excellent general software
More informationExample Linear Algebra Competency Test
Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,
More informationPractical Linear Algebra: A Geometry Toolbox
Practical Linear Algebra: A Geometry Toolbox Third edition Chapter 12: Gauss for Linear Systems Gerald Farin & Dianne Hansford CRC Press, Taylor & Francis Group, An A K Peters Book www.farinhansford.com/books/pla
More informationLecture 2: Linear Algebra Review
EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1
More informationAMS-207: Bayesian Statistics
Linear Regression How does a quantity y, vary as a function of another quantity, or vector of quantities x? We are interested in p(y θ, x) under a model in which n observations (x i, y i ) are exchangeable.
More informationDifferential Equations
Differential Equations Theory, Technique, and Practice George F. Simmons and Steven G. Krantz Higher Education Boston Burr Ridge, IL Dubuque, IA Madison, Wl New York San Francisco St. Louis Bangkok Bogota
More informationLecture 4: Principal Component Analysis and Linear Dimension Reduction
Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:
More informationELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018
ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space
More informationVector Space Models. wine_spectral.r
Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components
More informationBIOINFORMATICS Datamining #1
BIOINFORMATICS Datamining #1 Mark Gerstein, Yale University gersteinlab.org/courses/452 1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Large-scale Datamining Gene Expression Representing Data in
More informationFunctional Analysis Review
Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all
More information