How to exploit network properties to improve learning in relational domains
|
|
- Virgil Lamb
- 5 years ago
- Views:
Transcription
1 How to exploit network properties to improve learning in relational domains Jennifer Neville Departments of Computer Science and Statistics Purdue University!!!! (joint work with Brian Gallagher, Timothy La Fond, Sebastian Moreno, Joseph Pfeiffer and Rongjing Xiang)
2 Relational network classification examples Predict organizational roles from communication patterns networks! Predict protein function from interaction patterns Gene/protein networks! Predict paper topics from properties of cited papers Scientific networks! Predict content changes from properties of hyperlinked pages World wide web! Predict personal preferences from characteristics of friends Social networks! Predict group effectiveness from communication patterns Organizational networks!
3 Network data is: heterogeneous and interdependent, partially observed/labeled, dynamic and/or non-stationary, and often drawn from a single network...thus many traditional ML methods developed for i.i.d. data do not apply
4 Machine learning 0 Data representation choose Knowledge representation Generic form is: y = " x + " x...+ " 0 choose defines
5 Machine learning 0 Model space defines combine Objective function L sq (D) = N D X (f(x i ) y i ) i= choose
6 Machine learning 0 4 (eg. optimization) combine Search algorithm Learning identifies model with max objective function on training data Model is applied for prediction on new data from same distribution
7 Relational learning Machine learning 0 Data representation Knowledge representation networks! Objective function relational data Social networks! Scientific networks! relational models Bn Bn Firm Broker (Bk) Disclosure Branch (Bn) Bn Size 4 Search algorithm Gene/protein networks! Problem In Past Has Business Is Problem On Watchlist Year Bk Region Type Bk Area Bk Layoffs Bn On Watchlist World wide web! Organizational networks! Bn Bk
8 There has been a great deal of work on templated graphical model representations for relational data RBNs PRMs RMNs IHRMs MLNs DAPER GMNs RDNs Since model representation is also graphical we need to distinguish data networks from model networks
9 Data network
10 Gender? Married? Politics? Religion? Data network
11 Data representation F N D!C F Y D!C Relational F learning M Mtask: N Y N D D C E.g., predict!c political views C based C on user s intrinsic attributes and political views of friends F N C!C M Y C C F Y D C Estimate joint distribution: or conditional distribution: P (Y {X} n,g) Attributed network P (Y i X i, X R, Y R ) Note we often have only a single network for learning
12 Define structure of graphical model Politics i Politics j Relational template Politics i Gender i Politics i Married i Politics i Religion i
13 Y i Y j Relational template Y i X i Y i X i Y i X i Model template
14 Y i Y j Y i X i + Y i X i Y i X i Model template Data network
15 Knowledge representation X X 8 X X X 8 X 8 X X X Y X 4 X 4 X 4 X 5 X 5 X 5 Y 8 Y X X X Y 4 X 6 X 6 X 6 Y 5 X 7 X 7 X 7 Y Y 6 Y 7 Model network (graphical model)
16 Objective 4 Search: eg. convex function optimization X X 8 X X X 8 X 8 X X X Y X 4 X 4 X 4 X 5 X 5 X 5 Y 8 Y X X X Y 4 X 6 X 6 X 6 Y 5 X 7 X 7 X 7 Y Y 6 Y 7 Learn model parameters from fully labeled network P (y G x G )= Z(, x G ) T T C C(T (G)) T (x C, y C ; T )
17 Apply model to make predictions in another network drawn from the same distribution Y i Y j X X X X X X X 8 X 8 X 8 Y i X i Y + Y X 4 X 4 X 4 Y 8 X 5 X 5 X 5 Y i X i X Y 4 X 6 Y 5 X 7 Y i X i X X X 6 X 6 X 7 X 7 Y Y 6 Y 7 Model template Test network
18 Collective classification uses full joint, rolled out model for inference but labeled nodes impact the final model structure X X X X X X X X X X X X X 8 X 8 X 8 X 8 X 8 X 8 Y Y Y Y X 4 Y 4 X 4 X 4 Y 4 Y 8 Y 8 X 5 X 5 X 5 X 5 X 5 X 5 X X X X X X Y 4 Y 4 Y 4 X 6 X 6 X 6 Y 5 Y X 5 6 X 6 X 7 X 6 X 7 X 7 X 7 X 7 X 7 Y Y Y 6 Y 6 Y 7 Y 7 Labeled node
19 Collective classification uses full joint, rolled out model for inference but labeled nodes impact the final model structure X X X The structure X X X of rolled-out 8 relational X X 8 X 8 graphical models are determined by the Y Y X Y 4 X structure of the underlying 8 data 5 network, X 4 X 4 X 5 X 5 Labeled node including location + availability of labels X Y 4 X X X 6 X X 7 X this can impact performance 6 7 of Y Y 6 Y learning and inference methods 7 X 6 Y 5 X 7 via representation, objective function, and search algorithm
20 Networks are much, much larger in practice
21 Finding : Representation Implicit assumption is that nodes of the same type should be identically distributed but many relational representations cannot ensure this holds for varying graph structures
22 I.I.D. assumption revisited Current relational models do not impose the same marginal invariance condition that is assumed for IID models, which can impair generalization p(y A x A ) A B E p(y E x E ) C D F p(y A x A ) 6= p(y E x E ) due to varying graph structure Markov relational network representation does not allow us to explicitly specify the form of the marginal probability distributions, thus it is difficult to impose any equality constraints on the marginals
23 Is there an alternative approach? Goal: Combine the marginal invariance advantages of IID models with the ability to model relational dependence Incorporate node attributes in a general way (similar to IID classifiers) Idea: Apply copulas to combine marginal models with dependence structure F... t t t tn z z z zn t jointly ~ Copula theory: can construct n-dimensional z vector i = F ( of ) i ( arbitrary i (t i )) marginals while preserving the desired dependence structure... zi marginally ~ Fi
24 Let s start with a reformulation of IID classifiers... General form of probabilistic binary classification: e.g., Logistic regression p(y i = ) = F ( (x i )) Now view F as the CDF of a distribution symmetric around 0 to obtain a latent variable formulation:! z is a continuous variable, capturing random effects that are not present in x p is the corresponding PDF of F z i p(z i = z x i = x) =f(z (x i )) y i = sign(z i ) In IID models, the random effect for each instance is independent, thus can be integrated out When links among instances are observed, the correlations among their class labels can be modeled through dependence among the z s Key question: How to model the dependence among z s while preserving the marginals?? Zj
25 Copula Latent Markov Network (CLMN) IID classifiers CLMN The CLMN model Sample t from the desired joint dependency:(t,t,...,t n ) Apply marginal transformation to obtain the latent variable z: z i = F ( ) Marginal Φi transforms ti to uniform [0,] r.v. ui Classification: i ( i (t i )) y i = sign(z i ) Quasi-inverse of CDF Fi is used to obtain zi from ui, Attributes moderate corresponding pdf fi
26 Copula Latent Markov Network (Xiang and N. WSDM ) CLMN implementation Gaussian Markov network Estimation: First, learn marginal model as if instances were IID Next, learn the dependence model conditioned on the marginal model... but GMN has no parameters to learn Logistic regression Inference: Conditional inference in copulas have not previously been considered for largescale networks For efficient inference, we developed a message passing algorithm based on EP
27 Experimental Results CLMN SocDim RMN LR GMN Key idea: Ensuring that nodes with varying graph Facebook structure have identical marginals improves learning Gene IMDB IMDB
28 Finding : Search Graph+attribute space is too large to sample thoroughly, but efficient generative graph models can be exploited to search more effectively
29 How to efficiently generate attributed graph samples from the underlying joint distribution P (X, Y,G)? Space is O( V +V p ) so effective sampling from joint is difficult
30 Naive sampling approach: Assume independence between graph/attributes P E (X, E E, X )=P E (E E )P (X X ) Attributes Graph Model Attribute Model
31 Problem with naive approach Original Sampled Although graph structure can be captured by generative graph models, naive pairing with attribute samples does not capture relational correlation Attribute value combinations
32 Solution: Use graph model to propose edges, but sample conditional on node attribute values P E (X, E E, X )=P E (E X, E, X )P (X E, X ) Attributes Graph Model use Accept-Reject process to sample conditioned on attrs Attribute Model
33 Exploit efficient generative graph model as proposal distribution to search effectively What to use as acceptance probabilities? Ratio of observed probabilities in original data to sampled probabilities resulting from naive approach!!!!! Original Original This corresponds to rejection sampling Proposing distribution: True distribution: Sampled Attribute value combinations P E (E ij = E ) P o (E ij = f(x i, x j ), E, X ) Attribute value combinations
34 Attributed graph models (Pfeiffer, La Fond, Moreno, N & Gallagher WWW 4) # Learn attribute and graph model # Generate graph with naive approach # Compute acceptance ratios # Sample attributes! while not enough edges: draw (vi,vj) from Q (the model) U ~ Uniform(0,) if U < A(xi, xj) put (vi, vj) into the edges return edges Attribute value combinations a Possible Edges g b h f e d i c b f h g
35 Theorem : AGM samples from the joint distribution of edges and attributes P (E ij = f(x i,x j ), E, F )P (x i,x j X ) Corollary : Expected AGM degree equals expected degree of structural graph model
36 Empirical results on Facebook data 0.4 Correlation Political views AGM preserves characteristics 0. of graph model AGM Key idea: Statistical models of graphs can be exploited to improve sampling from full joint P E (E, X E, X ) AGM captures attribute correlation No AGM Facebook AGM-FCL AGM-TCL AGM-KPGM (x) AGM-KPGM (x) FCL TCL KPGM (x) KPGM (x)
37 Relational learning Data representation 4 Knowledge representation Objective function Search algorithm Representations affect our ability to enforce invariance assumptions Conventional obj. functions do not behave as expected in partially labeled networks (not in this talk) Simpler (graph) models can be used to statistically prune search space
38 Conclusion Relational models have been shown to significantly improve predictions through the use of joint modeling and collective inference But since the (rolled-out) model structure depends on the structure of the underlying data network we need to understand how the data graph affects model/algorithm characteristics in order to better exploit relational information for learning/prediction A careful consideration of interactions between: data representation, knowledge representation, objective function, and search algorithm will improve our understanding of mechanisms that impact performance and this will form the foundation for improved algorithms & methodology
39 Thanks to: Alum ni Hoda Eldardiry Rongjing Xiang Chris Mayfield Karthik Nagaraj Umang Sharan Sebastian Moreno Nesreen Ahmed Hyokun Yun Suvidha Kancharla Tao Wang Timothy La Fond Joel Pfeiffer Ellen Lai Pablo Granda Hogun Park
40 Questions?!
Collective classification in large scale networks. Jennifer Neville Departments of Computer Science and Statistics Purdue University
Collective classification in large scale networks Jennifer Neville Departments of Computer Science and Statistics Purdue University The data mining process Network Datadata Knowledge Selection Interpretation
More informationSupporting Statistical Hypothesis Testing Over Graphs
Supporting Statistical Hypothesis Testing Over Graphs Jennifer Neville Departments of Computer Science and Statistics Purdue University (joint work with Tina Eliassi-Rad, Brian Gallagher, Sergey Kirshner,
More informationSemi-supervised learning for node classification in networks
Semi-supervised learning for node classification in networks Jennifer Neville Departments of Computer Science and Statistics Purdue University (joint work with Paul Bennett, John Moore, and Joel Pfeiffer)
More informationLifted and Constrained Sampling of Attributed Graphs with Generative Network Models
Lifted and Constrained Sampling of Attributed Graphs with Generative Network Models Jennifer Neville Departments of Computer Science and Statistics Purdue University (joint work with Pablo Robles Granda,
More informationA Shrinkage Approach for Modeling Non-Stationary Relational Autocorrelation
A Shrinkage Approach for Modeling Non-Stationary Relational Autocorrelation Pelin Angin Department of Computer Science Purdue University pangin@cs.purdue.edu Jennifer Neville Department of Computer Science
More informationSampling of Attributed Networks from Hierarchical Generative Models
Sampling of Attributed Networks from Hierarchical Generative Models Pablo Robles Purdue University West Lafayette, IN USA problesg@purdue.edu Sebastian Moreno Universidad Adolfo Ibañez Viña del Mar, Chile
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationUsing Bayesian Network Representations for Effective Sampling from Generative Network Models
Using Bayesian Network Representations for Effective Sampling from Generative Network Models Pablo Robles-Granda and Sebastian Moreno and Jennifer Neville Computer Science Department Purdue University
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationComputational Genomics
Computational Genomics http://www.cs.cmu.edu/~02710 Introduction to probability, statistics and algorithms (brief) intro to probability Basic notations Random variable - referring to an element / event
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationCPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017
CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class
More informationHigh dimensional Ising model selection
High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationHybrid Models for Text and Graphs. 10/23/2012 Analysis of Social Media
Hybrid Models for Text and Graphs 10/23/2012 Analysis of Social Media Newswire Text Formal Primary purpose: Inform typical reader about recent events Broad audience: Explicitly establish shared context
More informationDynamic Approaches: The Hidden Markov Model
Dynamic Approaches: The Hidden Markov Model Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Inference as Message
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More informationUsing Bayesian Network Representations for Effective Sampling from Generative Network Models
Using Bayesian Network Representations for Effective Sampling from Generative Network Models Pablo Robles-Granda and Sebastian Moreno and Jennifer Neville Computer Science Department Purdue University
More informationBased on slides by Richard Zemel
CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we
More informationMachine Learning (CS 567) Lecture 2
Machine Learning (CS 567) Lecture 2 Time: T-Th 5:00pm - 6:20pm Location: GFS118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationProbabilistic Graphical Models (I)
Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random
More informationBayesian Learning (II)
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP
More informationProbabilistic Graphical Models
Probabilistic Graphical Models David Sontag New York University Lecture 4, February 16, 2012 David Sontag (NYU) Graphical Models Lecture 4, February 16, 2012 1 / 27 Undirected graphical models Reminder
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationGaussian Processes (10/16/13)
STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More information1 Undirected Graphical Models. 2 Markov Random Fields (MRFs)
Machine Learning (ML, F16) Lecture#07 (Thursday Nov. 3rd) Lecturer: Byron Boots Undirected Graphical Models 1 Undirected Graphical Models In the previous lecture, we discussed directed graphical models.
More informationQuilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs
Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs Hyokun Yun (work with S.V.N. Vishwanathan) Department of Statistics Purdue Machine Learning Seminar November 9, 2011 Overview
More informationIntroduction. Chapter 1
Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationQuilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs
Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs Hyokun Yun Department of Statistics Purdue University SV N Vishwanathan Departments of Statistics and Computer Science
More informationInference in Graphical Models Variable Elimination and Message Passing Algorithm
Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption
More informationDeep Learning Srihari. Deep Belief Nets. Sargur N. Srihari
Deep Belief Nets Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines for continuous
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationOverview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated
Fall 3 Computer Vision Overview of Statistical Tools Statistical Inference Haibin Ling Observation inference Decision Prior knowledge http://www.dabi.temple.edu/~hbling/teaching/3f_5543/index.html Bayesian
More informationMachine Learning for Data Science (CS4786) Lecture 24
Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationRandom Field Models for Applications in Computer Vision
Random Field Models for Applications in Computer Vision Nazre Batool Post-doctorate Fellow, Team AYIN, INRIA Sophia Antipolis Outline Graphical Models Generative vs. Discriminative Classifiers Markov Random
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Oct, 21, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models CPSC
More informationCS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationFinal Exam, Machine Learning, Spring 2009
Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationLecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu
Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes
More informationIntroduction to Graphical Models
Introduction to Graphical Models The 15 th Winter School of Statistical Physics POSCO International Center & POSTECH, Pohang 2018. 1. 9 (Tue.) Yung-Kyun Noh GENERALIZATION FOR PREDICTION 2 Probabilistic
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More informationConditional Independence
Conditional Independence Sargur Srihari srihari@cedar.buffalo.edu 1 Conditional Independence Topics 1. What is Conditional Independence? Factorization of probability distribution into marginals 2. Why
More informationIntroduction to Bayesian Learning
Course Information Introduction Introduction to Bayesian Learning Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Apprendimento Automatico: Fondamenti - A.A. 2016/2017 Outline
More informationAn Introduction to Bayesian Machine Learning
1 An Introduction to Bayesian Machine Learning José Miguel Hernández-Lobato Department of Engineering, Cambridge University April 8, 2013 2 What is Machine Learning? The design of computational systems
More informationMachine Learning, Fall 2012 Homework 2
0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0
More informationDirected Graphical Models or Bayesian Networks
Directed Graphical Models or Bayesian Networks Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Bayesian Networks One of the most exciting recent advancements in statistical AI Compact
More informationIntroduction to Machine Learning Midterm, Tues April 8
Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More informationCollaborative topic models: motivations cont
Collaborative topic models: motivations cont Two topics: machine learning social network analysis Two people: " boy Two articles: article A! girl article B Preferences: The boy likes A and B --- no problem.
More informationSubmodularity in Machine Learning
Saifuddin Syed MLRG Summer 2016 1 / 39 What are submodular functions Outline 1 What are submodular functions Motivation Submodularity and Concavity Examples 2 Properties of submodular functions Submodularity
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationStructure Learning in Sequential Data
Structure Learning in Sequential Data Liam Stewart liam@cs.toronto.edu Richard Zemel zemel@cs.toronto.edu 2005.09.19 Motivation. Cau, R. Kuiper, and W.-P. de Roever. Formalising Dijkstra's development
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More information11. Learning graphical models
Learning graphical models 11-1 11. Learning graphical models Maximum likelihood Parameter learning Structural learning Learning partially observed graphical models Learning graphical models 11-2 statistical
More informationClustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning
Clustering K-means Machine Learning CSE546 Sham Kakade University of Washington November 15, 2016 1 Announcements: Project Milestones due date passed. HW3 due on Monday It ll be collaborative HW2 grades
More informationClassification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationUnsupervised Learning
CS 3750 Advanced Machine Learning hkc6@pitt.edu Unsupervised Learning Data: Just data, no labels Goal: Learn some underlying hidden structure of the data P(, ) P( ) Principle Component Analysis (Dimensionality
More informationMaterial presented. Direct Models for Classification. Agenda. Classification. Classification (2) Classification by machines 6/16/2010.
Material presented Direct Models for Classification SCARF JHU Summer School June 18, 2010 Patrick Nguyen (panguyen@microsoft.com) What is classification? What is a linear classifier? What are Direct Models?
More informationUnsupervised Learning
2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and
More informationProbabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov
Probabilistic Graphical Models: MRFs and CRFs CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Why PGMs? PGMs can model joint probabilities of many events. many techniques commonly
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationProbabilistic Graphical Models
2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector
More informationCS Lecture 4. Markov Random Fields
CS 6347 Lecture 4 Markov Random Fields Recap Announcements First homework is available on elearning Reminder: Office hours Tuesday from 10am-11am Last Time Bayesian networks Today Markov random fields
More informationLeast Squares Regression
CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationGaussian discriminant analysis Naive Bayes
DM825 Introduction to Machine Learning Lecture 7 Gaussian discriminant analysis Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline 1. is 2. Multi-variate
More informationLeast Squares Regression
E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute
More informationAd Placement Strategies
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January
More informationRepresentation. Stefano Ermon, Aditya Grover. Stanford University. Lecture 2
Representation Stefano Ermon, Aditya Grover Stanford University Lecture 2 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 2 1 / 32 Learning a generative model We are given a training
More informationBayesian Networks Representation
Bayesian Networks Representation Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University March 19 th, 2007 Handwriting recognition Character recognition, e.g., kernel SVMs a c z rr r r
More informationProbabilistic Time Series Classification
Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign
More informationChapter 16. Structured Probabilistic Models for Deep Learning
Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationIntroduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak
Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,
More informationContent-based Recommendation
Content-based Recommendation Suthee Chaidaroon June 13, 2016 Contents 1 Introduction 1 1.1 Matrix Factorization......................... 2 2 slda 2 2.1 Model................................. 3 3 flda 3
More information11 : Gaussian Graphic Models and Ising Models
10-708: Probabilistic Graphical Models 10-708, Spring 2017 11 : Gaussian Graphic Models and Ising Models Lecturer: Bryon Aragam Scribes: Chao-Ming Yen 1 Introduction Different from previous maximum likelihood
More informationLearning Bayesian network : Given structure and completely observed data
Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution
More informationAlternative Parameterizations of Markov Networks. Sargur Srihari
Alternative Parameterizations of Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Three types of parameterization 1. Gibbs Parameterization 2. Factor Graphs 3. Log-linear Models with Energy functions
More informationBasic Sampling Methods
Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution
More informationModeling and measuring training information in a network. SML 2014 Jan Ramon
Modeling and measuring training information in a network SML 2014 Jan Ramon Contents Modeling example dependencies Problem of measuring training information Contributions Conclusion Networked examples
More informationEE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18
EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 18 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 31, 2012 Andre Tkacenko
More informationConditional Random Field
Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions
More information