Aggregating Ordinal Labels from Crowds by Minimax Conditional Entropy. Denny Zhou Qiang Liu John Platt Chris Meek
|
|
- Isabel Higgins
- 6 years ago
- Views:
Transcription
1 Aggregating Ordinal Labels from Crowds by Minimax Conditional Entropy Denny Zhou Qiang Liu John Platt Chris Meek
2 2
3 Crowds vs experts labeling: strength Time saving Money saving Big labeled data More data beats cleverer algorithms 3
4 Crowds vs experts labeling: weakness Garbage in Garbage out Crowdsourced labels may be highly noisy 4
5 Non-experts, redundant labels M O O O O O O M O M O M M M M M Orange (O) vs. Mandarin (M) 5
6 Non-experts, redundant labels M O O O O O O M O M O M M M M M Orange (O) vs. Mandarin (M) 6
7 Workers Items 1 2 j 1 x 11 x 12 x 1j 2 x 21 x 21 x 2j i x i1 x i2 x ij Observed worker labels Unobserved true labels: y j 7
8 Roadmap: from multiclass to ordinal 1. Develop a method to aggregate general multiclass labels 2. Adapt the general method to ordinal labels 8
9 Examples on multiclass labeling Image categorization Speech recognition 9
10 Introduce two fundamental concepts Empirical count of wrong/correct labels Expected number of wrong/correct labels : worker label distribution : true label distribution 10
11 Multiclass maximum conditional entropy Given the true labels, estimate by subject to worker constraints item constraints 11
12 Multiclass minimax conditional entropy Jointly estimate and by subject to worker constraints item constraints 12
13 Lagrangian dual constraints 13
14 Probabilistic labeling model By the optimization theory, the dual problem leads to normalization factor worker ability item difficulty 14
15 Dual problem 1. This only generates deterministic labels 2. Equivalent to maximizing complete likelihood 15
16 Roadmap: from multiclass to ordinal 1. Develop a method to aggregate general multiclass labels 2. Adapt the general method to ordinal labels 16
17 An example on ordinal labeling Perfect 1 Excellent 2 Good 3 Fair 4 Bad 5 search results 17
18 To proceed to ordinal labels Formulate assumptions which are specific for ordinal labeling Coincide with the previous multiclass method in the case of binary labeling 18
19 Our assumption for ordinal labeling adjacency confusability likely to confuse unlikely to confuse 19
20 Formulating this assumption though pairwise comparison Reference label, <, < Indirect label comparison True label Worker label 20
21 Ordinal minimax conditional entropy Jointly estimate and by subject to worker constraints item constraints Δ: take on values < or : take on values < or 21
22 Ordinal minimax conditional entropy Jointly estimate and by subject to reference label worker constraints item constraints true label worker label 22
23 Ordinal minimax conditional entropy Jointly estimate and by subject to reference label worker constraints item constraints difference from multiclass true label worker label 23
24 Explaining the ordinal constraints For example, let Δ = <, = : counting mistakes in ordinal sense 24
25 Probabilistic rating model By the KKT conditions, the dual problem leads to worker ability item difficulty structured 25
26 Regularization Two goals: 1. Prevent over fitting 2. Fix the deterministic label issue to generate probabilistic labels 26
27 Regularized minimax conditional entropy Jointly estimate and by + regularization terms subject to worker constraints item constraints 27
28 Regularized minimax conditional entropy Jointly estimate and by subject to worker constraints item constraints 28
29 Dual problem 1. This generates probabilistic labels 2. Equivalent to maximizing marginal likelihood 29
30 Choosing regularization parameters Cross-validation: 5 or 10 folds Random split Compare the likelihood of worker labels Don t need ground truth labels for cross-validation! 30
31 Experiments: metrics Evaluation metrics L0 error: L1 error: L2 error: 31
32 Experiments: baselines Compare regularized minimax condition entropy to Majority voting Dawid-Skene method (1979, see also its Bayesian version in Raykar et al. 2010, Liu et al. 2012, Chen at al. 2013) Latent trait analysis (Andrich 1978, Master 1982, Uebersax and Grove 1993, Mineiro 2011) 32
33 Web search data Perfect 1 Excellent 2 Good 3 Fair 4 Bad 5 search results 33
34 Web search data Some facts about the data: 2665 query-url pairs and a relevance rating scale from 1 to non-expert workers with average error rate 63% Each query-url pair is judged by 6 workers True labels are created via consensus from 9 experts Dataset created by Gabriella Kazai of Microsoft 34
35 Web search data L0 Error L1 Error L2 Error Majority vote Dawid & Skene Latent trait Entropy multiclass Entropy ordinal
36 Probabilistic labels vs error rates L0 error L1 error L2 error (0, 0.5) (0.5, 0.6) (0.6, 0.7) (0.7, 0.8) (0.8, 0.9) (0.9, 1) 36
37 Price prediction data $0 $50 1 $51 $100 2 $101 $250 3 $251 $500 4 $501 $ $1001 $ $2001 $
38 Price prediction data Some facts about the data: 80 household items collected from stores like Amazon and Costco Prices predicted by 155 students of UC Irvine Average error rate 69% and systematically biased Dataset created by Mark Steyvers of UC Irvine 38
39 Price prediction data L0 Error L1 Error L2 Error Majority vote Dawid & Skene Latent trait Entropy multiclass Entropy ordinal
40 Summary Minimax conditional entropy principle for crowdsourcing Adjacency confusability assumption in ordinal labeling Ordinal labeling model with structured confusion matrices 40
Learning from the Wisdom of Crowds by Minimax Entropy. Denny Zhou, John Platt, Sumit Basu and Yi Mao Microsoft Research, Redmond, WA
Learning from the Wisdom of Crowds by Minimax Entropy Denny Zhou, John Platt, Sumit Basu and Yi Mao Microsoft Research, Redmond, WA Outline 1. Introduction 2. Minimax entropy principle 3. Future work and
More informationarxiv: v2 [cs.lg] 17 Nov 2016
Approximating Wisdom of Crowds using K-RBMs Abhay Gupta Microsoft India R&D Pvt. Ltd. abhgup@microsoft.com arxiv:1611.05340v2 [cs.lg] 17 Nov 2016 Abstract An important way to make large training sets is
More informationCrowdsourcing via Tensor Augmentation and Completion (TAC)
Crowdsourcing via Tensor Augmentation and Completion (TAC) Presenter: Yao Zhou joint work with: Dr. Jingrui He - 1 - Roadmap Background Related work Crowdsourcing based on TAC Experimental results Conclusion
More informationUncovering the Latent Structures of Crowd Labeling
Uncovering the Latent Structures of Crowd Labeling Tian Tian and Jun Zhu Presenter:XXX Tsinghua University 1 / 26 Motivation Outline 1 Motivation 2 Related Works 3 Crowdsourcing Latent Class 4 Experiments
More informationImproving Quality of Crowdsourced Labels via Probabilistic Matrix Factorization
Human Computation AAAI Technical Report WS-12-08 Improving Quality of Crowdsourced Labels via Probabilistic Matrix Factorization Hyun Joon Jung School of Information University of Texas at Austin hyunjoon@utexas.edu
More informationCrowdsourcing & Optimal Budget Allocation in Crowd Labeling
Crowdsourcing & Optimal Budget Allocation in Crowd Labeling Madhav Mohandas, Richard Zhu, Vincent Zhuang May 5, 2016 Table of Contents 1. Intro to Crowdsourcing 2. The Problem 3. Knowledge Gradient Algorithm
More informationThe Benefits of a Model of Annotation
The Benefits of a Model of Annotation Rebecca J. Passonneau and Bob Carpenter Columbia University Center for Computational Learning Systems Department of Statistics LAW VII, August 2013 Conventional Approach
More informationBudget-Optimal Task Allocation for Reliable Crowdsourcing Systems
Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems Sewoong Oh Massachusetts Institute of Technology joint work with David R. Karger and Devavrat Shah September 28, 2011 1 / 13 Crowdsourcing
More informationCS 188: Artificial Intelligence. Outline
CS 188: Artificial Intelligence Lecture 21: Perceptrons Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein. Outline Generative vs. Discriminative Binary Linear Classifiers Perceptron Multi-class
More informationKernel Methods and Support Vector Machines
Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete
More informationarxiv: v3 [cs.lg] 25 Aug 2017
Achieving Budget-optimality with Adaptive Schemes in Crowdsourcing Ashish Khetan and Sewoong Oh arxiv:602.0348v3 [cs.lg] 25 Aug 207 Abstract Crowdsourcing platforms provide marketplaces where task requesters
More informationA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion
More informationCS 188: Artificial Intelligence Fall 2008
CS 188: Artificial Intelligence Fall 2008 Lecture 23: Perceptrons 11/20/2008 Dan Klein UC Berkeley 1 General Naïve Bayes A general naive Bayes model: C E 1 E 2 E n We only specify how each feature depends
More informationGeneral Naïve Bayes. CS 188: Artificial Intelligence Fall Example: Overfitting. Example: OCR. Example: Spam Filtering. Example: Spam Filtering
CS 188: Artificial Intelligence Fall 2008 General Naïve Bayes A general naive Bayes model: C Lecture 23: Perceptrons 11/20/2008 E 1 E 2 E n Dan Klein UC Berkeley We only specify how each feature depends
More informationPermuation Models meet Dawid-Skene: A Generalised Model for Crowdsourcing
Permuation Models meet Dawid-Skene: A Generalised Model for Crowdsourcing Ankur Mallick Electrical and Computer Engineering Carnegie Mellon University amallic@andrew.cmu.edu Abstract The advent of machine
More informationCS 188: Artificial Intelligence Fall 2011
CS 188: Artificial Intelligence Fall 2011 Lecture 22: Perceptrons and More! 11/15/2011 Dan Klein UC Berkeley Errors, and What to Do Examples of errors Dear GlobalSCAPE Customer, GlobalSCAPE has partnered
More informationErrors, and What to Do. CS 188: Artificial Intelligence Fall What to Do About Errors. Later On. Some (Simplified) Biology
CS 188: Artificial Intelligence Fall 2011 Lecture 22: Perceptrons and More! 11/15/2011 Dan Klein UC Berkeley Errors, and What to Do Examples of errors Dear GlobalSCAPE Customer, GlobalSCAPE has partnered
More informationLearning From Crowds. Presented by: Bei Peng 03/24/15
Learning From Crowds Presented by: Bei Peng 03/24/15 1 Supervised Learning Given labeled training data, learn to generalize well on unseen data Binary classification ( ) Multi-class classification ( y
More informationAdaptive Crowdsourcing via EM with Prior
Adaptive Crowdsourcing via EM with Prior Peter Maginnis and Tanmay Gupta May, 205 In this work, we make two primary contributions: derivation of the EM update for the shifted and rescaled beta prior and
More informationCrowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons
2015 The University of Texas at Arlington. All Rights Reserved. Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons Abolfazl Asudeh, Gensheng Zhang, Naeemul Hassan, Chengkai Li, Gergely
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationarxiv: v2 [cs.lg] 20 May 2018
LEARNING FROM NOISY SINGLY-LABELED DATA Ashish Khetan University of Illinois at Urbana-Champaign Urbana, IL 61801 khetan2@illinois.edu Zachary C. Lipton Amazon Web Services Seattle, WA 98101 liptoz@amazon.com
More informationCPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017
CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class
More informationMulticlass Multilabel Classification with More Classes than Examples
Multiclass Multilabel Classification with More Classes than Examples Ohad Shamir Weizmann Institute of Science Joint work with Ofer Dekel, MSR NIPS 2015 Extreme Classification Workshop Extreme Multiclass
More informationCS 188: Artificial Intelligence. Machine Learning
CS 188: Artificial Intelligence Review of Machine Learning (ML) DISCLAIMER: It is insufficient to simply study these slides, they are merely meant as a quick refresher of the high-level ideas covered.
More informationA Wisdom of the Crowd Approach to Forecasting
A Wisdom of the Crowd Approach to Forecasting Funded by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number D11PC20059 Brandon
More informationMachine Learning for Signal Processing Bayes Classification
Machine Learning for Signal Processing Bayes Classification Class 16. 24 Oct 2017 Instructor: Bhiksha Raj - Abelino Jimenez 11755/18797 1 Recap: KNN A very effective and simple way of performing classification
More informationarxiv: v3 [stat.ml] 1 Nov 2014
Spectral Methods meet EM: A Provably Optimal Algorithm for Crowdsourcing Yuchen Zhang Xi Chen Dengyong Zhou Michael I. Jordan arxiv:406.3824v3 [stat.ml] Nov 204 November 4, 204 Abstract Crowdsourcing is
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationNatural Language Processing. Classification. Features. Some Definitions. Classification. Feature Vectors. Classification I. Dan Klein UC Berkeley
Natural Language Processing Classification Classification I Dan Klein UC Berkeley Classification Automatically make a decision about inputs Example: document category Example: image of digit digit Example:
More informationBayesian Identity Clustering
Bayesian Identity Clustering Simon JD Prince Department of Computer Science University College London James Elder Centre for Vision Research York University http://pvlcsuclacuk sprince@csuclacuk The problem
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationThe Naïve Bayes Classifier. Machine Learning Fall 2017
The Naïve Bayes Classifier Machine Learning Fall 2017 1 Today s lecture The naïve Bayes Classifier Learning the naïve Bayes Classifier Practical concerns 2 Today s lecture The naïve Bayes Classifier Learning
More informationEvaluation. Andrea Passerini Machine Learning. Evaluation
Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain
More informationEvaluation requires to define performance measures to be optimized
Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationStatistical Quality Control for Human Computation and Crowdsourcing
Statistical Quality Control for Human Computation and Crowdsourcing Yukino aba (University of Tsukuba) Early career spotlight talk @ IJCI-ECI 2018 July 18, 2018 HUMN COMPUTTION Humans and computers collaboratively
More informationTruth Discovery and Crowdsourcing Aggregation: A Unified Perspective
Truth Discovery and Crowdsourcing Aggregation: A Unified Perspective Jing Gao 1, Qi Li 1, Bo Zhao 2, Wei Fan 3, and Jiawei Han 4 1 SUNY Buffalo; 2 LinkedIn; 3 Baidu Research Big Data Lab; 4 University
More informationCS 343: Artificial Intelligence
CS 343: Artificial Intelligence Perceptrons Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188
More informationSUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION
SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification
More informationData Exploration and Unsupervised Learning with Clustering
Data Exploration and Unsupervised Learning with Clustering Paul F Rodriguez,PhD San Diego Supercomputer Center Predictive Analytic Center of Excellence Clustering Idea Given a set of data can we find a
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released
More informationDouble or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing
Journal of Machine Learning Research 17 (2016) 1-52 Submitted 12/15; Revised 7/16; Published 9/16 Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing Nihar B. Shah Department of Electrical
More informationBandit-Based Task Assignment for Heterogeneous Crowdsourcing
Neural Computation, vol.27, no., pp.2447 2475, 205. Bandit-Based Task Assignment for Heterogeneous Crowdsourcing Hao Zhang Department of Computer Science, Tokyo Institute of Technology, Japan Yao Ma Department
More informationCS 188: Artificial Intelligence Spring Today
CS 188: Artificial Intelligence Spring 2006 Lecture 9: Naïve Bayes 2/14/2006 Dan Klein UC Berkeley Many slides from either Stuart Russell or Andrew Moore Bayes rule Today Expectations and utilities Naïve
More informationCrowd-Learning: Improving the Quality of Crowdsourcing Using Sequential Learning
Crowd-Learning: Improving the Quality of Crowdsourcing Using Sequential Learning Mingyan Liu (Joint work with Yang Liu) Department of Electrical Engineering and Computer Science University of Michigan,
More informationCS 5522: Artificial Intelligence II
CS 5522: Artificial Intelligence II Perceptrons Instructor: Alan Ritter Ohio State University [These slides were adapted from CS188 Intro to AI at UC Berkeley. All materials available at http://ai.berkeley.edu.]
More informationCHAPTER 3. THE IMPERFECT CUMULATIVE SCALE
CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE 3.1 Model Violations If a set of items does not form a perfect Guttman scale but contains a few wrong responses, we do not necessarily need to discard it. A wrong
More informationNaïve Bayesian. From Han Kamber Pei
Naïve Bayesian From Han Kamber Pei Bayesian Theorem: Basics Let X be a data sample ( evidence ): class label is unknown Let H be a hypothesis that X belongs to class C Classification is to determine H
More informationA Novel Click Model and Its Applications to Online Advertising
A Novel Click Model and Its Applications to Online Advertising Zeyuan Zhu Weizhu Chen Tom Minka Chenguang Zhu Zheng Chen February 5, 2010 1 Introduction Click Model - To model the user behavior Application
More informationLatent Class Analysis for Models with Error of Measurement Using Log-Linear Models and An Application to Women s Liberation Data
Journal of Data Science 9(2011), 43-54 Latent Class Analysis for Models with Error of Measurement Using Log-Linear Models and An Application to Women s Liberation Data Haydar Demirhan Hacettepe University
More informationA Bayesian model for fusing biomedical labels
Chapter 7 A Bayesian model for fusing biomedical labels Tingting Zhu, Gari D. Clifford and David A. Clifton 7.1 Background In manual annotation of data, significant intra- and inter-observer disagreements
More informationMinimax risk bounds for linear threshold functions
CS281B/Stat241B (Spring 2008) Statistical Learning Theory Lecture: 3 Minimax risk bounds for linear threshold functions Lecturer: Peter Bartlett Scribe: Hao Zhang 1 Review We assume that there is a probability
More informationMulticategory Crowdsourcing Accounting for Plurality in Worker Skill and Intention, Task Difficulty, and Task Heterogeneity
Multicategory Crowdsourcing Accounting for Plurality in Worker Skill and Intention, Task Difficulty, and Task Heterogeneity arxiv:307.7332v [cs.ir] 28 Jul 203 Aditya Kurve Department of Electrical Engineering
More informationListwise Approach to Learning to Rank Theory and Algorithm
Listwise Approach to Learning to Rank Theory and Algorithm Fen Xia *, Tie-Yan Liu Jue Wang, Wensheng Zhang and Hang Li Microsoft Research Asia Chinese Academy of Sciences document s Learning to Rank for
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationBayesian Decision Process for Cost-Efficient Dynamic Ranking via Crowdsourcing
Journal of Machine Learning Research 17 (016) 1-40 Submitted /16; Published 11/16 Bayesian Decision Process for Cost-Efficient Dynamic Ranking via Crowdsourcing Xi Chen Stern School of Business New York
More informationLearning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014
Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of
More informationDecoupled Collaborative Ranking
Decoupled Collaborative Ranking Jun Hu, Ping Li April 24, 2017 Jun Hu, Ping Li WWW2017 April 24, 2017 1 / 36 Recommender Systems Recommendation system is an information filtering technique, which provides
More informationInformation Retrieval
Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 15: Learning to Rank Sec. 15.4 Machine learning for IR ranking? We ve looked at methods for ranking
More informationPredicting the Probability of Correct Classification
Predicting the Probability of Correct Classification Gregory Z. Grudic Department of Computer Science University of Colorado, Boulder grudic@cs.colorado.edu Abstract We propose a formulation for binary
More informationLaconic: Label Consistency for Image Categorization
1 Laconic: Label Consistency for Image Categorization Samy Bengio, Google with Jeff Dean, Eugene Ie, Dumitru Erhan, Quoc Le, Andrew Rabinovich, Jon Shlens, and Yoram Singer 2 Motivation WHAT IS THE OCCLUDED
More informationAlgorithms for NLP. Classification II. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley
Algorithms for NLP Classification II Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley Minimize Training Error? A loss function declares how costly each mistake is E.g. 0 loss for correct label,
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationBayesian Estimation Under Informative Sampling with Unattenuated Dependence
Bayesian Estimation Under Informative Sampling with Unattenuated Dependence Matt Williams 1 Terrance Savitsky 2 1 Substance Abuse and Mental Health Services Administration Matthew.Williams@samhsa.hhs.gov
More informationGeneral structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models
General structural model Part 2: Categorical variables and beyond Psychology 588: Covariance structure and factor models Categorical variables 2 Conventional (linear) SEM assumes continuous observed variables
More informationMachine Learning. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 8 May 2012
Machine Learning Hal Daumé III Computer Science University of Maryland me@hal3.name CS 421 Introduction to Artificial Intelligence 8 May 2012 g 1 Many slides courtesy of Dan Klein, Stuart Russell, or Andrew
More informationMixtures of Gaussians with Sparse Regression Matrices. Constantinos Boulis, Jeffrey Bilmes
Mixtures of Gaussians with Sparse Regression Matrices Constantinos Boulis, Jeffrey Bilmes {boulis,bilmes}@ee.washington.edu Dept of EE, University of Washington Seattle WA, 98195-2500 UW Electrical Engineering
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L6: Structured Estimation Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune, January
More informationTOPIC models, such as latent Dirichlet allocation (LDA),
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. X, NO. X, XXXX Learning Supervised Topic Models for Classification and Regression from Crowds Filipe Rodrigues, Mariana Lourenço, Bernardete
More informationChapter 6 Classification and Prediction (2)
Chapter 6 Classification and Prediction (2) Outline Classification and Prediction Decision Tree Naïve Bayes Classifier Support Vector Machines (SVM) K-nearest Neighbors Accuracy and Error Measures Feature
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Max-margin learning of GM Eric Xing Lecture 28, Apr 28, 2014 b r a c e Reading: 1 Classical Predictive Models Input and output space: Predictive
More informationCPSC 340: Machine Learning and Data Mining. More PCA Fall 2017
CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).
More information1-bit Matrix Completion. PAC-Bayes and Variational Approximation
: PAC-Bayes and Variational Approximation (with P. Alquier) PhD Supervisor: N. Chopin Bayes In Paris, 5 January 2017 (Happy New Year!) Various Topics covered Matrix Completion PAC-Bayesian Estimation Variational
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationDeconstructing Data Science
Deconstructing Data Science David Bamman, UC Berkeley Info 290 Lecture 3: Classification overview Jan 24, 2017 Auditors Send me an email to get access to bcourses (announcements, readings, etc.) Classification
More informationCromwell's principle idealized under the theory of large deviations
Cromwell's principle idealized under the theory of large deviations Seminar, Statistics and Probability Research Group, University of Ottawa Ottawa, Ontario April 27, 2018 David Bickel University of Ottawa
More informationMIRA, SVM, k-nn. Lirong Xia
MIRA, SVM, k-nn Lirong Xia Linear Classifiers (perceptrons) Inputs are feature values Each feature has a weight Sum is the activation activation w If the activation is: Positive: output +1 Negative, output
More informationMachine Learning for NLP
Machine Learning for NLP Uppsala University Department of Linguistics and Philology Slides borrowed from Ryan McDonald, Google Research Machine Learning for NLP 1(50) Introduction Linear Classifiers Classifiers
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationMachine Learning for Computational Advertising
Machine Learning for Computational Advertising L1: Basics and Probability Theory Alexander J. Smola Yahoo! Labs Santa Clara, CA 95051 alex@smola.org UC Santa Cruz, April 2009 Alexander J. Smola: Machine
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More information10 : HMM and CRF. 1 Case Study: Supervised Part-of-Speech Tagging
10-708: Probabilistic Graphical Models 10-708, Spring 2018 10 : HMM and CRF Lecturer: Kayhan Batmanghelich Scribes: Ben Lengerich, Michael Kleyman 1 Case Study: Supervised Part-of-Speech Tagging We will
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationAggregating Crowdsourced Ordinal Labels via Bayesian Clustering
Aggregating Crowdsourced Ordinal Labels via Bayesian Clustering Xiawei Guo (B) and James T. Kwok Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water
More informationVariational Inference for Crowdsourcing
Variational Inference for Crowdsourcing Qiang Liu ICS, UC Irvine qliu1@ics.uci.edu Jian Peng TTI-C & CSAIL, MIT jpeng@csail.mit.edu Alexander Ihler ICS, UC Irvine ihler@ics.uci.edu Abstract Crowdsourcing
More informationVCMC: Variational Consensus Monte Carlo
VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object
More informationAugmented Statistical Models for Speech Recognition
Augmented Statistical Models for Speech Recognition Mark Gales & Martin Layton 31 August 2005 Trajectory Models For Speech Processing Workshop Overview Dependency Modelling in Speech Recognition: latent
More informationSpectral Unsupervised Parsing with Additive Tree Metrics
Spectral Unsupervised Parsing with Additive Tree Metrics Ankur Parikh, Shay Cohen, Eric P. Xing Carnegie Mellon, University of Edinburgh Ankur Parikh 2014 1 Overview Model: We present a novel approach
More informationRandomized Decision Trees
Randomized Decision Trees compiled by Alvin Wan from Professor Jitendra Malik s lecture Discrete Variables First, let us consider some terminology. We have primarily been dealing with real-valued data,
More informationReconstruction. Reading for this lecture: Lecture Notes.
ɛm Reconstruction Reading for this lecture: Lecture Notes. The Learning Channel... ɛ -Machine of a Process: Intrinsic representation! Predictive (or causal) equivalence relation: s s Pr( S S= s ) = Pr(
More information18.9 SUPPORT VECTOR MACHINES
744 Chapter 8. Learning from Examples is the fact that each regression problem will be easier to solve, because it involves only the examples with nonzero weight the examples whose kernels overlap the
More informationFrom Binary to Multiclass Classification. CS 6961: Structured Prediction Spring 2018
From Binary to Multiclass Classification CS 6961: Structured Prediction Spring 2018 1 So far: Binary Classification We have seen linear models Learning algorithms Perceptron SVM Logistic Regression Prediction
More informationFactor Modeling for Advertisement Targeting
Ye Chen 1, Michael Kapralov 2, Dmitry Pavlov 3, John F. Canny 4 1 ebay Inc, 2 Stanford University, 3 Yandex Labs, 4 UC Berkeley NIPS-2009 Presented by Miao Liu May 27, 2010 Introduction GaP model Sponsored
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Expectation Maximization (EM) and Mixture Models Hamid R. Rabiee Jafar Muhammadi, Mohammad J. Hosseini Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2 Agenda Expectation-maximization
More informationLogistic Regression. Machine Learning Fall 2018
Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes
More information11. Learning To Rank. Most slides were adapted from Stanford CS 276 course.
11. Learning To Rank Most slides were adapted from Stanford CS 276 course. 1 Sec. 15.4 Machine learning for IR ranking? We ve looked at methods for ranking documents in IR Cosine similarity, inverse document
More informationRegression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)
Linear Regression Regression Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Example: Height, Gender, Weight Shoe Size Audio features
More informationOverview. Multidimensional Item Response Theory. Lecture #12 ICPSR Item Response Theory Workshop. Basics of MIRT Assumptions Models Applications
Multidimensional Item Response Theory Lecture #12 ICPSR Item Response Theory Workshop Lecture #12: 1of 33 Overview Basics of MIRT Assumptions Models Applications Guidance about estimating MIRT Lecture
More information