Preference-Based Rank Elicitation using Statistical Models: The Case of Mallows
|
|
- Alyson Gregory
- 5 years ago
- Views:
Transcription
1 Preference-Based Rank Elicitation using Statistical Models: The Case of Mallows Robert Busa-Fekete 1 Eyke Huellermeier 2 Balzs Szrnyi 3 1 MTA-SZTE Research Group on Artificial Intelligence, Tisza Lajos krt. 103., H-6720 Szeged, Hungary 2 Department of Computer Science, University of Paderborn, Warburger Str. 100, Paderborn, Germany 3 INRIA Lille - Nord Europe, SequeL project, 40 avenue Halley, Villeneuve dascq, France ICML 2014 June 24, 2014 (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
2 Outline 1 Preference-based stochastic multi-armed bandit (PB-MAB) 2 PB-MAB Framework with Statistical Models over Ranking 3 Concrete implementations with Mallows model 4 Numerical Experiments (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
3 Motivation Exploiting revealed preferences to learn a ranking Example: Crowdsourcing Amazon Mechanical Turk Widely-used platform in Natural Language Processing (NLP) to annotate training database Machine Translation: for each English sentence, there are given many possible translations Goal: either to find a ranking which reflects to the quality of the translations, or the best translation The annotators might be asked in terms of simple questions: Is translation A better than translation B? (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
4 Preference-based stochastic multi-armed bandit setup Give M items/arms: A = {a 1,..., a M } Items/arms can only be compared in a pairwise manner In a time step t, the (online) learning algorithm selects a pair of items (i t, j t ) to be compared feedback Pairwise probability for items a i and a j : p i,j = P(a i a j ) = E[I{a i a j }] (1) follows a fixed probabilistic distribution If p i,j > 1/2, then item a i is preferred to item a j If p i,j < 1/2, then item a j is preferred to item a i (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
5 Goal of the online learner (decision maker/agent) Find the best item (with high probability) Find a ranking over item (with high probability) Minimize the number of pairwise comparisons (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
6 Modeling assumptions Elicit a ranking based on probabilistic (noisy) feedback Establish a connection to statistical models of rank data Full ranking (r 1,..., r i,..., r j,..., r M ) P(. ) Observation I{r i < r j } P(. ) is a parametric probability distribution over the set of ranking S M Making inference about P based on sampled pairwise comparisons (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
7 Modeling assumptions Pairwise probabilities can be written as p i,j = P(a i a j ) = where L(r j > r i ) = {r S M r j > r i } r L(r j >r i ) P(r ) (2) Implicitly, we assume certain regularity properites on P induced by P(. ) Pairwise probability cannot be arbitrary (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
8 Modeling assumptions Strong Stochastic Transitivity For any triplet of arms such that a i a j a k, p i,k max(p i,j, p j,k ) If we know a i a j and a j a k then a i a k Nice regularity property Reduce sample complexity (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
9 Online learning framework (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
10 How to make the setup complete (MPI) Find the most preferred item (or arm) a i defined as i = arg max 1 i M E [I{r i = 1}] (3) r P(. ) (MPR) Find the most probable ranking r given the ranking model P(. ) r = arg max P(r ) (4) r S M All goals are meant to be achieved with probability at least 1 δ Based as few pairwise comparisons as possible (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
11 Mallows φ-model The probability of observing a ranking r is r = ( r 1,..., r M ) is the center ranking P(r φ, r) = 1 Z(φ) φd(r, r) (5) d(.,.) is the Kendall s rank distance defined as d(r, r) = 1 i<j M φ (0, 1] is the spread parameter φ = 1 uniform distribution φ 0 P( r φ, r) 1 (becomes more peaky) Z(φ) is the normalisation factor I{(r i r j )( r i r j ) < 0} (6) (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
12 Find the most preferred item for Mallows model (MPI) i = arg max 1 i M E [I{r i = 1}] (7) r P(. φ, r) Most preferred item (MPI) is the one for which r i = 1 The center ranking determines a total order on the set of items such that if r i < r j then p i,j > 1/2, and r i > r j then p i,j < 1/2 (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
13 Find the most preferred item for Mallows model (MPI) i = arg max 1 i M E [I{r i = 1}] (7) r P(. φ, r) Most preferred item (MPI) is the one for which r i = 1 The center ranking determines a total order on the set of items such that if r i < r j then p i,j > 1/2, and r i > r j then p i,j < 1/2 MALLOWSMPI(δ) 1 Pick a random item a i 2 Pick another item, say a j, which has not selected yet, if there is no such, then break 3 Compare a i and a j until 1/2 / [ˆp i,j c i,j, ˆp i,j + c i,j ], where 1 c i,j = log 4n2 i,j M 2n i,j δ 4 if 1/2 < ˆp i,j c i,j, then keep a i goto 2, otherwise keep a j (8) (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
14 Find the most probable ranking for Mallows model (MPR) Most probable ranking is the center ranking 1 r = arg max P(r φ, r) = arg max φd(r, r) (9) r S M r S M Z(φ) The center ranking determines a total order on the set of items such that if r i < r j then p i,j > 1/2, and r i > r j then p i,j < 1/2 (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
15 Find the most probable ranking for Mallows model (MPR) MALLOWSMPR(δ) Follow the Merge sort strategy If the sorting algorithm compares two distinct items, say a i and a j, then compare them until 1/2 / [ˆp i,j c i,j, ˆp i,j + c i,j ], where 1 c i,j = log 4n2 i,j C M 2n i,j δ Worst case performance of merge sort algorithm: C M = M log 2 M M + 1 (Theorem 1, Flajolet & Golin (1994)) (10) (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
16 Sample complexity MallowsMPI(δ) O( M ρ 2 log M ), (11) δρ where ρ = 1 φ 1+φ lim φ 0 1/ρ 2 = 1 (more peaky distribution easier task) lim φ 1 1/ρ 2 = (more uniform harder task) MALLOWSMPR(δ) O( M log 2 M ρ 2 log M log 2 M ) (12) δρ (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
17 Numerical experiments for identifying Most Preferred Item (MPI) Verify that if the model assumptions are valid, the purposed algorithm is efficient. If Mallows model is assumed best arm = most preferred item Beat the mean (in PAC setting) [Yue and Joachims, 2011] Interleaved Filter [Yue et al., 2012] Compare the algorithms in terms of sample complexity Sample complexity: number of pairwise comparison take prior to the termination 100 repetitions The confidence parameter δ was set to 0.05, and thus, the accuracy was significantly higher than 0.95 in every case. (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
18 Numerical experiments for identifying MPI (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
19 Numerical experiments for identifying Most Probable Ranking (MPR) Most probable ranking = center ranking r = arg max P(r φ, r) (13) r S M Parameter estimation method for Mallows which can handle incomplete ranking [Cheng et al., 2009] Validated on datasets that consist of pairwise comparisons Assessed the accuracy of the estimator for center ranking on datasets with various size. (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
20 Numerical experiments for identifying MPR Solid Line: Parameter estimation method by Cheng et al. [2009] Dashed vertical lines: Merge sort-based MPR algorithm (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
21 Conclusion The purposed algorithms are efficient, if the modeling assumption holds Minimize sample complexity Guarantee a certain level of confidence (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
22 The End (UCLA) Ranking Elicitation ICML 2014 June 24, / 21
Preference-Based Rank Elicitation using Statistical Models: The Case of Mallows
: The Case of Mallows Róbert Busa-Fekete BUSAROBI@INF.U-SZEGED.HU Eyke Hüllermeier EYKE@UPB.DE Balázs Szörényi,3 SZORENYI@INF.U-SZEGED.HU MTA-SZTE Research Group on Artificial Intelligence, Tisza Lajos
More informationPreference-based Reinforcement Learning
Preference-based Reinforcement Learning Róbert Busa-Fekete 1,2, Weiwei Cheng 1, Eyke Hüllermeier 1, Balázs Szörényi 2,3, Paul Weng 3 1 Computational Intelligence Group, Philipps-Universität Marburg 2 Research
More informationOnline Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach
Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach Balázs Szörényi Technion, Haifa, Israel / MTA-SZTE Research Group on Artificial Intelligence, Hungary szorenyibalazs@gmail.com Róbert
More informationRelative Upper Confidence Bound for the K-Armed Dueling Bandit Problem
Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem Masrour Zoghi 1, Shimon Whiteson 1, Rémi Munos 2 and Maarten de Rijke 1 University of Amsterdam 1 ; INRIA Lille / MSR-NE 2 June 24,
More informationPreference-based Reinforcement Learning: Evolutionary Direct Policy Search using a Preference-based Racing Algorithm
Preference-based Reinforcement Learning: Evolutionary Direct Policy Search using a Preference-based Racing Algorithm Róbert Busa-Fekete Balázs Szörényi Paul Weng Weiwei Cheng Eyke Hüllermeier Abstract
More informationCrowdsourcing & Optimal Budget Allocation in Crowd Labeling
Crowdsourcing & Optimal Budget Allocation in Crowd Labeling Madhav Mohandas, Richard Zhu, Vincent Zhuang May 5, 2016 Table of Contents 1. Intro to Crowdsourcing 2. The Problem 3. Knowledge Gradient Algorithm
More informationBasics of reinforcement learning
Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system
More informationKnowledge Compilation and Machine Learning A New Frontier Adnan Darwiche Computer Science Department UCLA with Arthur Choi and Guy Van den Broeck
Knowledge Compilation and Machine Learning A New Frontier Adnan Darwiche Computer Science Department UCLA with Arthur Choi and Guy Van den Broeck The Problem Logic (L) Knowledge Representation (K) Probability
More informationUnsupervised Learning with Permuted Data
Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University
More informationIntroduction to Logistic Regression
Introduction to Logistic Regression Guy Lebanon Binary Classification Binary classification is the most basic task in machine learning, and yet the most frequent. Binary classifiers often serve as the
More information1 [15 points] Search Strategies
Probabilistic Foundations of Artificial Intelligence Final Exam Date: 29 January 2013 Time limit: 120 minutes Number of pages: 12 You can use the back of the pages if you run out of space. strictly forbidden.
More informationTop-k Selection based on Adaptive Sampling of Noisy Preferences
Top-k Selection based on Adaptive Sampling of Noisy Preferences Róbert Busa-Fekete, busarobi@inf.u-szeged.hu Balázs Szörényi, szorenyi@inf.u-szeged.hu Paul Weng paul.weng@lip.fr Weiwei Cheng cheng@mathematik.uni-marburg.de
More informationAlireza Shafaei. Machine Learning Reading Group The University of British Columbia Summer 2017
s s Machine Learning Reading Group The University of British Columbia Summer 2017 (OCO) Convex 1/29 Outline (OCO) Convex Stochastic Bernoulli s (OCO) Convex 2/29 At each iteration t, the player chooses
More informationArtificial Intelligence Bayes Nets: Independence
Artificial Intelligence Bayes Nets: Independence Instructors: David Suter and Qince Li Course Delivered @ Harbin Institute of Technology [Many slides adapted from those created by Dan Klein and Pieter
More information1 [15 points] Frequent Itemsets Generation With Map-Reduce
Data Mining Learning from Large Data Sets Final Exam Date: 15 August 2013 Time limit: 120 minutes Number of pages: 11 Maximum score: 100 points You can use the back of the pages if you run out of space.
More informationA parametric approach to Bayesian optimization with pairwise comparisons
A parametric approach to Bayesian optimization with pairwise comparisons Marco Co Eindhoven University of Technology m.g.h.co@tue.nl Bert de Vries Eindhoven University of Technology and GN Hearing bdevries@ieee.org
More informationBeat the Mean Bandit
Yisong Yue H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, USA Thorsten Joachims Department of Computer Science, Cornell University, Ithaca, NY, USA yisongyue@cmu.edu tj@cs.cornell.edu
More informationA Simple Algorithm for Multilabel Ranking
A Simple Algorithm for Multilabel Ranking Krzysztof Dembczyński 1 Wojciech Kot lowski 1 Eyke Hüllermeier 2 1 Intelligent Decision Support Systems Laboratory (IDSS), Poznań University of Technology, Poland
More informationThe Multi-Armed Bandit Problem
The Multi-Armed Bandit Problem Electrical and Computer Engineering December 7, 2013 Outline 1 2 Mathematical 3 Algorithm Upper Confidence Bound Algorithm A/B Testing Exploration vs. Exploitation Scientist
More informationStochastic Structured Prediction under Bandit Feedback
Stochastic Structured Prediction under Bandit Feedback Artem Sokolov,, Julia Kreutzer, Christopher Lo,, Stefan Riezler, Computational Linguistics & IWR, Heidelberg University, Germany {sokolov,kreutzer,riezler}@cl.uni-heidelberg.de
More informationThe Multi-Arm Bandit Framework
The Multi-Arm Bandit Framework A. LAZARIC (SequeL Team @INRIA-Lille) ENS Cachan - Master 2 MVA SequeL INRIA Lille MVA-RL Course In This Lecture A. LAZARIC Reinforcement Learning Algorithms Oct 29th, 2013-2/94
More informationDriving Semantic Parsing from the World s Response
Driving Semantic Parsing from the World s Response James Clarke, Dan Goldwasser, Ming-Wei Chang, Dan Roth Cognitive Computation Group University of Illinois at Urbana-Champaign CoNLL 2010 Clarke, Goldwasser,
More informationQualitative Multi-Armed Bandits: A Quantile-Based Approach
Qualitative Multi-Armed Bandits: A Quantile-Based Approach Balazs Szorenyi, Róbert Busa-Fekete, Paul Weng, Eyke Hüllermeier To cite this version: Balazs Szorenyi, Róbert Busa-Fekete, Paul Weng, Eyke Hüllermeier.
More informationOPTIMIZING SEARCH ENGINES USING CLICKTHROUGH DATA. Paper By: Thorsten Joachims (Cornell University) Presented By: Roy Levin (Technion)
OPTIMIZING SEARCH ENGINES USING CLICKTHROUGH DATA Paper By: Thorsten Joachims (Cornell University) Presented By: Roy Levin (Technion) Outline The idea The model Learning a ranking function Experimental
More informationBinary Classification, Multi-label Classification and Ranking: A Decision-theoretic Approach
Binary Classification, Multi-label Classification and Ranking: A Decision-theoretic Approach Krzysztof Dembczyński and Wojciech Kot lowski Intelligent Decision Support Systems Laboratory (IDSS) Poznań
More informationInterventions. Online Learning from User Interactions through Interventions. Interactive Learning Systems
Online Learning from Interactions through Interventions CS 7792 - Fall 2016 horsten Joachims Department of Computer Science & Department of Information Science Cornell University Y. Yue, J. Broder, R.
More informationCS 5522: Artificial Intelligence II
CS 5522: Artificial Intelligence II Bayes Nets: Independence Instructor: Alan Ritter Ohio State University [These slides were adapted from CS188 Intro to AI at UC Berkeley. All materials available at http://ai.berkeley.edu.]
More informationCOMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017
COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SEQUENTIAL DATA So far, when thinking
More informationLearning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht
Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht Computer Science Department University of Pittsburgh Outline Introduction Learning with
More informationLearning Multiple Tasks in Parallel with a Shared Annotator
Learning Multiple Tasks in Parallel with a Shared Annotator Haim Cohen Department of Electrical Engeneering The Technion Israel Institute of Technology Haifa, 3000 Israel hcohen@tx.technion.ac.il Koby
More informationEfficient learning by implicit exploration in bandit problems with side observations
Efficient learning by implicit exploration in bandit problems with side observations Tomáš Kocák, Gergely Neu, Michal Valko, Rémi Munos SequeL team, INRIA Lille - Nord Europe, France SequeL INRIA Lille
More informationBayesian Active Learning With Basis Functions
Bayesian Active Learning With Basis Functions Ilya O. Ryzhov Warren B. Powell Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA IEEE ADPRL April 13, 2011 1 / 29
More informationA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion
More informationThe Benefits of a Model of Annotation
The Benefits of a Model of Annotation Rebecca J. Passonneau and Bob Carpenter Columbia University Center for Computational Learning Systems Department of Statistics LAW VII, August 2013 Conventional Approach
More informationMulti-task Linear Bandits
Multi-task Linear Bandits Marta Soare Alessandro Lazaric Ouais Alsharif Joelle Pineau INRIA Lille - Nord Europe INRIA Lille - Nord Europe McGill University & Google McGill University NIPS2014 Workshop
More informationTsinghua Machine Learning Guest Lecture, June 9,
Tsinghua Machine Learning Guest Lecture, June 9, 2015 1 Lecture Outline Introduction: motivations and definitions for online learning Multi-armed bandit: canonical example of online learning Combinatorial
More informationThe K-armed Dueling Bandits Problem
The K-armed Dueling Bandits Problem Yisong Yue, Josef Broder, Robert Kleinberg, Thorsten Joachims Abstract We study a partial-information online-learning problem where actions are restricted to noisy comparisons
More informationProf. Dr. Ann Nowé. Artificial Intelligence Lab ai.vub.ac.be
REINFORCEMENT LEARNING AN INTRODUCTION Prof. Dr. Ann Nowé Artificial Intelligence Lab ai.vub.ac.be REINFORCEMENT LEARNING WHAT IS IT? What is it? Learning from interaction Learning about, from, and while
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationBayesian Networks BY: MOHAMAD ALSABBAGH
Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional
More informationA Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits
A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits Pratik Gajane Tanguy Urvoy Fabrice Clérot Orange-labs, Lannion, France PRATIK.GAJANE@ORANGE.COM TANGUY.URVOY@ORANGE.COM
More informationMulti-armed bandit models: a tutorial
Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)
More informationHierarchical Boosting and Filter Generation
January 29, 2007 Plan Combining Classifiers Boosting Neural Network Structure of AdaBoost Image processing Hierarchical Boosting Hierarchical Structure Filters Combining Classifiers Combining Classifiers
More informationACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging
ACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging Stephen Clark Natural Language and Information Processing (NLIP) Group sc609@cam.ac.uk The POS Tagging Problem 2 England NNP s POS fencers
More informationError Limiting Reductions Between Classification Tasks
Error Limiting Reductions Between Classification Tasks Keywords: supervised learning, reductions, classification Abstract We introduce a reduction-based model for analyzing supervised learning tasks. We
More informationStatistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening
: The Case of Rank-Dependent Coarsening Mohsen Ahmadi Fahandar 1 Eyke Hüllermeier 1 Inés Couso 2 Abstract We consider the problem of statistical inference for ranking data, specifically rank aggregation,
More informationAn Optimal Bidimensional Multi Armed Bandit Auction for Multi unit Procurement
An Optimal Bidimensional Multi Armed Bandit Auction for Multi unit Procurement Satyanath Bhat Joint work with: Shweta Jain, Sujit Gujar, Y. Narahari Department of Computer Science and Automation, Indian
More informationKnowledge Discovery in Data: Overview. Naïve Bayesian Classification. .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar..
Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar Knowledge Discovery in Data: Naïve Bayes Overview Naïve Bayes methodology refers to a probabilistic approach to information discovery
More informationChap 2: Classical models for information retrieval
Chap 2: Classical models for information retrieval Jean-Pierre Chevallet & Philippe Mulhem LIG-MRIM Sept 2016 Jean-Pierre Chevallet & Philippe Mulhem Models of IR 1 / 81 Outline Basic IR Models 1 Basic
More informationBandit models: a tutorial
Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses
More informationSQL-Rank: A Listwise Approach to Collaborative Ranking
SQL-Rank: A Listwise Approach to Collaborative Ranking Liwei Wu Depts of Statistics and Computer Science UC Davis ICML 18, Stockholm, Sweden July 10-15, 2017 Joint work with Cho-Jui Hsieh and James Sharpnack
More informationOnline learning with noisy side observations
Online learning with noisy side observations Tomáš Kocák Gergely Neu Michal Valko Inria Lille - Nord Europe, France DTIC, Universitat Pompeu Fabra, Barcelona, Spain Inria Lille - Nord Europe, France SequeL
More informationDiscriminative Training. March 4, 2014
Discriminative Training March 4, 2014 Noisy Channels Again p(e) source English Noisy Channels Again p(e) p(g e) source English German Noisy Channels Again p(e) p(g e) source English German decoder e =
More informationPolyhedral Outer Approximations with Application to Natural Language Parsing
Polyhedral Outer Approximations with Application to Natural Language Parsing André F. T. Martins 1,2 Noah A. Smith 1 Eric P. Xing 1 1 Language Technologies Institute School of Computer Science Carnegie
More informationBandit View on Continuous Stochastic Optimization
Bandit View on Continuous Stochastic Optimization Sébastien Bubeck 1 joint work with Rémi Munos 1 & Gilles Stoltz 2 & Csaba Szepesvari 3 1 INRIA Lille, SequeL team 2 CNRS/ENS/HEC 3 University of Alberta
More informationMDP Preliminaries. Nan Jiang. February 10, 2019
MDP Preliminaries Nan Jiang February 10, 2019 1 Markov Decision Processes In reinforcement learning, the interactions between the agent and the environment are often described by a Markov Decision Process
More informationSubmodular Functions Properties Algorithms Machine Learning
Submodular Functions Properties Algorithms Machine Learning Rémi Gilleron Inria Lille - Nord Europe & LIFL & Univ Lille Jan. 12 revised Aug. 14 Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised
More informationSparse Linear Contextual Bandits via Relevance Vector Machines
Sparse Linear Contextual Bandits via Relevance Vector Machines Davis Gilton and Rebecca Willett Electrical and Computer Engineering University of Wisconsin-Madison Madison, WI 53706 Email: gilton@wisc.edu,
More informationLecture 5 Neural models for NLP
CS546: Machine Learning in NLP (Spring 2018) http://courses.engr.illinois.edu/cs546/ Lecture 5 Neural models for NLP Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office hours: Tue/Thu 2pm-3pm
More informationModeling User s Cognitive Dynamics in Information Access and Retrieval using Quantum Probability (ESR-6)
Modeling User s Cognitive Dynamics in Information Access and Retrieval using Quantum Probability (ESR-6) SAGAR UPRETY THE OPEN UNIVERSITY, UK QUARTZ Mid-term Review Meeting, Padova, Italy, 07/12/2018 Self-Introduction
More informationLarge-Scale Matrix Factorization with Distributed Stochastic Gradient Descent
Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent KDD 2011 Rainer Gemulla, Peter J. Haas, Erik Nijkamp and Yannis Sismanis Presenter: Jiawen Yao Dept. CSE, UT Arlington 1 1
More informationSum-Product Networks: A New Deep Architecture
Sum-Product Networks: A New Deep Architecture Pedro Domingos Dept. Computer Science & Eng. University of Washington Joint work with Hoifung Poon 1 Graphical Models: Challenges Bayesian Network Markov Network
More informationFast boosting using adversarial bandits
Róbert Busa-Fekete 1,2 Balázs Kégl 1 1 LAL/LRI, University of Paris-Sud, CNRS, 91898 Orsay, France busarobi@gmail.com balazs.kegl@gmail.com 2 Research Group on Artificial Intelligence of the Hungarian
More informationSelecting the State-Representation in Reinforcement Learning
Selecting the State-Representation in Reinforcement Learning Odalric-Ambrym Maillard INRIA Lille - Nord Europe odalricambrym.maillard@gmail.com Rémi Munos INRIA Lille - Nord Europe remi.munos@inria.fr
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationNew bounds on the price of bandit feedback for mistake-bounded online multiclass learning
Journal of Machine Learning Research 1 8, 2017 Algorithmic Learning Theory 2017 New bounds on the price of bandit feedback for mistake-bounded online multiclass learning Philip M. Long Google, 1600 Amphitheatre
More informationLearning to Rank and Quadratic Assignment
Learning to Rank and Quadratic Assignment Thomas Mensink TVPA - XRCE & LEAR - INRIA Grenoble, France Jakob Verbeek LEAR Team INRIA Rhône-Alpes Grenoble, France Abstract Tiberio Caetano Machine Learning
More informationBayes Nets: Independence
Bayes Nets: Independence [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] Bayes Nets A Bayes
More informationOn the Problem of Error Propagation in Classifier Chains for Multi-Label Classification
On the Problem of Error Propagation in Classifier Chains for Multi-Label Classification Robin Senge, Juan José del Coz and Eyke Hüllermeier Draft version of a paper to appear in: L. Schmidt-Thieme and
More informationLecture 21: Spectral Learning for Graphical Models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation
More informationApprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning
Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Nicolas Thome Prenom.Nom@cnam.fr http://cedric.cnam.fr/vertigo/cours/ml2/ Département Informatique Conservatoire
More informationActive Learning and Optimized Information Gathering
Active Learning and Optimized Information Gathering Lecture 7 Learning Theory CS 101.2 Andreas Krause Announcements Project proposal: Due tomorrow 1/27 Homework 1: Due Thursday 1/29 Any time is ok. Office
More informationEvaluation of multi armed bandit algorithms and empirical algorithm
Acta Technica 62, No. 2B/2017, 639 656 c 2017 Institute of Thermomechanics CAS, v.v.i. Evaluation of multi armed bandit algorithms and empirical algorithm Zhang Hong 2,3, Cao Xiushan 1, Pu Qiumei 1,4 Abstract.
More informationTnT Part of Speech Tagger
TnT Part of Speech Tagger By Thorsten Brants Presented By Arghya Roy Chaudhuri Kevin Patel Satyam July 29, 2014 1 / 31 Outline 1 Why Then? Why Now? 2 Underlying Model Other technicalities 3 Evaluation
More informationOptimal Teaching for Online Perceptrons
Optimal Teaching for Online Perceptrons Xuezhou Zhang Hrag Gorune Ohannessian Ayon Sen Scott Alfeld Xiaojin Zhu Department of Computer Sciences, University of Wisconsin Madison {zhangxz1123, gorune, ayonsn,
More informationOn Pairwise Naive Bayes Classifiers
On Pairwise Naive Bayes Classifiers Jan-Nikolas Sulzmann 1, Johannes Fürnkranz 1, and Eyke Hüllermeier 2 1 Department of Computer Science, TU Darmstadt Hochschulstrasse 10, D-64289 Darmstadt, Germany {sulzmann,juffi}@ke.informatik.tu-darmstadt.de
More informationNotation. Pattern Recognition II. Michal Haindl. Outline - PR Basic Concepts. Pattern Recognition Notions
Notation S pattern space X feature vector X = [x 1,...,x l ] l = dim{x} number of features X feature space K number of classes ω i class indicator Ω = {ω 1,...,ω K } g(x) discriminant function H decision
More informationClassification of Ordinal Data Using Neural Networks
Classification of Ordinal Data Using Neural Networks Joaquim Pinto da Costa and Jaime S. Cardoso 2 Faculdade Ciências Universidade Porto, Porto, Portugal jpcosta@fc.up.pt 2 Faculdade Engenharia Universidade
More informationCrowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons
2015 The University of Texas at Arlington. All Rights Reserved. Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons Abolfazl Asudeh, Gensheng Zhang, Naeemul Hassan, Chengkai Li, Gergely
More informationAdministration. CSCI567 Machine Learning (Fall 2018) Outline. Outline. HW5 is available, due on 11/18. Practice final will also be available soon.
Administration CSCI567 Machine Learning Fall 2018 Prof. Haipeng Luo U of Southern California Nov 7, 2018 HW5 is available, due on 11/18. Practice final will also be available soon. Remaining weeks: 11/14,
More informationLecture 2: Learning from Evaluative Feedback. or Bandit Problems
Lecture 2: Learning from Evaluative Feedback or Bandit Problems 1 Edward L. Thorndike (1874-1949) Puzzle Box 2 Learning by Trial-and-Error Law of Effect: Of several responses to the same situation, those
More informationActive Ranking from Pairwise Comparisons and when Parametric Assumptions Don t Help
Active Ranking from Pairwise Comparisons and when Parametric Assumptions Don t Help Reinhard Heckel Nihar B. Shah Kannan Ramchandran Martin J. Wainwright, Department of Statistics, and Department of Electrical
More informationLecture 6: Non-stochastic best arm identification
CSE599i: Online and Adaptive Machine Learning Winter 08 Lecturer: Kevin Jamieson Lecture 6: Non-stochastic best arm identification Scribes: Anran Wang, eibin Li, rian Chan, Shiqing Yu, Zhijin Zhou Disclaimer:
More informationCounterfactual Model for Learning Systems
Counterfactual Model for Learning Systems CS 7792 - Fall 28 Thorsten Joachims Department of Computer Science & Department of Information Science Cornell University Imbens, Rubin, Causal Inference for Statistical
More informationMachine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20.
10-601 Machine Learning, Midterm Exam: Spring 2008 Please put your name on this cover sheet If you need more room to work out your answer to a question, use the back of the page and clearly mark on the
More informationProduct rule. Chain rule
Probability Recap CS 188: Artificial Intelligence ayes Nets: Independence Conditional probability Product rule Chain rule, independent if and only if: and are conditionally independent given if and only
More informationScale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract
Scale-Invariance of Support Vector Machines based on the Triangular Kernel François Fleuret Hichem Sahbi IMEDIA Research Group INRIA Domaine de Voluceau 78150 Le Chesnay, France Abstract This paper focuses
More informationFantope Regularization in Metric Learning
Fantope Regularization in Metric Learning CVPR 2014 Marc T. Law (LIP6, UPMC), Nicolas Thome (LIP6 - UPMC Sorbonne Universités), Matthieu Cord (LIP6 - UPMC Sorbonne Universités), Paris, France Introduction
More informationTutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning.
Tutorial: PART 1 Online Convex Optimization, A Game- Theoretic Approach to Learning http://www.cs.princeton.edu/~ehazan/tutorial/tutorial.htm Elad Hazan Princeton University Satyen Kale Yahoo Research
More informationMachine Learning for Physicists Lecture 1
Machine Learning for Physicists Lecture 1 Summer 2017 University of Erlangen-Nuremberg Florian Marquardt (Image generated by a net with 20 hidden layers) OUTPUT INPUT (Picture: Wikimedia Commons) OUTPUT
More informationCS230: Lecture 8 Word2Vec applications + Recurrent Neural Networks with Attention
CS23: Lecture 8 Word2Vec applications + Recurrent Neural Networks with Attention Today s outline We will learn how to: I. Word Vector Representation i. Training - Generalize results with word vectors -
More informationLarge-scale Collaborative Ranking in Near-Linear Time
Large-scale Collaborative Ranking in Near-Linear Time Liwei Wu Depts of Statistics and Computer Science UC Davis KDD 17, Halifax, Canada August 13-17, 2017 Joint work with Cho-Jui Hsieh and James Sharpnack
More informationarxiv: v3 [cs.lg] 25 Aug 2017
Achieving Budget-optimality with Adaptive Schemes in Crowdsourcing Ashish Khetan and Sewoong Oh arxiv:602.0348v3 [cs.lg] 25 Aug 207 Abstract Crowdsourcing platforms provide marketplaces where task requesters
More information1-bit Matrix Completion. PAC-Bayes and Variational Approximation
: PAC-Bayes and Variational Approximation (with P. Alquier) PhD Supervisor: N. Chopin Bayes In Paris, 5 January 2017 (Happy New Year!) Various Topics covered Matrix Completion PAC-Bayesian Estimation Variational
More informationCS-E4830 Kernel Methods in Machine Learning
CS-E4830 Kernel Methods in Machine Learning Lecture 5: Multi-class and preference learning Juho Rousu 11. October, 2017 Juho Rousu 11. October, 2017 1 / 37 Agenda from now on: This week s theme: going
More informationMonte-Carlo Tree Search by. MCTS by Best Arm Identification
Monte-Carlo Tree Search by Best Arm Identification and Wouter M. Koolen Inria Lille SequeL team CWI Machine Learning Group Inria-CWI workshop Amsterdam, September 20th, 2017 Part of...... a new Associate
More informationStochastic Tube MPC with State Estimation
Proceedings of the 19th International Symposium on Mathematical Theory of Networks and Systems MTNS 2010 5 9 July, 2010 Budapest, Hungary Stochastic Tube MPC with State Estimation Mark Cannon, Qifeng Cheng,
More informationPei Wang( 王培 ) Temple University, Philadelphia, USA
Pei Wang( 王培 ) Temple University, Philadelphia, USA Artificial General Intelligence (AGI): a small research community in AI that believes Intelligence is a general-purpose capability Intelligence should
More informationArtificial Intelligence Markov Chains
Artificial Intelligence Markov Chains Stephan Dreiseitl FH Hagenberg Software Engineering & Interactive Media Stephan Dreiseitl (Hagenberg/SE/IM) Lecture 12: Markov Chains Artificial Intelligence SS2010
More informationDiscriminative Training
Discriminative Training February 19, 2013 Noisy Channels Again p(e) source English Noisy Channels Again p(e) p(g e) source English German Noisy Channels Again p(e) p(g e) source English German decoder
More information