Task Assignment in a Human-Autonomous

Similar documents
Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems

Recipes for the Linear Analysis of EEG and applications

CHARACTERIZATION OF NONLINEAR NEURON RESPONSES

A Deep Interpretation of Classifier Chains

CS 6375 Machine Learning

Classification of Hand-Written Digits Using Scattering Convolutional Network

Machine Learning and Deep Learning! Vincent Lepetit!

Novel Quantization Strategies for Linear Prediction with Guarantees

Spectral Clustering on Handwritten Digits Database

Crowdsourcing & Optimal Budget Allocation in Crowd Labeling

Bayesian Networks Inference with Probabilistic Graphical Models

Utility Maximizing Routing to Data Centers

Learning with multiple models. Boosting.

Empirical Risk Minimization

An Introduction to Statistical and Probabilistic Linear Models

CSCI-567: Machine Learning (Spring 2019)

ILP Formulations for the Lazy Bureaucrat Problem

GAMINGRE 8/1/ of 7

Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images

Pattern Recognition and Machine Learning. Learning and Evaluation of Pattern Recognition Processes

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Probabilistic Time Series Classification

A Hybrid Method of CART and Artificial Neural Network for Short-term term Load Forecasting in Power Systems

Low-Density Parity-Check Codes

Classification Semi-supervised learning based on network. Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS Winter

Probabilistic Graphical Models for Image Analysis - Lecture 1

HMM and IOHMM Modeling of EEG Rhythms for Asynchronous BCI Systems

Linear Models in Machine Learning

Revenue Maximization in a Cloud Federation

Batch Mode Sparse Active Learning. Lixin Shi, Yuhang Zhao Tsinghua University

COMPUTATIONAL MODELING OF VISUAL PERCEPTION

Permuation Models meet Dawid-Skene: A Generalised Model for Crowdsourcing

Integer weight training by differential evolution algorithms

Machine Learning (CS 567) Lecture 2

An artificial chemical reaction optimization algorithm for. multiple-choice; knapsack problem.

Technical note on seasonal adjustment for M0

Jeff Howbert Introduction to Machine Learning Winter

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Iterative Laplacian Score for Feature Selection

Standard P300 scalp. Fz, Cz, Pz Posterior Electrodes: PO 7, PO 8, O Z

COURSE INTRODUCTION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

Introduction to Course

Handling imprecise and uncertain class labels in classification and clustering

Brief Introduction of Machine Learning Techniques for Content Analysis

ECE Lecture 1. Terminology. Statistics are used much like a drunk uses a lamppost: for support, not illumination

Support Vector Machines: Maximum Margin Classifiers

MLE/MAP + Naïve Bayes

CHARACTERIZATION OF NONLINEAR NEURON RESPONSES

ECE521 week 3: 23/26 January 2017

ENGINE SERIAL NUMBERS

Microarray Data Analysis: Discovery

Naïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 18 Oct. 31, 2018

FEATURE SELECTION COMBINED WITH RANDOM SUBSPACE ENSEMBLE FOR GENE EXPRESSION BASED DIAGNOSIS OF MALIGNANCIES

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Hidden Markov Models

Classification and Support Vector Machine

MATH36061 Convex Optimization

Logistic regression and linear classifiers COMS 4771

WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and Rainfall For Selected Arizona Cities

Integer Programming, Constraint Programming, and their Combination

Integer Linear Programming (ILP)

Analyzing the computational impact of individual MINLP solver components

IE598 Big Data Optimization Introduction

CSE 250a. Assignment Noisy-OR model. Out: Tue Oct 26 Due: Tue Nov 2

Unsupervised and Semisupervised Classification via Absolute Value Inequalities

COMS 4771 Lecture Course overview 2. Maximum likelihood estimation (review of some statistics)

A Note on Extended Formulations for Cardinality-based Sparsity

Active and Semi-supervised Kernel Classification

Adaptive Crowdsourcing via EM with Prior

Convex Optimization M2

CS534 Machine Learning - Spring Final Exam

Research Article Weather Forecasting Using Sliding Window Algorithm

CS6375: Machine Learning Gautam Kunapuli. Decision Trees

PRELIMINARY DRAFT FOR DISCUSSION PURPOSES

CHAPTER 4 CRITICAL GROWTH SEASONS AND THE CRITICAL INFLOW PERIOD. The numbers of trawl and by bag seine samples collected by year over the study

Hidden Markov Models

CPSC 540: Machine Learning

Rick Ebert & Joseph Mazzarella For the NED Team. Big Data Task Force NASA, Ames Research Center 2016 September 28-30

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany

GWAS V: Gaussian processes

Scenario Grouping and Decomposition Algorithms for Chance-constrained Programs

Fair Inference Through Semiparametric-Efficient Estimation Over Constraint-Specific Paths

The Pooling Problem - Applications, Complexity and Solution Methods

Multivariate statistical methods and data mining in particle physics

Statistical Methods for Data Mining

Forecasting. Copyright 2015 Pearson Education, Inc.

Uncovering the Latent Structures of Crowd Labeling

Bias-Variance Tradeoff

arxiv: v3 [cs.gt] 17 Jun 2015

NETS 412: Algorithmic Game Theory March 28 and 30, Lecture Approximation in Mechanism Design. X(v) = arg max v i (a)

Mathematical Formulation of Our Example

Published by ASX Settlement Pty Limited A.B.N Settlement Calendar for ASX Cash Market Products

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks

Algorithmisches Lernen/Machine Learning

Dimensionality Reduction

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

Learning from the Wisdom of Crowds by Minimax Entropy. Denny Zhou, John Platt, Sumit Basu and Yi Mao Microsoft Research, Redmond, WA

In Centre, Online Classroom Live and Online Classroom Programme Prices

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?

Transcription:

in Addison Bohannon Applied Math, Statistics, & Scientific Computation Vernon Lawhern Army Laboratory Advisors: Brian Sadler Army Laboratory September 30, 2015

Outline 1 2 3

We want to design a human-autonomous system which can efficiently and accurately classify a database of images. We want to leverage computer vision technology, Brain-Computer Interface technology, and human agents. We have to address who classifies which images and when. We have to address how to combine the classifications from multiple agents for the same image.

Rapid Serial Visual Presentation (RSVP) Ries and Larkin [2013] Electroencephalogram (EEG) Brain-Computer Interface (BCI) presentation at high rate of speed (2-10 Hz) Visual oddball paradigm generates a neural signature

Cortically Coupled Computer Vision Sajda et al. [2010] Combine human understanding and computer speed to triage interesting images How to maximize the synergy? Triage in serial or parallel? Characterized serial triage

Human- OSD Autonomy Pilot Initiative, Army Laboratory

Relevant Work Repeated labeling using multiple noisy labelers, Ipeirotis et al. [2013] When data collection is more expensive than noisy labeling, repeated labeling improves meta-label quality Budget-optimal task allocation for reliable crowdsourcing systems, Karger et al. [2014] Minimize task assignments to achieve target reliability with homogeneous agents and tasks Random task assignment based on predetermined budget Belief propagation algorithm for data fusion Adaptive task assignment for crowdsourced classification, Ho et al. [2013] Minimize task assignments to achieve target reliability with heterogeneous agents and tasks Generalized assignment problem formulation Exploration to learn agent reliability and task value

An Iterative

Problem Statement 1 Assignment Problem How to optimally assign homogeneous binary classification tasks amongst diverse agents? 2 Problem How to dynamically combine multiple labels from noisy agents without supervised knowledge?

Generalized Assignment Problem (GAP) Morales and Romeijn [2004]; Kundakcioglu and Alizamir [2008] 1 2 f (x) = min x n i=1 j=1 m v ij x ij (1) n c ij x ij b j, j = 1,..., m i=1 m x ij = 1, i = 1,..., n j=1 3 x ij {0, 1} 4 v ij = g(r j, s j ) 0 n number of tasks m number of agents x ij assignment of task i to agent j v ij assignment value of task i to agent j c ij assignment cost of task i to agent j b j budget for agent j r j reliability of agent j s i classification confidence of task i

Generalized Assignment Problem (GAP) Morales and Romeijn [2004]; Kundakcioglu and Alizamir [2008] The GAP is a NP Hard binary integer linear program with two primary classes of solution methods: Exact Branch and Bound Algorithm* Branch and Price Algorithm Heuristic Relaxing integrality constraints Lagrangian relaxation of constraints Greedy methods* Meta-heuristics

Branch and Bound Algorithm Morales and Romeijn [2004]; Kundakcioglu and Alizamir [2008] The branch and bound algorithm is an exact method which attempts to exhaustively search the feasible solution space. Equipped with an appropriate bounding heuristic, it can eliminate subspaces of the solution space from the search. It has three components: 1 Bounding function Heuristic 2 Branching strategy Serial 3 Searching Strategy Best first Depth first Breadth first

Branch and Bound Algorithm Morales and Romeijn [2004]; Kundakcioglu and Alizamir [2008]; Clausen [1999] Data: S {0, 1} m n, g : S R Result: x opt {0, 1} m n I := ; LB(p 0 ) := g(p 0 ); B := {(p 0, LB(p 0 ))} ; while B do Select p B; B := B\{p}; branch on p for 1,..., k ; for i = 1,..., k do LB(p i ) := g(p i ) ; if LB(p i ) < I then if LB(p i ) = f (X) then I = f (X); x opt = X; go to end; else B := B {(p i, LB(p i ))}; end end end end

Bounding Function: Lagrangian Relaxation Morales and Romeijn [2004]; Fisher [2004] We dualize the capacity constraints of our GAP 1 L(λ) = min x n i=1 j=1 m v ij x ij + m x ij = 1, i = 1,..., n j=1 2 x ij {0, 1} 3 λ j 0, j = 1,..., m m λ j j=1 i=1 which satisfies the following inequality: n (c ij x ij b j ) (2) L(λ) min x f (x) (3)

Problem Given m agents each making binary decisions on all n tasks, how do you infer the right decision? arg min f n P(f (ŷ i1,..., ŷ im ) y i ) i=1 1 f : {ŷ ij } m j=1 ȳ i 2 ŷ ij, ȳ i, y i { 1, 1} (4) Figure: Classification Outcomes, A = [a 1 a m ] T ŷ ij label for task i from agent j ȳ i label for task i from fusion y i true label for task i

Approaches Ipeirotis et al. [2013]; Parisi et al. [2014] Unsupervised Majority vote Mean A priori knowledge Supervised Linear Regression Relative performance Semi-Supervised Active/proactive Learning Exploration-exploitation

Spectral Meta-Learner Parisi et al. [2014] Given the sample covariance matrix over the fully-classified classification outcomes, A FC A, ˆQ = 1 m 1 m (a j ā) T (a j ā), j=1 it can be shown that { 1 µ 2 j i = j Q ij = lim ˆQij = n (1 b 2 )(2π i 1)(2π j 1) o.w. where b = P(y i = 1) P(y i = 1) is the class imbalance and π i = P(ȳ i =1 y i =1)+P(ȳ i = 1 y i = 1) 2 is the balanced accuracy.

Spectral Meta-Learner Parisi et al. [2014] This implies: Q λvv T and further that v j (2π j 1)! This leads us to a maximum likelihood estimate for data fusion of multiple labels from noisy agents the so-called Spectral Meta-Learner (SML): ȳ MLE i m = sign ŷ ij v j. (5) j=1

Resources Hardware Dell Desktop Computer (100 GB RAM, 16 Processors, 3 GHz) Software MATLAB Databases Office Object Database Translational Neuroscience Branch, Army Laboratories Office Object Database RSVP Collection Translational Neuroscience Branch, Army Laboratories

Validation On-line simulation of six agents of predetermined reliability and cost Compare against MATLAB integer programming application On-line simulation of two BCI agents and three computer vision agents Actual results from experiments on Office Object Database Compare (speed and accuracy) against full-assignment off-line analysis

Schedule (with Milestones*) Preparation ( - 15 OCT) Proposal (15 OCT)* Late 1st Semester (15 OCT - 4 DEC) Implement branch and bound algorithm (6 NOV)* Implement greedy search algorithm Compare performance of assignment algorithms Mid-year Review (4 DEC)* Early 2nd Semester (25 JAN - 11 MAR) Build agent classes Integrate all components into a system (26 FEB)* Validation (11 MAR)* Late 2nd Semester (21 MAR - 1 MAY) Performance analysis (15 APR)* Final Presentation and Results (6 MAY)*

Deliverables Software Fusion module Assignment module Agent classes Executive script Analysis Performance analysis and implications for human-autonomous systems Computational complexity of system

Jens Clausen. Branch and bound algorithms-principles and examples. 1999. Marshall L. Fisher. The Lagrangian Relaxation Method for Solving Integer Programming Problems. Management Science, 50(12_supplement):1861 1871, December 2004. Chien-ju Ho, Shahin Jabbari, and Jennifer Wortman Vaughan. Adaptive for Crowdsourced Classification. Proceedings of the 30th International Conference on Machine Learning, 28, 2013. Panagiotis G. Ipeirotis, Foster Provost, Victor S. Sheng, and Jing Wang. Repeated labeling using multiple noisy labelers. Data Mining and Knowledge Discovery, 28(2):402 441, March 2013. David R. Karger, Sewoong Oh, and Devavrat Shah. Budget-Optimal Allocation for Reliable Crowdsourcing s. Operations, 62(1):1 24, February 2014. O. Erhun Kundakcioglu and Saed Alizamir. Generalized assignment problem Generalized Assignment Problem. In Christodoulos A. Floudas and Panos M. Pardalos, editors, Encyclopedia of Optimization, pages 1153 1162. Springer US, 2008.

Dolores Romero Morales and H. Edwin Romeijn. The Generalized Assignment Problem and Extensions. In Ding-Zhu Du and Panos M. Pardalos, editors, Handbook of Combinatorial Optimization, pages 259 311. Springer US, 2004. Fabio Parisi, Francesco Strino, Boaz Nadler, and Yuval Kluger. Ranking and combining multiple predictors without labeled data. Proceedings of the National Academy of Sciences, 111(4):1253 1258, January 2014. Anthony J. Ries and Gabriella B. Larkin. Stimulus and Response-Locked P3 Activity in a Dynamic Rapid Serial Visual Presentation (RSVP). Technical report, January 2013. P. Sajda, E. Pohlmeyer, Jun Wang, L.C. Parra, C. Christoforou, J. Dmochowski, B. Hanna, C. Bahlmann, M.K. Singh, and Shih-Fu Chang. In a Blink of an Eye and a Switch of a Transistor: Cortically Coupled Computer Vision. Proceedings of the IEEE, 98(3):462 478, March 2010.