Crowdsourcing Mul/- Label Classifica/on. Jonathan Bragg University of Washington
|
|
- Horace McKinney
- 5 years ago
- Views:
Transcription
1 Crowdsourcing Mul/- Label Classifica/on Jonathan Bragg University of Washington
2 Collaborators Dan Weld University of Washington Mausam University of Washington à IIT Delhi
3 Overview What is mul?- label classifica?on? Simple threshold approaches Probabilis?c approaches Choosing which ques?ons to ask
4 Overview What is mul?- label classifica?on? Simple threshold approaches Probabilis?c approaches Choosing which ques?ons to ask
5 Standard classifica?on Data Classes A B C Dai et al. 2013; Kamar, Hacker, and Horvitz 2012; Raykar et al. 2010; Sheng, Provost, and Ipeiro?s 2008; Wauthier and Jordan 2011; Welinder et al. 2010; Whitehill et al. 2009
6 Mul?- label classifica?on Data Classes label label label label label label label label label label
7 Social tagging
8 Overview What is mul?- label classifica?on? Simple threshold approaches Probabilis?c approaches Choosing which ques?ons to ask
9 Overview What is mul?- label classifica?on? Simple threshold approaches Lossless stopping One- away heuris?c Probabilis?c approaches Choosing which ques?ons to ask
10 Sample problem location building person car artist architect animal tiger fish athlete
11 A naïve approach tiger? animal? True True False False person? True False
12 A naïve approach threshold T animal? True False k votes 3? Yes No accept reject Majority vo?ng: T=k/2
13 Lossless stopping k=5 T=3 animal? True False T true? accept
14 Lossless stopping k=5 T=3 animal? True False T true? k- T+1 false? accept reject Ask two people to vote and only ask a third if the first two disagree - TurKit [Lidle et al. 2009]
15 One- away heuris?c k=5 T=3 animal? True False T- 1 true 0 false 0 true k- T false accept reject
16 An applica?on: taxonomies
17 Cascade [Chilton et al. 2013] the crowd Generate Labels Assign Labels to Data Infer Global Taxonomy mul?- label classifica?on naïve cost: data x labels x votes
18 An experiment 100 En??es Brad Pi. Kenny G Washington DC Martha Stewart Whidbey Island The Boston Globe Honda Accord Shanghai 33 Labels person actor director vehicle architect car city location island country Fine- grained en?ty tags [Ling and Weld 2012]
19 Labor reduc?on One- away saves 58% Lossless saves 56%
20 Summary Threshold approaches Lossless stopping (no error) One- away heuris?c (lidle error) Reduc?ons in labor over 50%
21 Overview What is mul?- label classifica?on? Simple threshold approaches Lossless stopping One- away heuris?c Probabilis?c approaches Choosing which ques?ons to ask
22 Overview What is mul?- label classifica?on? Simple threshold approaches Lossless stopping One- away heuris?c Probabilis?c approaches Independent Mul?- label naïve Bayes (MLNB) Choosing which ques?ons to ask
23 A simple probabilis?c model Independent animal? P(animal) True False F T T F T P(animal = True) = 0.04 P(animal = True animal? F ) = P(animal = True animal? F T ) =
24 Are labels independent? animal? person? tiger? P(animal = True) = 0.04 P(animal = True person = True) = P(animal = True?ger = True) =
25 Modeling label co- occurrence P(person) P(animal) P(tiger)
26 Modeling label co- occurrence Mul/- label naïve Bayes (MLNB) P(animal) P(person animal) P(tiger animal) labels trees P(animal = True person? T ) P(animal = True)
27 Model comparison Independent MLNB Inference speed (per item) O( labels ) O( labels 2 ) # of parameters O( labels ) O( labels 2 )
28 Overview What is mul?- label classifica?on? Simple threshold approaches Lossless stopping One- away heuris?c Probabilis?c approaches Independent Mul?- label naïve Bayes Choosing which ques?ons to ask
29 Overview What is mul?- label classifica?on? Simple threshold approaches Lossless stopping One- away heuris?c Probabilis?c approaches Independent Mul?- label naïve Bayes Choosing which ques?ons to ask
30 Choosing which ques?ons to ask Decision policy: How do we choose the next label? Compute heuris?c Select best label Observe vote
31 Choosing which ques?ons to ask Round- robin policy (e.g., Cascade) Votes animal person tiger
32 A probabilis?c approach Greedy policy Most uncertain heuris?c Votes animal person tiger P = 0.04 P = 0.09 P = 0.02
33 A probabilis?c approach Greedy policy Most uncertain heuris?c Votes False animal person tiger P = 0.04 P = 0.09 P = 0.02
34 A probabilis?c approach Greedy policy Most uncertain heuris?c Votes False animal person tiger P = 0.04 P = 0.01 P = 0.02
35 A probabilis?c approach Greedy policy Most uncertain heuris?c Votes False animal person tiger P = 0.05 P = 0.01 P = 0.03
36 A probabilis?c approach Greedy policy Most uncertain heuris?c Votes True False animal person tiger P = 0.05 P = 0.01 P = 0.03
37 A probabilis?c approach Greedy policy Most uncertain heuris?c Votes True False animal person tiger P = 0.7 P = 0.01 P = 0.03
38 A probabilis?c approach Greedy policy Most uncertain heuris?c Votes True False animal person tiger P = 0.7 P = P = 0.1
39 A probabilis?c approach Votes Greedy policy Most uncertain heuris?c Informa/on gain heuris?c True False Performance guarantees animal person tiger P = 0.7 P = P = 0.1
40 Informa?on gain heuris?c Entropy = uncertainty of labels 1 High 1 Low
41 Informa?on gain heuris?c Informa/on gain = expected reduc?on in entropy (uncertainty) tiger? Higher (beder)
42 Informa?on gain heuris?c Informa/on gain = expected reduc?on in entropy (uncertainty) In our models, these computa?ons are local and inexpensive Guaranteed (1-1/e) 63% of op?mal Sensor selec?on [Krause and Guestrin 2005]
43 Choosing which ques?ons to ask Greedy policy Informa?on gain! Compute heuris?c Select best label Observe vote Compute posterior beliefs
44 Probabilis?c results Mean F-score! MLNB" Independent" Cascade (Threshold)" Number of votes per item!
45 Probabilis?c results Mean F-score! MLNB" Independent" Cascade (Threshold)" Number of votes per item!
46 Probabilis?c results Mean F-score! Over 90% reduc?on in labor MLNB" Independent" Cascade (Threshold)" Number of votes per item!
47 Batching
48 A simple approxima?on Single- label 1. Rank labels by informa?on gain 2. Select best label 3. Observe vote Batched k- best 1. Rank labels by informa?on gain 2. Select top k labels 3. Observe k votes
49 Single- label Batching results Batches of size k=7
50 Summary Threshold approaches Lossless stopping (no error) One- away heuris?c (lidle error) Reduc?ons in labor over 50% Probabilis?c approaches Reduc?ons in labor over 90% Theore?cal guarantees Batching (lidle addi?onal error) For details, see [Bragg et al. 2013] to appear at HCOMP
CS 6140: Machine Learning Spring What We Learned Last Week 2/26/16
Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Sign
More informationBoos$ng Can we make dumb learners smart?
Boos$ng Can we make dumb learners smart? Aarti Singh Machine Learning 10-601 Nov 29, 2011 Slides Courtesy: Carlos Guestrin, Freund & Schapire 1 Why boost weak learners? Goal: Automa'cally categorize type
More informationBBM406 - Introduc0on to ML. Spring Ensemble Methods. Aykut Erdem Dept. of Computer Engineering HaceDepe University
BBM406 - Introduc0on to ML Spring 2014 Ensemble Methods Aykut Erdem Dept. of Computer Engineering HaceDepe University 2 Slides adopted from David Sontag, Mehryar Mohri, Ziv- Bar Joseph, Arvind Rao, Greg
More informationBudget-Optimal Task Allocation for Reliable Crowdsourcing Systems
Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems Sewoong Oh Massachusetts Institute of Technology joint work with David R. Karger and Devavrat Shah September 28, 2011 1 / 13 Crowdsourcing
More informationRandomized Decision Trees
Randomized Decision Trees compiled by Alvin Wan from Professor Jitendra Malik s lecture Discrete Variables First, let us consider some terminology. We have primarily been dealing with real-valued data,
More informationImproving Quality of Crowdsourced Labels via Probabilistic Matrix Factorization
Human Computation AAAI Technical Report WS-12-08 Improving Quality of Crowdsourced Labels via Probabilistic Matrix Factorization Hyun Joon Jung School of Information University of Texas at Austin hyunjoon@utexas.edu
More informationCS 6140: Machine Learning Spring What We Learned Last Week. Survey 2/26/16. VS. Model
Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Assignment
More informationCS 6140: Machine Learning Spring 2016
CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Assignment
More informationBe able to define the following terms and answer basic questions about them:
CS440/ECE448 Section Q Fall 2017 Final Review Be able to define the following terms and answer basic questions about them: Probability o Random variables, axioms of probability o Joint, marginal, conditional
More informationDecision Trees Lecture 12
Decision Trees Lecture 12 David Sontag New York University Slides adapted from Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Machine Learning in the ER Physician documentation Triage Information
More informationDecision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag
Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Supervised Learning Input: labelled training data i.e., data plus desired output Assumption:
More informationInforma(on theory in ML
Informa(on theory in ML Feature Selec+on For efficiency of the classifier and to suppress noise choose subset of all possible features. Selected features should be frequent to avoid overfitting the classifier
More informationUnsupervised Learning of Discrimina4ve Rela4ve Visual A9ributes
Unsupervised Learning of Discrimina4ve Rela4ve Visual A9ributes Shugao Ma, Stan Sclaroff, Nazli Ikizler- Cinbis * Department of Computer Science, Boston University * Department of Computer Engineering,
More informationCSE 21 Math for Algorithms and Systems Analysis. Lecture 10 Condi<onal Probability
CSE 21 Math for Algorithms and Systems Analysis Lecture 10 Condi
More informationCri$ques Ø 5 cri&ques in total Ø Each with 6 points
Cri$ques Ø 5 cri&ques in total Ø Each with 6 points 1 Distributed Applica$on Alloca$on in Shared Sensor Networks Chengjie Wu, You Xu, Yixin Chen, Chenyang Lu Shared Sensor Network Example in San Francisco
More informationUVA CS / Introduc8on to Machine Learning and Data Mining
UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 13: Probability and Sta3s3cs Review (cont.) + Naïve Bayes Classifier Yanjun Qi / Jane, PhD University of Virginia Department
More informationBayesian networks Lecture 18. David Sontag New York University
Bayesian networks Lecture 18 David Sontag New York University Outline for today Modeling sequen&al data (e.g., =me series, speech processing) using hidden Markov models (HMMs) Bayesian networks Independence
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14, 2015 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting
More informationCHAPTER-17. Decision Tree Induction
CHAPTER-17 Decision Tree Induction 17.1 Introduction 17.2 Attribute selection measure 17.3 Tree Pruning 17.4 Extracting Classification Rules from Decision Trees 17.5 Bayesian Classification 17.6 Bayes
More informationCSE 473: Ar+ficial Intelligence. Probability Recap. Markov Models - II. Condi+onal probability. Product rule. Chain rule.
CSE 473: Ar+ficial Intelligence Markov Models - II Daniel S. Weld - - - University of Washington [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188
More informationCS6375: Machine Learning Gautam Kunapuli. Decision Trees
Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s
More informationMidterm, Fall 2003
5-78 Midterm, Fall 2003 YOUR ANDREW USERID IN CAPITAL LETTERS: YOUR NAME: There are 9 questions. The ninth may be more time-consuming and is worth only three points, so do not attempt 9 unless you are
More informationBelief and Desire: On Information and its Value
Belief and Desire: On Information and its Value Ariel Caticha Department of Physics University at Albany SUNY ariel@albany.edu Info-Metrics Institute 04/26/2013 1 Part 1: Belief 2 What is information?
More informationBias/variance tradeoff, Model assessment and selec+on
Applied induc+ve learning Bias/variance tradeoff, Model assessment and selec+on Pierre Geurts Department of Electrical Engineering and Computer Science University of Liège October 29, 2012 1 Supervised
More informationThird Grade Social Studies Indicators Class Summary
History Standard Construct time lines to demonstrate an understanding of units of time and chronological order. 1. Define and measure time by years, decades and centuries. 2. Place local historical events
More informationIndiana Academic Standards Science Grade: 3 - Adopted: 2016
Main Criteria: Indiana Academic Standards Secondary Criteria: Subjects: Science, Social Studies Grade: 3 Correlation Options: Show Correlated Indiana Academic Standards Science Grade: 3 - Adopted: 2016
More informationA Bayesian Concept Learning Approach to Crowdsourcing
A Bayesian Concept Learning Approach to Crowdsourcing Paolo Viappiani Dept. of Computer Science Aalborg University Sandra Zilles, Howard J. Hamilton Dept. of Computer Science University of Regina Craig
More informationCSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on
CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Professor Wei-Min Shen Week 13.1 and 13.2 1 Status Check Extra credits? Announcement Evalua/on process will start soon
More informationDecision Trees. Tirgul 5
Decision Trees Tirgul 5 Using Decision Trees It could be difficult to decide which pet is right for you. We ll find a nice algorithm to help us decide what to choose without having to think about it. 2
More informationMachine Learning
Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 13, 2011 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Decision Trees. Tobias Scheffer
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Decision Trees Tobias Scheffer Decision Trees One of many applications: credit risk Employed longer than 3 months Positive credit
More informationLast Time. Today. Bayesian Learning. The Distributions We Love. CSE 446 Gaussian Naïve Bayes & Logistic Regression
CSE 446 Gaussian Naïve Bayes & Logistic Regression Winter 22 Dan Weld Learning Gaussians Naïve Bayes Last Time Gaussians Naïve Bayes Logistic Regression Today Some slides from Carlos Guestrin, Luke Zettlemoyer
More informationQ1 (12 points): Chap 4 Exercise 3 (a) to (f) (2 points each)
Q1 (1 points): Chap 4 Exercise 3 (a) to (f) ( points each) Given a table Table 1 Dataset for Exercise 3 Instance a 1 a a 3 Target Class 1 T T 1.0 + T T 6.0 + 3 T F 5.0-4 F F 4.0 + 5 F T 7.0-6 F T 3.0-7
More informationPrac%cal Bioinforma%cs for Life Scien%sts. Week 14, Lecture 28. István Albert Bioinforma%cs Consul%ng Center Penn State
Prac%cal Bioinforma%cs for Life Scien%sts Week 14, Lecture 28 István Albert Bioinforma%cs Consul%ng Center Penn State Final project A group of researchers are interested in studying protein binding loca%ons
More informationBe able to define the following terms and answer basic questions about them:
CS440/ECE448 Fall 2016 Final Review Be able to define the following terms and answer basic questions about them: Probability o Random variables o Axioms of probability o Joint, marginal, conditional probability
More informationQuan&fying Uncertainty. Sai Ravela Massachuse7s Ins&tute of Technology
Quan&fying Uncertainty Sai Ravela Massachuse7s Ins&tute of Technology 1 the many sources of uncertainty! 2 Two days ago 3 Quan&fying Indefinite Delay 4 Finally 5 Quan&fying Indefinite Delay P(X=delay M=
More informationIntegra(ng and Ranking Uncertain Scien(fic Data
Jan 19, 2010 1 Biomedical and Health Informatics 2 Computer Science and Engineering University of Washington Integra(ng and Ranking Uncertain Scien(fic Data Wolfgang Ga*erbauer 2 Based on joint work with:
More informationAggrega?on of Epistemic Uncertainty
Aggrega?on of Epistemic Uncertainty - Certainty Factors and Possibility Theory - Koichi Yamada Nagaoka Univ. of Tech. 1 What is Epistemic Uncertainty? Epistemic Uncertainty Aleatoric Uncertainty (Sta?s?cal
More informationIdaho Content Standards Science Grade: 3 - Adopted: 2006
Main Criteria: Idaho Content Standards Secondary Criteria: Subjects: Science, Social Studies Grade: 3 Correlation Options: Show Correlated Idaho Content Standards Science Grade: 3 - Adopted: 2006 STANDARD
More informationPsych 230. Psychological Measurement and Statistics
Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State
More informationCrowdsourcing via Tensor Augmentation and Completion (TAC)
Crowdsourcing via Tensor Augmentation and Completion (TAC) Presenter: Yao Zhou joint work with: Dr. Jingrui He - 1 - Roadmap Background Related work Crowdsourcing based on TAC Experimental results Conclusion
More informationComputer Vision. Pa0ern Recogni4on Concepts Part I. Luis F. Teixeira MAP- i 2012/13
Computer Vision Pa0ern Recogni4on Concepts Part I Luis F. Teixeira MAP- i 2012/13 What is it? Pa0ern Recogni4on Many defini4ons in the literature The assignment of a physical object or event to one of
More informationChapter 7: Hypothesis Testing - Solutions
Chapter 7: Hypothesis Testing - Solutions 7.1 Introduction to Hypothesis Testing The problem with applying the techniques learned in Chapter 5 is that typically, the population mean (µ) and standard deviation
More information1 [15 points] Search Strategies
Probabilistic Foundations of Artificial Intelligence Final Exam Date: 29 January 2013 Time limit: 120 minutes Number of pages: 12 You can use the back of the pages if you run out of space. strictly forbidden.
More informationText Categorization CSE 454. (Based on slides by Dan Weld, Tom Mitchell, and others)
Text Categorization CSE 454 (Based on slides by Dan Weld, Tom Mitchell, and others) 1 Given: Categorization A description of an instance, x X, where X is the instance language or instance space. A fixed
More informationNaive Bayes classification
Naive Bayes classification Christos Dimitrakakis December 4, 2015 1 Introduction One of the most important methods in machine learning and statistics is that of Bayesian inference. This is the most fundamental
More informationDetec%ng and Analyzing Urban Regions with High Impact of Weather Change on Transport
Detec%ng and Analyzing Urban Regions with High Impact of Weather Change on Transport Ye Ding, Yanhua Li, Ke Deng, Haoyu Tan, Mingxuan Yuan, Lionel M. Ni Presenta;on by Karan Somaiah Napanda, Suchithra
More informationCSE 473: Ar+ficial Intelligence. Hidden Markov Models. Bayes Nets. Two random variable at each +me step Hidden state, X i Observa+on, E i
CSE 473: Ar+ficial Intelligence Bayes Nets Daniel Weld [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at hnp://ai.berkeley.edu.]
More informationRhode Island World-Class Standards Science Grade: K - Adopted: 2006
Main Criteria: Rhode Island World-Class Standards Secondary Criteria: Subjects: Science, Social Studies Grade: K Correlation Options: Show Correlated Rhode Island World-Class Standards Science Grade: K
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17
3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural
More informationData Mining. CS57300 Purdue University. Bruno Ribeiro. February 8, 2018
Data Mining CS57300 Purdue University Bruno Ribeiro February 8, 2018 Decision trees Why Trees? interpretable/intuitive, popular in medical applications because they mimic the way a doctor thinks model
More informationDecision Trees. Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1
Decision Trees Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, 2018 Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1 Roadmap Classification: machines labeling data for us Last
More informationIntroduc)on to Bayesian methods (con)nued) - Lecture 16
Introduc)on to Bayesian methods (con)nued) - Lecture 16 David Sontag New York University Slides adapted from Luke Zettlemoyer, Carlos Guestrin, Dan Klein, and Vibhav Gogate Outline of lectures Review of
More informationDecision Tree Analysis for Classification Problems. Entscheidungsunterstützungssysteme SS 18
Decision Tree Analysis for Classification Problems Entscheidungsunterstützungssysteme SS 18 Supervised segmentation An intuitive way of thinking about extracting patterns from data in a supervised manner
More informationDecision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro
Decision Trees CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Classification without Models Well, partially without a model } Today: Decision Trees 2015 Bruno Ribeiro 2 3 Why Trees? } interpretable/intuitive,
More informationTutorial 2. Fall /21. CPSC 340: Machine Learning and Data Mining
1/21 Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2016 Overview 2/21 1 Decision Tree Decision Stump Decision Tree 2 Training, Testing, and Validation Set 3 Naive Bayes Classifier Decision
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,
More informationFrom statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu
From statistics to data science BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Why? How? What? How much? How many? Individual facts (quantities, characters, or symbols) The Data-Information-Knowledge-Wisdom
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification
More informationIntroduction to probability
Introduction to probability 4.1 The Basics of Probability Probability The chance that a particular event will occur The probability value will be in the range 0 to 1 Experiment A process that produces
More informationFounda'ons of Large- Scale Mul'media Informa'on Management and Retrieval. Lecture #4 Similarity. Edward Chang
Founda'ons of Large- Scale Mul'media Informa'on Management and Retrieval Lecture #4 Similarity Edward Y. Chang Edward Chang Foundations of LSMM 1 Edward Chang Foundations of LSMM 2 Similar? Edward Chang
More informationData Presentation. Naureen Ghani. May 4, 2018
Data Presentation Naureen Ghani May 4, 2018 Data is only as good as how it is presented. How do you take hundreds or thousands of data points and create something a human can understand? This is a problem
More informationClass 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio
Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant
More informationPSAAP Project Stanford
PSAAP Project QMU @ Stanford Component Analysis and rela:on to Full System Simula:ons 1 What do we want to predict? Objec:ve: predic:on of the unstart limit expressed as probability of unstart (or alterna:vely
More informationOn the Impossibility of Convex Inference in Human Computation
On the Impossibility of Convex Inference in Human Computation Nihar B. Shah U.C. Berkeley nihar@eecs.berkeley.edu Dengyong Zhou Microsoft Research dengyong.zhou@microsoft.com Abstract Human computation
More informationVariable Latent Semantic Indexing
Variable Latent Semantic Indexing Prabhakar Raghavan Yahoo! Research Sunnyvale, CA November 2005 Joint work with A. Dasgupta, R. Kumar, A. Tomkins. Yahoo! Research. Outline 1 Introduction 2 Background
More informationC) Discuss two factors that are contributing to the rapid geographical shifts in urbanization on a global scale.
AP Human Geography Unit VII. Cities and Urban Land Use Free Response Questions FRQ 1 Rapid urbanization in Least Developed Countries (LDCs) has many profound impacts for the world. Answer the following
More informationCS340 Winter 2010: HW3 Out Wed. 2nd February, due Friday 11th February
CS340 Winter 2010: HW3 Out Wed. 2nd February, due Friday 11th February 1 PageRank You are given in the file adjency.mat a matrix G of size n n where n = 1000 such that { 1 if outbound link from i to j,
More informationClassifica(on and predic(on omics style. Dr Nicola Armstrong Mathema(cs and Sta(s(cs Murdoch University
Classifica(on and predic(on omics style Dr Nicola Armstrong Mathema(cs and Sta(s(cs Murdoch University Classifica(on Learning Set Data with known classes Prediction Classification rule Data with unknown
More informationCSE 546 Final Exam, Autumn 2013
CSE 546 Final Exam, Autumn 0. Personal info: Name: Student ID: E-mail address:. There should be 5 numbered pages in this exam (including this cover sheet).. You can use any material you brought: any book,
More informationScalable Sparsification for Efficient Decision Making Under Uncertainty in High Dimensional State Spaces
Scalable Sparsification for Efficient Decision Making Under Uncertainty in High Dimensional State Spaces IROS 2017 KHEN ELIMELECH ROBOTICS AND AUTONOMOUS SYSTEMS PROGRAM VADIM INDELMAN DEPARTMENT OF AEROSPACE
More informationGrade 3 Social Studies
Grade 3 Social Studies Social Studies Grade(s) 3rd Course Overview Students will learn about a variety of communities. Students will learn about the geography and resources of communities. Students will
More informationDialogue as a Decision Making Process
Dialogue as a Decision Making Process Nicholas Roy Challenges of Autonomy in the Real World Wide range of sensors Noisy sensors World dynamics Adaptability Incomplete information Robustness under uncertainty
More informationECE 5984: Introduction to Machine Learning
ECE 5984: Introduction to Machine Learning Topics: Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16 Dhruv Batra Virginia Tech Administrativia HW3 Due: April 14, 11:55pm You will implement
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining Linear Classifiers: multi-class Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due in a week Midterm:
More informationMinnesota K-12 Academic Standards in Social Studies. Grade 4: Geography of North America
Minnesota K-12 Academic s in Social Studies Grade 4: Geography of North America 4 Describe how people take 1. Democratic government action to influence a depends on informed and decision on a specific
More informationSta$s$cal Significance Tes$ng In Theory and In Prac$ce
Sta$s$cal Significance Tes$ng In Theory and In Prac$ce Ben Cartere8e University of Delaware h8p://ir.cis.udel.edu/ictir13tutorial Hypotheses and Experiments Hypothesis: Using an SVM for classifica$on will
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. October 7, Efficiency: If size(w) = 100B, each prediction is expensive:
Simple Variable Selection LASSO: Sparse Regression Machine Learning CSE546 Carlos Guestrin University of Washington October 7, 2013 1 Sparsity Vector w is sparse, if many entries are zero: Very useful
More informationCOMP61011 : Machine Learning. Probabilis*c Models + Bayes Theorem
COMP61011 : Machine Learning Probabilis*c Models + Bayes Theorem Probabilis*c Models - one of the most active areas of ML research in last 15 years - foundation of numerous new technologies - enables decision-making
More informationCSE-4412(M) Midterm. There are five major questions, each worth 10 points, for a total of 50 points. Points for each sub-question are as indicated.
22 February 2007 CSE-4412(M) Midterm p. 1 of 12 CSE-4412(M) Midterm Sur / Last Name: Given / First Name: Student ID: Instructor: Parke Godfrey Exam Duration: 75 minutes Term: Winter 2007 Answer the following
More informationRobotics 2 Data Association. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Wolfram Burgard
Robotics 2 Data Association Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Wolfram Burgard Data Association Data association is the process of associating uncertain measurements to known tracks. Problem
More informationAdaptive Crowdsourcing via EM with Prior
Adaptive Crowdsourcing via EM with Prior Peter Maginnis and Tanmay Gupta May, 205 In this work, we make two primary contributions: derivation of the EM update for the shifted and rescaled beta prior and
More informationAssignment No A-05 Aim. Pre-requisite. Objective. Problem Statement. Hardware / Software Used
Assignment No A-05 Aim Implement Naive Bayes to predict the work type for a person. Pre-requisite 1. Probability. 2. Scikit-Learn Python Library. 3. Programming language basics. Objective 1. To Learn basic
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 23. Decision Trees Barnabás Póczos Contents Decision Trees: Definition + Motivation Algorithm for Learning Decision Trees Entropy, Mutual Information, Information
More informationCS246 Final Exam, Winter 2011
CS246 Final Exam, Winter 2011 1. Your name and student ID. Name:... Student ID:... 2. I agree to comply with Stanford Honor Code. Signature:... 3. There should be 17 numbered pages in this exam (including
More informationGraphical Models. Lecture 3: Local Condi6onal Probability Distribu6ons. Andrew McCallum
Graphical Models Lecture 3: Local Condi6onal Probability Distribu6ons Andrew McCallum mccallum@cs.umass.edu Thanks to Noah Smith and Carlos Guestrin for some slide materials. 1 Condi6onal Probability Distribu6ons
More informationProbability and Samples. Sampling. Point Estimates
Probability and Samples Sampling We want the results from our sample to be true for the population and not just the sample But our sample may or may not be representative of the population Sampling error
More informationFinal Exam, Machine Learning, Spring 2009
Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3
More informationMachine Learning, Fall 2009: Midterm
10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all
More informationCSE 473: Ar+ficial Intelligence
CSE 473: Ar+ficial Intelligence Hidden Markov Models Luke Ze@lemoyer - University of Washington [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188
More informationBayesian Learning Extension
Bayesian Learning Extension This document will go over one of the most useful forms of statistical inference known as Baye s Rule several of the concepts that extend from it. Named after Thomas Bayes this
More informationPoint Estimation. Vibhav Gogate The University of Texas at Dallas
Point Estimation Vibhav Gogate The University of Texas at Dallas Some slides courtesy of Carlos Guestrin, Chris Bishop, Dan Weld and Luke Zettlemoyer. Basics: Expectation and Variance Binary Variables
More informationStatistics for IT Managers
Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample
More informationthe tree till a class assignment is reached
Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal
More informationQuantization. Robert M. Haralick. Computer Science, Graduate Center City University of New York
Quantization Robert M. Haralick Computer Science, Graduate Center City University of New York Outline Quantizing 1 Quantizing 2 3 4 5 6 Quantizing Data is real-valued Data is integer valued with large
More informationCSE 473: Artificial Intelligence Spring 2014
CSE 473: Artificial Intelligence Spring 2014 Hanna Hajishirzi Problem Spaces and Search slides from Dan Klein, Stuart Russell, Andrew Moore, Dan Weld, Pieter Abbeel, Luke Zettelmoyer Outline Agents that
More informationOutline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012
CSE 573: Artificial Intelligence Autumn 2012 Reasoning about Uncertainty & Hidden Markov Models Daniel Weld Many slides adapted from Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer 1 Outline
More informationData Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan, Steinbach, Kumar Adapted by Qiang Yang (2010) Tan,Steinbach,
More informationSome Material on the Statistics Curriculum
Some Material on the Curriculum A/Prof Ken Russell School of Mathematics & Applied University of Wollongong kgr@uow.edu.au involves planning the collection of data, collecting those data, describing, analysing
More information2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 7 Inferences Based on Two Samples: Confidence Intervals & Tests of Hypotheses Content 1. Identifying the Target Parameter 2. Comparing Two Population Means:
More information