Knowledge Transfer with Interactive Learning of Semantic Relationships

Size: px
Start display at page:

Download "Knowledge Transfer with Interactive Learning of Semantic Relationships"

Transcription

1 Knowledge Transfer with Interactive Learning of Semantic Relationships Feb. 15, 2016 Jonghyun Choi Sung Ju Hwang, Leonid Sigal and Larry S. Davis University of Maryland Institute of Advanced Computer Studies (UMIACS) Comcast Labs, DC Ulsan National Institute of Science and Technology (UNIST) Disney Research, Pittsburgh

2 We Can Recognize This Image courtesy by

3 We Can Recognize This Too Image courtesy by

4 And We Can Easily Infer What This Is [1] J. Feldman, The structure of perceptual categories, Journal of Mathematical Psychology 1997 [2] J. Tenenbaum, Bayesian modeling of human concept learning, NIPS 1999 [3] T. Tommasi et al, Safety in Numbers: Learning Categories from Few Examples with Multi Model Knowledge Transfer, CVPR 2010 [4] Qi et al, Towards Cross-Category Knowledge Propagation for Learning Visual Concepts, CVPR 2011 [5] B. Lake, R. Salakhutdinov, J. Tenenbaum, Human-level concept learning through probabilistic program induction, Science 2015 Image courtesy by

5 Knowing Some Categories Allows Recognition of Others If we know these categories well We may recognize a bird-dog better [1] Image courtesy by Google [1] Tommasi, T.; Orabona, F.; and Caputo, B.. Safety in Numbers: Learning Categories from Few Examples with Multi Model Knowledge Transfer. CVPR 2010

6 Knowing Some Categories Allows Recognition of Others If we know these categories well We may recognize a bird-dog better [1] Relationship Image courtesy by Google [1] Tommasi, T.; Orabona, F.; and Caputo, B.. Safety in Numbers: Learning Categories from Few Examples with Multi Model Knowledge Transfer. CVPR 2010

7 Knowing Some Categories Allows Recognition of Others Anchor category If we know these categories well We may recognize a bird-dog better [1] Relationship Image courtesy by Google [1] Tommasi, T.; Orabona, F.; and Caputo, B.. Safety in Numbers: Learning Categories from Few Examples with Multi Model Knowledge Transfer. CVPR 2010

8 Knowing Some Categories Allows Recognition of Others Anchor category If we know these categories well Target categories We may recognize a bird-dog better [1] Relationship Image courtesy by Google [1] Tommasi, T.; Orabona, F.; and Caputo, B.. Safety in Numbers: Learning Categories from Few Examples with Multi Model Knowledge Transfer. CVPR 2010

9 Learning Semantic Relationship in Metric To improve the classification accuracy of rarely seen object category in metric learning framework

10 Learning Semantic Relationship in Metric To improve the classification accuracy of rarely seen object category in metric learning framework Target category

11 What Relationship? A is more similar to B than C Chimpanzee is more similar to Gorilla than Deer

12 What Relationship? Target category Anchor categories A is more similar to B than C Chimpanzee is more similar to Gorilla than Deer

13 What Relationship? Target category Anchor categories A is more similar to B than C Chimpanzee is more similar to Gorilla than Deer How to obtain the relationships?

14 Creating Relational Knowledge Base is expensive - Exponentially many (N 3 ) knowledges - Difficult to answer Ambiguous relative similarity - Every relationship is not equally useful Some are more useful, some are less or useless

15 Creating Relational Knowledge Base is expensive - Exponentially many (N 3 ) knowledges - Difficult to answer Ambiguous relative similarity - Every relationship is not equally useful Some are more useful, some are less or useless

16 Creating Relational Knowledge Base is expensive - Exponentially many (N 3 ) knowledges - Difficult to answer Ambiguous relative similarity - Every relationship is not equally useful Some are more useful, some are less or useless

17 Creating Relational Knowledge Base is expensive - Exponentially many (N 3 ) knowledges - Difficult to answer Ambiguous relative similarity - Every relationship is not equally useful Some are more useful, some are less or useless

18 Ambiguous Relationship Who is more similar to Ironman? Image courtesy by Disney and Toei Company 9

19 Ambiguous Relationship Who is more similar to Ironman? Power ranger? Image courtesy by Disney and Toei Company 9

20 Ambiguous Relationship Who is more similar to Ironman? Power ranger? Image courtesy by Disney and Toei Company 9 Tony Stark?

21 Not Every Relationship is Equally Useful for Classification Classifier for prettiness Snow-white Image courtesy by Disney

22 Not Every Relationship is Equally Useful for Classification Classifier for prettiness Snow-white > Queen Dwarf >> Snow-white is prettier than the Queen rather than the Seven dwarfs? Image courtesy by Disney

23 Not Every Relationship is Equally Useful for Classification Classifier for prettiness Rapunzel < Snow-white > Queen << Cinderella >> Dwarf Snow-white is prettier than the Queen rather than the Seven dwarfs? Snow-white is prettier than Rapunzel rather than Cinderella? Image courtesy by Disney

24 Our Approach: Ask A Few Useful Questions

25 Our Approach: Ask A Few Useful Questions Ask most useful questions in interactions

26 We Propose To learn a semantic space for target categories with constraints of semantic distance from the anchor categories Target category: one to improve the classification with few samples Anchor category: one to transfer knowledge from with more samples

27 We Propose To learn a semantic space for target categories with constraints of semantic distance from the anchor categories Example: - Chimpanzee is more similar to Gorilla than Deer? - Collie is more similar to Dalmatian than Giraffe?

28 Learning Classification Embeddings

29 Learning Classification Embeddings Training images of Anchor Categories

30 Learning Classification Embeddings Training images of Anchor Categories Training images of Target Categories

31 Learning Classification Embeddings Training images of Anchor Categories Training images of Target Categories W W

32 Learning Classification Embeddings Training images of Anchor Categories Training images of Target Categories W W Gorilla Deer Anchor Categories Giraffe Dalmatian

33 Learning Classification Embeddings Training images of Anchor Categories Training images of Target Categories W W Gorilla Deer Chimpanzee Anchor Categories Giraffe Dalmatian Collie Target Categories

34 Learning Classification Embeddings Training images of Anchor Categories Training images of Target Categories W W Gorilla u a1 Deer u a2 Chimpanzee u t1 Anchor Categories Giraffe u a3 Dalmatian u a4 Collie u t2 Target Categories

35 Refine Classification Embedding by Relational Semantics W W Gorilla u a1 Deer u a2 Chimpanzee u t1 Anchor Categories Giraffe u a3 Dalmatian u a4 Collie u t2 Target Categories

36 Refine Classification Embedding by Relational Semantics W W Gorilla u a1 Deer u a2 Chimpanzee u t1 Anchor Categories Giraffe u a3 Dalmatian u a4 Collie u t2 Target Categories? - Is Chimpanzee more similar to Gorilla than Deer? - Is Collie more similar to Dalmatian than Giraffe?

37 Learning Classification Embedding by Relational Semantics Feedback W W Gorilla u a1 Anchor Categories Giraffe u a3 Deer u a2 Dalmatian u a4 Chimpanzee u t1 Collie u t2 Target Categories? - Is Chimpanzee more similar to Gorilla than Deer? - Is Collie more similar to Dalmatian than Giraffe?

38 Learning Classification Embedding by Relational Semantics Feedback W W Gorilla u a1 Anchor Categories Giraffe u a3 Deer u a2 Dalmatian u a4 Chimpanzee u t1 Collie u t2 Target Categories? - Is Chimpanzee more similar to Gorilla than Deer? - Is Collie more similar to Dalmatian than Giraffe?

39 Learning Classification Embedding by Relational Semantics Feedback W W Gorilla u a1 Deer u a2 Chimpanzee u t1 u t1 u a1 < u t1 u a2 u t2 u a3 < u t2 u a4 Anchor Categories Giraffe u a3 Dalmatian u a4 Collie u t2 Target Categories? - Is Chimpanzee more similar to Gorilla than Deer? - Is Collie more similar to Dalmatian than Giraffe?

40 Learning Classification Embedding by Relational Semantics Feedback W W Gorilla u a1 Deer u a2 Chimpanzee u t1 u t1 u a1 < u t1 u a2 u t2 u a3 < u t2 u a4 Anchor Categories Giraffe u a3 Dalmatian u a4 Collie u t2 Target Categories? - Is Chimpanzee more similar to Gorilla than Deer? - Is Collie more similar to Dalmatian than Giraffe?

41 Learning Classification Embedding by Relational Semantics Feedback W W Gorilla u a1 Deer u a2 Chimpanzee u t1 u t1 u a1 < u t1 u a2 u t2 u a3 < u t2 u a4 Anchor Categories Giraffe u a3 Dalmatian u a4 Collie u t2 Target Categories? - Is Chimpanzee more similar to Gorilla than Deer? - Is Collie more similar to Dalmatian than Giraffe?

42 Learning Classification Embedding by Relational Semantics Feedback W W Gorilla u a1 Deer u a2 Chimpanzee u t1 u t1 u a1 < u t1 u a2 u t2 u a3 < u t2 u a4 Anchor Categories Giraffe u a3 Dalmatian u a4 Collie u t2 Target Categories? - Is Chimpanzee more similar to Gorilla than Deer? - Is Collie more similar to Dalmatian than Giraffe?

43 Learning Classification Embedding by Relational Semantics Feedback W W Gorilla u a1 Deer u a2 Chimpanzee u t1 u t1 u a1 < u t1 u a2 u t2 u a3 < u t2 u a4 Anchor Categories Giraffe u a3 Dalmatian u a4 Collie u t2 Target Categories? - Is Chimpanzee more similar to Gorilla than Deer? - Is Collie more similar to Dalmatian than Giraffe?

44 Relationship Transferred Metric Learning (RTML) Embed both visual feature and label entities - First on the Anchor categories only min W A,U A N A X i=1 X c2c A L W A, x i, u c + 1 kw A k 2 F + 2 ku A k 2 F, s.t. L(W A, x i, u c )= max kw A x i u yi k 2 2 kw A x i u c k , 0, 8i, 8c 6= y i W: subspace mapper, U: label embeddings, [1] K. Weinberger and L. Saul, Distance Metric Learning for Large Margin Nearest Neighbor Classification, JMLR 2009

45 Relationship Transferred Metric Learning (RTML) Embed both visual feature and label entities - First on the Anchor categories only min W A,U A Large margin embedding [1] on Anchor Categories N A X i=1 X c2c A L W A, x i, u c + 1 kw A k 2 F + 2 ku A k 2 F, s.t. L(W A, x i, u c )= max kw A x i u yi k 2 2 kw A x i u c k , 0, 8i, 8c 6= y i W: subspace mapper, U: label embeddings, [1] K. Weinberger and L. Saul, Distance Metric Learning for Large Margin Nearest Neighbor Classification, JMLR 2009

46 Relationship Transferred Metric Learning (RTML) Learn target category label prototypes and update W min W T,U T N T X i=1 X c2c T L W T, x i, u c + 3 kw T W A k 2 F + X j + 1 kw T k 2 F + 2 kuk 2 F (R j, U), s.t. L(W T, x i, u c ) = max kw T x i u yi k 2 2 kw T x i u c k , 0, 8i, 8c 6= y i, R j R W: subspace mapper, U: label embeddings, : semantic relational constraints [1] K. Weinberger and L. Saul, Distance Metric Learning for Large Margin Nearest Neighbor Classification, JMLR 2009

47 Relationship Transferred Metric Learning (RTML) Learn target category label prototypes and update W min W T,U T Large margin embedding [1] on Target Categories N T X i=1 X c2c T L W T, x i, u c + 3 kw T W A k 2 F + X j + 1 kw T k 2 F + 2 kuk 2 F (R j, U), s.t. L(W T, x i, u c ) = max kw T x i u yi k 2 2 kw T x i u c k , 0, 8i, 8c 6= y i, R j R W: subspace mapper, U: label embeddings, : semantic relational constraints [1] K. Weinberger and L. Saul, Distance Metric Learning for Large Margin Nearest Neighbor Classification, JMLR 2009

48 Relationship Transferred Metric Learning (RTML) Learn target category label prototypes and update W min W T,U T Large margin embedding [1] on Target Categories N T X i=1 X c2c T L W T, x i, u c + 1 kw T k 2 F + 2 kuk 2 F + 3 kw T W A k 2 F + X (R j, U), j s.t. L(W T, x i, u c ) = max kw T x i u yi k 2 2 kw T x i u c k , 0, 8i, 8c 6= y i, R j R Enforce semantic relation W: subspace mapper, U: label embeddings, : semantic relational constraints [1] K. Weinberger and L. Saul, Distance Metric Learning for Large Margin Nearest Neighbor Classification, JMLR 2009

49 Relationship Transferred Metric Learning (RTML) Learn target category label prototypes and update W min W T,U T Large margin embedding [1] on Target Categories N T X i=1 X c2c T L W T, x i, u c + 1 kw T k 2 F + 2 kuk 2 F + 3 kw T W A k 2 F + X (R j, U), j Bound to W A s.t. L(W T, x i, u c ) = max kw T x i u yi k 2 2 kw T x i u c k , 0, 8i, 8c 6= y i, R j R Enforce semantic relation W: subspace mapper, U: label embeddings, : semantic relational constraints [1] K. Weinberger and L. Saul, Distance Metric Learning for Large Margin Nearest Neighbor Classification, JMLR 2009

50 Semantic Relational Constraints u t Chimpanzee u a2 Deer u a1 Gorilla ku a1 u t k 2 2 < ku a2 u t k < ku a u 2 t k 2 2 ku u a1 t k 2 2 ut should be closer to ua1 than ua2! min U 0 < 1 max 1 ku a2 u t k 2 2 ku a1 u t k 2 2 ku a2 u t k 2 2 ku a1 u t k 2 2, 0.

51 Semantic Relational Constraints min U max 1 ku a2 u t k 2 2 ku a1 u t k 2 2, 0. ut should be closer to ua1 than ua2 Neither convex nor differentiable - So, relax by a way of [1]: X j (R j,u) = 1 h ku a1 u t k 2 2 ku a2 u t k 2 2 [1] Hwang et al., Analogy-preserving semantic embed- ding for visual object categorization, ICML 2013

52 Semantic Relational Constraints min U max 1 ku a2 u t k 2 2 ku a1 u t k 2 2, 0. ut should be closer to ua1 than ua2 Neither convex nor differentiable - So, relax by a way of [1]: X j (R j,u) = 1 h ku a1 u t k 2 2 ku a2 u t k 2 2 Objective function now becomes convex [1] Hwang et al., Analogy-preserving semantic embed- ding for visual object categorization, ICML 2013

53 Questions in What Order? Ones improving the accuracy the most - Score the questions by expected accuracy improvement Anchors How to obtain the scores 1. No scores - Random (baseline) 2. Training accuracy (entropy) Targets 3. Accuracy improvement on a validation set 4. Linear regression: predict the accuracy improvement on the validation set

54 Questions in What Order? Ones improving the accuracy the most - Score the questions by expected accuracy improvement Anchors How to obtain the scores 1. No scores - Random (baseline) 2. Training accuracy (entropy) Targets 3. Accuracy improvement on a validation set 4. Linear regression: predict the accuracy improvement on the validation set

55 Questions in What Order? Ones improving the accuracy the most - Score the questions by expected accuracy improvement Anchors How to obtain the scores 1. No scores - Random (baseline) 2. Training accuracy (entropy) Targets 3. Accuracy improvement on a validation set 4. Linear regression: predict the accuracy improvement on the validation set

56 Questions in What Order? Ones improving the accuracy the most - Score the questions by expected accuracy improvement Anchors How to obtain the scores 1. No scores - Random (baseline) 2. Training accuracy (entropy) Targets 3. Accuracy improvement on a validation set 4. Linear regression: predict the accuracy improvement on the validation set

57 Questions in What Order? Ones improving the accuracy the most - Score the questions by expected accuracy improvement How to obtain the scores Anchors No scores - Random (baseline) 2. Training accuracy (entropy) Targets 3. Accuracy improvement on a validation set 4. Linear regression: predict the accuracy improvement on the validation set

58 Questions in What Order? Ones improving the accuracy the most - Score the questions by expected accuracy improvement How to obtain the scores Anchors No scores - Random (baseline) 2. Training accuracy (entropy) Targets 3. Accuracy improvement on a validation set 4. Linear regression: predict the accuracy improvement on the validation set

59 Questions in What Order? Ones improving the accuracy the most - Score the questions by expected accuracy improvement Anchors How to obtain the scores 1. No scores - Random (baseline) 2. Training accuracy (entropy) Targets 3. Accuracy improvement on a validation set 4. Linear regression: predict the accuracy improvement on the validation set

60 Questions in What Order? Ones improving the accuracy the most - Score the questions by expected accuracy improvement Anchors How to obtain the scores 1. No scores - Random (baseline) 2. Training accuracy (entropy) Targets 3. Accuracy improvement on a validation set 4. Linear regression: predict the accuracy improvement on the validation set

61 Experiments Datasets Animals with Attributes (AwA) [1] - 50 animal classes (30,475 images) - 10 target classes (2,5,10 training/class) - 40 anchor classes (30 training/class) ImageNet-50 [2] - 50 object classes (70,380 images) - 10 target classes (2,5,10 training/class) - 40 anchor classes (30 training/class) [1] C. H. Lampert, H. Nickisch, and S. Harmeling. "Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer". CVPR, 2009 [2] S. J. Hwang, Analogy-preserving Semantic Embedding for Visual Object Categorization, ICML 2013

62 Questions Generated At Iterations Top ranked query for persian-cat at each iteration # Iteration Positively answered query at its highest rank 1 fox - persian cat < blue whale - persian cat 2 grizzly bear - persian cat < horse - persian cat 3 dalmatian - persian cat < beaver - persian cat 4 dalmatian - persian cat < german shepherd - persian cat

63 Questions Generated At Iterations Top ranked query for persian-cat at each iteration # Iteration Positively answered query at its highest rank 1 fox - persian cat < blue whale - persian cat 2 grizzly bear - persian cat < horse - persian cat 3 dalmatian - persian cat < beaver - persian cat 4 dalmatian - persian cat < german shepherd - persian cat

64 Questions Generated At Iterations Top ranked query for persian-cat at each iteration # Iteration Positively answered query at its highest rank 1 fox - persian cat < blue whale - persian cat 2 grizzly bear - persian cat < horse - persian cat 3 dalmatian - persian cat < beaver - persian cat 4 dalmatian - persian cat < german shepherd - persian cat

65 Questions Generated At Iterations Top ranked query for persian-cat at each iteration # Iteration Positively answered query at its highest rank 1 fox - persian cat < blue whale - persian cat 2 grizzly bear - persian cat < horse - persian cat 3 dalmatian - persian cat < beaver - persian cat 4 dalmatian - persian cat < german shepherd - persian cat

66 Questions Generated At Iterations Top ranked query for persian-cat at each iteration # Iteration Positively answered query at its highest rank 1 fox - persian cat < blue whale - persian cat 2 grizzly bear - persian cat < horse - persian cat 3 dalmatian - persian cat < beaver - persian cat 4 dalmatian - persian cat < german shepherd - persian cat As interactions continue, top ranked query becomes semantically more meaningful.

67 Qualitative Results Persian'Cat' Nearest' Nearest'Neighbors'in'Anchor'Classes' RTML Ours'(iter'3)' Wolf' Siamese3Cat' Fox' Weasel' Hamster' RTML RTML Ours'(iter'2)' Wolf' Siamese3Cat' Hamster' Chihuahua' Weasel' RTML Ours'(iter'1)' Wolf' Hamster' Siamese3Cat' Fox' Weasel' Blue3Whale' Dolphin' Hamster' Siamese3Cat' Wolf' LME3Transfer' RTML As iterations Blue3Whale' proceed, Dolphin' Hamster' Siamese3Cat' nearest Wolf' neighbors LME' become more semantically meaningful

68 Qualitative Results Persian'Cat' Nearest' Nearest'Neighbors'in'Anchor'Classes' RTML Ours'(iter'3)' Wolf' Siamese3Cat' Fox' Weasel' Hamster' RTML RTML Ours'(iter'2)' Wolf' Siamese3Cat' Hamster' Chihuahua' Weasel' RTML Ours'(iter'1)' Wolf' Hamster' Siamese3Cat' Fox' Weasel' Blue3Whale' Dolphin' Hamster' Siamese3Cat' Wolf' LME3Transfer' RTML As iterations Blue3Whale' proceed, Dolphin' Hamster' Siamese3Cat' nearest Wolf' neighbors LME' become more semantically meaningful

69 Qualitative Results Persian'Cat' Nearest' Nearest'Neighbors'in'Anchor'Classes' RTML Ours'(iter'3)' Wolf' Siamese3Cat' Fox' Weasel' Hamster' RTML RTML Ours'(iter'2)' Wolf' Siamese3Cat' Hamster' Chihuahua' Weasel' RTML Ours'(iter'1)' Wolf' Hamster' Siamese3Cat' Fox' Weasel' Blue3Whale' Dolphin' Hamster' Siamese3Cat' Wolf' LME3Transfer' RTML As iterations Blue3Whale' proceed, Dolphin' Hamster' Siamese3Cat' nearest Wolf' neighbors LME' become more semantically meaningful

70 Qualitative Results Persian'Cat' Nearest' Nearest'Neighbors'in'Anchor'Classes' RTML Ours'(iter'3)' Wolf' Siamese3Cat' Fox' Weasel' Hamster' RTML RTML Ours'(iter'2)' Wolf' Siamese3Cat' Hamster' Chihuahua' Weasel' RTML Ours'(iter'1)' Wolf' Hamster' Siamese3Cat' Fox' Weasel' Blue3Whale' Dolphin' Hamster' Siamese3Cat' Wolf' LME3Transfer' RTML As iterations Blue3Whale' proceed, Dolphin' Hamster' Siamese3Cat' nearest Wolf' neighbors LME' become more semantically meaningful

71 Qualitative Results Persian'Cat' Nearest' Nearest'Neighbors'in'Anchor'Classes' RTML Ours'(iter'3)' Wolf' Siamese3Cat' Fox' Weasel' Hamster' RTML RTML Ours'(iter'2)' Wolf' Siamese3Cat' Hamster' Chihuahua' Weasel' RTML Ours'(iter'1)' Wolf' Hamster' Siamese3Cat' Fox' Weasel' Blue3Whale' Dolphin' Hamster' Siamese3Cat' Wolf' LME3Transfer' RTML As iterations Blue3Whale' proceed, Dolphin' Hamster' Siamese3Cat' nearest Wolf' neighbors LME' become more semantically meaningful

72 Reduced Number of Questions Answer to better questions at every iteration - improve accuracy faster

73 Reduced Number of Questions Answer to better questions at every iteration - improve accuracy faster batchmode question generation

74 Reduced Number of Questions Answer to better questions at every iteration - improve accuracy question faster batchmode question generation Interactive generation

75 Reduced Number of Questions Answer to better questions at every iteration - improve accuracy question faster batchmode question generation Interactive generation

76 Reduced Number of Questions Answer to better questions at every iteration - improve accuracy question faster batchmode question generation Interactive generation

77 Reduced Number of Questions Answer to better questions at every iteration - improve accuracy question faster batchmode question generation Interactive generation When to stop?

78 Reduced Number of Questions Answer to better questions at every iteration - improve accuracy question faster batchmode question generation Interactive generation When to stop? : start to decrease in two consecutive iterations

79 By Different Query Selection Criteria #samples/class Animals with Attribute LME 22.51± ± ±1.33 LME-Transfer 24.59± ± ±1.67 Random 24.75± ± ±1.66 Entropy 24.96± ± ±1.91 Active-Regression 25.43± ± ±0.88 Active 26.62± ± ±1.33 Interactive 27.24± ± ±1.60 Interactive-UB 28.57± ± ±1.83 ImangeNet-50 LME 23.20± ± ±1.62 LME-Transfer 23.47± ± ±1.03 Random 24.23± ± ±2.26 Entropy 24.60± ± ±0.99 Active-Regression 23.34± ± ±0.89 Active 24.35± ± ±1.01 Interactive 24.95± ± ±1.01 Interactive-UB 25.15± ± ±1.53 Classification Accuracy (%) for Comparing Quality of Scoring Function

80 By Different Query Selection Criteria #samples/class Animals with Attribute LME 22.51± ± ±1.33 LME-Transfer 24.59± ± ±1.67 Random 24.75± ± ±1.66 Entropy 24.96± ± ±1.91 Active-Regression 25.43± ± ±0.88 Active 26.62± ± ±1.33 Interactive 27.24± ± ±1.60 Interactive-UB 28.57± ± ±1.83 ImangeNet-50 LME 23.20± ± ±1.62 LME-Transfer 23.47± ± ±1.03 Random 24.23± ± ±2.26 Entropy 24.60± ± ±0.99 Active-Regression 23.34± ± ±0.89 Active 24.35± ± ±1.01 Interactive 24.95± ± ±1.01 Interactive-UB 25.15± ± ±1.53 Baseline Proposed Classification Accuracy (%) for Comparing Quality of Scoring Function

81 Summary Propose an efficient and interactive strategy for collecting category relationship semantics Embedding semantic regularization into a classification model ensures the model to be semantically more meaningful over iterations Improve classification accuracy with small number of human verifications

82 Q/A Thank you!

83 Summary Propose an efficient and interactive strategy for collecting category relationship semantics Embedding semantic regularization into a classification model ensures the model to be semantically more meaningful over iterations Improve classification accuracy with small number of human verifications

Fantope Regularization in Metric Learning

Fantope Regularization in Metric Learning Fantope Regularization in Metric Learning CVPR 2014 Marc T. Law (LIP6, UPMC), Nicolas Thome (LIP6 - UPMC Sorbonne Universités), Matthieu Cord (LIP6 - UPMC Sorbonne Universités), Paris, France Introduction

More information

Using Both Latent and Supervised Shared Topics for Multitask Learning

Using Both Latent and Supervised Shared Topics for Multitask Learning Using Both Latent and Supervised Shared Topics for Multitask Learning Ayan Acharya, Aditya Rawal, Raymond J. Mooney, Eduardo R. Hruschka UT Austin, Dept. of ECE September 21, 2013 Problem Definition An

More information

Semi Supervised Distance Metric Learning

Semi Supervised Distance Metric Learning Semi Supervised Distance Metric Learning wliu@ee.columbia.edu Outline Background Related Work Learning Framework Collaborative Image Retrieval Future Research Background Euclidean distance d( x, x ) =

More information

Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time

Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time Experiment presentation for CS3710:Visual Recognition Presenter: Zitao Liu University of Pittsburgh ztliu@cs.pitt.edu

More information

Stable and Efficient Representation Learning with Nonnegativity Constraints. Tsung-Han Lin and H.T. Kung

Stable and Efficient Representation Learning with Nonnegativity Constraints. Tsung-Han Lin and H.T. Kung Stable and Efficient Representation Learning with Nonnegativity Constraints Tsung-Han Lin and H.T. Kung Unsupervised Representation Learning Layer 3 Representation Encoding Sparse encoder Layer 2 Representation

More information

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion

More information

Generalisation and inductive reasoning. Computational Cognitive Science 2014 Dan Navarro

Generalisation and inductive reasoning. Computational Cognitive Science 2014 Dan Navarro Generalisation and inductive reasoning Computational Cognitive Science 2014 Dan Navarro Where are we at? Bayesian statistics as a general purpose tool for doing inductive inference A specific Bayesian

More information

Large-scale Image Annotation by Efficient and Robust Kernel Metric Learning

Large-scale Image Annotation by Efficient and Robust Kernel Metric Learning Large-scale Image Annotation by Efficient and Robust Kernel Metric Learning Supplementary Material Zheyun Feng Rong Jin Anil Jain Department of Computer Science and Engineering, Michigan State University,

More information

Linear Classifiers IV

Linear Classifiers IV Universität Potsdam Institut für Informatik Lehrstuhl Linear Classifiers IV Blaine Nelson, Tobias Scheffer Contents Classification Problem Bayesian Classifier Decision Linear Classifiers, MAP Models Logistic

More information

Metric Learning. 16 th Feb 2017 Rahul Dey Anurag Chowdhury

Metric Learning. 16 th Feb 2017 Rahul Dey Anurag Chowdhury Metric Learning 16 th Feb 2017 Rahul Dey Anurag Chowdhury 1 Presentation based on Bellet, Aurélien, Amaury Habrard, and Marc Sebban. "A survey on metric learning for feature vectors and structured data."

More information

Classification Part 2. Overview

Classification Part 2. Overview Classification Part 2 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Overview Rule based Classifiers Nearest-neighbor Classifiers Data Mining

More information

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang Machine Learning Basics Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang Example: image classification indoor Indoor outdoor Example: image classification (multiclass)

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17 3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural

More information

Improved Local Coordinate Coding using Local Tangents

Improved Local Coordinate Coding using Local Tangents Improved Local Coordinate Coding using Local Tangents Kai Yu NEC Laboratories America, 10081 N. Wolfe Road, Cupertino, CA 95129 Tong Zhang Rutgers University, 110 Frelinghuysen Road, Piscataway, NJ 08854

More information

Active Learning for Sparse Bayesian Multilabel Classification

Active Learning for Sparse Bayesian Multilabel Classification Active Learning for Sparse Bayesian Multilabel Classification Deepak Vasisht, MIT & IIT Delhi Andreas Domianou, University of Sheffield Manik Varma, MSR, India Ashish Kapoor, MSR, Redmond Multilabel Classification

More information

Metric Embedding for Kernel Classification Rules

Metric Embedding for Kernel Classification Rules Metric Embedding for Kernel Classification Rules Bharath K. Sriperumbudur University of California, San Diego (Joint work with Omer Lang & Gert Lanckriet) Bharath K. Sriperumbudur (UCSD) Metric Embedding

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

PRUNING CONVOLUTIONAL NEURAL NETWORKS. Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz

PRUNING CONVOLUTIONAL NEURAL NETWORKS. Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz PRUNING CONVOLUTIONAL NEURAL NETWORKS Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz 2017 WHY WE CAN PRUNE CNNS? 2 WHY WE CAN PRUNE CNNS? Optimization failures : Some neurons are "dead":

More information

Semi-supervised Learning

Semi-supervised Learning Semi-supervised Learning Introduction Supervised learning: x r, y r R r=1 E.g.x r : image, y r : class labels Semi-supervised learning: x r, y r r=1 R, x u R+U u=r A set of unlabeled data, usually U >>

More information

Asaf Bar Zvi Adi Hayat. Semantic Segmentation

Asaf Bar Zvi Adi Hayat. Semantic Segmentation Asaf Bar Zvi Adi Hayat Semantic Segmentation Today s Topics Fully Convolutional Networks (FCN) (CVPR 2015) Conditional Random Fields as Recurrent Neural Networks (ICCV 2015) Gaussian Conditional random

More information

Efficient and Principled Online Classification Algorithms for Lifelon

Efficient and Principled Online Classification Algorithms for Lifelon Efficient and Principled Online Classification Algorithms for Lifelong Learning Toyota Technological Institute at Chicago Chicago, IL USA Talk @ Lifelong Learning for Mobile Robotics Applications Workshop,

More information

A Parallel SGD method with Strong Convergence

A Parallel SGD method with Strong Convergence A Parallel SGD method with Strong Convergence Dhruv Mahajan Microsoft Research India dhrumaha@microsoft.com S. Sundararajan Microsoft Research India ssrajan@microsoft.com S. Sathiya Keerthi Microsoft Corporation,

More information

Interpreting Deep Classifiers

Interpreting Deep Classifiers Ruprecht-Karls-University Heidelberg Faculty of Mathematics and Computer Science Seminar: Explainable Machine Learning Interpreting Deep Classifiers by Visual Distillation of Dark Knowledge Author: Daniela

More information

Lecture 2 - Learning Binary & Multi-class Classifiers from Labelled Training Data

Lecture 2 - Learning Binary & Multi-class Classifiers from Labelled Training Data Lecture 2 - Learning Binary & Multi-class Classifiers from Labelled Training Data DD2424 March 23, 2017 Binary classification problem given labelled training data Have labelled training examples? Given

More information

Online Passive-Aggressive Algorithms. Tirgul 11

Online Passive-Aggressive Algorithms. Tirgul 11 Online Passive-Aggressive Algorithms Tirgul 11 Multi-Label Classification 2 Multilabel Problem: Example Mapping Apps to smart folders: Assign an installed app to one or more folders Candy Crush Saga 3

More information

Outline. Limits of Bayesian classification Bayesian concept learning Probabilistic models for unsupervised and semi-supervised category learning

Outline. Limits of Bayesian classification Bayesian concept learning Probabilistic models for unsupervised and semi-supervised category learning Outline Limits of Bayesian classification Bayesian concept learning Probabilistic models for unsupervised and semi-supervised category learning Limitations Is categorization just discrimination among mutually

More information

Machine Learning Basics

Machine Learning Basics Security and Fairness of Deep Learning Machine Learning Basics Anupam Datta CMU Spring 2019 Image Classification Image Classification Image classification pipeline Input: A training set of N images, each

More information

Machine Learning for Signal Processing Bayes Classification and Regression

Machine Learning for Signal Processing Bayes Classification and Regression Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For

More information

Machine Learning for Structured Prediction

Machine Learning for Structured Prediction Machine Learning for Structured Prediction Grzegorz Chrupa la National Centre for Language Technology School of Computing Dublin City University NCLT Seminar Grzegorz Chrupa la (DCU) Machine Learning for

More information

Machine Learning for Signal Processing Bayes Classification

Machine Learning for Signal Processing Bayes Classification Machine Learning for Signal Processing Bayes Classification Class 16. 24 Oct 2017 Instructor: Bhiksha Raj - Abelino Jimenez 11755/18797 1 Recap: KNN A very effective and simple way of performing classification

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete

More information

Being Bayesian About Network Structure:

Being Bayesian About Network Structure: Being Bayesian About Network Structure: A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller Machine Learning, 2003 Presented by XianXing Zhang Duke University

More information

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi

More information

Deep Convolutional Neural Networks for Pairwise Causality

Deep Convolutional Neural Networks for Pairwise Causality Deep Convolutional Neural Networks for Pairwise Causality Karamjit Singh, Garima Gupta, Lovekesh Vig, Gautam Shroff, and Puneet Agarwal TCS Research, Delhi Tata Consultancy Services Ltd. {karamjit.singh,

More information

Online Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions?

Online Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions? Online Videos FERPA Sign waiver or sit on the sides or in the back Off camera question time before and after lecture Questions? Lecture 1, Slide 1 CS224d Deep NLP Lecture 4: Word Window Classification

More information

Learning Predictive Categories Using Lifted Relational Neural Networks

Learning Predictive Categories Using Lifted Relational Neural Networks Learning Predictive Categories Using Lifted Relational Neural Networks Gustav Šourek1, Suresh Manandhar 2, Filip Železný1, Steven Schockaert 3, and Ondřej Kuželka 3 1 Czech Technical University, Prague,

More information

Probabilistic Models for Sequence Labeling

Probabilistic Models for Sequence Labeling Probabilistic Models for Sequence Labeling Besnik Fetahu June 9, 2011 Besnik Fetahu () Probabilistic Models for Sequence Labeling June 9, 2011 1 / 26 Background & Motivation Problem introduction Generative

More information

Multi-label Active Learning with Auxiliary Learner

Multi-label Active Learning with Auxiliary Learner Multi-label Active Learning with Auxiliary Learner Chen-Wei Hung and Hsuan-Tien Lin National Taiwan University November 15, 2011 C.-W. Hung & H.-T. Lin (NTU) Multi-label AL w/ Auxiliary Learner 11/15/2011

More information

Expectation propagation as a way of life

Expectation propagation as a way of life Expectation propagation as a way of life Yingzhen Li Department of Engineering Feb. 2014 Yingzhen Li (Department of Engineering) Expectation propagation as a way of life Feb. 2014 1 / 9 Reference This

More information

Lecture 3: Pattern Classification. Pattern classification

Lecture 3: Pattern Classification. Pattern classification EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and

More information

Information Extraction from Text

Information Extraction from Text Information Extraction from Text Jing Jiang Chapter 2 from Mining Text Data (2012) Presented by Andrew Landgraf, September 13, 2013 1 What is Information Extraction? Goal is to discover structured information

More information

Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond

Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond Ben Haeffele and René Vidal Center for Imaging Science Mathematical Institute for Data Science Johns Hopkins University This

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University Text Mining Dr. Yanjun Li Associate Professor Department of Computer and Information Sciences Fordham University Outline Introduction: Data Mining Part One: Text Mining Part Two: Preprocessing Text Data

More information

Introduction to Neural Networks

Introduction to Neural Networks CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character

More information

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision

More information

Global Scene Representations. Tilke Judd

Global Scene Representations. Tilke Judd Global Scene Representations Tilke Judd Papers Oliva and Torralba [2001] Fei Fei and Perona [2005] Labzebnik, Schmid and Ponce [2006] Commonalities Goal: Recognize natural scene categories Extract features

More information

Concept Learning. Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University.

Concept Learning. Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University. Concept Learning Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Tom M. Mitchell, Machine Learning, Chapter 2 2. Tom M. Mitchell s

More information

Neural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35

Neural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35 Neural Networks David Rosenberg New York University July 26, 2017 David Rosenberg (New York University) DS-GA 1003 July 26, 2017 1 / 35 Neural Networks Overview Objectives What are neural networks? How

More information

Part-of-Speech Tagging + Neural Networks 3: Word Embeddings CS 287

Part-of-Speech Tagging + Neural Networks 3: Word Embeddings CS 287 Part-of-Speech Tagging + Neural Networks 3: Word Embeddings CS 287 Review: Neural Networks One-layer multi-layer perceptron architecture, NN MLP1 (x) = g(xw 1 + b 1 )W 2 + b 2 xw + b; perceptron x is the

More information

Loss Functions and Optimization. Lecture 3-1

Loss Functions and Optimization. Lecture 3-1 Lecture 3: Loss Functions and Optimization Lecture 3-1 Administrative Assignment 1 is released: http://cs231n.github.io/assignments2017/assignment1/ Due Thursday April 20, 11:59pm on Canvas (Extending

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

Appendix 1 PROOFS 1.1 PROOF OF LEMMA 1

Appendix 1 PROOFS 1.1 PROOF OF LEMMA 1 Appendix 1 PROOFS In this section, we provide proofs for all the lemmas and theorems in the main paper. We always assume that a class-conditional extension of the Classification Noise Process (CNP) (Angluin

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 27, 2015 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass

More information

Riemannian Metric Learning for Symmetric Positive Definite Matrices

Riemannian Metric Learning for Symmetric Positive Definite Matrices CMSC 88J: Linear Subspaces and Manifolds for Computer Vision and Machine Learning Riemannian Metric Learning for Symmetric Positive Definite Matrices Raviteja Vemulapalli Guide: Professor David W. Jacobs

More information

Learning from the Wisdom of Crowds by Minimax Entropy. Denny Zhou, John Platt, Sumit Basu and Yi Mao Microsoft Research, Redmond, WA

Learning from the Wisdom of Crowds by Minimax Entropy. Denny Zhou, John Platt, Sumit Basu and Yi Mao Microsoft Research, Redmond, WA Learning from the Wisdom of Crowds by Minimax Entropy Denny Zhou, John Platt, Sumit Basu and Yi Mao Microsoft Research, Redmond, WA Outline 1. Introduction 2. Minimax entropy principle 3. Future work and

More information

Active Learning with Cross-class Similarity Transfer

Active Learning with Cross-class Similarity Transfer Active Learning with Cross-class Similarity Transfer Yuchen Guo, Guiguang Ding, Yue Gao, Jungong Han Tsinghua National Laboratory for Information Science and Technology (TNList School of Software, Tsinghua

More information

MIRA, SVM, k-nn. Lirong Xia

MIRA, SVM, k-nn. Lirong Xia MIRA, SVM, k-nn Lirong Xia Linear Classifiers (perceptrons) Inputs are feature values Each feature has a weight Sum is the activation activation w If the activation is: Positive: output +1 Negative, output

More information

Support Vector Machine. Industrial AI Lab.

Support Vector Machine. Industrial AI Lab. Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different

More information

Structure Learning in Sequential Data

Structure Learning in Sequential Data Structure Learning in Sequential Data Liam Stewart liam@cs.toronto.edu Richard Zemel zemel@cs.toronto.edu 2005.09.19 Motivation. Cau, R. Kuiper, and W.-P. de Roever. Formalising Dijkstra's development

More information

Feature Engineering, Model Evaluations

Feature Engineering, Model Evaluations Feature Engineering, Model Evaluations Giri Iyengar Cornell University gi43@cornell.edu Feb 5, 2018 Giri Iyengar (Cornell Tech) Feature Engineering Feb 5, 2018 1 / 35 Overview 1 ETL 2 Feature Engineering

More information

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Nicolas Thome Prenom.Nom@cnam.fr http://cedric.cnam.fr/vertigo/cours/ml2/ Département Informatique Conservatoire

More information

Lecture 3: Pattern Classification

Lecture 3: Pattern Classification EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Domain adaptation for deep learning

Domain adaptation for deep learning What you saw is not what you get Domain adaptation for deep learning Kate Saenko Successes of Deep Learning in AI A Learning Advance in Artificial Intelligence Rivals Human Abilities Deep Learning for

More information

Feedback Loop between High Level Semantics and Low Level Vision

Feedback Loop between High Level Semantics and Low Level Vision Feedback Loop between High Level Semantics and Low Level Vision Varun K. Nagaraja Vlad I. Morariu Larry S. Davis {varun,morariu,lsd}@umiacs.umd.edu University of Maryland, College Park, MD, USA. Abstract.

More information

CENG 793. On Machine Learning and Optimization. Sinan Kalkan

CENG 793. On Machine Learning and Optimization. Sinan Kalkan CENG 793 On Machine Learning and Optimization Sinan Kalkan 2 Now Introduction to ML Problem definition Classes of approaches K-NN Support Vector Machines Softmax classification / logistic regression Parzen

More information

Examples are not Enough, Learn to Criticize! Criticism for Interpretability

Examples are not Enough, Learn to Criticize! Criticism for Interpretability Examples are not Enough, Learn to Criticize! Criticism for Interpretability Been Kim, Rajiv Khanna, Oluwasanmi Koyejo Wittawat Jitkrittum Gatsby Machine Learning Journal Club 16 Jan 2017 1/20 Summary Examples

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2017 Notes on Lecture the most technical lecture of the course includes some scary looking math, but typically with intuitive interpretation use of standard machine

More information

Laconic: Label Consistency for Image Categorization

Laconic: Label Consistency for Image Categorization 1 Laconic: Label Consistency for Image Categorization Samy Bengio, Google with Jeff Dean, Eugene Ie, Dumitru Erhan, Quoc Le, Andrew Rabinovich, Jon Shlens, and Yoram Singer 2 Motivation WHAT IS THE OCCLUDED

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014 Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of

More information

Logistic Regression Logistic

Logistic Regression Logistic Case Study 1: Estimating Click Probabilities L2 Regularization for Logistic Regression Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin January 10 th,

More information

ECE 661: Homework 10 Fall 2014

ECE 661: Homework 10 Fall 2014 ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Composite Quantization for Approximate Nearest Neighbor Search

Composite Quantization for Approximate Nearest Neighbor Search Composite Quantization for Approximate Nearest Neighbor Search Jingdong Wang Lead Researcher Microsoft Research http://research.microsoft.com/~jingdw ICML 104, joint work with my interns Ting Zhang from

More information

LOCALITY PRESERVING HASHING. Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA

LOCALITY PRESERVING HASHING. Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA LOCALITY PRESERVING HASHING Yi-Hsuan Tsai Ming-Hsuan Yang Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA ABSTRACT The spectral hashing algorithm relaxes

More information

Nonparametric Methods Lecture 5

Nonparametric Methods Lecture 5 Nonparametric Methods Lecture 5 Jason Corso SUNY at Buffalo 17 Feb. 29 J. Corso (SUNY at Buffalo) Nonparametric Methods Lecture 5 17 Feb. 29 1 / 49 Nonparametric Methods Lecture 5 Overview Previously,

More information

Machine Learning for NLP

Machine Learning for NLP Machine Learning for NLP Linear Models Joakim Nivre Uppsala University Department of Linguistics and Philology Slides adapted from Ryan McDonald, Google Research Machine Learning for NLP 1(26) Outline

More information

Lecture 14: Deep Generative Learning

Lecture 14: Deep Generative Learning Generative Modeling CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 14: Deep Generative Learning Density estimation Reconstructing probability density function using samples Bohyung Han

More information

Statistical NLP for the Web Log Linear Models, MEMM, Conditional Random Fields

Statistical NLP for the Web Log Linear Models, MEMM, Conditional Random Fields Statistical NLP for the Web Log Linear Models, MEMM, Conditional Random Fields Sameer Maskey Week 13, Nov 28, 2012 1 Announcements Next lecture is the last lecture Wrap up of the semester 2 Final Project

More information

Support Vector Machines. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Support Vector Machines. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Support Vector Machines CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 A Linearly Separable Problem Consider the binary classification

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 18, 2016 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass

More information

Neural networks and optimization

Neural networks and optimization Neural networks and optimization Nicolas Le Roux Criteo 18/05/15 Nicolas Le Roux (Criteo) Neural networks and optimization 18/05/15 1 / 85 1 Introduction 2 Deep networks 3 Optimization 4 Convolutional

More information

Machine Learning for NLP

Machine Learning for NLP Machine Learning for NLP Uppsala University Department of Linguistics and Philology Slides borrowed from Ryan McDonald, Google Research Machine Learning for NLP 1(50) Introduction Linear Classifiers Classifiers

More information

A Simple Exponential Family Framework for Zero-Shot Learning

A Simple Exponential Family Framework for Zero-Shot Learning A Simple Exponential Family Framework for Zero-Shot Learning Vinay Kumar Verma and Piyush Rai Dept. of Computer Science & Engineering, IIT Kanpur, India {vkverma,piyush}@cse.iitk.ac.in Abstract. We present

More information

FINAL: CS 6375 (Machine Learning) Fall 2014

FINAL: CS 6375 (Machine Learning) Fall 2014 FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for

More information

Generalized Zero-Shot Learning with Deep Calibration Network

Generalized Zero-Shot Learning with Deep Calibration Network Generalized Zero-Shot Learning with Deep Calibration Network Shichen Liu, Mingsheng Long, Jianmin Wang, and Michael I.Jordan School of Software, Tsinghua University, China KLiss, MOE; BNRist; Research

More information

Mirror Descent for Metric Learning. Gautam Kunapuli Jude W. Shavlik

Mirror Descent for Metric Learning. Gautam Kunapuli Jude W. Shavlik Mirror Descent for Metric Learning Gautam Kunapuli Jude W. Shavlik And what do we have here? We have a metric learning algorithm that uses composite mirror descent (COMID): Unifying framework for metric

More information

Analyzing the Performance of Multilayer Neural Networks for Object Recognition

Analyzing the Performance of Multilayer Neural Networks for Object Recognition Analyzing the Performance of Multilayer Neural Networks for Object Recognition Pulkit Agrawal, Ross Girshick, Jitendra Malik {pulkitag,rbg,malik}@eecs.berkeley.edu University of California Berkeley Supplementary

More information

Iterative Laplacian Score for Feature Selection

Iterative Laplacian Score for Feature Selection Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,

More information

Kernel Density Topic Models: Visual Topics Without Visual Words

Kernel Density Topic Models: Visual Topics Without Visual Words Kernel Density Topic Models: Visual Topics Without Visual Words Konstantinos Rematas K.U. Leuven ESAT-iMinds krematas@esat.kuleuven.be Mario Fritz Max Planck Institute for Informatics mfrtiz@mpi-inf.mpg.de

More information

Large-Margin Thresholded Ensembles for Ordinal Regression

Large-Margin Thresholded Ensembles for Ordinal Regression Large-Margin Thresholded Ensembles for Ordinal Regression Hsuan-Tien Lin and Ling Li Learning Systems Group, California Institute of Technology, U.S.A. Conf. on Algorithmic Learning Theory, October 9,

More information

Anticipating Visual Representations from Unlabeled Data. Carl Vondrick, Hamed Pirsiavash, Antonio Torralba

Anticipating Visual Representations from Unlabeled Data. Carl Vondrick, Hamed Pirsiavash, Antonio Torralba Anticipating Visual Representations from Unlabeled Data Carl Vondrick, Hamed Pirsiavash, Antonio Torralba Overview Problem Key Insight Methods Experiments Problem: Predict future actions and objects Image

More information

Machine Learning Lecture Notes

Machine Learning Lecture Notes Machine Learning Lecture Notes Predrag Radivojac January 25, 205 Basic Principles of Parameter Estimation In probabilistic modeling, we are typically presented with a set of observations and the objective

More information

Predictive analysis on Multivariate, Time Series datasets using Shapelets

Predictive analysis on Multivariate, Time Series datasets using Shapelets 1 Predictive analysis on Multivariate, Time Series datasets using Shapelets Hemal Thakkar Department of Computer Science, Stanford University hemal@stanford.edu hemal.tt@gmail.com Abstract Multivariate,

More information

Lecture: Face Recognition

Lecture: Face Recognition Lecture: Face Recognition Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 12-1 What we will learn today Introduction to face recognition The Eigenfaces Algorithm Linear

More information

Kronecker Decomposition for Image Classification

Kronecker Decomposition for Image Classification university of innsbruck institute of computer science intelligent and interactive systems Kronecker Decomposition for Image Classification Sabrina Fontanella 1,2, Antonio Rodríguez-Sánchez 1, Justus Piater

More information