Generalized Zero-Shot Learning with Deep Calibration Network

Size: px

Start display at page:

Download "Generalized Zero-Shot Learning with Deep Calibration Network"

Claude Golden
5 years ago
Views:

1 Generalized Zero-Shot Learning with Deep Calibration Network Shichen Liu, Mingsheng Long, Jianmin Wang, and Michael I.Jordan School of Software, Tsinghua University, China KLiss, MOE; BNRist; Research Center for Big Data, Tsinghua University, China University of California, Berkeley, Berkeley, USA Youngnam Kim Machine Learning Group Department of Computer Science and Engineering Pohang University of Science and Technology

2 Preliminary Class semantic representation Class semantic representation have information on the class such as hand-labeled attribute vectors or text descriptions. Figure: Attribute vectors on AWA dataset (Xian et al., 2017)

3 Preliminary Class semantic representation Class semantic representation have information on the class such as hand-labeled attribute vectors or text descriptions. Figure: Text description on CUP dataset (Annonymous, 2018)

4 Preliminary Zero-shot learning Seen class dataset D s = {(x (s) i i-th example s label y (s) i, y (s) i )} Ns i=1, {1,..., C s } and semantic representations S s = {s (s) i Unseen class dataset D u = {x (u) i } Nu semantic representations S u = {s (u) i } Cs i=1 i=1 and } Cu i=1 where N s is the number of seen class examples, C s is the number of seen classes, N u is the number of unseen class examples and C u is the number of unseen classes. S s and S u are disjoint

5 Preliminary Zero-shot learning Train a model (φ, ψ) using seen class dataset D s and semantic representations S s Define f c (x) = sim(φ(x), ψ(s (u) c )) Prediction: y (u) i = argmax c f c (x (u) i ) Sometimes people use unseen class semantic representations S u (Liu et al., 2018) or even unseen class examples D u (Zhao et al., 2018)

6 Preliminary Generalized zero-shot learning Standard zero-shot learning: Predict sample s label over only unseen classes. Generalized zero-shot learning: Predict sample s label over both seen and unseen classes. For all semantic representations S = {s i } Cs+Cu i=1 Define f c (x) = sim(φ(x), ψ(s c )) Predict y i = argmax c f c (x i )

7 Motivation Deep learning models are likely to overfit to seen classes examples and have overconfidence to seen classes examples (almost close to 1) Model s prediction becomes uncertain when unseen classes are introduced at test time. Over-confidence on seen class samples and uncertainty on unseen class samples hurt zero-shot learning accuracy

8 Motivation

9 Prediction function Embedding of a sample x i ; φ(x i ) R k Embedding of a semantic representation s c ; ψ(s c ) R k Define f c (x i ) = sim(φ(x i ), ψ(s c )); similarity measure like inner product and cosine similarity Prediction; y i = argmax c f c (x i ) φ is a CNN (e.g. GoogLeNet-v2, ResNet-101) and ψ is a MLP

10 Loss function Sample x s class probability q over seen classes, τ is temperature exp (f c (x)/τ) q(y = c x) = Cs c =1 exp (f c (x)/τ) (1) Let ground truth class probability p(y = c x) Cross entropy loss L L = E x [ Ey x p [ log q(y x) ]] (2) Using τ < 1 to mitigate overconfidence problem over seen classes samples

11 Multi class hinge loss Most zero-shot learning methods used multi-class hinge loss; (y i, c) is 0 when y i equals to c and 1 otherwise. N s C s max(0, (y i, c) + f c (x i ) f yi (x i )) (3) i=1 c=1 If f yi (x i ) f c (x i ) < (y i, c), than [ min fc (x i ) f yi (x i ) ] (4) φ,ψ This paper shows that cross entropy loss has an advantage on zero-shot classification accuracy, compared to multi-class hinge loss.

12 Uncertainty calibration Samples x s class probability q c (u) over unseen classes S u ; f c (u) (x) is simmilarity between embedding of unseen class c and embedding of sample x Entropy loss H q (u) c (y = c x) = exp(f c (u) (x)/τ) (u) exp(f c (x)/τ) Cu c =1 (5) Total loss function of DCN [ [ (u) H = E x Ey x q (u) log q c (y x) ]] (6) c min L + λh + γω(φ, ψ) (7) φ,ψ

13 Experiments Datasets Animals with Attributes (AwA); coarse-grained and medium-scale. Caltech-UCSD-Birds (CUB); fine-grained and medium-scale. SUN Attribute (SUN); fine-graiend and medium-scale. Attribute Pascal and Yahoo (apy) coarse-grained and small-scale

14 Experiments

15 Evaluation protocol Per-class classification accuracy ACC C = 1 #correctly predicted samples in class c C #samples in class c c C (8) Generalized zero-shot learning ACC H = 2ACC unseen ACC seen ACC unseen + ACC unseen (9)

16 Experimental results DCN w/o ET; DCN without entropy loss and temperature calibration DCN w E; DCN without entropy loss

17 Experimental results

18 Experimental results

19 Analysis Temperature calibration mitigates overconfidence problem

20 Other zero-shot learning papers Stacked semantic-guided attention model for fine-grained zero-shot learning (Yu et al., 2018) Domain-invariant projection learning for zero-shot recognition (Zhao et al., 2018) Feature generating networks for zero-shot learning (Xian et al., 2018) Corelation network: meta learning for zero-shot learning (Annonymous, 2018)

21 References Annonymous. Correction networks: Meta-learning for zero-shot learning S. Liu, M. Long, J. Wang, and M. Jordan. Generalized zero-shot learning with deep calibration network. NIPS, Y. Xian, B. Schiele, and Z. Akata. Zero-shot learning-the good, the bad and the ugly. arxiv preprint arxiv: , Y. Xian, T. Lorenz, B. Schiele, and Z. Akata. Feature generating networks for zero-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, Y. Yu, Z. Ji, Y. Fu, J. Guo, Y. Pang, and Z. Zhang. Stacked semantic-guided attention model for fine-grained zero-shot learning. arxiv preprint arxiv: , A. Zhao, M. Ding, J. Guan, Z. Lu, T. Xiang, and J.-R. Wen. Domain-invariant projection learning for zero-shot recognition. arxiv preprint arxiv: , 2018.

Memory-Augmented Attention Model for Scene Text Recognition

Memory-Augmented Attention Model for Scene Text Recognition Cong Wang 1,2, Fei Yin 1,2, Cheng-Lin Liu 1,2,3 1 National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences