TPN论文笔记

Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning

Abstract:

This paper propose Transductive Propagation Network (TPN), a novel meta-learning framework for transductive inference that classifies the entire test set at once to alleviate the low-data problem. Specifically, they propose to learn to propagate labels from labeled instances to unlabeled test instances, by learning a graph construction module that exploits the manifold structure in the data. TPN jointly learns both the parameters of feature embedding and the graph construction in an end-to-end manner.

The author validate TPN on multiple benchmark datasets, on which it largely outperforms existing few-shot learning approaches and achieves the state-of-the-art results.

contribution:

  • The first to model transductive inference explicitly in few-shot learning

Although Reptile experimented with a transductive setting, they only share information between test examples by batch normalization rather than directly proposing a transductive model.

  • TPN

Feature Embedding

The network is made up of four convolutional blocks where each block begins with a 2D convolutional layer with a 3×3 kernel and filter size of 64. Each 4 convolutional layer is followed by a batch-normalization layer, a ReLU nonlinearity and a 2×2 max-pooling layer. They use the same embedding function \(f_ϕ\) for both the Support set \(S\) and the Query set \(Q\).

Graph Construction

Use Gaussian similarity function to calculate the weight:

\[W_{i j}=\exp \left(-\frac{1}{2} d\left(\frac{f_{\varphi}\left(\mathbf{x}_{\mathbf{i}}\right)}{\sigma_{i}}, \frac{f_{\varphi}\left(\mathbf{x}_{\mathbf{j}}\right)}{\sigma_{j}}\right)\right) \]

Label Propagation

\[F^{*}=(I-\alpha S)^{-1} Y\\ \alpha \in (0,1) \text{ controls the amount of propagated information} \\ F^{*} \text{ denotes the predicted labels, } S \text{ denotes the normalized weight, } Y \text{ denotes the initial labels.} \]

Loss

\[P\left(\tilde{y}_{i}=j \mid \mathbf{x}_{i}\right)=\frac{\exp \left(F_{i j}^{*}\right)}{\sum_{j=1}^{N} \exp \left(F_{i j}^{*}\right)} \\ J(\varphi, \phi)=\sum_{i=1}^{N \times K+T} \sum_{j=1}^{N}-\mathbb{I}\left(y_{i}==j\right) \log \left(P\left(\tilde{y}_{i}=j \mid \mathbf{x}_{i}\right)\right)\\ \tilde y_i \text{ denotes the final predicted label for i th instance in the union of support and query set} \\ F^∗_{ij} \text{ denotes the j th component of predicted label from label propagation}\\ y_i \text{ means the ground-truth label of }x_i \text{ and } \mathbb{I}(b)\text{ is an indicator function, } \mathbb{I}(b) \text{ = 1 if b is true and 0 otherwise.} \]

posted @ 2021-10-06 01:19  SethDeng  阅读(172)  评论(0编辑  收藏  举报