Cross-stitch Networks

继那天整理完Multi-task Learning的一些基本常识以后,最近开始看涉及到的一些经典文章,下面对cross-stitch unit整理

1. Motivation

Multi-task 网络有task-specific 和 shared representation 两个部分。传统网络在设计这两个部分时,一般时通过尝试所有可能网络选择最优网络,这种方法效率很低。文章基于这种提出了一种网络可以自动学习an optimal combination of shared and task-specificrepresentations.可以理解为对之前的方法进行了数学建模。

2.Cross-stitch unit

 称αABαBA为αD,即different task parameter, αAAαBB为αS,即same-task parameter。通过改变他们的值,可以设计网络为same-task 或者 different task

3.Design desicions for cross-stitching

3.1 Cross-stitch units initialization and learning rates: 

The initialization of α in the range[0,1]

3.2 Network initialization——how should one initialize the networks A and B?

(基于AlexNet)

1. Initialize networks A and B by networks that were trained on these tasks separately, 

2. Have the same initialization and train them jointly.

4.Future work

1.Where in the network should they be used;

2.How should their weights be constrained, is an interesting future direction.

 

posted on 2023-09-18 10:42  wkkh  阅读(99)  评论(0编辑  收藏  举报