03 2022 档案
摘要:VL-BERT: PRE-TRAINING OF GENERIC VISUALLINGUISTIC REPRESENTATIONS 2022-03-30 20:35:13 Paper: https://openreview.net/forum?id=SygXPaEYvH Code: https://
阅读全文
摘要:Unicoder-VL: A Universal Encoder for Vision and Language by Cross-Modal Pre-Training 2022-03-22 14:22:12 Paper: https://ojs.aaai.org/index.php/AAAI/ar
阅读全文
摘要:Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions 2022-03-20 17:34:51 Paper: https://arxiv.org/pdf/2010.12831.pdf Cod
阅读全文
摘要:Visualbert: A simple and performant baseline for vision and language 2022-03-20 15:19:04 Paper: https://arxiv.org/pdf/1908.03557 1. Background and Mot
阅读全文
摘要:Fusion of Detected Objects in Text for Visual Question Answering 2022-03-18 16:29:58 Paper: https://aclanthology.org/D19-1219/ Code: https://github.co
阅读全文
摘要:Align before Fuse: Vision and Language Representation Learning with Momentum Distillation 2022-03-18 10:04:06 Paper: https://proceedings.neurips.cc/pa
阅读全文
摘要:ActBERT: Learning Global-Local Video-Text Representations 2022-03-17 16:41:43 Paper: http://openaccess.thecvf.com/content_CVPR_2020/papers/Zhu_ActBERT
阅读全文
摘要:12-in-1: Multi-Task Vision and Language Representation Learning 2022-03-17 09:45:41 Paper: https://openaccess.thecvf.com/content_CVPR_2020/papers/Lu_1
阅读全文
摘要:Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision 2022-03-16 21:02:21 Paper: http://proceedings.mlr.press/v139
阅读全文
该文被密码保护。