多模态模型-X-VLM-字节跳动
https://hub.baai.ac.cn/view/15120
https://mp.weixin.qq.com/s/U1pJ9TaijMVG0wawFulnww
论文标题:
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts
论文链接:
https://arxiv.org/abs/2111.08276
代码链接:
https://github.com/zengyan-97/X-VLM