AHU-WangXiao - 博客园

[置顶] Transformer in Computer Vision

摘要： Transformer in Computer Vision 2020-12-03 19:18:25 Survey 1: A Survey on Visual Transformer, Kai Han, et al. [Paper] Survey 2: Transformers in Vision: 阅读全文

posted @ 2020-12-03 19:45 AHU-WangXiao 阅读(2017) 评论(2) 推荐(1)

[置顶] Preparing

该文被密码保护。阅读全文

posted @ 2019-06-13 17:17 AHU-WangXiao 阅读(110) 评论(0) 推荐(0)

[置顶] Summary on deep learning framework --- PyTorch

摘要： Summary on deep learning framework PyTorch Updated on 2018-07-22 21:25:42 import osos.environ["CUDA_VISIBLE_DEVICES"]="4" export CUDA_VISIBLE_DEVICES= 阅读全文

posted @ 2017-08-13 16:07 AHU-WangXiao 阅读(6356) 评论(0) 推荐(2)

[置顶] Some Interesting Research Problems worth to Study

该文被密码保护。阅读全文

posted @ 2017-06-06 15:51 AHU-WangXiao 阅读(11) 评论(0) 推荐(0)

[置顶] Latex 经常见到的问题和解决方法

摘要： Latex 经常见到的问题和解决方法 2017-04-10 22:05:48 【资源下载】 1). Texlive 2021 下载地址：https://mirrors.sjtug.sjtu.edu.cn/ctan/systems/texlive/Images/ 2). AweSome LaTex: 阅读全文

posted @ 2017-04-10 22:07 AHU-WangXiao 阅读(12287) 评论(0) 推荐(2)

[置顶] Conclusions about Deep Learning with Python

摘要： Conclusions about Deep Learning with Python file_path = '{}/{}_ep{:04d}.pth.tar'.format(directory, net_type, self.epoch) 0. Install the specific versi 阅读全文

posted @ 2017-03-01 16:17 AHU-WangXiao 阅读(1582) 评论(0) 推荐(0)

[置顶] Matlab 进阶学习记录

摘要： Matlab 进阶学习记录 Error: Invalid MEX-file '/media/wangxiao/Acer/dataset/LDES/utility/mexfiles/mpolar.mexa64': /usr/local/MATLAB/R2017a/bin/glnxa64/../../s 阅读全文

posted @ 2016-07-31 16:32 AHU-WangXiao 阅读(3418) 评论(0) 推荐(0)

[置顶] Something on Visual Tracking

该文被密码保护。阅读全文

posted @ 2016-06-27 17:28 AHU-WangXiao 阅读(25) 评论(0) 推荐(0)

[置顶] Caffe+CUDA8.0+CuDNNv5.1+OpenCV3.1+Ubuntu14.04 配置参考文献以及常见编译问题总结

摘要： Ubuntu + Deep Learning (Caffe, PyTorch) 配置参考文献 sudo apt install nvidia-cuda-toolkit pip install gpustat watch --color -n1 gpustat -cpu [Note]: the RTX 阅读全文

posted @ 2016-04-13 10:11 AHU-WangXiao 阅读(28333) 评论(0) 推荐(0)

2022年10月4日

VSCODE 配置远程服务器 --- Tutorial

该文被密码保护。阅读全文

posted @ 2022-10-04 10:31 AHU-WangXiao 阅读(34) 评论(0) 推荐(0)

2022年7月23日

Weakly Alignment-Free RGBT Salient Object Detection With Deep Correlation Network

摘要： Weakly Alignment-Free RGBT Salient Object Detection With Deep Correlation Network 2022-07-23 19:27:08 Paper: IEEE Xplore Full-Text PDF: 1. Background 阅读全文

posted @ 2022-07-23 19:28 AHU-WangXiao 阅读(291) 评论(0) 推荐(0)

2022年7月16日

Visual Prompt Tuning

摘要： Visual Prompt Tuning 2022-07-16 19:13:50 Paper: [2203.12119] Visual Prompt Tuning (arxiv.org) Code: KMnP/vpt: 🔥 Visual Prompt Tuning [ECCV 2022] http 阅读全文

posted @ 2022-07-16 20:40 AHU-WangXiao 阅读(1279) 评论(0) 推荐(0)

2022年7月2日

ActionCLIP: A New Paradigm for Video Action Recognition

摘要： ActionCLIP: A New Paradigm for Video Action Recognition 2022-07-02 17:38:37 Paper: 2109.08472.pdf (arxiv.org) Code: https://github.com/sallymmx/Action 阅读全文

posted @ 2022-07-02 17:39 AHU-WangXiao 阅读(532) 评论(0) 推荐(0)

2022年6月25日

opencv4.6.0 + rtx2070 + ubuntu16.04 install tutorial

摘要： opencv4.6.0 + rtx2070 + ubuntu16.04 install tutorial ref-1: https://blog.csdn.net/qvodgg/article/details/108410549 ref-2: https://zhuanlan.zhihu.com/p 阅读全文

posted @ 2022-06-25 10:37 AHU-WangXiao 阅读(219) 评论(0) 推荐(0)

2022年6月7日

AEGNN: Asynchronous Event-based Graph Neural Networks

摘要： AEGNN: Asynchronous Event-based Graph Neural Networks 2022-06-07 17:01:45 Paper: https://rpg.ifi.uzh.ch/docs/CVPR22_Schaefer.pdf Code: https://uzh-rpg 阅读全文

posted @ 2022-06-07 17:03 AHU-WangXiao 阅读(179) 评论(0) 推荐(0)

2022年6月3日

A Voxel Graph CNN for Object Classification with Event Cameras

该文被密码保护。阅读全文

posted @ 2022-06-03 20:29 AHU-WangXiao 阅读(2) 评论(0) 推荐(0)

2022年4月12日

Event Transformer

该文被密码保护。阅读全文

posted @ 2022-04-12 20:50 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2022年4月6日

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

该文被密码保护。阅读全文

posted @ 2022-04-06 14:46 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2022年4月2日

A Roadmap for Big Model

该文被密码保护。阅读全文

posted @ 2022-04-02 10:41 AHU-WangXiao 阅读(1) 评论(0) 推荐(0)

2022年3月30日

VL-BERT: PRE-TRAINING OF GENERIC VISUALLINGUISTIC REPRESENTATIONS

摘要： VL-BERT: PRE-TRAINING OF GENERIC VISUALLINGUISTIC REPRESENTATIONS 2022-03-30 20:35:13 Paper: https://openreview.net/forum?id=SygXPaEYvH Code: https:// 阅读全文

posted @ 2022-03-30 20:37 AHU-WangXiao 阅读(95) 评论(0) 推荐(0)

2022年3月22日

Unicoder-VL: A Universal Encoder for Vision and Language by Cross-Modal Pre-Training

摘要： Unicoder-VL: A Universal Encoder for Vision and Language by Cross-Modal Pre-Training 2022-03-22 14:22:12 Paper: https://ojs.aaai.org/index.php/AAAI/ar 阅读全文

posted @ 2022-03-22 14:23 AHU-WangXiao 阅读(392) 评论(0) 推荐(0)

2022年3月20日

U-ViusalBERT --- Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions

摘要： Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions 2022-03-20 17:34:51 Paper: https://arxiv.org/pdf/2010.12831.pdf Cod 阅读全文

posted @ 2022-03-20 17:38 AHU-WangXiao 阅读(167) 评论(0) 推荐(0)

Visualbert --- A simple and performant baseline for vision and language

摘要： Visualbert: A simple and performant baseline for vision and language 2022-03-20 15:19:04 Paper: https://arxiv.org/pdf/1908.03557 1. Background and Mot 阅读全文

posted @ 2022-03-20 15:27 AHU-WangXiao 阅读(478) 评论(0) 推荐(0)

2022年3月18日

Fusion of Detected Objects in Text for Visual Question Answering

摘要： Fusion of Detected Objects in Text for Visual Question Answering 2022-03-18 16:29:58 Paper: https://aclanthology.org/D19-1219/ Code: https://github.co 阅读全文

posted @ 2022-03-18 16:31 AHU-WangXiao 阅读(120) 评论(0) 推荐(0)

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

摘要： Align before Fuse: Vision and Language Representation Learning with Momentum Distillation 2022-03-18 10:04:06 Paper: https://proceedings.neurips.cc/pa 阅读全文

posted @ 2022-03-18 10:13 AHU-WangXiao 阅读(1327) 评论(0) 推荐(0)

2022年3月17日

ActBERT: Learning Global-Local Video-Text Representations

摘要： ActBERT: Learning Global-Local Video-Text Representations 2022-03-17 16:41:43 Paper: http://openaccess.thecvf.com/content_CVPR_2020/papers/Zhu_ActBERT 阅读全文

posted @ 2022-03-17 16:51 AHU-WangXiao 阅读(180) 评论(0) 推荐(0)

12-in-1: Multi-Task Vision and Language Representation Learning

摘要： 12-in-1: Multi-Task Vision and Language Representation Learning 2022-03-17 09:45:41 Paper: https://openaccess.thecvf.com/content_CVPR_2020/papers/Lu_1 阅读全文

posted @ 2022-03-17 14:28 AHU-WangXiao 阅读(342) 评论(0) 推荐(0)

2022年3月16日

ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision

摘要： Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision 2022-03-16 21:02:21 Paper: http://proceedings.mlr.press/v139 阅读全文

posted @ 2022-03-16 21:20 AHU-WangXiao 阅读(721) 评论(0) 推荐(0)

2022年3月7日

Connecting Vision and Language with Localized Narratives

该文被密码保护。阅读全文

posted @ 2022-03-07 19:55 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2021年12月19日

RegionCLIP: Region-based Language-Image Pretraining

该文被密码保护。阅读全文

posted @ 2021-12-19 19:16 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2021年12月18日

Capsule-based Object Tracking with Natural Language Specification

摘要： Capsule-based Object Tracking with Natural Language Specification 2021-12-18 19:28:39 Paper: https://dl.acm.org/doi/abs/10.1145/3474085.3475349 1. Bac 阅读全文

posted @ 2021-12-18 19:31 AHU-WangXiao 阅读(248) 评论(0) 推荐(0)

2021年11月25日

CLIP: Learning Transferable Visual Models From Natural Language Supervision

摘要： CLIP: Learning Transferable Visual Models From Natural Language Supervision 2021-11-25 21:29:02 Paper: https://arxiv.org/pdf/2103.00020.pdf Code: http 阅读全文

posted @ 2021-11-25 21:30 AHU-WangXiao 阅读(205) 评论(0) 推荐(0)

2021年11月2日

Liquid Time-constant Networks

该文被密码保护。阅读全文

posted @ 2021-11-02 18:42 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2021年10月28日

VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding

该文被密码保护。阅读全文

posted @ 2021-10-28 17:43 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2021年10月20日

LEARNING TO PROMPT FOR VISION-LANGUAGE MODELS

该文被密码保护。阅读全文

posted @ 2021-10-20 22:02 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2021年9月28日

CPT: COLORFUL PROMPT TUNING FOR PRE-TRAINED VISION-LANGUAGE MODELS

摘要： CPT: COLORFUL PROMPT TUNING FOR PRE-TRAINED VISION-LANGUAGE MODELS 2021-09-28 11:41:22 Paper: https://arxiv.org/pdf/2109.11797.pdf Other blog: https:/ 阅读全文

posted @ 2021-09-28 11:43 AHU-WangXiao 阅读(1122) 评论(0) 推荐(0)

2021年9月12日

EventPoint: Self-Supervised Local Descriptor Learning for Event Cameras

该文被密码保护。阅读全文

posted @ 2021-09-12 11:16 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2021年8月31日

Temporal-wise Attention Spiking Neural Networks for Event Streams Classification

该文被密码保护。阅读全文

posted @ 2021-08-31 10:36 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2021年8月30日

Representation Learning for Event-based Visuomotor Policies

摘要： Representation Learning for Event-based Visuomotor Policies 2021-08-30 10:15:07 Paper: https://arxiv.org/pdf/2103.00806.pdf Code: https://github.com/m 阅读全文

posted @ 2021-08-30 10:19 AHU-WangXiao 阅读(168) 评论(0) 推荐(0)

2021年8月18日

Oscar, VinVL Pre-training: Bugs and Solutions

该文被密码保护。阅读全文

posted @ 2021-08-18 09:38 AHU-WangXiao 阅读(3) 评论(0) 推荐(0)

2021年8月11日

Multi-domain Collaborative Feature Representation for Robust Visual Object Tracking

该文被密码保护。阅读全文

posted @ 2021-08-11 20:31 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2021年8月10日

BEIT: BERT Pre-Training of Image Transformers

该文被密码保护。阅读全文

posted @ 2021-08-10 21:02 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2021年8月7日

TransReID: Transformer-based Object Re-Identification

该文被密码保护。阅读全文

posted @ 2021-08-07 20:31 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2021年8月4日

Space-time Event Clouds for Gesture Recognition: from RGB Cameras to Event Cameras

该文被密码保护。阅读全文

posted @ 2021-08-04 21:45 AHU-WangXiao 阅读(0) 评论(0) 推荐(0)

2021年7月29日

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

该文被密码保护。阅读全文

posted @ 2021-07-29 11:10 AHU-WangXiao 阅读(5) 评论(0) 推荐(0)

2021年7月22日

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

摘要： VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text 2021-07-22 08:54:20 Paper: https://arxiv.org/pdf/2104.11178. 阅读全文

posted @ 2021-07-22 11:38 AHU-WangXiao 阅读(1208) 评论(0) 推荐(0)

2021年7月21日

OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation

摘要： OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation 2021-07-21 20:23:07 Paper: https://arxiv.org/pdf/2107.00249.pdf Code: No 阅读全文

posted @ 2021-07-21 20:34 AHU-WangXiao 阅读(1175) 评论(0) 推荐(0)

AST: Audio Spectrogram Transformer

摘要： AST: Audio Spectrogram Transformer 2021-07-21 19:38:36 Paper: https://arxiv.org/pdf/2104.01778.pdf Code: https://github.com/YuanGongND/ast 1. Backgrou 阅读全文

posted @ 2021-07-21 20:14 AHU-WangXiao 阅读(1670) 评论(0) 推荐(0)

2021年7月20日

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts

摘要： Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts 2021-07-20 08:58:37 Paper: cvpr2021 Code: https://git 阅读全文

posted @ 2021-07-20 09:50 AHU-WangXiao 阅读(415) 评论(0) 推荐(0)

The Blog of Xiao Wang

Associate Professor, School of Computer Science and Technology, Anhui University, Email: xiaowang@ahu.edu.cn

公告