fariver

2024年5月15日

[Paper Reading] PETR: Position Embedding Transformation for Multi-View 3D Object Detection

摘要： PETR: Position Embedding Transformation for Multi-View 3D Object Detection PETR: Position Embedding Transformation for Multi-View 3D Object Detection 阅读全文

posted @ 2024-05-15 16:58 fariver 阅读(119) 评论(0) 推荐(0) 编辑

2024年5月14日

[Paper Reading] BEVDet: High-Performance Multi-Camera 3D Object Detection in Bird-Eye-View

摘要： BEVDet: High-Performance Multi-Camera 3D Object Detection in Bird-Eye-View BEVDet 时间：21/12 机构：PhiGo(鉴智机器人) TL;DR 一种BEV空间做detection的方法，构建了新颖的数据增强方法以及更新阅读全文

posted @ 2024-05-14 14:12 fariver 阅读(86) 评论(0) 推荐(0) 编辑

2024年5月7日

[Paper Reading] OFT Orthographic Feature Transform for Monocular 3D Object Detection

摘要： OFT Orthographic Feature Transform for Monocular 3D Object Detection OFT Orthographic Feature Transform for Monocular 3D Object Detection 时间：18.11 机构：阅读全文

posted @ 2024-05-07 21:22 fariver 阅读(49) 评论(0) 推荐(0) 编辑

2024年5月6日

[Paper Reading] LSS: Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

摘要：名称 Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D 时间：20.08 机构：NVIDIA TL;DR 后融合方法将每一目感知结果通过相机参数转换到BEV空阅读全文

posted @ 2024-05-06 22:58 fariver 阅读(104) 评论(0) 推荐(0) 编辑

2024年4月28日

[Paper Reading] DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries

摘要：名称 DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries 时间：21.10 机构：mit/CMU/Stanford TL;DR 一种利用Transformer做E2E的3D目标检测方法，在nuScenes自动阅读全文

posted @ 2024-04-28 14:09 fariver 阅读(61) 评论(0) 推荐(0) 编辑

2024年4月22日

[基础] DETR：End-to-End Object Detection with Transformers

摘要：名称 End-to-End Object Detection with Transformers 时间：20.05 机构：Facebook AI TL;DR 文章提出一种称为DETR(Detection Transformer)的基于Transformer的检测器，相比于传统检测器不需要NMS以及a 阅读全文

posted @ 2024-04-22 22:01 fariver 阅读(51) 评论(0) 推荐(0) 编辑

2024年4月1日

[Paper Reading] VQ-GAN: Taming Transformers for High-Resolution Image Synthesis

摘要：名称 link [VQ-GAN](Taming Transformers for High-Resolution Image Synthesis) 时间：CVPR2021 oral 21.06 机构：Heidelberg Collaboratory for Image Processing, IWR 阅读全文

posted @ 2024-04-01 23:08 fariver 阅读(282) 评论(0) 推荐(0) 编辑

2024年3月28日

[Paper Reading] LVM: Sequential Modeling Enables Scalable Learning for Large Vision Models

摘要： LVM: Sequential Modeling Enables Scalable Learning for Large Vision Models LVM: Sequential Modeling Enables Scalable Learning for Large Vision Models 阅读全文

posted @ 2024-03-28 14:03 fariver 阅读(55) 评论(0) 推荐(0) 编辑

2024年3月27日

[Paper Reading] KOSMOS: Language Is Not All You Need: Aligning Perception with Language Models

摘要：名称 KOSMOS: Language Is Not All You Need: Aligning Perception with Language Models 时间：23.05 机构：Microsoft TL;DR 一种输入多模型信息的大语言模型，作者称之为多模型大语言模型(MLLM)，可以图多阅读全文

posted @ 2024-03-27 00:12 fariver 阅读(33) 评论(0) 推荐(0) 编辑

2024年3月26日

[Paper Reading] VQ-VAE: Neural Discrete Representation Learning

摘要：名称 VQ-VAE: Neural Discrete Representation Learning 时间：17.11 机构：Google TL;DR VQ全称为Vector Quantised，故名思义，本文相对于VAE最大改进是将VAE的latent representation由连续建模为离散阅读全文

posted @ 2024-03-26 00:12 fariver 阅读(210) 评论(0) 推荐(0) 编辑

公告