Finding Task-Relevant Features for Few-Shot Learning by Category Traversal |
Edge-Labeling Graph Neural Network for Few-Shot Learning |
Generating Classification Weights With GNN Denoising Autoencoders for Few-Shot Learning |
Kervolutional Neural Networks |
Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data and How to Mitigate the Problem |
On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions |
Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization |
Hardness-Aware Deep Metric Learning |
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation |
Learning Loss for Active Learning |
Striking the Right Balance With Uncertainty |
AutoAugment: Learning Augmentation Strategies From Data |
SDRSAC: Semidefinite-Based Randomized Approach for Robust Point Cloud Registration Without Correspondences |
BAD SLAM: Bundle Adjusted Direct RGB-D SLAM |
Revealing Scenes by Inverting Structure From Motion Reconstructions |
Strand-Accurate Multi-View Hair Capture |
DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation |
Pushing the Boundaries of View Extrapolation With Multiplane Images |
GA-Net: Guided Aggregation Net for End-To-End Stereo Matching |
Real-Time Self-Adaptive Deep Stereo |
LAF-Net: Locally Adaptive Fusion Networks for Stereo Confidence Estimation |
NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences |
Coordinate-Free Carlsson-Weinshall Duality and Relative Multi-View Geometry |
Deep Reinforcement Learning of Volume-Guided Progressive View Inpainting for 3D Point Scene Completion From a Single Depth Image |
Video Action Transformer Network |
Timeception for Complex Action Recognition |
STEP: Spatio-Temporal Progressive Learning for Video Action Detection |
Relational Action Forecasting |
Long-Term Feature Banks for Detailed Video Understanding |
Which Way Are You Going? Imitative Decision Learning for Path Forecasting in Dynamic Scenes |
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment |
MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation |
2.5D Visual Sound |
Language-Driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model |
Gaussian Temporal Awareness Networks for Action Localization |
Efficient Video Classification Using Fewer Frames |
Parsing R-CNN for Instance-Level Human Analysis |
Large Scale Incremental Learning |
TopNet: Structural Point Cloud Decoder |
Perceive Where to Focus: Learning Visibility-Aware Part-Level Features for Partial Person Re-Identification |
Meta-Transfer Learning for Few-Shot Learning |
Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation |
Deep RNN Framework for Visual Sequential Applications |
Graph-Based Global Reasoning Networks |
SSN: Learning Sparse Switchable Normalization via SparsestMax |
Spherical Fractal Convolutional Neural Networks for Point Cloud Recognition |
Learning to Generate Synthetic Data via Compositing |
Divide and Conquer the Embedding Space for Metric Learning |
Latent Space Autoregression for Novelty Detection |
Attending to Discriminative Certainty for Domain Adaptation |
Feature Denoising for Improving Adversarial Robustness |
Selective Kernel Networks |
On Implicit Filter Level Sparsity in Convolutional Neural Networks |
FlowNet3D: Learning Scene Flow in 3D Point Clouds |
Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks |
Co-Occurrent Features in Semantic Segmentation |
Bag of Tricks for Image Classification with Convolutional Neural Networks |
Learning Channel-Wise Interactions for Binary Convolutional Neural Networks |
Knowledge Adaptation for Efficient Semantic Segmentation |
Parametric Noise Injection: Trainable Randomness to Improve Deep Neural Network Robustness Against Adversarial Attack |
Invariance Matters: Exemplar Memory for Domain Adaptive Person Re-Identification |
Dissecting Person Re-Identification From the Viewpoint of Viewpoint |
Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification |
Progressive Feature Alignment for Unsupervised Domain Adaptation |
Feature-Level Frankenstein: Eliminating Variations for Discriminative Recognition |
Learning a Deep ConvNet for Multi-Label Classification With Partial Labels |
Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression |
Densely Semantically Aligned Person Re-Identification |
Generalising Fine-Grained Sketch-Based Image Retrieval |
Adapting Object Detectors via Selective Cross-Domain Alignment |
Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation |
Thinking Outside the Pool: Active Training Image Creation for Relative Attributes |
Generalizable Person Re-Identification by Domain-Invariant Mapping Network |
Visual Attention Consistency Under Image Transforms for Multi-Label Image Classification |
Re-Ranking via Metric Fusion for Object Retrieval and Person Re-Identification |
Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization |
Weakly Supervised Person Re-Identification |
PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud |
Automatic Adaptation of Object Detectors to New Domains Using Self-Training |
Deep Sketch-Shape Hashing With Segmented 3D Stochastic Viewing |
Generative Dual Adversarial Network for Generalized Zero-Shot Learning |
Query-Guided End-To-End Person Search |
Libra R-CNN: Towards Balanced Learning for Object Detection |
Learning a Unified Classifier Incrementally via Rebalancing |
Feature Selective Anchor-Free Module for Single-Shot Object Detection |
Bottom-Up Object Detection by Grouping Extreme and Center Points |
Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples |
SCOPS: Self-Supervised Co-Part Segmentation |
Unsupervised Moving Object Detection via Contextual Information Separation |
Pose2Seg: Detection Free Human Instance Segmentation |
DrivingStereo: A Large-Scale Dataset for Stereo Matching in Autonomous Driving Scenarios |
PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding |
A Dataset and Benchmark for Large-Scale Multi-Modal Face Anti-Spoofing |
Unsupervised Learning of Consensus Maximization for 3D Vision Problems |
VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People |
Structural Relational Reasoning of Point Clouds |
MVF-Net: Multi-View 3D Face Morphable Model Regression |
Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction |
Guided Stereo Matching |
Unsupervised Event-Based Learning of Optical Flow, Depth, and Egomotion |
Modeling Local Geometric Structure of 3D Point Clouds Using Geo-CNN |
3D Point Capsule Networks |
GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving |
Single-Image Piece-Wise Planar 3D Reconstruction via Associative Embedding |
3DN: 3D Deformation Network |
HorizonNet: Learning Room Layout With 1D Representation and Pano Stretch Data Augmentation |
Deep Fitting Degree Scoring Network for Monocular 3D Object Detection |
Pushing the Envelope for RGB-Based Dense 3D Hand Pose Estimation via Neural Rendering |
Self-Supervised Learning of 3D Human Pose Using Multi-View Geometry |
FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image |
Dense 3D Face Decoding Over 2500FPS: Joint Texture & Shape Convolutional Mesh Decoders |
Does Learning Specific Features for Related Parts Help Human Pose Estimation? |
Linkage Based Face Clustering via Graph Convolution Network |
Towards High-Fidelity Nonlinear 3D Face Morphable Model |
RegularFace: Deep Face Recognition via Exclusive Regularization |
BridgeNet: A Continuity-Aware Probabilistic Network for Age Estimation |
GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction |
Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition With Multimodal Training |
Learning to Reconstruct People in Clothing From a Single RGB Camera |
Distilled Person Re-Identification: Towards a More Scalable System |
A Perceptual Prediction Framework for Self Supervised Event Segmentation |
COIN: A Large-Scale Dataset for Comprehensive Instructional Video Analysis |
Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization |
An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition |
Graph Convolutional Label Noise Cleaner: Train a Plug-And-Play Action Classifier for Anomaly Detection |
MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment |
Less Is More: Learning Highlight Detection From Video Duration |
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition |
AdaFrame: Adaptive Frame Selection for Fast Video Recognition |
Spatio-Temporal Video Re-Localization by Warp LSTM |
Completeness Modeling and Context Separation for Weakly Supervised Temporal Action Localization |
Unsupervised Deep Tracking |
Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers |
Fast Online Object Tracking and Segmentation: A Unifying Approach |
Object Tracking by Reconstruction With View-Specific Discriminative Correlation Filters |
SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints |
Leveraging Shape Completion for 3D Siamese Tracking |
Target-Aware Deep Tracking |
Spatiotemporal CNN for Video Object Segmentation |
Towards Rich Feature Discovery With Class Activation Maps Augmentation for Person Re-Identification |
Wide-Context Semantic Image Extrapolation |
End-To-End Time-Lapse Video Synthesis From a Single Outdoor Image |
GIF2Video: Color Dequantization and Temporal Interpolation of GIF Images |
Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis |
Pluralistic Image Completion |
Salient Object Detection With Pyramid Attention and Salient Edges |
Latent Filter Scaling for Multimodal Unsupervised Image-To-Image Translation |
Attention-Aware Multi-Stroke Style Transfer |
Feedback Adversarial Learning: Spatial Feedback for Improving Generative Adversarial Networks |
Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting |
Example-Guided Style-Consistent Image Synthesis From Semantic Labeling |
MirrorGAN: Learning Text-To-Image Generation by Redescription |
Light Field Messaging With Deep Photographic Steganography |
Im2Pencil: Controllable Pencil Illustration From Photographs |
When Color Constancy Goes Wrong: Correcting Improperly White-Balanced Images |
Beyond Volumetric Albedo – A Surface Optimization Framework for Non-Line-Of-Sight Imaging |
Reflection Removal Using a Dual-Pixel Sensor |
Practical Coding Function Design for Time-Of-Flight Imaging |
Meta-SR: A Magnification-Arbitrary Network for Super-Resolution |
Multispectral and Hyperspectral Image Fusion by MS/HS Fusion Net |
Learning Attraction Field Representation for Robust Line Segment Detection |
Blind Super-Resolution With Iterative Kernel Correction |
Video Magnification in the Wild Using Fractional Anisotropy in Temporal Distribution |
Attentive Feedback Network for Boundary-Aware Salient Object Detection |
Heavy Rain Image Restoration: Integrating Physics Model and Conditional Adversarial Learning |
Learning to Calibrate Straight Lines for Fisheye Image Rectification |
Camera Lens Super-Resolution |
Frame-Consistent Recurrent Video Deraining With Dual-Level Flow |
Deep Plug-And-Play Super-Resolution for Arbitrary Blur Kernels |
Sea-Thru: A Method for Removing Water From Underwater Images |
Deep Network Interpolation for Continuous Imagery Effect Transition |
Spatially Variant Linear Representation Models for Joint Filtering |
Toward Convolutional Blind Denoising of Real Photographs |
Towards Real Scene Super-Resolution With Raw Images |
ODE-Inspired Network Design for Single Image Super-Resolution |
Blind Image Deblurring With Local Maximum Gradient Prior |
Attention-Guided Network for Ghost-Free High Dynamic Range Imaging |
Searching for a Robust Neural Architecture in Four GPU Hours |
Hierarchy Denoising Recursive Autoencoders for 3D Scene Layout Prediction |
Adaptively Connected Neural Networks |
CrDoCo: Pixel-Level Domain Transfer With Cross-Domain Consistency |
Temporal Cycle-Consistency Learning |
Predicting Future Frames Using Retrospective Cycle GAN |
Density Map Regression Guided Detection Network for RGB-D Crowd Counting and Localization |
TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning |
Learning Semantic Segmentation From Synthetic Data: A Geometrically Guided Input-Output Adaptation Approach |
Attentive Single-Tasking of Multiple Tasks |
Deep Metric Learning to Rank |
End-To-End Multi-Task Learning With Attention |
Self-Supervised Learning via Conditional Motion Propagation |
Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence |
All About Structure: Adapting Structural Information Across Domains for Boosting Semantic Segmentation |
Iterative Reorganization With Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning |
Revisiting Self-Supervised Visual Representation Learning |
It’s Not About the Journey; It’s About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning |
Actively Seeking and Learning From Live Data |
Improving Referring Expression Grounding With Cross-Modal Attention-Guided Erasing |
Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks |
Scene Graph Generation With External Knowledge and Image Reconstruction |
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval |
MUREL: Multimodal Relational Reasoning for Visual Question Answering |
Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering |
Information Maximizing Visual Question Generation |
Learning to Detect Human-Object Interactions With Knowledge |
Learning Words by Drawing Images |
Factor Graph Attention |
Reducing Uncertainty in Undersampled MRI Reconstruction With Active Acquisition |
ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification |
ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape |
Collaborative Learning of Semi-Supervised Segmentation and Classification for Medical Images |
Biologically-Constrained Graphs for Global Connectomics Reconstruction |
P3SGD: Patient Privacy Preserving SGD for Regularizing Deep CNNs in Pathological Image Classification |
Elastic Boundary Projection for 3D Medical Image Segmentation |
SIXray: A Large-Scale Security Inspection X-Ray Benchmark for Prohibited Item Discovery in Overlapping Images |
Noise2Void - Learning Denoising From Single Noisy Images |
Joint Discriminative and Generative Learning for Person Re-Identification |
Unsupervised Person Re-Identification by Soft Multilabel Learning |
Learning Context Graph for Person Search |
Gradient Matching Generative Networks for Zero-Shot Learning |
Doodle to Search: Practical Zero-Shot Sketch-Based Image Retrieval |
Zero-Shot Task Transfer |
C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection |
Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations |
Attention-Based Dropout Layer for Weakly Supervised Object Localization |
Domain Generalization by Solving Jigsaw Puzzles |
Transferrable Prototypical Networks for Unsupervised Domain Adaptation |
Blending-Target Domain Adaptation by Adversarial Meta-Adaptation Networks |
ELASTIC: Improving CNNs With Dynamic Scaling Policies |
ScratchDet: Training Single-Shot Object Detectors From Scratch |
SFNet: Learning Object-Aware Semantic Correspondence |
Deep Metric Learning Beyond Binary Supervision |
Learning to Cluster Faces on an Affinity Graph |
C2AE: Class Conditioned Auto-Encoder for Open-Set Recognition |
Shapes and Context: In-The-Wild Image Synthesis & Manipulation |
Semantics Disentangling for Text-To-Image Generation |
Semantic Image Synthesis With Spatially-Adaptive Normalization |
Progressive Pose Attention Transfer for Person Image Generation |
Unsupervised Person Image Generation With Semantic Parsing Transformation |
DeepView: View Synthesis With Learned Gradient Descent |
Animating Arbitrary Objects via Deep Motion Transfer |
Textured Neural Avatars |
IM-Net for High Resolution Video Frame Interpolation |
Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation |
Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation |
Geometry-Consistent Generative Adversarial Networks for One-Sided Unsupervised Domain Mapping |
DeepVoxels: Learning Persistent 3D Feature Embeddings |
Inverse Path Tracing for Joint Material and Lighting Estimation |
The Visual Centrifuge: Model-Free Layered Video Representations |
Label-Noise Robust Generative Adversarial Networks |
DLOW: Domain Flow for Adaptation and Generalization |
CollaGAN: Collaborative GAN for Missing Image Data Imputation |
d-SNE: Domain Adaptation Using Stochastic Neighborhood Embedding |
Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation |
ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation |
ContextDesc: Local Descriptor Augmentation With Cross-Modality Context |
Large-Scale Long-Tailed Recognition in an Open World |
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations Rather Than Data |
SDC - Stacked Dilated Convolution: A Unified Descriptor Network for Dense Matching Tasks |
Learning Correspondence From the Cycle-Consistency of Time |
AE2-Nets: Autoencoder in Autoencoder Networks |
Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach |
Learning Spatial Common Sense With Geometry-Aware Recurrent Networks |
Structured Knowledge Distillation for Semantic Segmentation |
Scan2CAD: Learning CAD Model Alignment in RGB-D Scans |
Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation |
Tell Me Where I Am: Object-Level Scene Context Prediction |
Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation |
Supervised Fitting of Geometric Primitives to 3D Point Clouds |
Do Better ImageNet Models Transfer Better? |
Gotta Adapt 'Em All: Joint Pixel and Feature-Level Domain Adaptation for Recognition in the Wild |
Understanding the Disharmony Between Dropout and Batch Normalization by Variance Shift |
Circulant Binary Convolutional Networks: Enhancing the Performance of 1-Bit DCNNs With Circulant Back Propagation |
DeFusionNET: Defocus Blur Detection via Recurrently Fusing and Refining Multi-Scale Deep Features |
Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks |
Universal Domain Adaptation |
Improving Transferability of Adversarial Examples With Input Diversity |
Sequence-To-Sequence Domain Adaptation Network for Robust Text Image Recognition |
Hybrid-Attention Based Decoupled Metric Learning for Zero-Shot Image Retrieval |
Learning to Sample |
Few-Shot Learning via Saliency-Guided Hallucination of Samples |
Variational Convolutional Neural Network Pruning |
Towards Optimal Structured CNN Pruning via Generative Adversarial Learning |
Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression |
Fully Quantized Network for Object Detection |
MnasNet: Platform-Aware Neural Architecture Search for Mobile |
Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More |
K-Nearest Neighbors Hashing |
Learning RoI Transformer for Oriented Object Detection in Aerial Images |
Snapshot Distillation: Teacher-Student Optimization in One Generation |
Geometry-Aware Distillation for Indoor Semantic Segmentation |
LiveSketch: Query Perturbations for Guided Sketch-Based Visual Search |
Bounding Box Regression With Uncertainty for Accurate Object Detection |
OCGAN: One-Class Novelty Detection Using GANs With Constrained Latent Representations |
Learning Metrics From Teachers: Compact Networks for Image Embedding |
Activity Driven Weakly Supervised Object Detection |
Separate to Adapt: Open Set Domain Adaptation via Progressive Separation |
Layout-Graph Reasoning for Fashion Landmark Detection |
DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs |
Mind Your Neighbours: Image Annotation With Metadata Neighbourhood Graph Co-Attention Networks |
Region Proposal by Guided Anchoring |
Distant Supervised Centroid Shift: A Simple and Efficient Approach to Visual Domain Adaptation |
Learning to Transfer Examples for Partial Domain Adaptation |
Generalized Zero-Shot Recognition Based on Visually Semantic Embedding |
Towards Visual Feature Translation |
Amodal Instance Segmentation With KINS Dataset |
Global Second-Order Pooling Convolutional Networks |
Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up |
NetTailor: Tuning the Architecture, Not Just the Weights |
Learning-Based Sampling for Natural Image Matting |
Learning Unsupervised Video Object Segmentation Through Visual Attention |
4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks |
Pyramid Feature Attention Network for Saliency Detection |
Co-Saliency Detection via Mask-Guided Fully Convolutional Networks With Multi-Scale Label Smoothing |
SAIL-VOS: Semantic Amodal Instance Level Video Object Segmentation - A Synthetic Dataset and Baselines |
Learning Instance Activation Maps for Weakly Supervised Instance Segmentation |
Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation |
Box-Driven Class-Wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation |
Dual Attention Network for Scene Segmentation |
InverseRenderNet: Learning Single Image Inverse Rendering |
A Variational Auto-Encoder Model for Stochastic Point Processes |
Unifying Heterogeneous Classifiers With Distillation |
Assessment of Faster R-CNN in Man-Machine Collaborative Search |
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge |
NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction |
Spectral Metric for Dataset Complexity Assessment |
ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding |
VERI-Wild: A Large Dataset and a New Method for Vehicle Re-Identification in the Wild |
3D Local Features for Direct Pairwise Registration |
HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-Scale Point Clouds |
GPSfM: Global Projective SFM Using Algebraic Constraints on Multi-View Fundamental Matrices |
Group-Wise Correlation Stereo Network |
Multi-Level Context Ultra-Aggregation for Stereo Matching |
Large-Scale, Metric Structure From Motion for Unordered Light Fields |
Understanding the Limitations of CNN-Based Absolute Camera Pose Regression |
DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene From Sparse LiDAR Data and Single Color Image |
Modeling Point Clouds With Self-Attention and Gumbel Subset Sampling |
Learning With Batch-Wise Optimal Transport Loss for 3D Shape Recognition |
DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion |
Dense Depth Posterior (DDP) From Single Image and Sparse Range |
DuLa-Net: A Dual-Projection Network for Estimating Room Layouts From a Single RGB Panorama |
Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach |
Segmentation-Driven 6D Object Pose Estimation |
Exploiting Temporal Context for 3D Human Pose Estimation in the Wild |
What Do Single-View 3D Reconstruction Networks Learn? |
UniformFace: Learning Deep Equidistributed Representation for Face Recognition |
Semantic Graph Convolutional Networks for 3D Human Pose Regression |
Mask-Guided Portrait Editing With Conditional GANs |
Group Sampling for Scale Invariant Face Detection |
Joint Representation and Estimator Learning for Facial Action Unit Intensity Estimation |
Semantic Alignment: Finding Semantically Consistent Ground-Truth for Facial Landmark Detection |
LAEO-Net: Revisiting People Looking at Each Other in Videos |
Robust Facial Landmark Detection via Occlusion-Adaptive Deep Networks |
Learning Individual Styles of Conversational Gesture |
Face Anti-Spoofing: Model Matters, so Does Data |
Fast Human Pose Estimation |
Decorrelated Adversarial Learning for Age-Invariant Face Recognition |
Cross-Task Weakly Supervised Learning From Instructional Videos |
D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation |
Progressive Teacher-Student Learning for Early Action Prediction |
Social Relation Recognition From Videos via Multi-Scale Spatial-Temporal Reasoning |
MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation |
Transferable Interactiveness Knowledge for Human-Object Interaction Detection |
Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition |
Multi-Granularity Generator for Temporal Action Proposal |
Deep Rigid Instance Scene Flow |
See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks |
Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification |
SPM-Tracker: Series-Parallel Matching for Real-Time Visual Object Tracking |
Spatial Fusion GAN for Image Synthesis |
Text Guided Person Image Synthesis |
STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing |
Towards Instance-Level Image-To-Image Translation |
Dense Intrinsic Appearance Flow for Human Pose Transfer |
Depth-Aware Video Frame Interpolation |
Sliced Wasserstein Generative Models |
Deep Flow-Guided Video Inpainting |
Video Generation From Single Semantic Label Map |
Polarimetric Camera Calibration Using an LCD Monitor |
Fully Automatic Video Colorization With Self-Regularization and Diversity |
Zoom to Learn, Learn to Zoom |
Single Image Reflection Removal Beyond Linearity |
Learning to Separate Multiple Illuminants in a Single Image |
Shape Unicode: A Unified Shape Representation |
Robust Video Stabilization by Optimization in CNN Weight Space |
Learning Linear Transformations for Fast Image and Video Style Transfer |
Local Detection of Stereo Occlusion Boundaries |
Bi-Directional Cascade Network for Perceptual Edge Detection |
Single Image Deraining: A Comprehensive Benchmark Analysis |
Dynamic Scene Deblurring With Parameter Selective Sharing and Nested Skip Connections |
Events-To-Video: Bringing Modern Computer Vision to Event Cameras |
Feedback Network for Image Super-Resolution |
Semi-Supervised Transfer Learning for Image Rain Removal |
EventNet: Asynchronous Recursive Event Processing |
Recurrent Back-Projection Network for Video Super-Resolution |
Cascaded Partial Decoder for Fast and Accurate Salient Object Detection |
A Simple Pooling-Based Design for Real-Time Salient Object Detection |
Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection |
Progressive Image Deraining Networks: A Better and Simpler Baseline |
GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud |
Attentive Relational Networks for Mapping Images to Scene Graphs |
Relational Knowledge Distillation |
Compressing Convolutional Neural Networks via Factorized Convolutional Filters |
On the Intrinsic Dimensionality of Image Representations |
Part-Regularized Near-Duplicate Vehicle Re-Identification |
Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics |
Classification-Reconstruction Learning for Open-Set Recognition |
Emotion-Aware Human Attention Prediction |
Residual Regression With Semantic Prior for Crowd Counting |
Context-Reinforced Semantic Segmentation |
Adversarial Structure Matching for Structured Prediction Tasks |
Deep Spectral Clustering Using Dual Autoencoder Network |
Deep Asymmetric Metric Learning via Rich Relationship Mining |
Did It Change? Learning to Detect Point-Of-Interest Changes for Proactive Map Updates |
Associatively Segmenting Instances and Semantics in Point Clouds |
Pattern-Affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation |
Scene Categorization From Contours: Medial Axis Based Salience Measures |
Unsupervised Image Captioning |
Exact Adversarial Attack to Image Captioning via Structured Output Learning With Latent Variables |
Cross-Modal Relationship Inference for Grounding Referring Expressions |
What’s to Know? Uncertainty as a Guide to Asking Goal-Oriented Questions |
Iterative Alignment Network for Continuous Sign Language Recognition |
Neural Sequential Phrase Grounding (SeqGROUND) |
CLEVR-Ref+: Diagnosing Visual Reasoning With Referring Expressions |
Describing Like Humans: On Diversity in Image Captioning |
MSCap: Multi-Style Image Captioning With Unpaired Stylized Text |
CRAVES: Controlling Robotic Arm With a Vision-Based Economic System |
Networks for Joint Affine and Non-Parametric Image Registration |
Learning Shape-Aware Embedding for Scene Text Detection |
Learning to Film From Professional Human Motion Videos |
Pay Attention! - Robustifying a Deep Visuomotor Policy Through Task-Focused Visual Attention |
Deep Blind Video Decaptioning by Temporal Aggregation and Recurrence |
Learning Video Representations From Correspondence Proposals |
SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks |
Sphere Generative Adversarial Network Based on Geometric Moment Matching |
Adversarial Attacks Beyond the Image Space |
Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks |
Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses |
A General and Adaptive Robust Loss Function |
Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration |
Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss |
Not All Areas Are Equal: Transfer Learning for Semantic Segmentation via Hierarchical Region Selection |
Unsupervised Learning of Dense Shape Correspondence |
Unsupervised Visual Domain Adaptation: A Deep Max-Margin Gaussian Process Approach |
Balanced Self-Paced Learning for Generative Adversarial Clustering Network |
A Style-Based Generator Architecture for Generative Adversarial Networks |
Parallel Optimal Transport GAN |
3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans |
Causes and Corrections for Bimodal Multi-Path Scanning With Structured Light |
TextureNet: Consistent Local Parametrizations for Learning From High-Resolution Signals on Meshes |
PlaneRCNN: 3D Plane Detection and Reconstruction From a Single Image |
Occupancy Networks: Learning 3D Reconstruction in Function Space |
3D Shape Reconstruction From Images in the Frequency Domain |
SiCloPe: Silhouette-Based Clothed People |
Detailed Human Shape Estimation From a Single Image by Hierarchical Mesh Deformation |
Convolutional Mesh Regression for Single-Image Human Shape Reconstruction |
H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions |
Learning the Depths of Moving People by Watching Frozen People |
Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion |
A Skeleton-Bridged Deep Learning Approach for Generating Meshes of Complex Topologies From Single RGB Images |
Learning Structure-And-Motion-Aware Rolling Shutter Correction |
PVNet: Pixel-Wise Voting Network for 6DoF Pose Estimation |
SelFlow: Self-Supervised Learning of Optical Flow |
Taking a Deeper Look at the Inverse Compositional Algorithm |
Deeper and Wider Siamese Networks for Real-Time Visual Tracking |
Self-Supervised Adaptation of High-Fidelity Face Models for Monocular Performance Tracking |
Diverse Generation for Multi-Agent Sports Games |
Efficient Online Multi-Person 2D Pose Tracking With Recurrent Spatio-Temporal Affinity Fields |
GFrames: Gradient-Based Local Reference Frame for 3D Shape Matching |
Eliminating Exposure Bias and Metric Mismatch in Multiple Object Tracking |
Graph Convolutional Tracking |
ATOM: Accurate Tracking by Overlap Maximization |
Visual Tracking via Adaptive Spatially-Regularized Correlation Filters |
Deep Tree Learning for Zero-Shot Face Anti-Spoofing |
ArcFace: Additive Angular Margin Loss for Deep Face Recognition |
Learning Joint Gait Representation via Quintuplet Loss Minimization |
Gait Recognition via Disentangled Representation Learning |
Reversible GANs for Memory-Efficient Image-To-Image Translation |
Sensitive-Sample Fingerprinting of Deep Neural Networks |
Soft Labels for Ordinal Regression |
Local to Global Learning: Gradually Adding Classes for Training Deep Neural Networks |
What Does It Mean to Learn in Deep Networks? And, How Does One Detect Adversarial Attacks? |
Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning |
Adversarial Defense Through Network Profiling Based Path Extraction |
RENAS: Reinforced Evolutionary Neural Architecture Search |
Co-Occurrence Neural Network |
SpotTune: Transfer Learning Through Adaptive Fine-Tuning |
Signal-To-Noise Ratio: A Robust Distance Metric for Deep Metric Learning |
Detection Based Defense Against Adversarial Examples From the Steganalysis Point of View |
HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs |
Strike (With) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects |
Blind Geometric Distortion Correction on Images Through Deep Learning |
Instance-Level Meta Normalization |
Iterative Normalization: Beyond Standardization Towards Efficient Whitening |
On Learning Density Aware Embeddings |
Contrastive Adaptation Network for Unsupervised Domain Adaptation |
LP-3DCNN: Unveiling Local Phase in 3D Convolutional Neural Networks |
Attribute-Driven Feature Disentangling and Temporal Aggregation for Video Person Re-Identification |
Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? |
Distilling Object Detectors With Fine-Grained Feature Imitation |
Centripetal SGD for Pruning Very Deep Convolutional Networks With Complicated Structure |
Knockoff Nets: Stealing Functionality of Black-Box Models |
Deep Embedding Learning With Discriminative Sampling Policy |
Hybrid Task Cascade for Instance Segmentation |
Multi-Task Self-Supervised Object Detection via Recycling of Bounding Box Annotations |
ClusterNet: Deep Hierarchical Cluster Network With Rigorously Rotation-Invariant Representation for Point Cloud Analysis |
Learning to Learn Relation for Important People Detection in Still Images |
Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition |
Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning |
Domain-Symmetric Networks for Adversarial Domain Adaptation |
End-To-End Supervised Product Quantization for Image Search and Retrieval |
Learning to Learn From Noisy Labeled Data |
DSFD: Dual Shot Face Detector |
Label Propagation for Deep Semi-Supervised Learning |
Deep Global Generalized Gaussian Networks |
Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-Based Image Retrieval |
Context-Aware Crowd Counting |
Detect-To-Retrieve: Efficient Regional Aggregation for Image Search |
Towards Accurate One-Stage Object Detection With AP-Loss |
On Exploring Undetermined Relationships for Visual Relationship Detection |
Learning Without Memorizing |
Dynamic Recursive Neural Network |
Destruction and Construction Learning for Fine-Grained Image Recognition |
Distraction-Aware Shadow Detection |
Multi-Label Image Recognition With Graph Convolutional Networks |
High-Level Semantic Feature Detection: A New Perspective for Pedestrian Detection |
RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection |
Ranked List Loss for Deep Metric Learning |
CANet: Class-Agnostic Segmentation Networks With Iterative Refinement and Attentive Few-Shot Learning |
Precise Detection in Densely Packed Scenes |
KE-GAN: Knowledge Embedded Generative Adversarial Networks for Semi-Supervised Scene Parsing |
Fast User-Guided Video Object Segmentation by Interaction-And-Propagation Networks |
Fast Interactive Object Annotation With Curve-GCN |
FickleNet: Weakly and Semi-Supervised Semantic Image Segmentation Using Stochastic Inference |
RVOS: End-To-End Recurrent Network for Video Object Segmentation |
DeepFlux for Skeletons in the Wild |
Interactive Image Segmentation via Backpropagating Refinement Scheme |
Scene Parsing via Integrated Classification Model and Variance-Based Regularization |
RAVEN: A Dataset for Relational and Analogical Visual REasoNing |
Surface Reconstruction From Normals: A Robust DGP-Based Discontinuity Preservation Approach |
DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images |
Jumping Manifolds: Geometry Aware Dense Non-Rigid Structure From Motion |
LVIS: A Dataset for Large Vocabulary Instance Segmentation |
Fast Object Class Labelling via Speech |
LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking |
Creative Flow+ Dataset |
Weakly Supervised Open-Set Domain Adaptation by Dual-Domain Collaboration |
A Neurobiological Evaluation Metric for Neural Network Model Search |
Iterative Projection and Matching: Finding Structure-Preserving Representatives and Its Application to Computer Vision |
Efficient Multi-Domain Learning by Covariance Normalization |
Predicting Visible Image Differences Under Varying Display Brightness and Viewing Distance |
A Bayesian Perspective on the Deep Image Prior |
ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving |
Compressing Unknown Images With Product Quantizer for Efficient Zero-Shot Classification |
Self-Supervised Convolutional Subspace Clustering Network |
Multi-Scale Geometric Consistency Guided Multi-View Stereo |
Privacy Preserving Image-Based Localization |
SimulCap : Single-View Human Performance Capture With Cloth Simulation |
Hierarchical Deep Stereo Matching on High-Resolution Images |
Recurrent MVSNet for High-Resolution Multi-View Stereo Depth Inference |
Synthesizing 3D Shapes From Silhouette Image Collections Using Multi-Projection Generative Adversarial Networks |
The Perfect Match: 3D Point Cloud Matching With Smoothed Densities |
Recurrent Neural Network for (Un-)Supervised Learning of Monocular Video Visual Odometry and Depth |
PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing |
Scan2Mesh: From Unstructured Range Scans to 3D Meshes |
Unsupervised Domain Adaptation for ToF Data Denoising With Adversarial Learning |
Learning Independent Object Motion From Unlabelled Stereoscopic Videos |
Learning Single-Image Depth From Videos Using Quality Assessment Networks |
Learning 3D Human Dynamics From Video |
Lending Orientation to Neural Networks for Cross-View Geo-Localization |
Visual Localization by Learning Objects-Of-Interest Dense Match Regression |
Bilateral Cyclic Constraint and Adaptive Regularization for Unsupervised Monocular Depth Prediction |
Face Parsing With RoI Tanh-Warping |
Multi-Person Articulated Tracking With Spatial and Temporal Embeddings |
Multi-Person Pose Estimation With Enhanced Channel-Wise and Spatial Information |
A Compact Embedding for Facial Expression Similarity |
Deep High-Resolution Representation Learning for Human Pose Estimation |
Feature Transfer Learning for Face Recognition With Under-Represented Data |
Unsupervised 3D Pose Estimation With Geometric Self-Supervision |
Peeking Into the Future: Predicting Future Person Activities and Locations in Videos |
Re-Identification With Consistent Attentive Siamese Networks |
On the Continuity of Rotation Representations in Neural Networks |
Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation |
Inverse Discriminative Networks for Handwritten Signature Verification |
Led3D: A Lightweight and Efficient Deep Approach to Recognizing Low-Quality 3D Faces |
ROI Pooled Correlation Filters for Visual Tracking |
Deep Video Inpainting |
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-To-Image Synthesis |
Non-Adversarial Image Synthesis With Generative Latent Nearest Neighbors |
Mixture Density Generative Adversarial Networks |
SketchGAN: Joint Sketch Completion and Recognition With Generative Adversarial Network |
Foreground-Aware Image Inpainting |
Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-To-Image Translation |
Structure-Preserving Stereoscopic View Synthesis With Multi-Scale Adversarial Correlation Matching |
DynTypo: Example-Based Dynamic Text Effects Transfer |
Arbitrary Style Transfer With Style-Attentional Networks |
Typography With Decor: Intelligent Text Style Transfer |
RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape Completion |
Photo Wake-Up: 3D Character Animation From a Single Photo |
DeepLight: Learning Illumination for Unconstrained Mobile Mixed Reality |
Iterative Residual CNNs for Burst Photography Applications |
Learning Implicit Fields for Generative Shape Modeling |
Reliable and Efficient Image Cropping: A Grid Anchor Based Approach |
Patch-Based Progressive 3D Point Set Upsampling |
An Iterative and Cooperative Top-Down and Bottom-Up Inference Network for Salient Object Detection |
Deep Stacked Hierarchical Multi-Patch Network for Image Deblurring |
Turn a Silicon Camera Into an InGaAs Camera |
Low-Rank Tensor Completion With a New Tensor Nuclear Norm Induced by Invertible Linear Transforms |
Joint Representative Selection and Feature Learning: A Semi-Supervised Approach |
The Domain Transform Solver |
CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection |
Phase-Only Image Based Kernel Estimation for Single Image Blind Deblurring |
Hierarchical Discrete Distribution Decomposition for Match Density Estimation |
FOCNet: A Fractional Optimal Control Network for Image Denoising |
Orthogonal Decomposition Network for Pixel-Wise Binary Classification |
Multi-Source Weak Supervision for Saliency Detection |
ComDefend: An Efficient Image Compression Model to Defend Adversarial Examples |
Combinatorial Persistency Criteria for Multicut and Max-Cut |
S4Net: Single Stage Salient-Instance Segmentation |
A Decomposition Algorithm for the Sparse Generalized Eigenvalue Problem |
Polynomial Representation for Persistence Diagram |
Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks |
Cross-Atlas Convolution for Parameterization Invariant Learning on Textured Mesh Surface |
Deep Surface Normal Estimation With Hierarchical RGB-D Fusion |
Knowledge-Embedded Routing Network for Scene Graph Generation |
An End-To-End Network for Panoptic Segmentation |
Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models |
Marginalized Latent Semantic Encoder for Zero-Shot Learning |
Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation |
Unsupervised Embedding Learning via Invariant and Spreading Instance Feature |
AOGNets: Compositional Grammatical Architectures for Deep Learning |
A Robust Local Spectral Descriptor for Matching Non-Rigid Shapes With Incompatible Shape Structures |
Context and Attribute Grounded Dense Captioning |
Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification |
Interpreting CNNs via Decision Trees |
Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning |
Deep Modular Co-Attention Networks for Visual Question Answering |
Synthesizing Environment-Aware Activities via Activity Sketches |
Self-Critical N-Step Training for Image Captioning |
Multi-Target Embodied Question Answering |
Visual Question Answering as Reading Comprehension |
StoryGAN: A Sequential Conditional GAN for Story Visualization |
Noise-Aware Unsupervised Deep Lidar-Stereo Fusion |
Versatile Multiple Choice Learning and Its Application to Vision Computing |
EV-Gait: Event-Based Robust Gait Recognition Using Dynamic Vision Sensors |
ToothNet: Automatic Tooth Instance Segmentation and Identification From Cone Beam CT Images |
Modularized Textual Grounding for Counterfactual Resilience |
L3-Net: Towards Learning Based LiDAR Localization for Autonomous Driving |
Panoptic Feature Pyramid Networks |
Mask Scoring R-CNN |
Reasoning-RCNN: Unifying Adaptive Global Reasoning Into Large-Scale Object Detection |
Cross-Modality Personalization for Retrieval |
Composing Text and Image for Image Retrieval - an Empirical Odyssey |
Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation |
Adaptive NMS: Refining Pedestrian Detection in a Crowd |
Point in, Box Out: Beyond Counting Persons in Crowds |
Locating Objects Without Bounding Boxes |
FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery |
Mutual Learning of Complementary Networks via Residual Correction for Improving Semi-Supervised Classification |
Sampling Techniques for Large-Scale Object Detection From Sparsely Annotated Objects |
Curls & Whey: Boosting Black-Box Adversarial Attacks |
Barrage of Random Transforms for Adversarially Robust Defense |
Aggregation Cross-Entropy for Sequence Recognition |
LaSO: Label-Set Operations Networks for Multi-Label Few-Shot Learning |
Few-Shot Learning With Localization in Realistic Settings |
AdaGraph: Unifying Predictive and Continuous Domain Adaptation Through Graphs |
Grounded Video Description |
Streamlined Dense Video Captioning |
Adversarial Inference for Multi-Sentence Video Description |
Unified Visual-Semantic Embeddings: Bridging Vision and Language With Structured Meaning Representations |
Learning to Compose Dynamic Tree Structures for Visual Contexts |
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation |
Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering |
Cycle-Consistency for Robust Visual Question Answering |
Embodied Question Answering in Photorealistic Environments With Point Cloud Perception |
Reasoning Visual Dialogs With Structural and Partial Observations |
Recursive Visual Attention in Visual Dialog |
Two Body Problem: Collaborative Visual Task Completion |
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering |
Text2Scene: Generating Compositional Scenes From Textual Descriptions |
From Recognition to Cognition: Visual Commonsense Reasoning |
The Regretful Agent: Heuristic-Aided Navigation Through Progress Estimation |
Tactical Rewind: Self-Correction via Backtracking in Vision-And-Language Navigation |
Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning |
High Flux Passive Imaging With Single-Photon Sensors |
Photon-Flooded Single-Photon 3D Cameras |
Acoustic Non-Line-Of-Sight Imaging |
Steady-State Non-Line-Of-Sight Imaging |
A Theory of Fermat Paths for Non-Line-Of-Sight Shape Reconstruction |
End-To-End Projector Photometric Compensation |
Bringing a Blurry Frame Alive at High Frame-Rate With an Event Camera |
Bringing Alive Blurred Moments |
Learning to Synthesize Motion Blur |
Underexposed Photo Enhancement Using Deep Illumination Estimation |
Blind Visual Motif Removal From a Single Image |
Non-Local Meets Global: An Integrated Paradigm for Hyperspectral Denoising |
Neural Rerendering in the Wild |
GeoNet: Deep Geodesic Networks for Point Cloud Analysis |
MeshAdv: Adversarial Meshes for Visual Recognition |
Fast Spatially-Varying Indoor Lighting Estimation |
Neural Illumination: Lighting Prediction for Indoor Environments |
Deep Sky Modeling for Single Image Outdoor Lighting Estimation |
Bidirectional Learning for Domain Adaptation of Semantic Segmentation |
Enhanced Bayesian Compression via Deep Reinforcement Learning |
Strong-Weak Distribution Alignment for Adaptive Object Detection |
MFAS: Multimodal Fusion Architecture Search |
Disentangling Adversarial Robustness and Generalization |
ShieldNets: Defending Against Adversarial Attacks Using Probabilistic Adversarial Robustness |
Deeply-Supervised Knowledge Synergy |
Dual Residual Networks Leveraging the Potential of Paired Operations for Image Restoration |
Probabilistic End-To-End Noise Correction for Learning With Noisy Labels |
Attention-Guided Unified Network for Panoptic Segmentation |
NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection |
OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks |
Semantically Aligned Bias Reducing Zero Shot Learning |
Feature Space Perturbations Yield More Transferable Adversarial Examples |
IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstruction |
Accelerating Convolutional Neural Networks via Activation Map Compression |
Knowledge Distillation via Instance Relationship Graph |
PPGNet: Learning Point-Pair Graph for Line Segment Detection |
Building Detail-Sensitive Semantic Segmentation Networks With Polynomial Pooling |
Variational Bayesian Dropout With a Hierarchical Prior |
AANet: Attribute Attention Network for Person Re-Identifications |
Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction |
A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks |
PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet |
Few-Shot Adaptive Faster R-CNN |
VRSTC: Occlusion-Free Video Person Re-Identification |
Compact Feature Learning for Multi-Domain Image Classification |
Adaptive Transfer Network for Cross-Domain Person Re-Identification |
Large-Scale Few-Shot Learning: Knowledge Transfer With Class Hierarchy |
Moving Object Detection Under Discontinuous Change in Illumination Using Tensor Low-Rank and Invariant Sparse Decomposition |
Pedestrian Detection With Autoregressive Network Phases |
All You Need Is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification |
Stochastic Class-Based Hard Example Mining for Deep Metric Learning |
Revisiting Local Descriptor Based Image-To-Class Measure for Few-Shot Learning |
Towards Robust Curve Text Detection With Conditional Spatial Expansion |
Revisiting Perspective Information for Efficient Crowd Counting |
Towards Universal Object Detection by Domain Attention |
Ensemble Deep Manifold Similarity Learning Using Hard Proxies |
Quantization Networks |
RES-PCA: A Scalable Approach to Recovering Low-Rank Matrices |
Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks |
Efficient Featurized Image Pyramid Network for Single Shot Detector |
Multi-Task Multi-Sensor Fusion for 3D Object Detection |
Domain-Specific Batch Normalization for Unsupervised Domain Adaptation |
Grid R-CNN |
MetaCleaner: Learning to Hallucinate Clean Representations for Noisy-Labeled Visual Recognition |
Mapping, Localization and Path Planning for Image-Based Navigation Using Visual Features and Map |
Triply Supervised Decoder Networks for Joint Detection and Segmentation |
Leveraging the Invariant Side of Generative Zero-Shot Learning |
Exploring the Bounds of the Utility of Context for Object Detection |
A-CNN: Annularly Convolutional Neural Networks on Point Clouds |
DARNet: Deep Active Ray Network for Building Segmentation |
Point Cloud Oversegmentation With Graph-Structured Deep Metric Learning |
Graphonomy: Universal Human Parsing via Graph Transfer Learning |
Fitting Multiple Heterogeneous Models by Multi-Class Cascaded T-Linkage |
A Late Fusion CNN for Digital Matting |
BASNet: Boundary-Aware Salient Object Detection |
ZigZagNet: Fusing Top-Down and Bottom-Up Context for Object Segmentation |
Object Instance Annotation With Deep Extreme Level Set Evolution |
Leveraging Crowdsourced GPS Data for Road Extraction From Aerial Imagery |
Adaptive Pyramid Context Network for Semantic Segmentation |
Isospectralization, or How to Hear Shape, Style, and Correspondence |
Speech2Face: Learning the Face Behind a Voice |
Joint Manifold Diffusion for Combining Predictions on Decoupled Observations |
Audio Visual Scene-Aware Dialog |
Learning to Minify Photometric Stereo |
Reflective and Fluorescent Separation Under Narrow-Band Illumination |
Depth From a Polarisation + RGB Stereo Pair |
Rethinking the Evaluation of Video Summaries |
What Object Should I Use? - Task Driven Object Detection |
Triangulation Learning Network: From Monocular to Stereo 3D Object Detection |
Connecting the Dots: Learning Representations for Active Monocular Depth Estimation |
Learning Non-Volumetric Depth Fusion Using Successive Reprojections |
Stereo R-CNN Based 3D Object Detection for Autonomous Driving |
Hybrid Scene Compression for Visual Localization |
MMFace: A Multi-Metric Regression Network for Unconstrained Face Reconstruction |
3D Motion Decomposition for RGBD Future Dynamic Scene Synthesis |
Single Image Depth Estimation Trained via Depth From Defocus Cues |
RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion |
Neural Scene Decomposition for Multi-Person Motion Capture |
Efficient Decision-Based Black-Box Adversarial Attacks on Face Recognition |
FA-RPN: Floating Region Proposals for Face Detection |
Bayesian Hierarchical Dynamic Model for Human Action Recognition |
Mixed Effects Neural Networks (MeNets) With Applications to Gaze Estimation |
3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training |
Learning to Regress 3D Face Shape and Expression From an Image Without 3D Supervision |
PoseFix: Model-Agnostic General Human Pose Refinement Network |
RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation |
Fast and Robust Multi-Person 3D Pose Estimation From Multiple Views |
Face-Focused Cross-Stream Network for Deception Detection in Videos |
Unequal-Training for Deep Face Recognition With Long-Tailed Noisy Data |
T-Net: Parametrizing Fully Convolutional Nets With a Single High-Order Tensor |
Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss |
Object-Centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video |
DDLSTM: Dual-Domain LSTM for Cross-Dataset Action Recognition |
The Pros and Cons: Rank-Aware Temporal Attention for Skill Determination in Long Videos |
Collaborative Spatiotemporal Feature Learning for Video Action Recognition |
MARS: Motion-Augmented RGB Stream for Action Recognition |
Convolutional Relational Machine for Group Activity Recognition |
Video Summarization by Learning From Unpaired Data |
Skeleton-Based Action Recognition With Directed Graph Neural Networks |
PA3D: Pose-Action 3D Machine for Video Recognition |
Deep Dual Relation Modeling for Egocentric Interaction Recognition |
MOTS: Multi-Object Tracking and Segmentation |
Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking |
PointFlowNet: Learning Representations for Rigid Motion Estimation From Point Clouds |
Listen to the Image |
Image Super-Resolution by Neural Texture Transfer |
Conditional Adversarial Generative Flow for Controllable Image Synthesis |
How to Make a Pizza: Learning a Compositional Layer-Based GAN Model |
TransGaGa: Geometry-Aware Unsupervised Image-To-Image Translation |
Depth-Attentional Features for Single-Image Rain Removal |
Hyperspectral Image Reconstruction Using a Deep Spatial-Spectral Prior |
LiFF: Light Field Features in Scale and Depth |
Deep Exemplar-Based Video Colorization |
On Finding Gray Pixels |
UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos |
Learning Transformation Synchronization |
D2-Net: A Trainable CNN for Joint Description and Detection of Local Features |
Recurrent Neural Networks With Intra-Frame Iterations for Video Deblurring |
Learning to Extract Flawless Slow Motion From Blurry Videos |
Natural and Realistic Single Image Super-Resolution With Explicit Natural Manifold Discrimination |
RF-Net: An End-To-End Image Matching Network Based on Receptive Field |
Fast Single Image Reflection Suppression via Convex Optimization |
A Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision |
Enhanced Pix2pix Dehazing Network |
Assessing Personally Perceived Image Quality via Image Features and Collaborative Filtering |
Single Image Reflection Removal Exploiting Misaligned Training Data and Network Enhancements |
Exploring Context and Visual Pattern of Relationship for Scene Graph Generation |
Learning From Synthetic Data for Crowd Counting in the Wild |
A Local Block Coordinate Descent Algorithm for the CSC Model |
Not Using the Car to See the Sidewalk – Quantifying and Controlling the Effects of Context in Classification and Segmentation |
Discovering Fair Representations in the Data Domain |
Actor-Critic Instance Segmentation |
Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders |
Semantic Projection Network for Zero- and Few-Label Semantic Segmentation |
GCAN: Graph Convolutional Adversarial Network for Unsupervised Domain Adaptation |
Seamless Scene Segmentation |
Unsupervised Image Matching and Object Discovery as Optimization |
Wide-Area Crowd Counting via Ground-Plane Density Maps and Multi-View Fusion CNNs |
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions |
Towards VQA Models That Can Read |
Object-Aware Aggregation With Bidirectional Temporal Graph for Video Captioning |
Progressive Attention Memory Network for Movie Story Question Answering |
Memory-Attended Recurrent Network for Video Captioning |
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning |
Look Back and Predict Forward in Image Captioning |
Explainable and Explicit Visual Reasoning Over Scene Graphs |
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering |
Intention Oriented Image Captions With Guiding Objects |
Uncertainty Guided Multi-Scale Residual Learning-Using a Cycle Spinning CNN for Single Image De-Raining |
Toward Realistic Image Compositing With Adversarial Learning |
Cross-Classification Clustering: An Efficient Multi-Object Tracking Technique for 3-D Instance Segmentation in Connectomics |
Deep ChArUco: Dark ChArUco Marker Pose Estimation |
Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving |
Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions |
Metric Learning for Image Registration |
LO-Net: Deep Real-Time Lidar Odometry |
TraPHic: Trajectory Prediction in Dense and Heterogeneous Traffic Using Weighted Interactions |
World From Blur |
Topology Reconstruction of Tree-Like Structure in Images via Structural Similarity Measure and Dominant Set Clustering |
Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training |
Holistic and Comprehensive Annotation of Clinically Significant Findings on Diverse CT Images: Learning From Radiology Reports and Label Ontology |
Robust Histopathology Image Analysis: To Label or to Synthesize? |
Data Augmentation Using Learned Transformations for One-Shot Medical Image Segmentation |
Shifting More Attention to Video Salient Object Detection |
Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration |
Beyond Tracking: Selecting Memory and Refining Poses for Deep Visual Odometry |
Image Generation From Layout |
Multimodal Explanations by Predicting Counterfactuality in Videos |
Learning to Explain With Complemental Examples |
HAQ: Hardware-Aware Automated Quantization With Mixed Precision |
Content Authentication for Neural Imaging Pipelines: End-To-End Optimization of Photo Provenance in Complex Distribution Channels |
Inverse Procedural Modeling of Knitwear |
Estimating 3D Motion and Forces of Person-Object Interactions From Monocular Video |
DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds |
End-To-End Interpretable Neural Motion Planner |
Divergence Triangle for Joint Training of Generator Model, Energy-Based Model, and Inferential Model |
Image Deformation Meta-Networks for One-Shot Learning |
Online High Rank Matrix Completion |
Multispectral Imaging for Fine-Grained Recognition of Powders on Complex Backgrounds |
ContactDB: Analyzing and Predicting Grasp Contact via Thermal Imaging |
Robust Subspace Clustering With Independent and Piecewise Identically Distributed Noise Modeling |
What Correspondences Reveal About Unknown Camera and Motion Models? |
Self-Calibrating Deep Photometric Stereo Networks |
Argoverse: 3D Tracking and Forecasting With Rich Maps |
Side Window Filtering |
Defense Against Adversarial Images Using Web-Scale Nearest-Neighbor Search |
Incremental Object Learning From Contiguous Views |
IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition |
CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification |
Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence |
UPSNet: A Unified Panoptic Segmentation Network |
JSIS3D: Joint Semantic-Instance Segmentation of 3D Point Clouds With Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields |
Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth |
DeepCO3: Deep Instance Co-Segmentation by Co-Peak Search and Co-Saliency Detection |
Improving Semantic Segmentation via Video Propagation and Label Relaxation |
Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video |
Shape2Motion: Joint Analysis of Motion Parts and Attributes From 3D Shapes |
Semantic Correlation Promoted Shape-Variant Context for Segmentation |
Relation-Shape Convolutional Neural Network for Point Cloud Analysis |
Enhancing Diversity of Defocus Blur Detectors via Cross-Ensemble Network |
BubbleNets: Learning to Select the Guidance Frame in Video Object Segmentation by Deep Sorting Frames |
Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images |
Efficient Parameter-Free Clustering Using First Neighbor Relations |
Learning Personalized Modular Network Guided by Structured Knowledge |
A Generative Appearance Model for End-To-End Video Object Segmentation |
A Flexible Convolutional Solver for Fast Style Transfers |
Cross Domain Model Compression by Structurally Weight Sharing |
TraVeLGAN: Image-To-Image Translation by Transformation Vector Learning |
Deep Robust Subjective Visual Property Prediction in Crowdsourcing |
Transferable AutoML by Model Sharing Over Grouped Datasets |
Learning Not to Learn: Training Deep Neural Networks With Biased Data |
IRLAS: Inverse Reinforcement Learning for Architecture Search |
Learning for Single-Shot Confidence Calibration in Deep Neural Networks Through Stochastic Inferences |
Attention-Based Adaptive Selection of Operations for Image Restoration in the Presence of Unknown Combined Distortions |
Fully Learnable Group Convolution for Acceleration of Deep Neural Networks |
EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching From Scratch |
Deep Incremental Hashing Network for Efficient Image Retrieval |
Robustness via Curvature Regularization, and Vice Versa |
SparseFool: A Few Pixels Make a Big Difference |
Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks |
Structured Pruning of Neural Networks With Budget-Aware Regularization |
MBS: Macroblock Scaling for CNN Model Reduction |
Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells |
Generating 3D Adversarial Point Clouds |
Partial Order Pruning: For Best Speed/Accuracy Trade-Off in Neural Architecture Search |
Memory in Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity From Spatiotemporal Dynamics |
Variational Information Distillation for Knowledge Transfer |
You Look Twice: GaterNet for Dynamic Filter Selection in CNNs |
SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360deg Images |
ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network |
Assisted Excitation of Activations: A Learning Technique to Improve Object Detectors |
Exploiting Edge Features for Graph Neural Networks |
Propagation Mechanism for Deep and Wide Neural Networks |
Catastrophic Child’s Play: Easy to Perform, Hard to Defend Adversarial Attacks |
Embedding Complementary Deep Networks for Image Classification |
Deep Multimodal Clustering for Unsupervised Audiovisual Learning |
Dense Classification and Implanting for Few-Shot Learning |
Class-Balanced Loss Based on Effective Number of Samples |
Discovering Visual Patterns in Art Collections With Spatially-Consistent Feature Learning |
Min-Max Statistical Alignment for Transfer Learning |
Spatial-Aware Graph Relation Network for Large-Scale Object Detection |
Deformable ConvNets V2: More Deformable, Better Results |
Interaction-And-Aggregation Network for Person Re-Identification |
Rare Event Detection Using Disentangled Representation Learning |
Shape Robust Text Detection With Progressive Scale Expansion Network |
Dual Encoding for Zero-Example Video Retrieval |
MaxpoolNMS: Getting Rid of NMS Bottlenecks in Two-Stage Object Detectors |
Character Region Awareness for Text Detection |
Effective Aesthetics Prediction With Multi-Level Spatially Pooled Features |
Attentive Region Embedding Network for Zero-Shot Learning |
Explicit Spatial Encoding for Deep Local Descriptors |
Panoptic Segmentation |
You Reap What You Sow: Using Videos to Generate High Precision Object Proposals for Weakly-Supervised Object Detection |
Explore-Exploit Graph Traversal for Image Retrieval |
Dissimilarity Coefficient Based Weakly Supervised Object Detection |
Kernel Transformer Networks for Compact Spherical Convolution |
Object Detection With Location-Aware Deformable Convolution and Backward Attention Filtering |
Variational Prototyping-Encoder: One-Shot Learning With Prototypical Images |
Unsupervised Domain Adaptation Using Feature-Whitening and Consensus Loss |
FEELVOS: Fast End-To-End Embedding Learning for Video Object Segmentation |
PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation |
Learning Multi-Class Segmentations From Single-Class Datasets |
Convolutional Recurrent Network for Road Boundary Extraction |
DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation |
A Cross-Season Correspondence Dataset for Robust Semantic Segmentation |
ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features |
On Zero-Shot Recognition of Generic Objects |
Explicit Bias Discovery in Visual Question Answering Models |
REPAIR: Removing Representation Bias by Dataset Resampling |
Label Efficient Semi-Supervised Learning via Graph Filtering |
MVTec AD – A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection |
ABC: A Big CAD Model Dataset for Geometric Deep Learning |
Tightness-Aware Evaluation Protocol for Scene Text Detection |
PointConv: Deep Convolutional Networks on 3D Point Clouds |
Octree Guided CNN With Spherical Kernels for 3D Point Clouds |
VITAMIN-E: VIsual Tracking and MappINg With Extremely Dense Feature Points |
Conditional Single-View Shape Generation for Multi-View Stereo Reconstruction |
Learning to Adapt for Stereo |
3D Appearance Super-Resolution With Deep Learning |
Radial Distortion Triangulation |
Robust Point Cloud Based Reconstruction of Large-Scale Outdoor Scenes |
Minimal Solvers for Mini-Loop Closures in 3D Multi-Scan Alignment |
Volumetric Capture of Humans With a Single RGBD Camera via Semi-Parametric Learning |
Joint Face Detection and Facial Motion Retargeting for Multiple Faces |
Monocular Depth Estimation Using Relative Depth Maps |
Unsupervised Primitive Discovery for Improved 3D Generative Modeling |
Learning to Explore Intrinsic Saliency for Stereoscopic Video |
Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on N-Spheres |
Refine and Distill: Exploiting Cycle-Inconsistency and Knowledge Distillation for Unsupervised Monocular Depth Estimation |
Learning View Priors for Single-View 3D Reconstruction |
Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation |
Learning Monocular Depth Estimation Infusing Traditional Stereo Knowledge |
SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception |
3D Guided Fine-Grained Face Manipulation |
Neuro-Inspired Eye Tracking With Eye Movement Dynamics |
Facial Emotion Distribution Learning by Exploiting Low-Rank Label Correlations Locally |
Unsupervised Face Normalization With Extreme Pose and Expression in the Wild |