Semantic Component Decomposition for Face Attribute Manipulation |
R3 Adversarial Network for Cross Model Face Recognition |
Disentangling Latent Hands for Image Synthesis and Pose Estimation |
Generating Multiple Hypotheses for 3D Human Pose Estimation With Mixture Density Network |
CrossInfoNet: Multi-Task Information Sharing Based Hand Pose Estimation |
P2SGrad: Refined Gradients for Optimizing Deep Face Models |
Action Recognition From Single Timestamp Supervision in Untrimmed Videos |
Time-Conditioned Action Anticipation in One Shot |
Dance With Flow: Two-In-One Stream Action Detection |
Representation Flow for Action Recognition |
LSTA: Long Short-Term Attention for Egocentric Action Recognition |
Learning Actor Relation Graphs for Group Activity Recognition |
A Structured Model for Action Detection |
Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition |
Object Discovery in Videos as Foreground Motion Clustering |
Towards Natural and Accurate Future Motion Prediction of Humans and Animals |
Automatic Face Aging in Videos via Deep Reinforcement Learning |
Multi-Adversarial Discriminative Deep Domain Generalization for Face Presentation Attack Detection |
A Content Transformation Block for Image Style Transfer |
BeautyGlow: On-Demand Makeup Transfer Framework With Reversible Generative Network |
Style Transfer by Relaxed Optimal Transport and Self-Similarity |
Inserting Videos Into Videos |
Learning Image and Video Compression Through Spatial-Temporal Energy Compaction |
Event-Based High Dynamic Range Image and Very High Frame Rate Video Generation Using Conditional Generative Adversarial Networks |
Enhancing TripleGAN for Semi-Supervised Conditional Instance Synthesis and Classification |
Capture, Learning, and Synthesis of 3D Speaking Styles |
Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds Using Convolutional Neural Networks |
Ray-Space Projection Model for Light Field Camera |
Deep Geometric Prior for Surface Reconstruction |
Analysis of Feature Visibility in Non-Line-Of-Sight Measurements |
Hyperspectral Imaging With Random Printed Mask |
All-Weather Deep Outdoor Lighting Estimation |
A Variational EM Framework With Adaptive Edge Selection for Blind Motion Deblurring |
Viewport Proposal CNN for 360deg Video Quality Assessment |
Beyond Gradient Descent for Regularized Segmentation Losses |
MAGSAC: Marginalizing Sample Consensus |
Understanding and Visualizing Deep Visual Saliency Models |
Divergence Prior and Vessel-Tree Reconstruction |
Unsupervised Domain-Specific Deblurring via Disentangled Representations |
Douglas-Rachford Networks: Learning Both the Image Prior and Data Fidelity Terms for Blind Image Deconvolution |
Speed Invariant Time Surface for Learning to Detect Corner Points With Event-Based Cameras |
Training Deep Learning Based Image Denoisers From Undersampled Measurements Without Ground Truth and Without Image Prior |
A Variational Pan-Sharpening With Local Gradient Constraints |
F-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning |
Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation |
Graph Attention Convolution for Point Cloud Semantic Segmentation |
Normalized Diversification |
Learning to Localize Through Compressed Binary Maps |
A Parametric Top-View Representation of Complex Road Scenes |
Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction |
Superquadrics Revisited: Learning 3D Shape Parsing Beyond Cuboids |
Unsupervised Disentangling of Appearance and Geometry by Deformable Generator Network |
Self-Supervised Representation Learning by Rotation Feature Decoupling |
Weakly Supervised Deep Image Hashing Through Tag Embeddings |
Improved Road Connectivity by Joint Learning of Orientation and Segmentation |
Deep Supervised Cross-Modal Retrieval |
A Theoretically Sound Upper Bound on the Triplet Loss for Improving the Efficiency of Deep Distance Metric Learning |
Data Representation and Learning With Graph Diffusion-Embedding Networks |
Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph |
Image-Question-Answer Synergistic Network for Visual Dialog |
Not All Frames Are Equal: Weakly-Supervised Video Grounding With Contextual Similarity and Visual Clustering Losses |
Inverse Cooking: Recipe Generation From Food Images |
Adversarial Semantic Alignment for Improved Image Captions |
Answer Them All! Toward Universal Visual Question Answering Models |
Unsupervised Multi-Modal Neural Machine Translation |
Multi-Task Learning of Hierarchical Vision-Language Representation |
Cross-Modal Self-Attention Network for Referring Image Segmentation |
DuDoNet: Dual Domain Network for CT Metal Artifact Reduction |
Fast Spatio-Temporal Residual Network for Video Super-Resolution |
Complete the Look: Scene-Based Complementary Product Recommendation |
Selective Sensor Fusion for Neural Visual-Inertial Odometry |
Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes |
Learning Binary Code for Personalized Fashion Recommendation |
Attention Based Glaucoma Detection: A Large-Scale Database and CNN Model |
Privacy Protection in Street-View Panoramas Using Depth and Multi-View Imagery |
Grounding Human-To-Vehicle Advice for Self-Driving Vehicles |
Multi-Step Prediction of Occupancy Grid Maps With Recurrent Neural Networks |
Connecting Touch and Vision via Cross-Modal Prediction |
X2CT-GAN: Reconstructing CT From Biplanar X-Rays With Generative Adversarial Networks |
Practical Full Resolution Learned Lossless Image Compression |
Image-To-Image Translation via Group-Wise Deep Whitening-And-Coloring Transformation |
Max-Sliced Wasserstein Distance and Its Use for GANs |
Meta-Learning With Differentiable Convex Optimization |
RePr: Improved Training of Convolutional Filters |
Tangent-Normal Adversarial Regularization for Semi-Supervised Learning |
Auto-Encoding Scene Graphs for Image Captioning |
Fast, Diverse and Accurate Image Captioning Guided by Part-Of-Speech |
Attention Branch Network: Learning of Attention Mechanism for Visual Explanation |
Cascaded Projection: End-To-End Network Compression and Acceleration |
DeepCaps: Going Deeper With Capsule Networks |
FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search |
APDrawingGAN: Generating Artistic Portrait Drawings From Face Photos With Hierarchical GANs |
Constrained Generative Adversarial Networks for Interactive Image Generation |
WarpGAN: Automatic Caricature Generation |
Explainability Methods for Graph Convolutional Neural Networks |
A Generative Adversarial Density Estimator |
SoDeep: A Sorting Deep Net to Learn Ranking Loss Surrogates |
High-Quality Face Capture Using Anatomical Muscles |
FML: Face Model Learning From Videos |
AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations |
3D Hand Shape and Pose Estimation From a Single RGB Image |
3D Hand Shape and Pose From Images in the Wild |
Self-Supervised 3D Hand Pose Estimation Through Training by Fitting |
CrowdPose: Efficient Crowded Scenes Pose Estimation and a New Benchmark |
Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in a Triadic Interaction |
HoloPose: Holistic 3D Human Reconstruction In-The-Wild |
Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation |
In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations |
Slim DensePose: Thrifty Learning From Sparse Annotations and Motion Cues |
Self-Supervised Representation Learning From Videos for Facial Action Unit Detection |
Combining 3D Morphable Models: A Large Scale Face-And-Head Model |
Boosting Local Shape Matching for Dense 3D Face Correspondence |
Unsupervised Part-Based Disentangling of Object Shape and Appearance |
Monocular Total Capture: Posing Face, Body, and Hands in the Wild |
Expressive Body Capture: 3D Hands, Face, and Body From a Single Image |
Neural RGB®D Sensing: Depth and Uncertainty From a Video Camera |
DAVANet: Stereo Deblurring With View Aggregation |
DVC: An End-To-End Deep Video Compression Framework |
SOSNet: Second Order Similarity Regularization for Local Descriptor Learning |
“Double-DIP”: Unsupervised Image Decomposition via Coupled Deep-Image-Priors |
Unprocessing Images for Learned Raw Denoising |
Residual Networks for Light Field Image Super-Resolution |
Modulating Image Restoration With Continual Levels via Adaptive Feature Modification Layers |
Second-Order Attention Network for Single Image Super-Resolution |
Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations |
Path-Invariant Map Networks |
FilterReg: Robust and Efficient Probabilistic Point-Set Registration Using Gaussian Filter and Twist Parameterization |
Probabilistic Permutation Synchronization Using the Riemannian Structure of the Birkhoff Polytope |
Lifting Vectorial Variational Problems: A Natural Formulation Based on Geometric Measure Theory and Discrete Exterior Calculus |
A Sufficient Condition for Convergences of Adam and RMSProp |
Guaranteed Matrix Completion Under Multiple Linear Transformations |
MAP Inference via Block-Coordinate Frank-Wolfe Algorithm |
A Convex Relaxation for Multi-Graph Matching |
Pixel-Adaptive Convolutional Neural Networks |
Single-Frame Regularization for Temporally Stable CNNs |
An End-To-End Network for Generating Social Relationship Graphs |
Meta-Learning Convolutional Neural Architectures for Multi-Target Concrete Defect Classification With the COncrete DEfect BRidge IMage Dataset |
ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model |
SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization |
Defending Against Adversarial Attacks by Randomized Diversification |
Rob-GAN: Generator, Discriminator, and Adversarial Attacker |
Learning From Noisy Labels by Regularized Estimation of Annotator Confusion |
Task-Free Continual Learning |
Importance Estimation for Neural Network Pruning |
Detecting Overfitting of Deep Generative Networks via Latent Recovery |
Coloring With Limited Data: Few-Shot Colorization via Memory Augmented Networks |
Characterizing and Avoiding Negative Transfer |
Building Efficient Deep Neural Networks With Unitary Group Convolutions |
Semi-Supervised Learning With Graph Learning-Convolutional Networks |
Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning |
AIRD: Adversarial Learning Framework for Image Repurposing Detection |
A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations |
Trust Region Based Adversarial Attack on Neural Networks |
PEPSI : Fast Image Inpainting With Parallel Decoding Network |
Model-Blind Video Denoising via Frame-To-Frame Training |
End-To-End Efficient Representation Learning via Cascading Combinatorial Optimization |
Sim-Real Joint Reinforcement Transfer for 3D Indoor Navigation |
ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation |
Regularizing Activation Distribution for Training Binarized Deep Networks |
Robustness Verification of Classification Deep Neural Networks via Linear Programming |
Additive Adversarial Learning for Unbiased Authentication |
Simultaneously Optimizing Weight and Quantizer of Ternary Neural Network Using Truncated Gaussian Approximation |
Adversarial Defense by Stratified Convolutional Sparse Coding |
Exploring Object Relation in Mean Teacher for Cross-Domain Detection |
Hierarchical Disentanglement of Discriminative Latent Features for Zero-Shot Learning |
R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network |
Rethinking Knowledge Graph Propagation for Zero-Shot Learning |
Learning to Learn Image Classifiers With Visual Analogy |
Where’s Wally Now? Deep Generative and Discriminative Embeddings for Novelty Detection |
Weakly Supervised Image Classification Through Noise Regularization |
Data-Driven Neuron Allocation for Scale Aggregation Networks |
Graphical Contrastive Losses for Scene Graph Parsing |
Deep Transfer Learning for Multiple Class Novelty Detection |
QATM: Quality-Aware Template Matching for Deep Learning |
Retrieval-Augmented Convolutional Neural Networks Against Adversarial Examples |
Learning Cross-Modal Embeddings With Adversarial Networks for Cooking Recipes and Food Images |
FastDraw: Addressing the Long Tail of Lane Detection by Adapting a Sequential Prediction Network |
Weakly Supervised Video Moment Retrieval From Text Queries |
Content-Aware Multi-Level Guidance for Interactive Instance Segmentation |
Greedy Structure Learning of Hierarchical Compositional Models |
Interactive Full Image Segmentation by Considering All Regions Jointly |
Learning Active Contour Models for Medical Image Segmentation |
Customizable Architecture Search for Semantic Segmentation |
Local Features and Visual Words Emerge in Activations |
Hyperspectral Image Super-Resolution With Optimized RGB Guidance |
Adaptive Confidence Smoothing for Generalized Zero-Shot Learning |
PMS-Net: Robust Haze Removal Based on Patch Map for Single Images |
Deep Spherical Quantization for Image Search |
Large-Scale Interactive Object Segmentation With Human Annotators |
A Poisson-Gaussian Denoising Dataset With Real Fluorescence Microscopy Images |
Task Agnostic Meta-Learning for Few-Shot Learning |
Progressive Ensemble Networks for Zero-Shot Recognition |
Direct Object Recognition Without Line-Of-Sight Using Optical Coherence |
Atlas of Digital Pathology: A Generalized Hierarchical Histological Tissue Type-Annotated Database for Deep Learning |
Perturbation Analysis of the 8-Point Algorithm: A Case Study for Wide FoV Cameras |
Robustness of 3D Deep Learning in an Adversarial Setting |
SceneCode: Monocular Dense Semantic Reconstruction Using Learned Encoded Scene Representations |
StereoDRNet: Dilated Residual StereoNet |
The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation |
Learning Joint Reconstruction of Hands and Manipulated Objects |
Deep Single Image Camera Calibration With Radial Distortion |
CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth |
Translate-to-Recognize Networks for RGB-D Scene Recognition |
Re-Identification Supervised Texture Generation |
Action4D: Online Action Recognition in the Crowd and Clutter |
Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction |
Attribute-Aware Face Aging With Wavelet-Based Generative Adversarial Networks |
Noise-Tolerant Paradigm for Training Face Recognition CNNs |
Low-Rank Laplacian-Uniform Mixed Model for Robust Face Recognition |
Generalizing Eye Tracking With Bayesian Adversarial Learning |
Local Relationship Learning With Person-Specific Shape Regularization for Facial Action Unit Detection |
Point-To-Pose Voting Based Hand Pose Estimation Using Residual Permutation Equivariant Layer |
Improving Few-Shot User-Specific Gaze Adaptation via Gaze Redirection Synthesis |
AdaptiveFace: Adaptive Margin and Sampling for Face Recognition |
Disentangled Representation Learning for 3D Face Shape |
LBS Autoencoder: Self-Supervised Fitting of Articulated Meshes to Point Clouds |
PifPaf: Composite Fields for Human Pose Estimation |
TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection |
Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos |
Local Temporal Bilinear Pooling for Fine-Grained Action Parsing |
Improving Action Localization by Progressive Cross-Stream Cooperation |
Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition |
A Neural Network Based on SPD Manifold Learning for Skeleton-Based Hand Gesture Recognition |
Large-Scale Weakly-Supervised Pre-Training for Video Action Recognition |
Learning Spatio-Temporal Representation With Local and Global Diffusion |
Unsupervised Learning of Action Classes With Continuous Temporal Embedding |
Double Nuclear Norm Based Low Rank Representation on Grassmann Manifolds for Clustering |
SR-LSTM: State Refinement for LSTM Towards Pedestrian Trajectory Prediction |
Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes |
An Efficient Schmidt-EKF for 3D Visual-Inertial SLAM |
A Neural Temporal Model for Human Motion Prediction |
Multi-Agent Tensor Fusion for Contextual Trajectory Prediction |
Coordinate-Based Texture Inpainting for Pose-Guided Human Image Generation |
On Stabilizing Generative Adversarial Training With Noise |
Self-Supervised GANs via Auxiliary Rotation Loss |
Texture Mixer: A Network for Controllable Synthesis and Interpolation of Texture |
Object-Driven Text-To-Image Synthesis via Adversarial Training |
Zoom-In-To-Check: Boosting Video Interpolation via Instance-Level Discrimination |
Disentangling Latent Space for VAE by Label Relevant/Irrelevant Dimensions |
Spectral Reconstruction From Dispersive Blur: A Novel Light Efficient Spectral Imager |
Quasi-Unsupervised Color Constancy |
Deep Defocus Map Estimation Using Domain Adaptation |
Using Unknown Occluders to Recover Hidden Scenes |
Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation |
Learning Parallax Attention for Stereo Image Super-Resolution |
Knowing When to Stop: Evaluation and Verification of Conformity to Output-Size Specifications |
Spatial Attentive Single-Image Deraining With a High Quality Real Rain Dataset |
Focus Is All You Need: Loss Functions for Event-Based Vision |
Scalable Convolutional Neural Network for Image Compressed Sensing |
Event Cameras, Contrast Maximization and Reward Functions: An Analysis |
Convolutional Neural Networks Can Be Deceived by Visual Illusions |
PDE Acceleration for Active Contours |
Dichromatic Model Based Temporal Color Constancy for AC Light Sources |
Semantic Attribute Matching Networks |
Skin-Based Identification From Multispectral Image Data Using CNNs |
Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks |
Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments |
PIEs: Pose Invariant Embeddings |
Representation Similarity Analysis for Efficient Task Taxonomy & Transfer Learning |
Object Counting and Instance Segmentation With Image-Level Supervision |
Variational Autoencoders Pursue PCA Directions (by Accident) |
A Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes |
Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping |
PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval |
Depth Coefficients for Depth Completion |
Diversify and Match: A Domain Adaptive Representation Learning Paradigm for Object Detection |
Good News, Everyone! Context Driven Entity-Aware Captioning for News Images |
Multi-Level Multimodal Common Semantic Space for Image-Phrase Grounding |
Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning |
Pointing Novel Objects in Image Captioning |
Informative Object Annotations: Tell Me Something I Don’t Know |
Engaging Image Captioning via Personality |
Vision-Based Navigation With Language-Based Assistance via Imitation Learning With Indirect Intervention |
TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments |
A Simple Baseline for Audio-Visual Scene-Aware Dialog |
End-To-End Learned Random Walker for Seeded Image Segmentation |
Efficient Neural Network Compression |
Cascaded Generative and Discriminative Learning for Microcalcification Detection in Breast Mammograms |
C3AE: Exploring the Limits of Compact Model for Age Estimation |
Adaptive Weighting Multi-Field-Of-View CNN for Semantic Segmentation in Pathology |
In Defense of Pre-Trained ImageNet Architectures for Real-Time Semantic Segmentation of Road-Driving Images |
Context-Aware Visual Compatibility Prediction |
Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks |
Multiview 2D/3D Rigid Registration via a Point-Of-Interest Network for Tracking and Triangulation |
Context-Aware Spatio-Recurrent Curvilinear Structure Segmentation |
An Alternative Deep Feature Approach to Line Level Keyword Spotting |
Dynamics Are Important for the Recognition of Equine Pain in Video |
LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving |
Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds |
PointPillars: Fast Encoders for Object Detection From Point Clouds |
Motion Estimation of Non-Holonomic Ground Vehicles From a Single Feature Correspondence Measured Over N Views |
From Coarse to Fine: Robust Hierarchical Localization at Large Scale |
Large Scale High-Resolution Land Cover Mapping With Multi-Resolution Data |
Leveraging Heterogeneous Auxiliary Tasks to Assist Crowd Counting |