【今日CS 视觉论文速览】3 Jan 2019
今日CS.CV计算机视觉论文速览
Thu, 3 Jan 2019
Totally 38 papers
Interesting:
-
将古代花鸟山水转换为照片的风格迁移,通过域迁移的方法将古画处理问题转变成了自然图像处理问题,在自然图像上训练的模型可以应用到迁移绘画中,在古画中对真实照片训练的分类模型和风格模型进行了迁移。研究人员主要收集了宋代、清代的花鸟和山水画数据集,并建立了域风格迁移网络。通过复杂的损失函数保证了迁移后的图像保持源图像的色彩和内容。(from 浙江大学)
研究人员收集的三个数据集,其中古画图片花(2258+650)鸟(2119+600)山水(2009+600):
采用的网络结构:
最终得到的结果:
dataset:CFP,CBP,CLP; 花朵分类器:Oxford Flower;语义分割任务:PASCAL VOC
2012
ref:https://person.zju.edu.cn/0092050 -
EdgeConnect,一种基于边缘补全的图像修复新方法,这篇文章将图像修复的工作分成了两个部分,首先利用利用启发式的生成模型得到了缺失部分的边缘信息,随后将边缘信息作为图像缺失的先验部分和图像一起送入修复网络进行图像重建。(from 安大略技术大学)
感受一下效果:
dataset:CelebA, Places2, and Paris Street View
Code:https://github.com/knazeri/edge-connect
related inpainting:
https://github.com/satoshiiizuka/siggraph2017_inpainting
https://github.com/JiahuiYu/generative_inpainting -
掩膜辅助的人群计数方法,由于人群估计的问题主要在于密度估计,而在掩膜的加入可以减小密度估计的难度,同时掩膜估计问题又可以转换为二值化的分割问题来解决。在传统方法的基础上增加了目标掩膜的分支,随后将预测出的掩膜与与输入图结合生成更好的密度图。(from 南京大学 阿德莱德大学 澳大利亚)
研究人员提出了五种不同的架构来实现mask的预测和融合预测密度图的方式:
人群计数数据集: shanghaitech, UCF_CC_50, WorldExpo10, The MALL
ref:http://cs-chan.com/downloads_crowd_dataset.html
https://github.com/svishwa/crowdcount-mcnn
https://irc.atr.jp/sets/TEMPOSAN_dataset/
港中文的大数据集 -
Action2Vec,建立了衔接语言信息和视觉空间信息的嵌入隐含空间,将动作和语言描述用类似word2vec的方式衔接起来。(from 佐治亚理工)
嵌入空间的可视化:
同时在嵌入空间中实现了代数运算,对动作和主体进行了代数操作:
dataset:UCF101 [29], HMDB51[18] and Kinetics [13]. -
学习三维刚体的物理动力学过程,通过输入目标点云、冲量矢量得到了物体在物理环境中受力作用后的最终位姿,这一模型的物理动力学学习结果还能用于未知物体的动力学估计。(from 斯坦福)
网络模型,输入物体点云和输出的力通过综合后得到物体的最终位姿:
dataset:ShapeNet
仿真环境:
https://pybullet.org/
https://unity3d.com/
Author:
https://github.com/davrempe
https://cs.stanford.edu/people/ssrinath/
https://geometry.stanford.edu/member/guibas/index.html
The hierarchical relation network -
利用模糊数据来训练模型,保护用户隐私,利用人眼难以分辨但是机器可以使用的图像来训练算法。在分类、属性分类和人脸关键点检测方面取得了不错的结果。通过训练模糊网络来处理数据,随后利用处理的数据来训练目标网络。
-
(from Deeping Source)
![![在这里插入图片描述](https://img-blog.csdnimg.cn/20190104174700976. =500x)
检测数据集:SVHN, CIFAR10, Pascal VOC 2012, CelebA, and MTFL.
ref:http://www.deepingsource.io/ -
SiCloPe,单张图像生成人体衣着旋转效果的模型,基于模特的剪影研究人员可以通过这一模型重建人体衣着的三维模型。这意味着在虚拟试装时可以看到自己前后左右的衣着效果。这一工作利用了二维剪影和三维关节位置数据来描述复杂变化的人体穿着场景。首先通过利用输入剪影和关节数据合成了新视角下连续的剪影,随后利用生成网络得到目标的三维模型。最后利用前视图生成后视图,从而得到纹理来对三维模型的表面进行处理。(from 美国南加州大学创意技术研究所)
新视角下的剪影合成网络:
前后映射模型:
一些结果:
dataset:rigged meshes,aXYZ, Renderpeople, animation sequences Mixamo, HDRI Haven -
SIXray,提出了一个大规模的安检X光数据集,包含了1059231张X光安检数据,并对其中的6类共8929个违禁品进行了手动标记。其特点是很多物体之间有遮挡关系。研究人员提出了类平衡的层级精炼方法来处理复杂物件和数据不平衡的情况,同时引入了高级视觉特征辅助中级特征。利用中特征检测得到了很好地效果,使得弱监督学习成为可能。(from 中科大)
数据集由不同层的透明图像叠加构成:
论文中提出的层级平衡精炼方法:
一些检测到违禁品的结果:
安检X光数据集SIXray,ref:GDXray -
一种字符检测的方法,(from百度)
文本字符Text检测数据集:The VGG SynthText dataset, ICDAR13, MSRA-TD500.,Total-Text
文本字符识别比赛会议ref:http://u-pat.org/ICDAR2017/index.php
http://u-pat.org/ICDAR2017/program_competitions.php
http://u-pat.org/ICDAR2017/index.php
http://rrc.cvc.uab.es/
http://tc11.cvc.uab.es/datasets/icdar15smartdoc-ch2_1
https://arxiv.org/pdf/1601.07140.pdf -
利用3D合成法生成人脸欺诈数据集,利用打印的彩色头像转换为三维网格,并进行随机的弯曲和选择,最后利用透视变换渲染出虚拟的样本。(from 中科大)
-
多输出学习的综述,(from 悉尼技术大学)
-
基于FPGA加速的深度学习综述,(from 法赫德国王石油矿产大学,沙特)
-
Lipi Gnani,一个印度卡纳达语的字符识别转换系统,(from 印度科学院)
Daily Computer Vision Papers
[1] Title: Improving Face Anti-Spoofing by 3D Virtual Synthesis
Authors:Jianzhu Guo, Xiangyu Zhu, Jinchuan Xiao, Zhen Lei, Genxun Wan, Stan Z. Li
[2] Title: Action2Vec: A Crossmodal Embedding Approach to Action Learning
Authors:Meera Hahn, Andrew Silva, James M. Rehg
[3] Title: Learning Generalizable Physical Dynamics of 3D Rigid Objects
Authors:Davis Rempe, Srinath Sridhar, He Wang, Leonidas J. Guibas
[4] Title: Improved Hyperspectral Unmixing With Endmember Variability Parametrized Using an Interpolated Scaling Tensor
Authors:Ricardo Augusto Borsoi, Tales Imbiriba, José Carlos Moreira Bermudez
[5] Title: Lipi Gnani - A Versatile OCR for Documents in any Language Printed in Kannada Script
Authors:Shiva Kumar H R, Ramakrishnan A G
[6] Title: Attribute-Aware Attention Model for Fine-grained Representation Learning
Authors:Kai Han, Jianyuan Guo, Chao Zhang, Mingjian Zhu
[7] Title: Learning Efficient Detector with Semi-supervised Adaptive Distillation
Authors:Shitao Tang, Litong Feng, Wenqi Shao, Zhanghui Kuang, Wei Zhang, Yimin Chen
[8] Title: Detecting Text in the Wild with Deep Character Embedding Network
Authors:Jiaming Liu, Chengquan Zhang, Yipeng Sun, Junyu Han, Errui Ding
[9] Title: Optical Fringe Patterns Filtering Based on Multi-Stage Convolution Neural Network
Authors:Bowen Lin, Shujun Fu, Caiming Zhang, Fengling Wang, Yuliang Li
[10] Title: Plugin Networks for Inference under Partial Evidence
Authors:Michal Koperski, Tomasz Konopczynski, Piotr Semberecki, Tomasz Trzcinski
[11] Title: SIXray : A Large-scale Security Inspection X-ray Benchmark for Prohibited Item Discovery in Overlapping Images
Authors:Caijing Miao, Lingxi Xie, Fang Wan, Chi Su, Hongye Liu, Jianbin Jiao, Qixiang Ye
[12] Title: On Minimum Discrepancy Estimation for Deep Domain Adaptation
Authors:Mohammad Mahfujur Rahman, Clinton Fookes, Mahsa Baktashmotlagh, Sridha Sridharan
[13] Title: Vector and Line Quantization for Billion-scale Similarity Search on GPUs
Authors:Wei Chen, Jincai Chen, Fuhao Zou, Yuan-Fang Li, Ping Lu, Qiang Wang, Wei Zhao
[14] Title: Ancient Painting to Natural Image: A New Solution for Painting Processing
Authors:Tingting Qiao, Weijing Zhang, Miao Zhang, Zixuan Ma, Duanqing Xu
[15] Title: EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning
Authors:Kamyar Nazeri, Eric Ng, Tony Joseph, Faisal Qureshi, Mehran Ebrahimi
[16] Title: Mapping Areas using Computer Vision Algorithms and Drones
Authors:Bashar Alhafni, Saulo Fernando Guedes, Lays Cavalcante Ribeiro, Juhyun Park, Jeongkyu Lee
[17] Title: Nasal Patches and Curves for Expression-robust 3D Face Recognition
Authors:Mehryar Emambakhsh, Adrian Evans
[18] Title: Handwritten Indic Character Recognition using Capsule Networks
Authors:Bodhisatwa Mandal, Suvam Dubey, Swarnendu Ghosh, Ritesh Sarkhel, Nibaran Das
[19] Title: Rethinking on Multi-Stage Networks for Human Pose Estimation
Authors:Wenbo Li, Zhicheng Wang, Binyi Yin, Qixiang Peng, Yuming Du, Tianzi Xiao, Gang Yu, Hongtao Lu, Yichen Wei, Jian Su
[20] Title: Gated-Dilated Networks for Lung Nodule Classification in CT scans
Authors:Mundher Al-Shabi, Hwee Kuan Lee, Maxine Tan
[21] Title: Training with the Invisibles: Obfuscating Images to Share Safely for Learning Visual Recognition Models
Authors:Tae-hoon Kim, Dongmin Kang, Kari Pulli, Jonghyun Choi
[22] Title: Not All Words are Equal: Video-specific Information Loss for Video Captioning
Authors:Jiarong Dong, Ke Gao, Xiaokai Chen, Junbo Guo, Juan Cao, Yongdong Zhang
[23] Title: Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion
Authors:Zhenpei Yang, Jeffrey Z.Pan, Linjie Luo, Xiaowei Zhou, Kristen Grauman, Qixing Huang
[24] Title: Multiple Sclerosis Lesion Inpainting Using Non-Local Partial Convolutions
Authors:Hao Xiong, Dacheng Tao
[25] Title: A Noise-Sensitivity-Analysis-Based Test Prioritization Technique for Deep Neural Networks
Authors:Long Zhang, Xuechao Sun, Yong Li, Zhenyu Zhang, Yang Feng
[26] Title: SiCloPe: Silhouette-Based Clothed People
Authors:Ryota Natsume, Shunsuke Saito, Zeng Huang, Weikai Chen, Chongyang Ma, Hao Li, Shigeo Morishima
[27] Title: Deep Information Theoretic Registration
Authors:Alireza Sedghi, Jie Luo, Alireza Mehrtash, Steve Pieper, Clare M. Tempany, Tina Kapur, Parvin Mousavi, William M. Wells III
[28] Title: Mask-aware networks for crowd counting
Authors:Shengqin Jiang, Xiaobo Lu, Yinjie Lei, Lingqiao Liu
[29] Title: Interest Point Detection based on Adaptive Ternary Coding
Authors:Zhenwei Miao, Kim-Hui Yap, Xudong Jiang
[30] Title: DCI: Discriminative and Contrast Invertible Descriptor
Authors:Zhenwei Miao, Kim-Hui Yap, Xudong Jiang, Subbhuraam Sinduja, Zhenhua Wang
[31] Title: Learning Spatial Common Sense with Geometry-Aware Recurrent Networks
Authors:Hsiao-Yu Fish Tung, Ricson Cheng, Katerina Fragkiadaki
[32] Title: Impact of Ground Truth Annotation Quality on Performance of Semantic Image Segmentation of Traffic Conditions
Authors:Vlad Taran, Yuri Gordienko, Alexandr Rokovyi, Oleg Alienin, Sergii Stirenko
[33] Title: Instant Automated Inference of Perceived Mental Stress through Smartphone PPG and Thermal Imaging
Authors:Youngjun Cho, Simon J. Julier, Nadia Bianchi-Berthouze
[34] Title: AVRA: Automatic Visual Ratings of Atrophy from MRI images using Recurrent Convolutional Neural Networks
Authors:Gustav Mårtensson, Daniel Ferreira, Lena Cavallin, J-Sebastian Muehlboeck, Lars-Olof Wahlund, Chunliang Wang, Eric Westman
[35] Title: A Survey on Multi-output Learning
Authors:Donna Xu, Yaxin Shi, Ivor W. Tsang, Yew-Soon Ong, Chen Gong, Xiaobo Shen
[36] Title: FPGA-based Accelerators of Deep Learning Networks for Learning and Classification: A Review
Authors:Ahmad Shawahna, Sadiq M. Sait, Aiman El-Maleh
[37] Title: Dense Morphological Network: An Universal Function Approximator
Authors:Ranjan Mondal, Sanchayan Santra, Bhabatosh Chanda
[38] Title: Deep Frame Prediction for Video Coding
Authors:Hyomin Choi, Ivan V. Bajic