图像融合-A Comprehensive Survey on Deep Image Composition

A Comprehensive Survey on Deep Image Composition

Abstract

图像合成作为一种常见的图像编辑操作,其目的是从一个图像切割前景并将其粘贴到另一幅图像上,得到合成图像。然而,有许多问题可能会使合成图像不现实。这些问题可以概括为前景和背景之间的不一致,包括外观不一致(例如,不兼容的颜色和照明)和几何形状不一致(例如,不合理的大小和位置)。以前的图像合成的目标是一个或多个问题。由于每个问题都是一个复杂的问题,所以有一些研究方向(例如,图像协调,物体放置)只关注一个问题。通过综合所有的努力,我们可以获得真实的合成图像。有时,我们期望复合图像不仅是真实的,而且是审美的,在这种情况下,需要考虑审美评价。在本调查中,我们总结了上述研究方向的数据集和方法。

1 INTRODUCTION

外观不一致性包括但不限于:

  1. 前景与背景之间的非自然边界;

  2. 前景与背景之间的颜色和照明统计数据不一致;

  3. 前景的阴影和反射。

  • 图像混合[148,170]的目的是解决前景和背景之间的非自然边界,使前景与背景无缝混合。
  • 图像协调[139,31,25]的目的是调整前景的颜色和照明统计数据,使其与背景更加兼容,从而使整个图像看起来更加和谐
  • 阴影或反射生成[91,172,127]侧重于根据前景和背景信息为前景生成可信的阴影或反射。

几何不一致性包括但不限于:

  1. 前景物体过大或太小;
  2. 前景物体没有支撑力(如悬挂在空中);
  3. 前景物体出现在语义不合理的地方(如陆地上的船);
  4. 不合理的遮挡;
  5. 前景和背景之间的视角不一致。综上所述,前景的位置、大小和形状可能是不合理的。
  • 对象放置和空间转换[3,38,73,138,169]倾向于为前景寻找合理的位置、大小和形状,以避免上述不一致。对象放置[169,138]主要是对前景进行翻译和调整大小,而空间变换[73,85]则涉及更复杂的前景变换(如透视变换)。

在解决了前景和背景之间的外观不一致和几何形状不一致后,复合图像将看起来更加逼真。然而,有时现实主义是不够的,美学也是非常理想的。例如,在图像合成过程中将花瓶放在桌子上时,可能有许多合理的位置和大小。然而,由于有一定的位置和大小,考虑到合成规则和美学标准,合成图像可能在视觉上更有吸引力。在这种情况下,我们需要进一步评估一个复合图像的美学价值。[134,21,92,126]的美学评价有很多,可以从多个美学方面(如有趣的灯光、和谐、生动的色彩、景深、三度法则、对称)来判断图像。在本文中,我们关注构图感知的审美评价(例如,第三规则,对称),因为我们可以调整前景的位置和大小,以在合成图像时获得更高的构图质量。请注意,构图感知审美评价和图像组合中的术语组合是不同的概念,前者指图像的视觉布局,后者指前景和背景结合的过程。

在接下来的章节中,我们将首先介绍不同的研究方向来解决不同的问题:第二节的对象放置,第三节的图像混合,第四节的图像协调,第五节的阴影生成,这大致与创建复合图像的顺序一致。如前所述,图像组成有许多需要解决的问题。有些方法只针对一个问题,而有些方法同时针对多个问题。为了更好地说明,我们在表一中总结了所有问题和解决每个问题的相应方法。

本文的贡献可以总结如下:

  1. 据我们所知,这是第一次关于深度图像组成的综合调查。
  2. 我们总结了在图像合成任务中需要解决的问题以及相应的研究方向和方法,为深度图像合成提供了一个较大的图像。
  3. 为了完整性,我们还讨论了对图像构成的更高需求(即从现实到美学)以及对抗图像构成的方法。

7 图像操作检测

可以通过篡改技术对图像操作方法进行复制移动[29,146,117,151,152]、图像拼接[30,150,65]、图像嵌入[180]、图像增强[7,8,24]等进行分类。与图像合成最相关的操作类型是图像拼接,这将是本文的重点。

A. Datasets and Evaluation Metrics

图像拼接是一个长期存在的研究课题。在早期的开创性作品中,哥伦比亚灰度[109],哥伦比亚颜色[60],卡瓦略[33]是常用的数据集,其中包括真实的图像和拼接的图像。深度学习技术的快速发展促使人们重新思考神经网络的图像拼接检测能力。因此,为了满足公共数据集对图像取证的日益增长的需求,我们构建并发布了更多的拼接数据集,其中CASIA数据集[37]是近年来研究中广泛使用的数据集。

为了评价图像拼接数据集的检测性能,最近的工作通常采用以下指标:分类精度[122,117,41,80,103]、ROC曲线下面积(AUC)[122,150,175,153]、f分数[122,124,150,65,175]、马修斯相关系数(MCC)[124,65]、平均平均精度(mAP)和Union(cIOU)[65]。其中,分类任务常用的指标是准确性和AUC,而定位任务常用的指标是F-Score和AUC。

B. Traditional Methods

传统的图像处理检测方法高度依赖于在数据集上的先验知识或假设。根据伪造者的特点,篡改线索可以分为三组:

  1. 噪声模式[102,100,30,20,23,81,90,116,70]
  2. 彩色滤波器阵列(CFA)插值模式[36,46]
  3. jpeg相关的压缩工作[87,11,12,43,82,1,13,61,143]

第一组方法假设噪声模式因图像而不同,因为不同的相机设备具有不同的捕获参数,而不同的后处理技术使噪声模式在每幅图像中更加独特。在复合图像中,前景和背景的噪声模式通常是不同的,这可以用来识别拼接的前景。

第二组方法是建立在相机滤波器阵列(CFA)一致性假设的基础上的。大多数图像都来自于数码相机的单个图像传感器。这些图像传感器上覆盖着一个产生一个值的CFA,这些值以预定义的格式排列。结合两种不同的图像可以以多种方式破坏CFA的插值模式。

第三组方法利用 JPEG 压缩一致性。 大多数这些方法 [13, 143] 利用 JPEG 量化伪影和 JPEG 压缩网格连续性来确定拼接区域。

C. Deep Learning Methods

由于传统的方法仅限于手工制作的特征或先验知识,最近的工作开始利用强大的深度学习技术[5,8,159,74,83]来处理基于许多方面的图像拼接,如视角、几何、噪声模式、频率特征、颜色、照明等。其中,Liang等人[83]专注于定位与其他区域相比,颜色和照明统计数据不协调的不和谐区域。这些方法大致可分为三组:

  1. 局部补丁比较[7,122,117,65,4,5]
  2. 伪造特征提取[103,175,8,159,153]
  3. 对抗性学习[124,74]。

在第一组中,[7,122,117]中的方法比较图像中的不同区域,以发现可疑区域。Huh等人,[65]假设了EXIF的一致性,并通过比较图像斑块的EXIF属性来揭示了剪接区域。J-LSTM[4]是一种基于LSTM的斑块比较方法,通过检测回火斑块与真实斑块之间的边缘来寻找回火区域。H-LSTM[5]注意到,操作将扭曲被篡改区域边界上的自然统计数据。

在第二组中,伪造特征被衍生并包含在网络中。Bayar和Stamm[8]设计了一个约束卷积层,能够联合抑制图像内容和自适应学习预测误差特征。Zhou等人。[175]使用有阶分析富模型(SRM)[47]作为内核来产生噪声特征。Yang等人[159]使用约束卷积核[8]来预测像素级篡改概率图。Wu等人[153]提出了ManTraNet,它同时利用SRM核[47]和约束卷积核[8]来提取噪声特征。Wu等人[153]也设计了Z-Score来从噪声特征中捕获局部异常。

第三组的图像拼接检测任务采用对抗式学习[74]。[74]是第一个将对抗性学习引入图像操作检测。他们设计了一个重新计算器和一个注释器,其中重新计算器用于生成可信的复合图像,而注释器用于发现拼接区域。重新计算器和注释者以一种敌对的方式进行训练。在[64]中也探索了类似的想法,其中发生器执行图像协调,而鉴别器努力定位不协调区域。

8 限制和未来的发展方向

在图像合成任务中,很难获取地面真实图像。因此,对于一些研究方向(如对象放置、图像混合),没有基准数据集,主观评价(如用户研究)也不是很有说服力,这使得不同的方法很难进行公平的比较。因此,必须建立具有地面真实图像的基准数据集来进行更可靠的评价。例如,在第一个大尺度图像协调数据集iHarmony4[25]发布后,一些后续作品[56,129,27,54,88,10]已经涌现,不仅促进了协调性能,而且为这个任务提供了不同的视角。

最后,我们提出了一些潜在的图像组成研究方向:

  1. 如上所述,在现实应用中,我们很可能会在背景上粘贴多个前景,而前景可能会被遮挡。为了处理被遮挡的前景,我们可能需要预先使用图像输出绘画或对象补全技术[140,174,14,165,156,39]来完成被遮挡的前景。为了处理多个前景,我们需要考虑它们的相对深度和复杂的交互作用来合成一个真实的图像。

  2. 现有的图像合成方法大多是从图像到图像,即2D→2D。然而,对图像组成的一个直观的解决方案是推断出前景/背景的完整的三维几何信息以及照明信息。理想情况下,通过完整的几何和照明信息,我们可以协调前景,准确地生成前景阴影。然而,基于单一的二维图像重建三维世界和估计光照条件仍然是非常具有挑战性的。在此过程中,所诱导的噪声或冗余信息可能会对最终的性能产生不利影响。然而,2D→3D→2D是一个有趣而又有前途的研究方向。我们可以探索2D→2D和2D→3D→2D之间的中间地带,以整合两者的优势。

  3. 对于真实的复合图像,它们的地面真实图像很难获取。对于渲染的复合图像,其地面真实图像相对容易获得。具体来说,我们可以通过在3D渲染软件中的前景、背景、摄像机设置和光照条件来创建成对渲染的合成图像和渲染的真实图像,这可以缓解缺乏真实图像的问题。

参考文献

[1] Irene Amerini, Rudy Becarelli, Roberto Caldelli, andAndrea Del Mastio. Splicing forgeries localizationthrough the use of first digit features. In IEEE Interna-tional Workshop on Information Forensics and Security,2014.
[2] Ibrahim Arief, Simon McCallum, and Jon Yngve Hard-eberg. Realtime estimation of illumination direction foraugmented reality on mobile devices. In Color andImaging Conference, 2012.
[3] Samaneh Azadi, Deepak Pathak, Sayna Ebrahimi, andTrevor Darrell. Compositional GAN: Learning image-conditional binary composition. International Journalof Computer Vision, 128(10):2570–2585, 2020.
[4] Jawadul H Bappy, Amit K Roy-Chowdhury, JasonBunk, Lakshmanan Nataraj, and BS Manjunath. Ex-ploiting spatial structure for localizing manipulated im-age regions. In ICCV, 2017.
[5] Jawadul H Bappy, Cody Simons, Lakshmanan Nataraj,BS Manjunath, and Amit K Roy-Chowdhury. Hybridlstm and encoder–decoder architecture for detection ofimage forgeries. IEEE Transactions on Image Process-ing, 28(7):3286–3300, 2019.
[6] Jonathan T Barron and Jitendra Malik. Shape, illumina-tion, and reflectance from shading. IEEE Transactionson Pattern Analysis and Machine Intelligence, 37(8):1670–1687, 2014.
[7] Belhassen Bayar and Matthew C Stamm. A deep learn-ing approach to universal image manipulation detectionusing a new convolutional layer. In ACM Workshop onInformation Hiding and Multimedia Security, 2016.
[8] Belhassen Bayar and Matthew C Stamm. Constrainedconvolutional neural networks: A new approach towardsgeneral purpose image manipulation detection. IEEETransactions on Information Forensics and Security, 13(11):2691–2706, 2018.
[9] Subhabrata Bhattacharya, Rahul Sukthankar, andMubarak Shah. A framework for photo-quality assess-ment and enhancement based on visual aesthetics. InACM MM, 2010.
[10] Anand Bhattad and David A. Forsyth. Cut-and-pasteneural rendering. arXiv preprint arXiv:2010.05907,2020.
[11] Tiziano Bianchi and Alessandro Piva. Detection ofnonaligned double JPEG compression based on integerperiodicity maps. IEEE Transactions on InformationForensics and Security, 7(2):842–848, 2011.
[12] Tiziano Bianchi and Alessandro Piva. Image forgerylocalization via block-grained analysis of JPEG arti-facts. IEEE Transactions on Information Forensics andSecurity, 7(3):1003–1017, 2012.
[13] Tiziano Bianchi, Alessia De Rosa, and AlessandroPiva. Improved DCT coefficient analysis for forgerylocalization in JPEG images. In ICASSP, 2011.
[14] Richard Strong Bowen, Huiwen Chang, Charles Her-rmann, Piotr Teterwak, Ce Liu, and Ramin Zabih.OCONet: Image extrapolation by object completion. InCVPR, 2021.
[15] Ralph Allan Bradley and Milton E Terry. Rank analysisof incomplete block designs: I. the method of pairedcomparisons. Biometrika, 39(3/4):324–345, 1952.
[16] Peter J Burt and Edward H Adelson. A multiresolu-tion spline with application to image mosaics. ACMTransactions on Graphics, 2(4):217–236, 1983.
[17] Vladimir Bychkovsky, Sylvain Paris, Eric Chan, andFr´edo Durand. Learning photographic global tonaladjustment with a database of input / output image pairs.In CVPR, 2011.
[18] Angel X Chang, Thomas Funkhouser, Leonidas Guibas,Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese,Manolis Savva, Shuran Song, Hao Su, et al. ShapeNet:An information-rich 3D model repository.arXivpreprint arXiv: 1512.03012, 2015.
[19] Bor-Chun Chen and Andrew Kae. Toward realisticimage compositing with adversarial learning. In CVPR,2019.
[20] Mo Chen, Jessica Fridrich, Miroslav Goljan, and JanLuk´as. Determining image origin and integrity usingsensor noise. IEEE Transactions on information foren-sics and security, 3(1):74–90, 2008.
[21] Qiuyu Chen, Wei Zhang, Ning Zhou, Peng Lei, Yi Xu,Yu Zheng, and Jianping Fan. Adaptive fractional dilatedconvolution network for image aesthetics assessment. InCVPR, 2020.
[22] Dachuan Cheng, Jian Shi, Yanyun Chen, XiaomingDeng, and Xiaopeng Zhang. Learning scene illumi-nation by pairwise photos from rear and front mobilecameras. Computer Graphics Forum, 37(7):213–221,2018.
[23] Giovanni Chierchia, Giovanni Poggi, Carlo Sansone,and Luisa Verdoliva. A bayesian-MRF approach forPRNU-based image forgery detection. IEEE Transac-tions on Information Forensics and Security, 9(4):554–567, 2014.
[24] Hak-Yeol Choi, Han-Ul Jang, Dongkyu Kim, JeonghoSon, Seung-Min Mun, Sunghee Choi, and Heung-KyuLee. Detecting composite image manipulation based ondeep neural networks. In IWSSIP, 2017.
[25] Wenyan Cong, Jianfu Zhang, Li Niu, Liu Liu, ZhixinLing, Weiyuan Li, and Liqing Zhang. DoveNet: Deepimage harmonization via domain verification. In CVPR,2020.
[26] Wenyan Cong, Junyan Cao, Li Niu, Jianfu Zhang,Xuesong Gao, Zhiwei Tang, and Liqing Zhang. Deepimage harmonization by bridging the reality gap. arXivpreprint arXiv:2103.17104, 2021.
[27] Wenyan Cong, Li Niu, Jianfu Zhang, Jing Liang, andLiqing Zhang. BargainNet: Background-guided domaintranslation for image harmonization. In ICME, 2021.
[28] Marius Cordts, Mohamed Omran, Sebastian Ramosan Timo Rehfeld, Markus Enzweiler, Rodrigo Benen-son, Uwe Franke, Stefan Roth, and Bernt Schiele. Thecityscapes dataset for semantic urban scene understand-ing. In CVPR, 2016.
[29] Davide Cozzolino, Giovanni Poggi, and Luisa Verdo-liva. Efficient dense-field copy–move forgery detection.IEEE Transactions on Information Forensics and Secu-rity, 10(11):2284–2297, 2015.
[30] Davide Cozzolino, Giovanni Poggi, and Luisa Verdo-liva. Splicebuster: A new blind image splicing detector.In IEEE International Workshop on Information Foren-sics and Security, 2015.
[31] Xiaodong Cun and Chi-Man Pun. Improving the har-mony of the composite image by spatial-separated atten-tion module. IEEE Transactions on Image Processing,29:4759–4771, 2020.
[32] Ritendra Datta, Jia Li, and James Z Wang. Algorithmicinferencing of aesthetics and emotion in natural images:An exposition. In ICIP, 2008.
[33] Tiago Jos´e De Carvalho, Christian Riess, ElliAngelopoulou, Helio Pedrini, and Andersonde Rezende Rocha. Exposing digital image forgeries byillumination color classification. IEEE Transactions onInformation Forensics and Security, 8(7):1182–1194,2013.
[34] Yubin Deng, Chen Change Loy, and Xiaoou Tang.Image aesthetic assessment: An experimental survey.IEEE Signal Processing Magazine, 34(4):80–106, 2017.
[35] Sagnik Dhar, Vicente Ordonez, and Tamara L Berg.High level describable attributes for predicting aesthet-ics and interestingness. In CVPR, 2011.
[36] Ahmet Emir Dirik and Nasir Memon. Image tamperdetection based on demosaicing artifacts. In ICIP, 2009.
[37] Jing Dong, Wei Wang, and Tieniu Tan. Casia imagetampering detection evaluation database. In IEEE ChinaSummit and International Conference on Signal andInformation Processing, 2013.
[38] Nikita Dvornik, Julien Mairal, and Cordelia Schmid.Modeling visual context is key to augmenting objectdetection datasets. In ECCV, 2018.
[39] Kiana Ehsani, Roozbeh Mottaghi, and Ali Farhadi.Segan: Segmenting and generating the invisible. InCVPR, 2018.
[40] Mark Everingham, Luc Van Gool, Christopher K. I.Williams, John M. Winn, and Andrew Zisserman. Thepascal visual object classes (VOC) challenge. Inter-national Journal of Computer Vision, 88(2):303–338,2010.
[41] Yu Fan, Philippe Carr´e, and Christine Fernandez-Maloigne. Image splicing detection with local illumi-nation estimation. In ICIP, 2015.
[42] Haoshu Fang, Jianhua Sun, Runzhong Wang, MinghaoGou, Yonglu Li, and Cewu Lu. InstaBoost: Boostinginstance segmentation via probability map guided copy-pasting. In ICCV, 2019.
[43] Hany Farid. Exposing digital forgeries from JPEGghosts. IEEE Transactions on Information Forensicsand Security, 4(1):154–160, 2009.
[44] Raanan Fattal, Dani Lischinski, and Michael Werman.Gradient domain high dynamic range compression. InACM Siggraph Computer Graphics, 2002.
[45] Ulrich Fecker, Marcus Barkowsky, and Andr´e Kaup.Histogram-based prefiltering for luminance and chromi-nance compensation of multiview video. IEEE Trans-actions on Circuits and Systems for Video Technology,18(9):1258–1267, 2008.
[46] Pasquale Ferrara, Tiziano Bianchi, Alessia De Rosa, andAlessandro Piva. Image forgery localization via fine-grained analysis of CFA artifacts. IEEE Transactionson Information Forensics and Security, 7(5):1566–1577,2012.
[47] Jessica Fridrich and Jan Kodovsky. Rich models forsteganalysis of digital images. IEEE Transactionson Information Forensics and Security, 7(3):868–882,2012.
[48] Marc-Andr´e Gardner, Kalyan Sunkavalli, Ersin Yumer,Xiaohui Shen, Emiliano Gambaretto, Christian Gagn´e,and Jean-Franc¸ois Lalonde. Learning to predict indoorillumination from a single image. ACM Transactionson Graphics, 36(6):1–14, 2017.
[49] Marc-Andr´e Gardner, Yannick Hold-Geoffroy, KalyanSunkavalli, Christian Gagn´e, and Jean-FrancoisLalonde. Deep parametric indoor lighting estimation.In ICCV, 2019.
[50] Georgios Georgakis, Md. Alimoor Reza, ArsalanMousavian, Phi-Hung Le, and Jana Kosecka. MultiviewRGB-D dataset for object instance detection. In 3DV,2016.
[51] Georgios Georgakis, Arsalan Mousavian, Alexander C.Berg, and Jana Kosecka. Synthesizing training data forobject detection in indoor scenes. In RSS XIII, 2017.
[52] Golnaz Ghiasi, Yin Cui, Aravind Srinivas, Rui Qian,Tsung-Yi Lin, Ekin D. Cubuk, Quoc V. Le, and BarretZoph. Simple copy-paste is a strong data augmenta-tion method for instance segmentation. arXiv preprintarXiv:2012.07177, 2020.
[53] Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza,Bing Xu, David Warde-Farley, Sherjil Ozair, AaronCourville, and Yoshua Bengio. Generative adversarialnetworks. NIPS, 2014.
[54] Zonghui Guo, Haiyong Zheng, Yufeng Jiang, ZhaoruiGu, and Bing Zheng. Intrinsic image harmonization. InCVPR, 2021.
[55] Jong Goo Han, Tae Hee Park, Yong Ho Moon, andIl Kyu Eom. Efficient markov feature extraction methodfor image splicing detection using maximization andthreshold expansion. Journal of Electronic Imaging, 25(2):023031, 2016.
[56] Guoqing Hao, Satoshi Iizuka, and Kazuhiro Fukui.Image harmonization with attention-based deep featuremodulation. In BMVC, 2020.
[57] Martin Heusel, Hubert Ramsauer, Thomas Unterthiner,Bernhard Nessler, and Sepp Hochreiter. GANs trainedby a two time-scale update rule converge to a local nashequilibrium. In NeurIPS, 2017.
[58] Yan Hong, Li Niu, Jianfu Zhang, and Liqing Zhang.Shadow generation for composite image in real-worldscenes. arXiv preprint arXiv:2104.10338, 2021.
[59] Le Hou, Chen-Ping Yu, and Dimitris Samaras. Squaredearth mover’s distance-based loss for training deep neu-ral networks. arXiv preprint arXiv:1611.05916, 2016.
[60] Yu-Feng Hsu and Shih-Fu Chang. Detecting imagesplicing using geometry invariants and camera charac-teristics consistency. In ICME, 2006.
[61] Wu-Chih Hu and Wei-Hao Chen. Effective forgerydetection using DCT+ SVD-based watermarking forregion of interest in key frames of vision-based surveil-lance. International Journal of Computational Scienceand Engineering, 8(4):297–305, 2013.
[62] Xiaowei Hu, Yitong Jiang, Chi-Wing Fu, and Pheng-Ann Heng. Mask-ShadowGAN: Learning to removeshadows from unpaired data. In ICCV, 2019.
[63] Xuefeng Hu, Zhihan Zhang, Zhenye Jiang, SyomantakChaudhuri, Zhenheng Yang, and Ram Nevatia. SPAN:Spatial pyramid attention network for image manipula-tion localization. In ECCV, 2020.
[64] Hao-Zhi Huang, Sen-Zhe Xu, Jun-Xiong Cai, Wei Liu,and Shi-Min Hu. Temporally coherent video harmo-nization using adversarial networks. IEEE Transactionson Image Processing, 29:214–224, 2019.
[65] Minyoung Huh, Andrew Liu, Andrew Owens, andAlexei A Efros. Fighting fake news: Image splicedetection via learned self-consistency. In ECCV, 2018.
[66] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei AEfros. Image-to-image translation with conditionaladversarial networks. In CVPR, 2017.
[67] Max Jaderberg, Karen Simonyan, Andrew Zisserman,and Koray Kavukcuoglu. Spatial transformer networks.In NeurIPS, 2015.
[68] Xin Jin, Le Wu, Geng Zhao, Xiaodong Li, XiaokunZhang, Shiming Ge, Dongqing Zou, Bin Zhou, andXinghui Zhou. Aesthetic attributes assessment of im-ages. In ACM MM, 2019.
[69] Dhiraj Joshi, Ritendra Datta, Elena Fedorovskaya,Quang-Tuan Luong, James Z Wang, Jia Li, and JieboLuo. Aesthetics and emotions in images. IEEE SignalProcessing Magazine, 28(5):94–115, 2011.
[70] Thibaut Julliand, Vincent Nozick, and Hugues Talbot.Image noise and digital image forensics. In Interna-tional Workshop on Digital Watermarking, 2015.
[71] Kevin Karsch, Kalyan Sunkavalli, Sunil Hadap, NathanCarr, Hailin Jin, Rafael Fonte, Michael Sittig, and DavidForsyth. Automatic scene inference for 3d objectcompositing. ACM Transactions on Graphics, 33(3):1–15, 2014.
[72] Michael Kazhdan and Hugues Hoppe. Streaming multi-grid for gradient-domain operations on large images.ACM Transactions on graphics, 27(3):1–10, 2008.
[73] Kotaro Kikuchi, Kota Yamaguchi, Edgar Simo-Serra,and Tetsunori Kobayashi. Regularized adversarial train-ing for single-shot virtual try-on. In ICCV Workshop,2019.
[74] Vladimir V Kniaz, Vladimir Knyaz, and Fabio Re-mondino. The point where reality meets fantasy: Mixedadversarial generators for image splice detection. InNeurIPs, 2019.
[75] Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, andCharless Fowlkes. Photo aesthetics ranking networkwith attributes and content adaptation. In ECCV, 2016.
[76] Pierre-Yves Laffont, Zhile Ren, Xiaofeng Tao, ChaoQian, and James Hays. Transient attributes for high-level understanding and editing of outdoor scenes. ACMTransactions on graphics, 33(4):1–11, 2014.
[77] Jean-Franc¸ois Lalonde and Alexei A. Efros. Using colorcompatibility for assessing image realism. In ICCV,2007.
[78] Donghoon Lee, Sifei Liu, Jinwei Gu, Ming-Yu Liu,Ming-Hsuan Yang, and Jan Kautz. Context-aware syn-thesis and placement of object instances. In NeurIPS,2018.
[79] Anat Levin, Assaf Zomet, Shmuel Peleg, and YairWeiss. Seamless image stitching in the gradient domain.In ECCV, 2004.
[80] Ce Li, Qiang Ma, Limei Xiao, Ming Li, and AihuaZhang. Image splicing detection based on markovfeatures in QDCT domain. Neurocomputing, 228:29–36, 2017.
[81] Chang-Tsun Li and Yue Li. Color-decoupled photoresponse non-uniformity for digital image forensics.IEEE Transactions on Circuits and Systems for VideoTechnology, 22(2):260–271, 2011.
[82] Weihai Li, Yuan Yuan, and Nenghai Yu. Passivedetection of doctored JPEG image via block artifact gridextraction. Signal Processing, 89(9):1821–1829, 2009.
[83] Jing Liang, Li Niu, and Liqing Zhang. Inharmoniousregion localization. In ICME, 2021.
[84] Bin Liao, Yao Zhu, Chao Liang, Fei Luo, and ChunxiaXiao. Illumination animating and editing in a singlepicture using scene structure estimation. Computers &Graphics, 82:53–64, 2019.
[85] Chen-Hsuan Lin, Ersin Yumer, Oliver Wang, Eli Shecht-man, and Simon Lucey. ST-GAN: spatial transformergenerative adversarial networks for image compositing.In CVPR, 2018.
[86] Tsung-Yi Lin, Michael Maire, Serge J. Belongie, JamesHays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, andC. Lawrence Zitnick. Microsoft COCO: common ob-jects in context. In ECCV, 2014.
[87] Zhouchen Lin, Junfeng He, Xiaoou Tang, and Chi-Keung Tang. Fast, automatic and fine-grained tamperedJPEG image detection via DCT coefficient analysis.Pattern Recognition, 42(11):2492–2501, 2009.
[88] Jun Ling, Han Xue, Li Song, Rong Xie, and Xiao Gu.Region-aware adaptive instance normalization for imageharmonization. In CVPR, 2021.
[89] Bin Liu, Kun Xu, and Ralph R Martin. Static sceneillumination estimation from videos with applications.Journal of Computer Science and Technology, 32(3):430–442, 2017.
[90] Bo Liu and Chi-Man Pun. Splicing forgery exposurein digital image by detecting noise discrepancies. In-ternational Journal of Computer and CommunicationEngineering, 4(1):33, 2015.
[91] Daquan Liu, Chengjiang Long, Hongpan Zhang, Han-ning Yu, Xinzhi Dong, and Chunxia Xiao. ARshadow-Gan: Shadow generative adversarial network for aug-mented reality in single light scenes. In CVPR, 2020.
[92] Dong Liu, Rohit Puri, Nagendra Kamath, and Sub-habrata Bhattacharya. Composition-aware image aes-thetics assessment. In WACV, 2020.
[93] Guilin Liu, Fitsum A Reda, Kevin J Shih, Ting-ChunWang, Andrew Tao, and Bryan Catanzaro. Imageinpainting for irregular holes using partial convolutions.In ECCV, 2018.
[94] Ziwei Liu, Ping Luo, Xiaogang Wang, and XiaoouTang. Deep learning face attributes in the wild. InICCV, 2015.
[95] Kuo-Yen Lo, Keng-Hao Liu, and Chu-Song Chen. As-sessment of photo aesthetics with efficiency. In ICPR,2012.
[96] Fujun Luan, Sylvain Paris, Eli Shechtman, and KavitaBala. Deep painterly harmonization. In Computergraphics forum, volume 37, pages 95–106, 2018.
[97] Wei Luo, Xiaogang Wang, and Xiaoou Tang. Content-based photo quality assessment. In ICCV, 2011.
[98] Wei Luo, Xiaogang Wang, and Xiaoou Tang. Content-based photo quality assessment. IEEE Transactions onImage Processing, 15(8):1930–1943, 2013.
[99] Yiwen Luo and Xiaoou Tang. Photo and video qualityevaluation: Focusing on the subject. In ECCV, 2008.
[100] Siwei Lyu, Xunyu Pan, and Xing Zhang. Exposing re-gion splicing forgeries with blind local noise estimation.International Journal of Computer Vision, 110(2):202–221, 2014.
[101] Shuang Ma, Jing Liu, and Chang Wen Chen. A-Lamp:Adaptive layout-aware multi-patch deep convolutionalneural network for photo aesthetic assessment. InCVPR, 2017.
[102] Babak Mahdian and Stanislav Saic. Using noise incon-sistencies for blind image forensics. Image and VisionComputing, 27(10):1497–1503, 2009.
[103] Owen Mayer and Matthew C Stamm. Learned forensicsource similarity for unknown camera models. InICASSP, pages 2012–2016, 2018.
[104] Shervin Minaee, Yuri Y Boykov, Fatih Porikli, Anto-nio J Plaza, Nasser Kehtarnavaz, and Demetri Terzopou-los. Image segmentation using deep learning: A survey.IEEE Transactions on Pattern Analysis and MachineIntelligence, 2021.
[105] Mehdi Mirza and Simon Osindero. Conditional gener-ative adversarial nets. arXiv preprint arXiv:1411.1784,2014.
[106] Naila Murray, Luca Marchesotti, and Florent Perronnin.AVA: A large-scale database for aesthetic visual analy-sis. In CVPR, 2012.
[107] Naila Murray, Luca Marchesotti, and Florent Perronnin.Aesthetic critiques generation for photos. In ICCV,2017.
[108] Thomas Nestmeyer, Jean-Franc¸ois Lalonde, IainMatthews, and Andreas Lehrmann. Learning physics-guided face relighting under directional light. In CVPR,2020.
[109] Tian-Tsong Ng, Shih-Fu Chang, and Q Sun. A data setof authentic and spliced image blocks. Columbia Uni-versity, ADVENT Technical Report, pages 203–2004,2004.
[110] Vu Nguyen, Tomas F Yago Vicente, Maozheng Zhao,Minh Hoai, and Dimitris Samaras. Shadow detectionwith conditional generative adversarial networks. InICCV, 2017.
[111] Pere Obrador, Ludwig Schmidt-Hackenberg, and NuriaOliver. The role of image composition in image aes-thetics. In ICIP, 2010.
[112] Rohit Kumar Pandey, Sergio Orts Escolano, ChloeLeGendre, Christian Haene, Sofien Bouaziz, ChristophRhemann, Paul Debevec, and Sean Fanello. Totalrelighting: Learning to relight portraits for backgroundreplacement. In SIGGRAPH, 2021.
[113] Patrick P´erez, Michel Gangnet, and Andrew Blake.Poisson image editing. In ACM SIGGRAPH. 2003.
[114] Franc¸ois Piti´e, Anil C Kokaram, and Rozenn Dahyot.Automated colour grading using colour distributiontransfer. Computer Vision and Image Understanding,107(1-2):123–137, 2007.
[115] Thomas Porter and Tom Duff. Compositing digitalimages. In ACM Siggraph Computer Graphics, 1984.
[116] Chi-Man Pun, Bo Liu, and Xiao-Chen Yuan. Multi-scale noise estimation for image splicing forgery de-tection. Journal of Visual Communication and ImageRepresentation, 38:195–206, 2016.
[117] Yuan Rao and Jiangqun Ni. A deep learning approachto detection of splicing and copy-move forgeries inimages. In IEEE International Workshop on InformationForensics and Security, pages 1–6, 2016.
[118] Yogesh Singh Rawat and Mohan S Kankanhalli.Context-aware photography learning for smart mobiledevices. ACM Transactions on Multimedia Computing,Communications, and Applications, 12(1):1–24, 2015.
[119] Erik Reinhard, Michael Ashikhmin, Bruce Gooch, andPeter Shirley. Color transfer between images. IEEEComputer Graphics and Applications, 21(5):34–41,2001.
[120] Tal Remez, Jonathan Huang, and Matthew Brown.Learning to segment via cut-and-paste. In ECCV, 2018.
[121] Jian Ren, Xiaohui Shen, Zhe Lin, Radomir Mech, andDavid J Foran. Personalized image aesthetics. In ICCV,2017.
[122] Paolo Rota, Enver Sangineto, Valentina Conotter, andChristopher Pramerdorfer. Bad teacher or unruly stu-dent: Can deep learning say something in image foren-sics analysis? In ICPR, 2016.
[123] R. Sukthankar S. Bhattacharya and M. Shah. A holis-tic approach to aesthetic enhancement of photographs.ACM Transactions on Multimedia Computing, Commu-nications, and Applications, 7(1):1–21, 2011.
[124] Ronald Salloum, Yuzhuo Ren, and C-C Jay Kuo. Imagesplicing localization using a multi-task fully convolu-tional network (mfcn). Journal of Visual Communica-tion and Image Representation, 51:201–209, 2018.
[125] Tiago A Schieber, Laura Carpi, Albert D´ıaz-Guilera,Panos M Pardalos, Cristina Masoller, and Mart´ın GRavetti. Quantification of network structural dissimi-larities. Nature Communications, 8(1):1–10, 2017.
[126] Dongyu She, Yu-Kun Lai, Gaoxiong Yi, and Kun Xu.Hierarchical layout-aware graph convolutional networkfor unified aesthetics assessment. In CVPR, 2021.
[127] Yichen Sheng, Jianming Zhang, and Bedrich Benes.SSN: Soft shadow network for image compositing. InCVPR, 2021.
[128] Zhixin Shu, Sunil Hadap, Eli Shechtman, KalyanSunkavalli, Sylvain Paris, and Dimitris Samaras. Por-trait lighting transfer using a mass transport approach.ACM Transactions on Graphics, 36(4):1, 2017.
[129] Konstantin Sofiiuk, Polina Popenova, and AntonKonushin. Foreground-aware semantic representationsfor image harmonization. In WACV, 2021.
[130] Shuangbing Song, Fan Zhong, Xueying Qin, andChanghe Tu. Illumination harmonization with graymean scale. In Computer Graphics International Con-ference, 2020.
[131] Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang,Manolis Savva, and Thomas A. Funkhouser. Semanticscene completion from a single depth image. In CVPR,2017.
[132] Kalyan Sunkavalli, Micah K. Johnson, Wojciech Ma-tusik, and Hanspeter Pfister. Multi-scale image harmo-nization. ACM Transactions on Graphics, 29(4):125:1–125:10, 2010.
[133] Richard Szeliski, Matthew Uyttendaele, and DrewSteedly. Fast poisson blending using multi-splines. InICCP, 2011.
[134] Hossein Talebi and Peyman Milanfar. NIMA: Neuralimage assessment. IEEE Transactions on Image Pro-cessing, 27(8):3998–4011, 2018.
[135] Fuwen Tan, Crispin Bernier, Benjamin Cohen, VicenteOrdonez, and Connelly Barnes. Where and who? au-tomatic semantic-aware person composition. In WACV,2018.
[136] Xuehan Tan, Panpan Xu, Shihui Guo, and WenchengWang. Image composition of partially occluded objects.In Computer Graphics Forum, volume 38, pages 641–650, 2019.
[137] K. Th¨ommes and R. H¨ubner. Instagram likes for archi-tectural photos can be predicted by quantitative balancemeasures and curvature. Frontiers in Psychology, 9(1):1050–1067, 2018.
[138] Shashank Tripathi, Siddhartha Chandra, Amit Agrawal,Ambrish Tyagi, James M. Rehg, and Visesh Chari.Learning to generate synthetic data via compositing. InCVPR, 2019.
[139] Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, KalyanSunkavalli, Xin Lu, and Ming-Hsuan Yang. Deep imageharmonization. In CVPR, 2017.
[140] Basile Van Hoorick. Image outpainting and harmo-nization using generative adversarial networks. arXivpreprint arXiv:1912.10960, 2019.
[141] Hao Wang, Qilong Wang, Fan Yang, Weiqi Zhang, andWangmeng Zuo. Data augmentation for object detectionvia progressive and selective instance-switching. arXivpreprint arXiv:1906.00358, 2019.
[142] Tianyu Wang, Xiaowei Hu, Qiong Wang, Pheng-AnnHeng, and Chi-Wing Fu. Instance shadow detection. InCVPR, 2020.
[143] Wei Wang, Jing Dong, and Tieniu Tan. Tamperedregion localization of digital color images based onJPEG compression noise. In International Workshopon Digital Watermarking, 2010.
[144] Wenshan Wang, Su Yang, Weishan Zhang, and JiulongZhang. Neural aesthetic image reviewer. IET ComputerVision, 13(8):749–758, 2019.
[145] Zhibo Wang, Xin Yu, Ming Lu, Quan Wang, ChenQian, and Feng Xu. Single image portrait relightingvia explicit multiple reflectance channel modeling. ACMTransactions on Graphics, 39(6):1–13, 2020.
[146] Bihan Wen, Ye Zhu, Ramanathan Subramanian, Tian-Tsong Ng, Xuanjing Shen, and Stefan Winkler. COV-ERAGE—a novel database for copy-move forgery de-tection. In ICIP, 2016.
[147] Shuchen Weng, Wenbo Li, Dawei Li, Hongxia Jin,and Boxin Shi. Misc: Multi-condition injection andspatially-adaptive compositing for conditional personimage synthesis. In CVPR, 2020.
[148] Huikai Wu, Shuai Zheng, Junge Zhang, and KaiqiHuang. GP-GAN: Towards realistic high-resolutionimage blending. In ACM MM, 2019.
[149] Yaowen Wu, Christian Bauckhage, and Christian Thu-rau. The good, the bad, and the ugly: Predictingaesthetic image labels. In ICPR, 2010.
[150] Yue Wu, Wael Abd-Almageed, and Prem Natarajan.Deep matching and validation network: An end-to-endsolution to constrained image splicing localization anddetection. In ACM MM, 2017.
[151] Yue Wu, Wael Abd-Almageed, and Prem Natarajan.Busternet: Detecting copy-move image forgery withsource/target localization. In ECCV, 2018.
[152] Yue Wu, Wael Abd-Almageed, and Prem Natarajan.Image copy-move forgery detection via an end-to-enddeep neural network. In WACV, 2018.
[153] Yue Wu, Wael AbdAlmageed, and Premkumar Natara-jan. Mantra-net: Manipulation tracing network for de-tection and localization of image forgeries with anoma-lous features. In CVPR, 2019.
[154] Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu,Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 3dshapenets: A deep representation for volumetric shapes.In CVPR, 2015.
[155] Xuezhong Xiao and Lizhuang Ma. Color transfer incorrelated color space. In ACM international conferenceon Virtual reality continuum and its applications, 2006.
[156] Yuting Xiao, Yanyu Xu, Ziming Zhong, Weixin Luo,Jiawei Li, and Shenghua Gao. Amodal segmentationbased on visible region segmentation and shape prior.AAAI, 2021.
[157] Ning Xu, Brian Price, Scott Cohen, and Thomas Huang.Deep image matting. In CVPR, 2017.
[158] Su Xue, Aseem Agarwala, Julie Dorsey, and Holly E.Rushmeier. Understanding and improving the realismof image composites. ACM Transactions on Graphics,31(4):84:1–84:10, 2012.
[159] Chao Yang, Huizhou Li, Fangting Lin, Bin Jiang, andHao Zhao. Constrained R-CNN: A general imagemanipulation detection model. In ICME, 2020.
[160] Raymond A Yeh, Chen Chen, Teck Yian Lim, Alexan-der G Schwing, Mark Hasegawa-Johnson, and Minh NDo. Semantic image inpainting with deep generativemodels. In CVPR, 2017.
[161] Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu,and Thomas S Huang. Free-form image inpainting withgated convolution. In ICCV, 2019.
[162] Fangneng Zhan, Hongyuan Zhu, and Shijian Lu. Spatialfusion GAN for image synthesis. In CVPR, 2019.
[163] Fangneng Zhan, Shijian Lu, Changgong Zhang, FeiyingMa, and Xuansong Xie. Adversarial image compositionwith auxiliary illumination. In ACCV, 2020.
[164] Fangneng Zhan, Jiaxing Huang, and Shijian Lu. Hierar-chy composition GAN for high-fidelity image synthesis.IEEE Transactions on cybernetics, 2021.
[165] Xiaohang Zhan, Xingang Pan, Bo Dai, Ziwei Liu,Dahua Lin, and Chen Change Loy. Self-supervisedscene de-occlusion. In CVPR, 2020.
[166] Bo Zhang, Li Niu, and Liqing Zhang. Image composi-tion assessment with saliency-augmented multi-patternpooling. arXiv preprint arXiv:2104.03133, 2021.
[167] He Zhang, Jianming Zhang, Federico Perazzi, Zhe Lin,and Vishal M Patel. Deep image compositing. In WACV,2021.
[168] Jinsong Zhang, Kalyan Sunkavalli, Yannick Hold-Geoffroy, Sunil Hadap, Jonathan Eisenman, and Jean-Franc¸ois Lalonde. All-weather deep outdoor lightingestimation. In CVPR, 2019.
[169] Lingzhi Zhang, Tarmily Wen, Jie Min, Jiancong Wang,David Han, and Jianbo Shi. Learning object placementby inpainting for compositional data augmentation. InECCV, 2020.
[170] Lingzhi Zhang, Tarmily Wen, and Jianbo Shi. Deepimage blending. In WACV, 2020.
[171] Luming Zhang, Yue Gao, Roger Zimmermann, Qi Tian,and Xuelong Li. Fusion of multichannel local andglobal structural cues for photo aesthetics evaluation.IEEE Transactions on Image Processing, 23(3):1419–1429, 2014.
[172] Shuyang Zhang, Runze Liang, and Miao Wang. Shad-owGAN: Shadow synthesis for virtual objects withconditional adversarial networks. Computational VisualMedia, 5(1):105–115, 2019.
[173] Song-Hai Zhang, Zhengping Zhou, Bin Liu, Xi Dong,and Peter Hall. What and where: A context-basedrecommendation system for object insertion. Compu-tational Visual Media, 6(1):79–93, 2020.
[174] Zibo Zhao, Wen Liu, Yanyu Xu, Xianing Chen, WeixinLuo, Lei Jin, Bohui Zhu, Tong Liu, Binqiang Zhao,and Shenghua Gao. Prior based human completion. InCVPR, 2021.
[175] Peng Zhou, Xintong Han, Vlad I Morariu, and Larry SDavis. Learning rich features for image manipulationdetection. In CVPR, 2018.
[176] Ye Zhou, Xin Lu, Junping Zhang, and James Z Wang.Joint image and text representation for aesthetics anal-ysis. In ACM MM, 2016.
[177] Zihan Zhou, Siqiong He, Jia Li, and James Z Wang.Modeling perspective effects in photographic composi-tion. In ACM MM, 2015.
[178] Jun-Yan Zhu, Philipp Krahenbuhl, Eli Shechtman, andAlexei A Efros. Learning a discriminative model for theperception of realism in composite images. In ICCV,2015.
[179] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A.Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV, 2017.
[180] Xinshan Zhu, Yongjun Qian, Xianfeng Zhao, Biao Sun,and Ya Sun. A deep learning approach to patch-basedimage inpainting forensics. Signal Processing: ImageCommunication, 67:90–99, 2018.

posted @ 2022-04-20 13:28  梁君牧  阅读(520)  评论(0编辑  收藏  举报