Forgeries Image Detection Paper-Paper
Paper-Paper
Forgeries Image Detection
Hybrid LSTM and Encoder–Decoder Architecture for Detection of Image Forgeries
Hybrid LSTM and Encoder–Decoder Architecture for Detection of Image Forgeries
Jawadul H. Bappy, Cody Simons, +2 authors A. Roy-Chowdhury
Published 2019
Computer Science, Medicine
IEEE Transactions on Image Processing
-
This paper proposes a high-confidence manipulation localization architecture that utilizes resampling features, long short-term memory (LSTM) cells, and an encoder–decoder network to segment out manipulated regions from non-manipulated ones. Resampling features are used to capture artifacts, such as JPEG quality loss, upsampling, downsampling, rotation, and shearing. The proposed network exploits larger receptive fields (spatial maps) and frequency-domain correlation to analyze the discriminative characteristics between the manipulated and non-manipulated regions by incorporating the encoder and LSTM network.
-
本文提出了一种高置信度操作定位架构,该架构利用重采样特征、长短期记忆 (LSTM) 单元和编码器 - 解码器网络从非操作区域中分割出操作区域。重采样功能用于捕获伪像,例如 JPEG 质量损失、上采样、下采样、旋转和剪切。所提出的网络利用更大的感受野(空间图)和频域相关性,通过结合编码器和 LSTM 网络来分析操纵和非操纵区域之间的判别特征。最后,解码器网络学习从低分辨率特征图到像素级预测的映射,以进行图像篡改定位。使用所提出架构的最后一层 (softmax) 提供的预测掩码,执行端到端训练以使用真实掩码通过反向传播来学习网络参数。
Optimization of a Pre-Trained AlexNet Model for Detecting and Localizing Image Forgeries
Soad Samir, E. Emary, +1 author H. Onsi
Published 2020
Computer Science
- A novel image forgery detection model using AlexNet framework is introduced. We proposed a modified model to optimize the AlexNet model by using batch normalization instead of local Response normalization, a maxout activation function instead of a rectified linear unit, and a softmax activation function in the last layer to act as a classifier. As a consequence, the AlexNet proposed model can carry out feature extraction and as well as detection of forgeries without the need for further manipulations. Throughout a number of experiments, we examine and differentiate the impacts of several important AlexNet design choices. The proposed networks model is applied on CASIA v2.0, CASIA v1.0, DVMM, and NIST Nimble Challenge 2017 datasets. We also apply k-fold cross-validation on datasets to divide them into training and test data samples. The experimental results achieved prove that the proposed model can accomplish a great performance for detecting different sorts of forgeries. Quantitative performance analysis of the proposed model can detect image forgeries with 98.176% accuracy.
- 介绍了一种使用 AlexNet 框架的新型图像伪造检测模型。我们提出了一个修改模型来优化 AlexNet 模型,使用批量归一化代替局部响应归一化,使用 maxout 激活函数代替修正线性单元,并在最后一层使用 softmax 激活函数作为分类器。因此,AlexNet 提出的模型可以执行特征提取以及伪造检测,而无需进一步操作。在许多实验中,我们检查并区分了几个重要的 AlexNet 设计选择的影响。提出的网络模型应用于 CASIA v2.0、CASIA v1.0、DVMM 和 NIST Nimble Challenge 2017 数据集。我们还在数据集上应用 k 折交叉验证,将它们划分为训练和测试数据样本。所取得的实验结果证明所提出的模型可以在检测不同种类的伪造品方面取得良好的性能。所提出模型的定量性能分析可以以 98.176% 的准确率检测图像伪造。
- PDF论文:https://sci-hub.mksa.top/10.3390/info11050275
DOA-GAN: Dual-Order Attentive Generative Adversarial Network for Image Copy-Move Forgery Detection and Localization
Ashraful Islam, Chengjiang Long, +1 author A. Hoogs
Published 2020
Computer Science
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
-
In this paper, we propose a Generative Adversarial Network with a dual-order attention model to detect and localize copy-move forgeries. In the generator, the first-order attention is designed to capture copy-move location information, and the second-order attention exploits more discriminative features for the patch co-occurrence. Both attention maps are extracted from the affinity matrix and are used to fuse location-aware and co-occurrence features for the final detection and localization branches of the network. The discriminator network is designed to further ensure more accurate localization results. To the best of our knowledge, we are the first to propose such a network architecture with the 1st-order attention mechanism from the affinity matrix.
-
在本文中,我们提出了一个具有双阶注意力模型的生成对抗网络来检测和定位复制移动伪造。在生成器中,一阶注意力旨在捕获复制移动位置信息,而二阶注意力则利用补丁共现的更多判别特征。两个注意力图都是从亲和度矩阵中提取的,用于融合位置感知和共现特征,用于网络的最终检测和定位分支。鉴别器网络旨在进一步确保更准确的定位结果。据我们所知,我们是第一个从亲和度矩阵提出具有一阶注意力机制的网络架构的人。
A hybrid copy-move image forgery detection technique based on Fourier-Mellin and scale invariant feature transforms
K. Meena, Vipin Tyagi
Published 2020
Computer Science
Multimedia Tools and Applications
- In digital images, the most common forgery is copy-move image forgery in which some region(s) of an image is replicated within the image. The copy-move forgery detection (CMFD) techniques fall under two categories; keypoint-based and block-based. The keypoint-based techniques perform well under rotation and scaling but show very poor performance in the case of smooth images. On the contrary, the block-based techniques perform better in smooth images but are comparatively more time demanding. In this paper, a hybrid technique has been proposed by combining the block-based technique using Fourier-Mellin Transform (FMT) and a keypoint-based technique using Scale Invariant Feature Transform (SIFT). In this technique, the input image to be checked for forgery is first divided into texture and smooth regions. Then the keypoints are extracted from the texture part of the image using the SIFT descriptor, and the FMT is applied on the smooth part of the image. Extracted features are then matched to detect the duplicated regions of the image. The experimental results illustrate that the proposed technique performs better in comparison to other state-of-the-art CMFD techniques under various geometric transformations and post-processing operations in reasonable time.
- 在数字图像中,最常见的伪造是复制移动图像伪造,其中图像的某些区域被复制到图像中。复制移动伪造检测 (CMFD) 技术分为两类:基于关键点和基于块。基于关键点的技术在旋转和缩放下表现良好,但在平滑图像的情况下表现非常差。相反,基于块的技术在平滑图像中表现更好,但相对来说需要更多的时间。在本文中,通过将使用傅立叶-梅林变换 (FMT) 的基于块的技术与使用尺度不变特征变换 (SIFT) 的基于关键点的技术相结合,提出了一种混合技术。在这种技术中,要检查伪造的输入图像首先被划分为纹理和平滑区域。然后使用SIFT描述符从图像的纹理部分提取关键点,并对图像的平滑部分应用FMT。然后匹配提取的特征以检测图像的重复区域。实验结果表明,在合理的时间内,在各种几何变换和后处理操作下,与其他最先进的 CMFD 技术相比,所提出的技术性能更好。
A Full-Image Full-Resolution End-to-End-Trainable CNN Framework for Image Forgery Detection
Francesco Marra, Diego Gragnaniello, +1 author G. Poggi
Published 2020
Computer Science
IEEE Access
-
Due to limited computational and memory resources, current deep learning models accept only rather small images in input, calling for preliminary image resizing. This is not a problem for high-level vision problems, where discriminative features are barely affected by resizing. On the contrary, in image forensics, resizing tends to destroy precious high-frequency details, impacting heavily on performance. One can avoid resizing by means of patch-wise processing, at the cost of renouncing whole-image analysis. In this work, we propose a CNN-based image forgery detection framework which makes decisions based on full-resolution information gathered from the whole image. Thanks to gradient checkpointing, the framework is trainable end-to-end with limited memory resources and weak (image-level) supervision, allowing for the joint optimization of all parameters. Experiments on widespread image forensics datasets prove the good performance of the proposed approach, which largely outperforms all baselines and all reference methods.
-
由于计算和内存资源有限,当前的深度学习模型只接受较小的图像输入,需要初步调整图像大小。对于高级视觉问题来说,这不是问题,其中判别特征几乎不受调整大小的影响。相反,在图像取证中,调整大小往往会破坏宝贵的高频细节,严重影响性能。可以通过逐块处理来避免调整大小,代价是放弃整体图像分析。在这项工作中,我们提出了一个基于 CNN 的图像伪造检测框架,该框架基于从整个图像收集的全分辨率信息做出决策。由于梯度检查点,该框架可以在有限的内存资源和弱(图像级)监督的情况下进行端到端的训练,从而允许对所有参数进行联合优化。在广泛的图像取证数据集上的实验证明了所提出方法的良好性能,其在很大程度上优于所有基线和所有参考方法。
An End-to-End Dense-InceptionNet for Image Copy-Move Forgery Detection
Junliu Zhong, Chi-Man Pun
Published 2020
Computer Science
IEEE Transactions on Information Forensics and Security
- A novel image copy-move forgery detection scheme using a Dense-InceptionNet is proposed in this paper. Dense-InceptionNet is an end-to-end, multi-dimensional dense-feature connection, Deep Neural Network (DNN). It is the first DNN model to autonomously learn the feature correlations and search the possible forgery snippets through the matching clues. The proposed Dense-InceptionNet consists of Pyramid Feature Extractor (PFE), Feature Correlation Matching (FCM), and Hierarchical Post-Processing (HPP) modules. The PFE module is proposed to extract multi-dimensional and multi-scale dense-features. The features of each layer in this extractor module are directly connected to the preceding layers. The FCM module is proposed to learn the high correlations of deep features and obtain three candidate matching maps. Finally, the HPP module which makes use of three matching maps to obtain a combination of cross-entropies is amenable to better training via backpropagation. Experiments demonstrate that the efficiency of the proposed Dense-InceptionNet is much better than the other state-of-the-art methods while achieving the relative best performance against most known attacks.
- 本文提出了一种使用 Dense-InceptionNet 的新型图像复制-移动伪造检测方案。 Dense-InceptionNet 是一个端到端的多维密集特征连接,深度神经网络 (DNN)。它是第一个自主学习特征相关性并通过匹配线索搜索可能的伪造片段的 DNN 模型。提议的 Dense-InceptionNet 由金字塔特征提取器 (PFE)、特征相关匹配 (FCM) 和分层后处理 (HPP) 模块组成。提出了 PFE 模块来提取多维和多尺度的密集特征。这个提取器模块中每一层的特征都直接连接到前面的层。提出FCM模块来学习深度特征的高相关性并获得三个候选匹配图。最后,利用三个匹配图来获得交叉熵组合的 HPP 模块可以通过反向传播进行更好的训练。实验表明,所提出的 Dense-InceptionNet 的效率远高于其他最先进的方法,同时针对大多数已知攻击实现了相对最佳的性能。
Fast and Effective Image Copy-Move Forgery Detection via Hierarchical Feature Point Matching
Yuanman Li, Jiantao Zhou
Published 2019
Computer Science
IEEE Transactions on Information Forensics and Security
- Copy-move forgery is one of the most commonly used manipulations for tampering digital images. Keypoint-based detection methods have been reported to be very effective in revealing copy-move evidence due to their robustness against various attacks, such as large-scale geometric transformations. However, these methods fail to handle the cases when copy-move forgeries only involve small or smooth regions, where the number of keypoints is very limited. To tackle this challenge, we propose a fast and effective copy-move forgery detection algorithm through hierarchical feature point matching. We first show that it is possible to generate a sufficient number of keypoints that exist even in small or smooth regions by lowering the contrast threshold and rescaling the input image. We then develop a novel hierarchical matching strategy to solve the keypoint matching problems over a massive number of keypoints. To reduce the false alarm rate and accurately localize the tampered regions, we further propose a novel iterative localization technique by exploiting the robustness properties (including the dominant orientation and the scale information) and the color information of each keypoint. Extensive experimental results are provided to demonstrate the superior performance of our proposed scheme in terms of both efficiency and accuracy.
- 复制移动伪造是篡改数字图像最常用的操作之一。据报道,基于关键点的检测方法在揭示复制移动证据方面非常有效,因为它们对各种攻击(例如大规模几何变换)具有鲁棒性。然而,这些方法无法处理复制移动伪造仅涉及小区域或平滑区域的情况,其中关键点的数量非常有限。为了应对这一挑战,我们提出了一种通过分层特征点匹配快速有效的复制移动伪造检测算法。我们首先表明,通过降低对比度阈值和重新缩放输入图像,即使在小区域或平滑区域中也可以生成足够数量的关键点。然后,我们开发了一种新颖的分层匹配策略来解决大量关键点上的关键点匹配问题。为了降低误报率并准确定位篡改区域,我们通过利用每个关键点的鲁棒性属性(包括主导方向和尺度信息)和颜色信息,进一步提出了一种新颖的迭代定位技术。提供了广泛的实验结果,以证明我们提出的方案在效率和准确性方面的优越性能。
RRU-Net: The Ringed Residual U-Net for Image Splicing Forgery Detection
Xiuli Bi, Yan Wei, +1 author Weisheng Li
Published 2019
Computer Science
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
- In this paper, we propose a ringed residual U-Net (RRU-Net) for image splicing forgery detection. The proposed RRU-Net is an end-to-end image essence attribute segmentation network, which is independent of human visual system, it can accomplish the forgery detection without any preprocessing and post-processing. The core idea of the proposed RRU-Net is to strengthen the learning way of CNN, which is inspired by the recall and the consolidation mechanism of the human brain and implemented by the propagation and the feedback process of the residual in CNN. The residual propagation recalls the input feature information to solve the gradient degradation problem in the deeper network; the residual feedback consolidates the input feature information to make the differences of image attributes between the un-tampered and tampered regions be more obvious.
- 在本文中,我们提出了一种用于图像拼接伪造检测的环形残差 U-Net (RRU-Net)。所提出的RRU-Net是一种端到端的图像本质属性分割网络,它独立于人类视觉系统,无需任何预处理和后处理即可完成伪造检测。提出的RRU-Net的核心思想是加强CNN的学习方式,其灵感来自于人脑的召回和巩固机制,通过CNN中残差的传播和反馈过程实现。残差传播召回输入特征信息,解决更深网络中的梯度退化问题;残差反馈整合了输入的特征信息,使未篡改和篡改区域之间的图像属性差异更加明显。
ARGAN: Attentive Recurrent Generative Adversarial Network for Shadow Detection and Removal
Bin Ding, Chengjiang Long, +1 author Chunxia Xiao
Published 2019
Computer Science
2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- In this paper we propose an attentive recurrent generative adversarial network (ARGAN) to detect and remove shadows in an image. The generator consists of multiple progressive steps. At each step a shadow attention detector is firstly exploited to generate an attention map which specifies shadow regions in the input image.Given the attention map, a negative residual by a shadow remover encoder will recover a shadow-lighter or even a shadow-free image. A discriminator is designed to classify whether the output image in the last progressive step is real or fake. Moreover, ARGAN is suitable to be trained with a semi-supervised strategy to make full use of sufficient unsupervised data. The experiments on four public datasets have demonstrated that our ARGAN is robust to detect both simple and complex shadows and to produce more realistic shadow removal results. It outperforms the state-of-the-art methods, especially in detail of recovering shadow areas.
- 在本文中,我们提出了一种细心的循环生成对抗网络(ARGAN)来检测和去除图像中的阴影。生成器由多个渐进步骤组成。在每一步,首先利用阴影注意力检测器来生成注意力图,该图指定输入图像中的阴影区域。给定注意力图,阴影去除器编码器的负残差将恢复阴影较亮甚至无阴影的图像.鉴别器旨在对最后一个渐进步骤中的输出图像是真还是假进行分类。此外,ARGAN 适合采用半监督策略进行训练,以充分利用足够的无监督数据。在四个公共数据集上的实验表明,我们的 ARGAN 能够检测简单和复杂的阴影并产生更逼真的阴影去除结果。它优于最先进的方法,尤其是在恢复阴影区域的细节方面。
Image Copy-Move Forgery Detection using Color Features and Hierarchical Feature Point Matching
Yi-Lin Tsai, Jin-Jang Leou
Published in IMPROVE 2021
Computer Science
- In this study, an image copy-move forgery detection approach using color features and hierarchical feature point matching is proposed. The proposed approach contains three main stages, namely, pre-processing and feature extraction, hierarchical feature point matching, and iterative forgery localization and post-processing. In the proposed approach, Gaussian-blurred images and difference of Gaussians (DoG) images are constructed. Hierarchical feature point matching is employed to find matched feature point pairs, in which two matching strategies, namely, group matching via scale clustering and group matching via overlapped gray level clustering, are used. Based on the experimental results obtained in this study, the performance of the proposed approach is better than those of three comparison approaches.
- 在这项研究中,提出了一种使用颜色特征和分层特征点匹配的图像复制移动伪造检测方法。所提出的方法包含三个主要阶段,即预处理和特征提取、分层特征点匹配以及迭代伪造定位和后处理。在所提出的方法中,构建了高斯模糊图像和高斯差分(DoG)图像。采用层次特征点匹配寻找匹配的特征点对,采用尺度聚类的组匹配和重叠灰度聚类的组匹配两种匹配策略。根据本研究中获得的实验结果,所提出的方法的性能优于三种比较方法。
TBNet: Two-Stream Boundary-aware Network for Generic Image Manipulation Localization
Zan Gao, Chao Sun, Zhiyong Cheng, Weili Guan, Anan Liu, Meng Wang less
Published 2021
Computer Science
ArXiv
- Finding tampered regions in images is a hot research topic in machine learning and computer vision. Although many image manipulation location algorithms have been proposed, most of them only focus on the RGB images with different color spaces, and the frequency information that contains the potential tampering clues is often ignored. Moreover, among the manipulation operations, splicing and copy-move are two frequently used methods, but as their characteristics are quite different, specific methods have been individually designed for detecting the operations of either splicing or copy-move, and it is very difficult to widely apply these methods in practice. To solve these issues, in this work, a novel end-to-end two-stream boundary-aware network (abbreviated as TBNet) is proposed for generic image manipulation localization in which the RGB stream, the frequency stream, and the boundary artifact location are explored in a unified framework. Specifically, we first design an adaptive frequency selection module (AFS) to adaptively select the appropriate frequency to mine inconsistent statistics and eliminate the interference of redundant statistics. Then, an adaptive cross-attention fusion module (ACF) is proposed to adaptively fuse the RGB feature and the frequency feature. Finally, the boundary artifact location network (BAL) is designed to locate the boundary artifacts for which the parameters are jointly updated by the outputs of the ACF, and its results are further fed into the decoder. Thus, the parameters of the RGB stream, the frequency stream, and the boundary artifact location network are jointly optimized, and their latent complementary relationships are fully mined. The results of extensive experiments performed on four public benchmarks of the image manipulation localization task, namely, CASIA1.0, COVER, Carvalho, and InThe-Wild, demonstrate that the proposed TBNet can significantly outperform state-of-the-art generic image manipulation localization methods in terms of both MCC and F1 while maintaining robustness with respect to various attacks.
- 在图像中寻找篡改区域是机器学习和计算机视觉领域的一个热门研究课题。尽管已经提出了许多图像处理定位算法,但大多数只关注具有不同颜色空间的RGB图像,而往往忽略包含潜在篡改线索的频率信息。而且,在操作操作中,拼接和复制移动是两种经常使用的方法,但由于它们的特点大不相同,所以单独设计了具体的方法来检测拼接或复制移动的操作,很难做到在实践中广泛应用这些方法。为了解决这些问题,在这项工作中,提出了一种新颖的端到端双流边界感知网络(缩写为 TBNet)用于通用图像处理定位,其中 RGB 流、频率流和边界伪影定位在一个统一的框架中进行探索。具体来说,我们首先设计了一个自适应频率选择模块(AFS)来自适应地选择合适的频率来挖掘不一致的统计数据并消除冗余统计数据的干扰。然后,提出了自适应交叉注意融合模块(ACF)来自适应地融合RGB特征和频率特征。最后,边界伪影定位网络(BAL)旨在定位边界伪影,其参数由 ACF 的输出联合更新,其结果进一步馈入解码器。因此,RGB流、频率流和边界伪影定位网络的参数被联合优化,充分挖掘了它们潜在的互补关系。在图像处理定位任务的四个公共基准(即 CASIA1.0、COVER、Carvalho 和 InThe-Wild)上进行的大量实验结果表明,所提出的 TBNet 可以显着优于最先进的通用图像处理MCC 和 F1 方面的定位方法,同时保持对各种攻击的鲁棒性。
PSCC-Net: Progressive Spatio-Channel Correlation Network for Image Manipulation Detection and Localization
Xiaohong Liu, Yao-Kai Liu, +1 author Xiaoming Liu
Published 2021
Computer Science
ArXiv
- To defend against manipulation of image content, such as splicing, copy-move, and removal, we develop a Progressive Spatio-Channel Correlation Network (PSCC-Net) to detect and localize image manipulations. PSCC-Net processes the image in a two-path procedure: a top-down path that extracts local and global features and a bottom-up path that detects whether the input image is manipulated, and estimates its manipulation masks at 4 scales, where each mask is conditioned on the previous one. Different from the conventional encoder-decoder and no-pooling structures, PSCC-Net leverages features at different scales with dense cross-connections to produce manipulation masks in a coarse-to-fine fashion. Moreover, a Spatio-Channel Correlation Module (SCCM) captures both spatial and channel-wise correlations in the bottom-up path, which endows features with holistic cues, enabling the network to cope with a wide range of manipulation attacks. Thanks to the light-weight backbone and progressive mechanism, PSCC-Net can process 1, 080P images at 50+ FPS. Extensive experiments demonstrate the superiority of PSCC-Net over the state-of-the-art methods on both detection and localization.
- 为了防止对图像内容进行操作,例如拼接、复制移动和删除,我们开发了一个渐进式空间通道相关网络 (PSCC-Net) 来检测和定位图像操作。 PSCC-Net 以两路程序处理图像:自上而下的路径提取局部和全局特征,自下而上的路径检测输入图像是否被操纵,并在 4 个尺度上估计其操纵掩码,其中每个mask 以前一个为条件。与传统的编码器-解码器和无池化结构不同,PSCC-Net 利用具有密集交叉连接的不同尺度的特征以从粗到细的方式生成操作掩码。此外,空间通道相关模块 (SCCM) 捕获自下而上路径中的空间和通道相关性,这赋予特征以整体线索,使网络能够应对广泛的操纵攻击。得益于轻量级主干和渐进机制,PSCC-Net 可以以 50+ FPS 的速度处理 1080P 图像。大量实验证明了 PSCC-Net 在检测和定位方面优于最先进的方法。
TransForensics: Image Forgery Localization with Dense Self-Attention
Jing Hao, Zhixin Zhang, +2 authors Shiliang Pu
Published 2021
Computer Science
ArXiv
- To tackle this challenging problem, we introduce TransForensics, a novel image forgery localization method inspired by Transformers. The two major components in our framework are dense self-attention encoders and dense correction modules. The former is to model global context and all pairwise interactions between local patches at different scales, while the latter is used for improving the transparency of the hidden layers and correcting the outputs from different branches. Compared to previous traditional and deep learning methods, TransForensics not only can capture discriminative representations and obtain high-quality mask predictions but is also not limited by tampering types and patch sequence orders. By conducting experiments on main benchmarks, we show that TransForensics outperforms the stateof-the-art methods by a large margin.
- 为了解决这个具有挑战性的问题,我们引入了 TransForensics,这是一种受 Transformers 启发的新型图像伪造定位方法。我们框架中的两个主要组件是密集的自注意力编码器和密集的校正模块。前者用于对全局上下文和不同尺度的局部补丁之间的所有成对交互进行建模,而后者用于提高隐藏层的透明度并校正来自不同分支的输出。与之前的传统和深度学习方法相比,TransForensics 不仅可以捕获判别式表示并获得高质量的掩码预测,而且不受篡改类型和补丁序列顺序的限制。通过在主要基准上进行实验,我们表明 TransForensics 大大优于最先进的方法。
GAN-Generated Image Detection With Self-Attention Mechanism Against GAN Generator Defect
Zhongjie Mi, Xinghao Jiang, Tanfeng Sun, Ke Xu less
Published 2020
Computer Science
IEEE Journal of Selected Topics in Signal Processing
- With Generative adversarial networks (GAN) achieving realistic image generation, fake image detection research has become an imminent need. In this paper, a novel detection algorithm is designed to exploit the structural defect in GAN, taking advantage of the most vulnerable link in GAN generators – the up-sampling process conducted by the Transposed Convolution operation. The Transposed Convolution in the process will cause the lack of global information in the generated images. Therefore, the Self-Attention mechanism is adopted correspondingly, equipping the algorithm with a much better comprehension of the global information than the other current work adopting pure CNN network, which is reflected in the significant increase in the detection accuracy. With the thorough comparison to the current work and corresponding careful analysis, it is verified that our proposed algorithm outperforms other current works in the field. Also, with experiments conducted on other image-generation categories and images undergone usual real-life post-processing methods, our proposed algorithm shows decent robustness for various categories of images under different reality circumstances, rather than restricted by image types and pure laboratory situation.
- 随着生成对抗网络(GAN)实现逼真的图像生成,假图像检测研究已成为迫在眉睫的需求。在本文中,设计了一种新的检测算法来利用 GAN 中的结构缺陷,利用 GAN 生成器中最脆弱的环节——转置卷积操作进行的上采样过程。过程中的转置卷积会导致生成的图像缺乏全局信息。因此,相应地采用了Self-Attention机制,使算法比目前采用纯CNN网络的其他工作更好地理解全局信息,这体现在检测精度的显着提高上。通过与当前工作的彻底比较和相应的仔细分析,验证了我们提出的算法优于该领域的其他当前工作。此外,通过对其他图像生成类别和经过通常现实生活后处理方法的图像进行的实验,我们提出的算法在不同的现实环境下对各种类别的图像显示出不错的鲁棒性,而不受图像类型和纯实验室情况的限制。
GRAFT: Unsupervised Adaptation to Resizing for Detection of Image Manipulation
Ludovic Darmet, K. Wang, F. Cayre
Published 2020
Computer Science
IEEE Access
- A large number of methods for forensics of image manipulation relies on detecting fingerprints in residuals or noises. Therefore, these detection methods are bound to be sensitive to noise generated by the image acquisition process, as well as any pre-processing. We show that a difference in pre-processing pipelines between training and testing sets induces performance losses for various classifiers. We focus on a particular pre-processing: resizing. It corresponds to a typical scenario where images may be resized (e.g., downscaled to reduce storage) prior to being manipulated. This performance loss due to pre-resizing could be troublesome but has been rarely investigated in the image forensics field. We propose a new and effective adaptation method for one state-of-the-art image manipulation detection pipeline, and we call our proposed method Gaussian mixture model Resizing Adaptation by Fine-Tuning (GRAFT). Adaptation is performed in an unsupervised fashion, i.e., without using any ground-truth label in the pre-resized testing domain, for the detection of image manipulation on very small patches. Experimental results show that the proposed GRAFT method can effectively improve the detection accuracy in this challenging scenario of unsupervised adaptation to resizing pre-processing.
- 大量图像处理取证方法依赖于检测残差或噪声中的指纹。因此,这些检测方法必然对图像采集过程以及任何预处理产生的噪声敏感。我们表明,训练集和测试集之间预处理管道的差异会导致各种分类器的性能损失。我们专注于特定的预处理:调整大小。它对应于在处理图像之前可以调整图像大小(例如,缩小以减少存储)的典型场景。由于预调整大小而导致的这种性能损失可能很麻烦,但在图像取证领域很少被调查。我们为一个最先进的图像处理检测管道提出了一种新的有效适应方法,我们将我们提出的方法称为高斯混合模型通过微调调整大小适应 (GRAFT)。适应以无监督的方式执行,即在预先调整大小的测试域中不使用任何真实标签,以检测非常小的补丁上的图像处理。实验结果表明,在这种无监督适应调整大小预处理的挑战性场景中,所提出的 GRAFT 方法可以有效提高检测精度。
典中典Paper
Self-Attention Generative Adversarial Networks
Han Zhang, I. Goodfellow, +1 author Augustus Odena
Published in ICML 2019
Computer Science, Mathematics
- In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks. Traditional convolutional GANs generate high-resolution details as a function of only spatially local points in lower-resolution feature maps. In SAGAN, details can be generated using cues from all feature locations. Moreover, the discriminator can check that highly detailed features in distant portions of the image are consistent with each other. Furthermore, recent work has shown that generator conditioning affects GAN performance. Leveraging this insight, we apply spectral normalization to the GAN generator and find that this improves training dynamics. The proposed SAGAN achieves the state-of-the-art results, boosting the best published Inception score from 36.8 to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the challenging ImageNet dataset. Visualization of the attention layers shows that the generator leverages neighborhoods that correspond to object shapes rather than local regions of fixed shape.
- 在本文中,我们提出了自注意力生成对抗网络(SAGAN),它允许对图像生成任务进行注意力驱动的远程依赖建模。传统的卷积 GAN 仅根据低分辨率特征图中的空间局部点生成高分辨率细节。在 SAGAN 中,可以使用来自所有特征位置的线索生成细节。此外,鉴别器可以检查图像远处部分的高度详细特征是否相互一致。此外,最近的工作表明,生成器调节会影响 GAN 的性能。利用这一见解,我们将谱归一化应用于 GAN 生成器,并发现这可以改善训练动态。所提出的 SAGAN 实现了最先进的结果,在具有挑战性的 ImageNet 数据集上将发布的最佳 Inception 分数从 36.8 提高到 52.52,并将 Frechet Inception 距离从 27.62 降低到 18.65。注意层的可视化表明,生成器利用与对象形状相对应的邻域,而不是固定形状的局部区域。
Generative Adversarial Nets
I. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, Yoshua Bengio less
Published in NIPS 2014
Computer Science
- We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to ½ everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.
- 我们提出了一种通过对抗性过程估计生成模型的新框架,其中我们同时训练两个模型:捕获数据分布的生成模型 G 和估计样本来自训练数据的概率的判别模型 D G 的训练过程是最大化 D 犯错的概率。这个框架对应于一个极小极大的两人游戏。在任意函数 G 和 D 的空间中,存在唯一解,其中 G 恢复训练数据分布,D 处处等于 ½。在 G 和 D 由多层感知器定义的情况下,整个系统可以通过反向传播进行训练。在训练或样本生成期间,不需要任何马尔可夫链或展开的近似推理网络。实验通过对生成的样本进行定性和定量评估来证明该框架的潜力。
GAN对抗样本
Making GAN-Generated Images Difficult To Spot: A New Attack Against Synthetic Image Detectors
Xinwei Zhao, Matthew C. Stamm
Published 2021
Computer Science, Engineering
ArXiv
- In this paper, we propose a new anti-forensic attack capable of fooling GAN-generated image detectors. Our attack uses an adversarially trained generator to synthesize traces that these detectors associate with real images. Furthermore, we propose a technique to train our attack so that it can achieve transferability, i.e. it can fool unknown CNNs that it was not explicitly trained against. We demonstrate the performance of our attack through an extensive set of experiments, where we show that our attack can fool eight state-of-the-art detection CNNs with synthetic images created using seven different GANs.
- 在本文中,我们提出了一种新的反取证攻击,能够欺骗 GAN 生成的图像检测器。我们的攻击使用经过对抗训练的生成器来合成这些检测器与真实图像相关联的痕迹。此外,我们提出了一种训练我们的攻击的技术,以便它可以实现可转移性,即它可以欺骗未明确训练的未知 CNN。我们通过一组广泛的实验证明了我们的攻击的性能,在这些实验中,我们表明我们的攻击可以利用使用七个不同 GAN 创建的合成图像来欺骗八个最先进的检测 CNN。