损失函数SSIM (structural similarity index) 的PyTorch实现

SSIM介绍

结构相似性指数(structural similarity index,SSIM), 出自参考文献[1],用于度量两幅图像间的结构相似性。和被广泛采用的L2 loss不同,SSIM和人类的视觉系统(HVS)类似,对局部结构变化的感知敏感。

SSIM分为三个部分:照明度、对比度、结构,分别如下公式所示:

 

 

 

 

 

 

将上面三个式子汇总到一起就是SSIM:

 

 

 其中,上式各符号分别为图像x和y的均值、方差和它们的协方差,显而易见,不赘述.   ,  , 为常数。

一般默认\scriptstyle k_{1}=0.01\scriptstyle k_{2}=0.03. L为像素值的动态范围,如8-bit深度的图像的L值为2^8-1=255.

更详细的说明可以参考维基百科[2].

Pytorch实现
SSIM值越大代表图像越相似,当两幅图像完全相同时,SSIM=1。所以作为损失函数时,应该要取负号,例如采用 loss = 1 - SSIM 的形式。由于PyTorch实现了自动求导机制,因此我们只需要实现SSIM loss的前向计算部分即可,不用考虑求导。(具体的求导过程可以参考文献[3])

以下是代码实现,来源于github [4].

复制代码
  1 import torch
  2 import torch.nn.functional as F
  3 from math import exp
  4 import numpy as np
  5  
  6  
  7 # 计算一维的高斯分布向量
  8 def gaussian(window_size, sigma):
  9     gauss = torch.Tensor([exp(-(x - window_size//2)**2/float(2*sigma**2)) for x in range(window_size)])
 10     return gauss/gauss.sum()
 11  
 12  
 13 # 创建高斯核,通过两个一维高斯分布向量进行矩阵乘法得到
 14 # 可以设定channel参数拓展为3通道
 15 def create_window(window_size, channel=1):
 16     _1D_window = gaussian(window_size, 1.5).unsqueeze(1)
 17     _2D_window = _1D_window.mm(_1D_window.t()).float().unsqueeze(0).unsqueeze(0)
 18     window = _2D_window.expand(channel, 1, window_size, window_size).contiguous()
 19     return window
 20  
 21  
 22 # 计算SSIM
 23 # 直接使用SSIM的公式,但是在计算均值时,不是直接求像素平均值,而是采用归一化的高斯核卷积来代替。
 24 # 在计算方差和协方差时用到了公式Var(X)=E[X^2]-E[X]^2, cov(X,Y)=E[XY]-E[X]E[Y].
 25 # 正如前面提到的,上面求期望的操作采用高斯核卷积代替。
 26 def ssim(img1, img2, window_size=11, window=None, size_average=True, full=False, val_range=None):
 27     # Value range can be different from 255. Other common ranges are 1 (sigmoid) and 2 (tanh).
 28     if val_range is None:
 29         if torch.max(img1) > 128:
 30             max_val = 255
 31         else:
 32             max_val = 1
 33  
 34         if torch.min(img1) < -0.5:
 35             min_val = -1
 36         else:
 37             min_val = 0
 38         L = max_val - min_val
 39     else:
 40         L = val_range
 41  
 42     padd = 0
 43     (_, channel, height, width) = img1.size()
 44     if window is None:
 45         real_size = min(window_size, height, width)
 46         window = create_window(real_size, channel=channel).to(img1.device)
 47  
 48     mu1 = F.conv2d(img1, window, padding=padd, groups=channel)
 49     mu2 = F.conv2d(img2, window, padding=padd, groups=channel)
 50  
 51     mu1_sq = mu1.pow(2)
 52     mu2_sq = mu2.pow(2)
 53     mu1_mu2 = mu1 * mu2
 54  
 55     sigma1_sq = F.conv2d(img1 * img1, window, padding=padd, groups=channel) - mu1_sq
 56     sigma2_sq = F.conv2d(img2 * img2, window, padding=padd, groups=channel) - mu2_sq
 57     sigma12 = F.conv2d(img1 * img2, window, padding=padd, groups=channel) - mu1_mu2
 58  
 59     C1 = (0.01 * L) ** 2
 60     C2 = (0.03 * L) ** 2
 61  
 62     v1 = 2.0 * sigma12 + C2
 63     v2 = sigma1_sq + sigma2_sq + C2
 64     cs = torch.mean(v1 / v2)  # contrast sensitivity
 65  
 66     ssim_map = ((2 * mu1_mu2 + C1) * v1) / ((mu1_sq + mu2_sq + C1) * v2)
 67  
 68     if size_average:
 69         ret = ssim_map.mean()
 70     else:
 71         ret = ssim_map.mean(1).mean(1).mean(1)
 72  
 73     if full:
 74         return ret, cs
 75     return ret
 76  
 77  
 78  
 79 # Classes to re-use window
 80 class SSIM(torch.nn.Module):
 81     def __init__(self, window_size=11, size_average=True, val_range=None):
 82         super(SSIM, self).__init__()
 83         self.window_size = window_size
 84         self.size_average = size_average
 85         self.val_range = val_range
 86  
 87         # Assume 1 channel for SSIM
 88         self.channel = 1
 89         self.window = create_window(window_size)
 90  
 91     def forward(self, img1, img2):
 92         (_, channel, _, _) = img1.size()
 93  
 94         if channel == self.channel and self.window.dtype == img1.dtype:
 95             window = self.window
 96         else:
 97             window = create_window(self.window_size, channel).to(img1.device).type(img1.dtype)
 98             self.window = window
 99             self.channel = channel
100  
101         return ssim(img1, img2, window=window, window_size=self.window_size, size_average=self.size_average)
复制代码

参考来源
[1] Wang Z, Bovik A C, Sheikh H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE transactions on image processing, 2004, 13(4): 600-612.

[2] https://en.wikipedia.org/wiki/Structural_similarity

[3] Zhao H, Gallo O, Frosio I, et al. Loss functions for neural networks for image processing[J]. arXiv preprint arXiv:1511.08861, 2015.

[4] https://github.com/jorge-pessoa/pytorch-msssim
[5] 本文转自 https://blog.csdn.net/hyk_1996/article/details/87867285

posted @   咖啡陪你  阅读(7094)  评论(0编辑  收藏  举报
编辑推荐:
· 一个奇形怪状的面试题:Bean中的CHM要不要加volatile?
· [.NET]调用本地 Deepseek 模型
· 一个费力不讨好的项目,让我损失了近一半的绩效!
· .NET Core 托管堆内存泄露/CPU异常的常见思路
· PostgreSQL 和 SQL Server 在统计信息维护中的关键差异
阅读排行:
· DeepSeek “源神”启动!「GitHub 热点速览」
· 我与微信审核的“相爱相杀”看个人小程序副业
· 微软正式发布.NET 10 Preview 1:开启下一代开发框架新篇章
· C# 集成 DeepSeek 模型实现 AI 私有化(本地部署与 API 调用教程)
· spring官宣接入deepseek,真的太香了~
点击右上角即可分享
微信分享提示