四种比较简单的图像显著性区域特征提取方法原理及实现-----> AC/HC/LC/FT。

laviewpbt  2014.8.4 编辑

  1、 LC算法

  参考论文:Visual Attention Detection in Video Sequences Using Spatiotemporal Cues。 Yun Zhai and Mubarak Shah.  Page 4-5。 


      When viewers watch a video sequence, they are attracted not only by the interesting events, but also sometimes by the interesting objects in still images. This is referred as the spatial attention. Based on the psychological studies, human perception system is sensitive to the contrast of visual signals, such as color, intensity and texture. Taking this as the underlying assumption, we propose an e±cient method for computing the spatial saliency maps using the color statistics of images. The algorithm is designed with a linear computational complexity with respect to the number of image pixels. The saliency map of an image is built upon the color contrast between image pixels. The saliency value of a pixel Ik in an image I is defined as,


     where the value of Ii is in the range of [0; 255], and || * ||represent the color distance metric。





extern void Normalize(float *DistMap, unsigned char *SaliencyMap, int Width, int Height, int Stride, int Method = 0);

/// <summary>
/// 实现功能: 基于SPATIAL ATTENTION MODEL的图像显著性检测
///    参考论文: Visual Attention Detection in Video Sequences Using Spatiotemporal Cues。 Yun Zhai and Mubarak Shah.  Page 4-5。
///    整理时间: 2014.8.2
/// </summary>
/// <param name="Src">需要进行检测的图像数据,只支持24位图像。</param>
/// <param name="SaliencyMap">输出的显著性图像,也是24位的。</param>
/// <param name="Width">输入的彩色数据的对应的灰度数据。</param>
/// <param name="Height">输入图像数据的高度。</param>
/// <param name="Stride">图像的扫描行大小。</param>
/// <remarks> 基于像素灰度值进行的统计。</remarks>

void __stdcall SalientRegionDetectionBasedonLC(unsigned char *Src, unsigned char *SaliencyMap, int Width, int Height, int Stride)
    int X, Y, Index, CurIndex ,Value;
    unsigned char *Gray = (unsigned char*)malloc(Width * Height);
    int *Dist = (int *)malloc(256 * sizeof(int));
    int *HistGram = (int *)malloc(256 * sizeof(int));
    float *DistMap = (float *) malloc(Height * Width * sizeof(float));

    memset(HistGram, 0, 256 * sizeof(int));

    for (Y = 0; Y < Height; Y++)
        Index = Y * Stride;
        CurIndex = Y * Width;
        for (X = 0; X < Width; X++)
            Value = (Src[Index] + Src[Index + 1] * 2 + Src[Index + 2]) / 4;        //    保留灰度值,以便不需要重复计算
            HistGram[Value] ++;
            Gray[CurIndex] = Value;
            Index += 3;
            CurIndex ++;

    for (Y = 0; Y < 256; Y++)
        Value = 0;
        for (X = 0; X < 256; X++) 
            Value += abs(Y - X) * HistGram[X];                //    论文公式(9),灰度的距离只有绝对值,这里其实可以优化速度,但计算量不大,没必要了
        Dist[Y] = Value;
    for (Y = 0; Y < Height; Y++)
        CurIndex = Y * Width;
        for (X = 0; X < Width; X++)
            DistMap[CurIndex] = Dist[Gray[CurIndex]];        //    计算全图每个像素的显著性
            CurIndex ++;

    Normalize(DistMap, SaliencyMap, Width, Height, Stride);    //    归一化图像数据








   参考论文: 2011 CVPR Global Contrast based salient region detection Ming-Ming Cheng

  这篇论文有相关代码可以直接下载的,不过需要向作者索取解压密码 ,有pudn账号的朋友可以直接在pudn上下载,不过那个下载的代码是用 opencv的低版本写的,下载后需要自己配置后才能运行,并且似乎只有前一半能运行(显著性检测部分)。


      在本质上,HC和上面的LC没有区别,但是HC考虑了彩色信息,而不是像LC那样只用像素的灰度信息,由于彩色图像最多有256*256*256种颜色,因此直接基于直方图技术的方案不太可行了。但是实际上一幅彩色图像并不会用到那么多种颜色,因此,作者提出了降低颜色数量的方案,将RGB各分量分别映射成12等份,则隐射后的图最多只有12*12*12种颜色,这样就可以构造一个较小的直方图用来加速,但是由于过渡量化会对结果带来一定的瑕疵。因此作者又用了一个平滑的过程。 最后和LC不同的是,作者的处理时在Lab空间进行的,而由于Lab空间和RGB并不是完全对应的,其量化过程还是在RGB空间完成的。



    原图:共有64330种颜色                    色调分离                    结果图:共有1143种颜色




               原图:172373种颜色                              结果图:共有1143种颜色


      在作者的附带代码中,有这个算法的实现,我只随便看了下,觉得写的比较复杂, 于是我自己构思了自己的想法。


      那么这个处理的第一步就是找到彩色图像的中最具有代表性的颜色值,这个过程可以用8叉树实现,或者用高4位等方式获取。 第二,就是在量化的过程中必须采用相关的抖动技术,比如ordered dither或者FloydSteinberg error diffuse等。更进一步,可以超越8位索引的概念,可以实现诸如大于256的调色板,1024或者4096都是可以的,但是这将稍微加大计算量以及编码的复杂度。我就采用256种颜色的方式。量化的结果如下图:


        原图:172373种颜色                                  结果图:共有256种颜色





/// <summary>
/// 实现功能: 基于全局对比度的图像显著性检测
///    参考论文: 2011 CVPR Global Contrast based salient region detection  Ming-Ming Cheng
///               http://mmcheng.net/salobj/
///    整理时间: 2014.8.3
/// </summary>
/// <param name="Src">需要进行检测的图像数据,只支持24位图像。</param>
/// <param name="SaliencyMap">输出的显著性图像,也是24位的。</param>
/// <param name="Width">输入的彩色数据的对应的灰度数据。</param>
/// <param name="Height">输入图像数据的高度。</param>
/// <param name="Stride">图像的扫描行大小。</param>
///    <remarks> 在Lab空间进行的处理,使用了整形的LAB转换,采用抖动技术将图像颜色总数量降低为256种,在利用直方图计算出显著性查找表,最后采用高斯模糊降低量化后的颗粒感。</remarks>

void __stdcall SalientRegionDetectionBasedonHC(unsigned char *Src, unsigned char *SaliencyMap, int Width, int Height, int Stride)
    int X, Y, XX, YY, Index, Fast, CurIndex;
    int FitX, FitY, FitWidth, FitHeight;
    float Value;
    unsigned char *Lab = (unsigned char *) malloc(Height * Stride);
    unsigned char *Mask = (unsigned char *) malloc(Height * Width);
    float *DistMap = (float *) malloc(Height * Width * sizeof(float));
    float *Dist = (float *)malloc(256 * sizeof(float));
    int *HistGram = (int *)malloc(256 * sizeof(int));

    GetBestFitInfoEx(Width, Height, 256, 256, FitX, FitY, FitWidth, FitHeight);
    unsigned char *Sample = (unsigned char *) malloc(FitWidth * FitHeight * 3);

    for (Y = 0; Y < Height; Y++)
        RGBToLAB(Src + Y * Stride, Lab + Y * Stride, Width);

    Resample (Lab, Width, Height, Stride, Sample, FitWidth, FitHeight, FitWidth * 3, 0);    //    最近邻插值

    RGBQUAD *Palette = ( RGBQUAD *)malloc( 256 * sizeof(RGBQUAD));
    GetOptimalPalette(Sample, FitWidth, FitHeight, FitWidth * 3, 256, Palette);

    ErrorDiffusionFloydSteinberg(Lab, Mask, Width, Height, Stride, Palette, true);            //    先把图像信息量化到较少的范围内,这里量化到256种彩色

    memset(HistGram, 0, 256 * sizeof(int));

    for (Y = 0; Y < Height; Y++)
        CurIndex = Y * Width;
        for (X = 0; X < Width; X++)
            HistGram[Mask[CurIndex]] ++;
            CurIndex ++;

    for (Y = 0; Y < 256; Y++)                                // 采用类似LC的方式进行显著性计算
        Value = 0;
        for (X = 0; X < 256; X++) 
            Value += sqrt((Palette[Y].rgbBlue - Palette[X].rgbBlue)*(Palette[Y].rgbBlue - Palette[X].rgbBlue) + (Palette[Y].rgbGreen- Palette[X].rgbGreen)*(Palette[Y].rgbGreen - Palette[X].rgbGreen) + (Palette[Y].rgbRed- Palette[X].rgbRed)*(Palette[Y].rgbRed - Palette[X].rgbRed)+ 0.0 )  * HistGram[X];
        Dist[Y] = Value;

    for (Y = 0; Y < Height; Y++)
        CurIndex = Y * Width;
        for (X = 0; X < Width; X++)
            DistMap[CurIndex] = Dist[Mask[CurIndex]];
            CurIndex ++;

    Normalize(DistMap, SaliencyMap, Width, Height, Stride);                //    归一化图像数据

    GuassBlur(SaliencyMap, Width, Height, Stride, 1);                    //    最后做个模糊以消除分层的现象



          原图                HC结果,用时20ms            直接实现:150000ms              原作者的效果




    参考论文:Salient Region Detection and Segmentation Radhakrishna Achanta, Francisco Estrada, Patricia Wils, and Sabine SÄusstrunk 2008 , Page 4-5


         saliency is determined as the local contrast of an image region with respect to its neighborhood at various scales.






     Objects that are smaller than a filter size are detected ompletely, while objects larger than a filter size are only artially detected (closer to edges). Smaller objects that are well detected by the smallest filter are detected by all three filters, while larger objects are only detected by the larger filters. Since the final saliency map is an average of the three feature maps (corresponding to detections of he three filters), small objects will almost always be better highlighted.


/// <summary>
/// 实现功能: saliency is determined as the local contrast of an image region with respect to its neighborhood at various scales
/// 参考论文: Salient Region Detection and Segmentation   Radhakrishna Achanta, Francisco Estrada, Patricia Wils, and Sabine SÄusstrunk   2008  , Page 4-5
///    整理时间: 2014.8.2
/// </summary>
/// <param name="Src">需要进行检测的图像数据,只支持24位图像。</param>
/// <param name="SaliencyMap">输出的显著性图像,也是24位的。</param>
/// <param name="Width">输入的彩色数据的对应的灰度数据。</param>
/// <param name="Height">输入图像数据的高度。</param>
/// <param name="Stride">图像的扫描行大小。</param>
/// <param name="R1">inner region's radius R1。</param>
/// <param name="MinR2">outer regions's min radius。</param>
/// <param name="MaxR2">outer regions's max radius。</param>
/// <param name="Scale">outer regions's scales。</param>
///    <remarks> 通过不同尺度局部对比度叠加得到像素显著性。</remarks>

void __stdcall SalientRegionDetectionBasedonAC(unsigned char *Src, unsigned char *SaliencyMap, int Width, int Height, int Stride, int R1, int MinR2, int MaxR2, int Scale)
    int X, Y, Z, Index, CurIndex;
    unsigned char *MeanR1 =(unsigned char *)malloc( Height * Stride);
    unsigned char *MeanR2 =(unsigned char *)malloc( Height * Stride);
    unsigned char *Lab = (unsigned char *) malloc(Height * Stride);
    float *DistMap = (float *)malloc(Height * Width * sizeof(float));

    for (Y = 0; Y < Height; Y++) 
        RGBToLAB(Src + Y * Stride, Lab + Y * Stride, Width);                    //    注意也是在Lab空间进行的

    memcpy(MeanR1, Lab, Height * Stride);
    if (R1 > 0)                                                                    //    如果R1==0,则表示就取原始像素
        BoxBlur(MeanR1, Width, Height, Stride, R1);

    memset(DistMap, 0, Height * Width * sizeof(float));

    for (Z = 0; Z < Scale; Z++)
        memcpy(MeanR2, Lab, Height * Stride);
        BoxBlur(MeanR2, Width, Height, Stride, (MaxR2 - MinR2) * Z / (Scale - 1) + MinR2);
        for (Y = 0; Y < Height; Y++) 
            Index = Y * Stride;
            CurIndex = Y * Width;
            for (X = 0; X < Width; X++)                    //    计算全图每个像素的显著性
                DistMap[CurIndex] += sqrt( (MeanR2[Index] - MeanR1[Index]) * (MeanR2[Index] - MeanR1[Index]) + (MeanR2[Index + 1] - MeanR1[Index + 1]) * (MeanR2[Index + 1] - MeanR1[Index + 1]) + (MeanR2[Index + 2] - MeanR1[Index + 2]) * (MeanR2[Index + 2] - MeanR1[Index + 2]) + 0.0) ;
                Index += 3;
    Normalize(DistMap, SaliencyMap, Width, Height, Stride, 0);        //    归一化图像数据


  核心就是一个 boxblur,注意他也是在LAB空间做的处理。




 以上检测均是在R1 =0 , MinR2 = Min(Width,Height) / 8 . MaxR2 = Min(Width,Height) / 2, Scale = 3的结果。


  参考论文: Frequency-tuned Salient Region Detection, Radhakrishna Achantay, Page 4-5, 2009 CVPR 


           1、 Emphasize the largest salient objects.

           2、Uniformly highlight whole salient regions.

           3、Establish well-defined boundaries of salient objects.

           4、Disregard high frequencies arising from texture, noise  and blocking artifacts.

           5、Efficiently output full resolution saliency maps.

    而起最后提出的显著性检测的计算方式也很简答 :


       where I is the mean image feature vector, I!hc (x; y) is the corresponding image pixel vector value in the Gaussian blurred version (using a 55 separable binomial kernel) of the original image, and || *|| is the L2 norm. 


     这篇论文作者提供了M代码和VC的代码,但是M代码实际上和VC的代码是不是对应的, M代码是有错误的,他求平均值的对象不对。



/// <summary>
/// 实现功能: 基于Frequency-tuned 的图像显著性检测
///    参考论文: Frequency-tuned Salient Region Detection, Radhakrishna Achantay, Page 4-5, 2009 CVPR 
///               http://ivrgwww.epfl.ch/supplementary_material/RK_CVPR09/
///    整理时间: 2014.8.2
/// </summary>
/// <param name="Src">需要进行检测的图像数据,只支持24位图像。</param>
/// <param name="SaliencyMap">输出的显著性图像,也是24位的。</param>
/// <param name="Width">输入的彩色数据的对应的灰度数据。</param>
/// <param name="Height">输入图像数据的高度。</param>
/// <param name="Stride">图像的扫描行大小。</param>
///    <remarks> 在Lab空间进行的处理,但是不能用库中的整形RGBLAB颜色函数,必须用原始的浮点数处理。不然很多结果不明显,原因未知。</remarks>

void __stdcall SalientRegionDetectionBasedOnFT(unsigned char *Src, unsigned char *SaliencyMap, int Width, int Height, int Stride)
    int X, Y, XX, YY, Index, Fast, CurIndex, SrcB, SrcG, SrcR, DstB, DstG, DstR;
    float *Lab = (float *) malloc(Height * Stride * sizeof(float));
    float *DistMap = (float *) malloc(Height * Width * sizeof(float));
    float MeanL = 0, MeanA = 0, MeanB = 0;
    for (Y = 0; Y < Height; Y++) 
        RGBToLABF(Src + Y * Stride, Lab + Y * Stride, Width);                //    浮点类型的数据转换
    for (Y = 0; Y < Height; Y++) 
        Index = Y * Stride;
        for (X = 0; X < Width; X++)
            MeanL +=  Lab[Index];
            MeanA +=  Lab[Index + 1];
            MeanB +=  Lab[Index + 2];
            Index += 3;
    MeanL /= (Width * Height);                                            //    求LAB空间的平均值
    MeanA /= (Width * Height);
    MeanB /= (Width * Height);

    GuassBlurF(Lab, Width, Height, Stride, 1);                            //    use Gaussian blur to eliminate fine texture details as well as noise and coding artifacts

    for (Y = 0; Y < Height; Y++)                                        //    网站的matlab代码的blur部分代码不对
        Index = Y * Stride;
        CurIndex = Y * Width;
        for (X = 0; X < Width; X++)                                        //    计算像素的显著性
            DistMap[CurIndex++] = (MeanL - Lab[Index]) *  (MeanL - Lab[Index]) +  (MeanA - Lab[Index + 1]) *  (MeanA - Lab[Index + 1]) +  (MeanB - Lab[Index + 2]) *  (MeanB - Lab[Index + 2])   ;
            Index += 3;
    Normalize(DistMap, SaliencyMap, Width, Height, Stride);                //    归一化图像数据









           原图                         FT(50ms)                        AC(25ms)


         LC(2ms)                        AC(23ms)   



          原图                                  FT                                 AC 


          LC                             AC






