搜索引擎对相似图片搜索识别的原理（一）

晚上看了阮一峰先生的博客：搜索引擎对相似图片搜索识别的原理，感觉挺有趣，就按着算法写写。唉，本来打算写写spring mvc的，这下一晚上又废了。先记录一下：

颜色分布法：

具体原来可以去查看博客，这里不写了。

包括两个类，第一个是颜色分布法的实现

import java.awt.image.BufferedImage;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;

import javax.imageio.ImageIO;

/**
 * 
 * @author Andy
 * date:2013/8/18
 * 颜色分布法
 *将0～255分成四个区：
 *0～63为第0区，64～127为第1区，128～191为第2区，192～255为第3区。
 *这意味着红绿蓝分别有4个区，总共可以构成64种组合（4的3次方）。
 *所以一张图片组成了一个64维向量，这个向量可以视为图片的指纹
 *
 */

public class ColorDistribution {
    private int N=64;
    public int[] vectorValue=new int[64];
    
    public ColorDistribution(){
        
    }
    public void vectorCompute(String imagePath){
        BufferedImage bufferedImage=null;
        try {
            //read the image
            bufferedImage = ImageIO.read(new FileInputStream( imagePath));
            int green, red, blue;
            int imageWidth = bufferedImage.getWidth();
            int imageHeight = bufferedImage.getHeight();
            //deal every pixel
            for (int i = bufferedImage.getMinX(); i < imageWidth; i++) {
                for (int j = bufferedImage.getMinY(); j < imageHeight; j++) {
                    Object data = bufferedImage.getRaster().getDataElements(i,
                            j, null);
                    red = bufferedImage.getColorModel().getRed(data);
                    green = bufferedImage.getColorModel().getGreen(data);
                    blue = bufferedImage.getColorModel().getBlue(data);
                    vectorValue[zone(red) * 16 + zone(green) * 4 + zone(blue)]++;
                }
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
 
    }
    //divide to four zones -0,1,2,3
    int zone(int colorValue){
        if(colorValue>=0 && colorValue<=63){
            return 0;
        }else if(colorValue>=64 && colorValue<=127){
            return 1;
        }else if(colorValue>=128 && colorValue<=191){
            return 2;
        }else{
            return 3;
        }
    }
    
    //test
    public static void main(String[] args){
        ColorDistribution cd=new ColorDistribution();
        cd.vectorCompute("C:\\Users\\red\\Desktop\\2.jpg");
        int[] a=cd.vectorValue;
        for(int i=0;i<a.length;++i){
            System.out.print(a[i]+" ");
        }
    }
}

第二个是皮尔逊积矩相关系数的实现：

/**
 * 皮尔逊积矩相关系数
 * @author Andy
 * date:2013/8/18
 */
public class PPMCC{
    private double exVector1;//sample mean
    private double exVector2;//sample mean
    private double sxVector1;//sample standard deviation
    private double sxVector2;//sample standard deviation
    
    //mean
    public double mean(int[] v){
        double sum=0;
        for(int i=0;i<v.length;++i){
            sum+=v[i];
        }
        return sum/(v.length);
    }
    //standard deviation
    public double standardDeviation(int[] v,double ex){
        double sx=0;
        for(int i=0;i<v.length;++i){
            double temp=v[i]-ex;
            sx+=temp*temp;
        }
        return Math.sqrt(sx/(v.length-1));
    }
    public double execute(int[] v1,int[] v2){
        int n=0;
        double r=0;
        if(v1.length==v2.length){
            n=v1.length;
        }else{
            return r;
        }
        exVector1=mean(v1);
        exVector2=mean(v2);
        sxVector1=standardDeviation(v1, exVector1);
        sxVector2=standardDeviation(v2, exVector2);
        for(int i=0;i<v1.length;++i){
            r+=((v1[i]-exVector1)/sxVector1)*((v2[i]-exVector2)/sxVector2);
        }
        r/=n-1;
        return r;
    }
    
    
    //test
    public static void main(String[] args){
        ColorDistribution cd1=new ColorDistribution();
        ColorDistribution cd2=new ColorDistribution();
        cd1.vectorCompute("C:\\Users\\red\\Desktop\\2.jpg");
        cd2.vectorCompute("C:\\Users\\red\\Desktop\\1.jpg");
        PPMCC ppmcc=new PPMCC();
        double r=ppmcc.execute(cd1.vectorValue, cd2.vectorValue);
        System.out.println(r);
    }
}

最后说说结果：

对于同样一张图片的结果，返回的皮尔逊系数是1.0000000000009，也没检查出那里出现了误差。还有就是在windows的画图上用同种颜色的笔画出的两张不同图片比较，结果很接近一，不知道这是算法的问题，还是写的程序代码，有问题，今天太晚了，先记到这。

posted @ 2013-08-18 01:55 AndyDHG 阅读(1329) 评论(0) 编辑收藏举报

刷新页面返回顶部

A dream doesn't become reality through magic; it takes sweat, determination and hard work.

搜索引擎对相似图片搜索识别的原理（一）

公告