lesson10:hashmap变慢原因分析

下面的英文描述了String.hashCode()方法,在特定情况下,返回值为0的问题:

Java offers the HashMap and Hashtable classes, which use the 
String.hashCode() hash function. It is very similar to DJBX33A (instead of 33, it uses the 
multiplication constant 31 and instead of the start value 5381 it uses 0). Thus it is also 
vulnerable to an equivalent substring attack. When hashing a string, Java also caches the 
hash value in the hash attribute, but only if the result is different from zero. 
Thus, the target value zero is particularly interesting for an attacker as it prevents caching 
and forces re-hashing. 

接下来我们来看一下String类的hashCode()方法:当下面代码中的val[off++]返回值都是0的情况下,hashCode()的返回值也是0

    public int hashCode() {
        int h = hash;//初始值为0
        if (h == 0 && count > 0) {//count值为字符个数
            int off = offset;//off值为0
            char val[] = value;//字符数组
            int len = count;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];//如果val[off++]的所有返回值都是ascii码0会发生什么?
            }
            hash = h;
        }
        return h;
    }

我们知道hashmap存储值的数据结构是数组+链表的结果,如果不同的key值,但是返回的hashcode()值都是0的话,hashmap的结构不会得到很好的应用,会造成所有的元素都存储在数组的第一个元素的链表中,下面通过代码来证明:

package com.mantu.advance;

import java.util.HashMap;


public class Lesson10HashmapLeak {

    public static void main(String[] args){
        testHashMapNormal();
        testHashMapBug();
    }
    
    public static void testHashMapBug(){
        HashMap<String,String> map = new HashMap<String,String>(100000);
        String xxx= asciiToString("0");
        String temp = xxx;
        long beginTime = System.currentTimeMillis();
        //System.out.println("开始时间:"+System.currentTimeMillis());
        for(int i=0;i<100000;i++){
            map.put(xxx, i+"");

            if((i%10000)==0){
                xxx=temp;
            }
            else{
                xxx=xxx+temp;
            }
        }
        System.out.println("testHashMapBug()耗时:"+(System.currentTimeMillis()-beginTime)+"毫秒");
    }
    
    public static void testHashMapNormal(){
        HashMap<String,String> map = new HashMap<String,String>(100000);
        String xxx= asciiToString("1");
        String temp = xxx;
        long beginTime = System.currentTimeMillis();
        //System.out.println("开始时间:"+System.currentTimeMillis());
        for(int i=0;i<100000;i++){
            map.put(xxx, i+"");

            if((i%10000)==0){
                xxx=temp;
            }
            else{
                xxx=xxx+temp;
            }
        }
        System.out.println("testHashMapNormal()耗时:"+(System.currentTimeMillis()-beginTime)+"毫秒");
    }
    public static String asciiToString(String value)
    {
        StringBuffer sbu = new StringBuffer();
        String[] chars = value.split(",");
        for (int i = 0; i < chars.length; i++) {
            sbu.append((char) Integer.parseInt(chars[i]));
        }
        return sbu.toString();
    }
}

最后的执行结果是:

正常key值的一组执行时间是:1887毫秒

key值对应的hashcode()值为0的执行时间是:7365毫秒

posted @ 2016-12-09 22:41  【刘光亮】  阅读(3443)  评论(2编辑  收藏  举报