19、哈希表

内容来自刘宇波老师算法与数据结构体系课

1、字符串中的第一个唯一字符

387 - 字符串中的第一个唯一字符

public class FirstUniqChar {

    public int firstUniqChar(String s) {
        int[] freq = new int[26];

        for (int i = 0; i < s.length(); i++) freq[s.charAt(i) - 'a']++;

        for (int i = 0; i < s.length(); i++) {
            if (freq[s.charAt(i) - 'a'] == 1) return i;
        }

        return -1;
    }
}

image
image
image

2、哈希函数的设计

good hash table primes

哈希函数的设计是很重要的
"键" 通过哈希函数得到的 "索引" 分布越均匀越好

通用的哈希函数,都是转成整型处理,但它并不是唯一的方法!

原则
1、一致性:如果 a == b,则 hash(a) == hash(b)
2、高效性:计算高效简便
3、均匀性:哈希值均匀分布

int hash = 0;
for (int i = 0; i < s.length(); i++) hash = (hash * B + s.charAt(i)) % M; // B 代表进制, M 代表素数

2.1、整型

image
image

2.2、浮点型

image

2.3、字符串

int hash = 0;
for (int i = 0; i < s.length(); i++) hash = (hash * B + s.charAt(i)) % M; // B 代表进制, M 代表素数

image
image

2.4、复合类型

image

3、Java 中的 hashCode 方法

public class Member {

    private int grade;
    private int cls;
    private String firstName;
    private String lastName;

    public Member(int grade, int cls, String firstName, String lastName) {
        this.grade = grade;
        this.cls = cls;
        this.firstName = firstName;
        this.lastName = lastName;
    }

    /**
     * 当哈希冲突时, 会调用 equals 判断 2 个对象是否相等
     */
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;

        Member another = (Member) o;

        return grade == another.grade && cls == another.cls &&
               Objects.equals(firstName.toLowerCase(), another.firstName.toLowerCase()) &&
               Objects.equals(lastName.toLowerCase(), another.lastName.toLowerCase());
    }

    /**
     * 对整型溢出不做处理也是没有问题的, Object 中的 hashCode 根据地址计算
     */
    @Override
    public int hashCode() {
        int B = 31;

        int hash = 0;
        hash = hash * B + grade;
        hash = hash * B + cls;
        hash = hash * B + (firstName != null ? firstName.toLowerCase().hashCode() : 0);
        hash = hash * B + (lastName != null ? lastName.toLowerCase().hashCode() : 0);

        return hash;
    }
}

4、哈希冲突的处理

链地址法 Seperate Chaining

(e.hashCode() & 0x7FFFFFFF) % M
0x7FFFFFFF 是 31 个 1,(e.hashCode() & 0x7FFFFFFF) 会把最高位,也就是第 32 位,将它变为 0

image

5、实现哈希表

hash 函数为什么要选择对素数求余

/**
 * 哈希表, key 需要实现 hashCode()
 * 分块分组思想,哈希值相等的 key 放在一组中
 * 容忍度 Tolerance degree
 */
public class HashTable<K, V> {

    /**
     * 最大容忍度 Upper Tolerance degree
     */
    private static final int upperTol = 10;
    /**
     * 最小容忍度 Lower Tolerance degree
     */
    private static final int lowerTol = 2;

    private final int[] capacity = {
            53, 97, 193, 389, 769, 1543, 3079, 6151, 12289, 24593, 49157,
            98317, 196613, 393241, 786433, 1572869, 3145739, 6291469, 12582917,
            25165843, 50331653, 100663319, 201326611, 402653189, 805306457, 1610612741
    };
    private int capacityIndex = 0;

    private TreeMap<K, V>[] hashTable;
    private int M; // M 是 hashTable 的长度, 它是一个素数
    private int size;

    public HashTable() {
        this.M = capacity[capacityIndex];
        size = 0;
        hashTable = new TreeMap[M];
        for (int i = 0; i < M; i++) hashTable[i] = new TreeMap<>();
    }

    private int hash(K key) {
        return (key.hashCode() & 0x7FFFFFFF) % M;
    }

    public int getSize() {
        return size;
    }

    /**
     * 添加
     */
    public void add(K key, V value) {
        TreeMap<K, V> map = hashTable[hash(key)];

        if (map.containsKey(key)) map.put(key, value);
        else {
            map.put(key, value);
            size++;

            if (size >= upperTol * M && capacityIndex + 1 < capacity.length) resize(capacity[++capacityIndex]);
        }
    }

    /**
     * 删除
     */
    public V remove(K key) {
        TreeMap<K, V> map = hashTable[hash(key)];

        V ret = null;
        if (map.containsKey(key)) {
            ret = map.remove(key);
            size--;

            if (size < lowerTol * M && capacityIndex - 1 >= 0) resize(capacity[--capacityIndex]);
        }

        return ret;
    }

    /**
     * 修改
     */
    public void set(K key, V newValue) {
        TreeMap<K, V> map = hashTable[hash(key)];

        if (!map.containsKey(key)) throw new IllegalArgumentException(key + "doesn't exist!");
        map.put(key, newValue);
    }

    /**
     * 查看
     */
    public boolean contains(K key) {
        TreeMap<K, V> map = hashTable[hash(key)];
        return map.containsKey(key);
    }

    /**
     * 查看
     */
    public V get(K key) {
        return hashTable[hash(key)].get(key);
    }

    /**
     * 动态的扩容或缩容
     */
    private void resize(int newM) {
        TreeMap<K, V>[] newHashTable = new TreeMap[newM];
        for (int i = 0; i < newM; i++) newHashTable[i] = new TreeMap<>();

        int oldM = M;
        this.M = newM;

        for (int i = 0; i < oldM; i++) {
            TreeMap<K, V> map = hashTable[i];
            for (K key : map.keySet()) newHashTable[hash(key)].put(key, map.get(key));
        }

        hashTable = newHashTable;
    }
}

6、复杂度分析

image
image
image

7、哈希表与平衡树

image
image
image

8、更多哈希冲突的处理方法

image
image

posted @ 2023-04-12 16:36  lidongdongdong~  阅读(23)  评论(0编辑  收藏  举报