单词查找树

本节内容：学习两种字符串查找相关的数据结构

应用：基于字符串键的符号表

算法：基于字符串键的查找算法

数据结构：

单词查找树（R 向单词查找树）
三向单词查找树（TST）

性能：

查找命中所需的时间与被查找的键的长度成正比
查找未命中只需检查若干个字符

单词查找树

性质：

根结点是一个空结点
每个结点都只有一个指向它的结点，即它的父结点（根结点除外）
每个结点有 R 条链接，R 是字母表的大小

结点实现细节：

val：保存结点的值
Node[R]：保存链接
键的字符隐式地保存在链接中，键由路径上的结点表示，并将值保存在尾结点中

算法实现：

/**
 * 单词查找树符号表
 * 数据结构：单词查找树 / R 向单词查找树
 * */
public class TrieST<T> {
    private static final int R = 26; // 字符集大小，26 个小写字母
    
    private static class Node<T> {
        private Node<T>[] next = new Node[R];
        private T val;
    }

    private Node<T> root = new Node<>();

    public T get(String key) {
        Node<T> node = get(key, root, 0);
        if (node == null) {
            return null;
        }
        return node.val;
    }

    private Node<T> get(String key, Node<T> node, int i) {
        if (node == null) {
            return null;
        }
        if (i == key.length()) {
            return node;
        }
        int index = indexFor(key.charAt(i));
        return get(key, node.next[index], i + 1);
    }

    public void put(String key, T val) {
        put(key, val, root, 0);
    }

    private void put(String key, T val, Node<T> node, int i) {
        if (i == key.length()) {
            node.val = val;
            return;
        }
        int index = indexFor(key.charAt(i));
        if (node.next[index] == null) {
            node.next[index] = new Node<>();
        }
        put(key, val, node.next[index], i + 1);
    }

    private int indexFor(char c) {
        return c - 'a';
    }
}

测试：

class TrieSTTest {
    @Test
    public void testTrieST() {
        TrieST<Integer> trieST = new TrieST<>();
        trieST.put("hello", 10);
        trieST.put("world", 20);
        Assertions.assertEquals(10, trieST.get("hello"));
        Assertions.assertEquals(20, trieST.get("world"));
        Assertions.assertNull(trieST.get("no exists"));
    }
}

三向单词查找树

目的：解决 R 向单词查找树过度的空间消耗

数据结构：

在三向单词查找树中，每个结点都含有一个字符、一个值和三条链接
三条链接分别对应着字符小于、等于、大于当前结点字符的键

算法实现：

/**
 * 三向单词查找树
 * */
public class TST<T> {
    private static class Node<T> {
        private char c;
        private T val;
        private Node<T> left, mid, right;
    }

    private Node<T> root;

    public T get(String key) {
        Node<T> node = get(key, root, 0);
        if (node == null) {
            return null;
        }
        return node.val;
    }

    private Node<T> get(String key, Node<T> node, int i) {
        if (node == null) {
            return null;
        }
        char c = key.charAt(i);
        if (c < node.c) {
            return get(key, node.left, i);
        }
        if (c > node.c) {
            return get(key, node.right, i);
        }
        if (i == key.length() - 1) {
            return node;
        }
        return get(key, node.mid, i + 1);
    }

    public void put(String key, T val) {
        root = put(key, val, root, 0);
    }

    private Node<T> put(String key, T val, Node<T> node, int i) {
        char c = key.charAt(i);
        if (node == null) {
            node = new Node<>();
            node.c = c;
        }
        if (c < node.c) {
            node.left = put(key, val, node.left, i);
        } else if (c > node.c) {
            node.right = put(key, val, node.right, i);
        } else {
            if (i == key.length() - 1) {
                node.val = val;
            } else {
                node.mid = put(key, val, node.mid, i + 1);
            }
        }
        return node;
    }
}

测试：

class TSTTest {
    @Test
    public void testTST() {
        TST<Integer> tst = new TST<>();
        tst.put("hello", 10);
        tst.put("world", 20);
        Assertions.assertEquals(10, tst.get("hello"));
        Assertions.assertEquals(20, tst.get("world"));
        Assertions.assertNull(tst.get("no exists"));
    }
}

扩展

以字符串为键的符号表的 API：

public class StringST<Value> {
    StringST();

    void put(String key, Value val);
    Value get(String key);
    void delete(String key);
    boolean contains(String key);
    boolean isEmpty();
    String longestPrefixOf(String s); // 返回一个最长的键，该键是 s 的前缀
    Iterable<String> keysWithPrefix(String s); // 以 s 为前缀的键的集合
    Iterable<String> keysThatMatch(String s); // 匹配模式 s 的键的集合，. 可以匹配任意字符
    int size();
    Iterable<String> keys;
}

posted @ 2022-08-13 13:47 廖子博阅读(93) 评论(0) 编辑收藏举报

刷新页面返回顶部

廖子博

liaozibo.com

单词查找树

单词查找树

三向单词查找树

扩展

公告