HammingDistance

汉明距离

implementation 'org.apache.commons:commons-text:1.10.0'

The hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different.
For further explanation about the Hamming Distance, take a look at its Wikipedia page at http://en.wikipedia.org/wiki/Hamming_distance.
Since:
1.0

百度百科:
汉明距离是使用在数据传输差错控制编码里面的,汉明距离是一个概念,它表示两个(相同长度)字符串对应位置的不同字符的数量,我们以d(x,y)表示两个字x,y之间的汉明距离。对两个字符串进行异或运算,并统计结果为1的个数,那么这个数就是汉明距离。

汉明距离越短,两个字符序列越相似

以程序为例:
已知两个字符串
a = "1011101"

b = "1011111"

程序示例

实现很简单,前提是两个字符序列均不为空且长度相等

解释:比较两个字符序列在相同索引位置的字符,如果不相等,距离加1。a 和 b只有在倒数第二个字符不相等,因此两个字符序列的距离为 1。

public Integer apply(final CharSequence left, final CharSequence right) {
    if (left == null || right == null) {
        throw new IllegalArgumentException("CharSequences must not be null");
    }

    if (left.length() != right.length()) {
        throw new IllegalArgumentException("CharSequences must have the same length");
    }

    int distance = 0;

    for (int i = 0; i < left.length(); i++) {
        if (left.charAt(i) != right.charAt(i)) {
            distance++;
        }
    }

    return distance;
}
posted @ 2023-05-08 12:28  干翻苍穹  阅读(20)  评论(0编辑  收藏  举报