HammingDistance
汉明距离
implementation 'org.apache.commons:commons-text:1.10.0'
The hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different.
For further explanation about the Hamming Distance, take a look at its Wikipedia page at http://en.wikipedia.org/wiki/Hamming_distance.
Since:
1.0
百度百科:
汉明距离是使用在数据传输差错控制编码里面的,汉明距离是一个概念,它表示两个(相同长度)字符串对应位置的不同字符的数量,我们以d(x,y)表示两个字x,y之间的汉明距离。对两个字符串进行异或运算,并统计结果为1的个数,那么这个数就是汉明距离。
汉明距离越短,两个字符序列越相似
以程序为例:
已知两个字符串
a = "1011101"
和
b = "1011111"
程序示例
实现很简单,前提是两个字符序列均不为空且长度相等
解释:比较两个字符序列在相同索引位置的字符,如果不相等,距离加1。a 和 b只有在倒数第二个字符不相等,因此两个字符序列的距离为 1。
public Integer apply(final CharSequence left, final CharSequence right) {
if (left == null || right == null) {
throw new IllegalArgumentException("CharSequences must not be null");
}
if (left.length() != right.length()) {
throw new IllegalArgumentException("CharSequences must have the same length");
}
int distance = 0;
for (int i = 0; i < left.length(); i++) {
if (left.charAt(i) != right.charAt(i)) {
distance++;
}
}
return distance;
}