「Day 11 & 12 & 13 & 14—杂项」

合集 - C++信息竞赛(17)

1.CSP考前2023-10-04 2.「Day 1—递归问题」2024-08-04 3.「Day 2—贪心问题&分治&前缀和」2024-08-05 4.「Day 3—深度优先搜索 & 广度优先搜索」2024-08-06 5.「Day 4—图的存储 & 图上搜索」2024-08-11 6.「Day 5—最短路径」2024-08-11 7.「Day 6—单调栈 & 单调队列 & 并查集」2024-08-11 8.「Day 7—离散化 & 树状数组 & 线段树」2024-08-12 9.「Day 8—最小生成树之Kruskal & Prim」2024-08-12 10.「Day 9 & 10—DP问题」2024-08-14

11.「Day 11 & 12 & 13 & 14—杂项」2024-08-17

12.「2024 - 暑假 - Day-1 提高笔记-割点(割边)」2024-10-09 13.「2024 - 暑假 - Day-2 提高笔记-字典树」2024-10-16 14.【基础知识】2024-10-18 15.「2024 - 暑假 - Day-3 提高笔记-ST表 & RMQ」2024-10-22 16.「2024 - 暑假 - Day-4 提高笔记-LCA最近公共祖先」2024-10-22 17.「2025 - 寒假 - Day-2 提高笔记-反悔贪心」01-22

字符串Hash

定义

就是类似于 $map$ 的一种映射关系吧，一个字符串对应一个整数值，通过整数值的异同来判断字符串的异同。那么如何去计算呢？

P3370 【模板】字符串哈希

单哈希法

我们可以对于一个字符串 $(s = s_{1}, s_{2}, s_{3}, \dots, s_{n})$ ，我们让 $hash [i] = hash [i - 1] \cdot p + int (s [i])$ ，这样就可以计算出一个在 $(2^{64})$ 次方内，也就是 unsigned long long 的范围内没什么问题。但是，万一这个 $hash [i]$ 溢出了怎么办呢？我们可以通过模上一个数来解决。

hash [i] = (hash [i - 1] \cdot p + int (s [i])) % mod

注：这里的 $p$ 和 $mod$ 都是一个质数，且 $p < mod$ ，但是我们要选一个较大的质数。

代码

点击查看代码

#include<iostream>
#include<algorithm>
using namespace std;

#define int unsigned long long

int n;
int tot = 0;
int a[10005];
int mod = 212370440130137957ll;
int p = 157;

int hx(string x){
	int len = x.size();
	int ans = 0;
	for(int i = 0;i < len;i ++){
		ans = (ans * p + (int)x[i]) % mod;
	}
	return ans;
}

signed main(){
	cin >> n;
	while(n --){
		string s;
		cin >> s;
		a[++ tot] = hx(s);
	}
	sort(a + 1,a + tot + 1);
	int sum = 0;
	for(int i = 1;i <= tot;i ++){
		if(a[i] != a[i + 1]){
			sum ++;
		}
	}
	cout << sum << "\n";
	return 0;
}

~~然鹅，这样对于一下的字符串，可能会爆：~~

$S1 = abbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb$
$S2 = bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb$

那么我们就有可能会发生哈希冲突，即两个不同字符串的 $Hash$ 值一样。

双哈希法

如果算一次容易发生哈希冲突，那么我要算两次阁下该如何应对呢？
这样一来哈希冲突的概率大大降低~~（狂喜）~~。

我们可以用一个 $pair<hash1,hash2>$ 来存储两个哈希的值，这样可以最大程度的减少哈希冲突。

代码

点击查看代码

#include<iostream>
#include<algorithm>
#include<vector>
using namespace std;

#define int unsigned long long

int n;
int mod1 = 212370440130137957ll;
int p1 = 157;
int mod2 = 192608173749137ll;
int p2 = 233;
vector<pair<int,int> > a;

pair<int,int> hx(string x){
	int len = x.size();
	int h1 = 0;
	int h2 = 0;
	for(int i = 0;i < len;i ++){
		h1 = (h1 * p1 + (int)x[i]) % mod1;
		h2 = (h2 * p2 + (int)x[i]) % mod2;
	}
	return make_pair(h1,h2);
}

signed main(){
	cin >> n;
	while(n --){
		string s;
		cin >> s;
		a.push_back(hx(s));
	}
	sort(a.begin(),a.end());
	int sum = 1;
	for(int i = 1;i < a.size();i ++){
		if(a[i] != a[i - 1]){
			sum ++;
		}
	}
	cout << sum << "\n";
	return 0;
}

子串哈希值

int geth(int l,int r) return h[r] - h[l - 1] * p[r - l + 1];

数学

P1226 【模板】快速幂

本质上是一种倍增的思想。

int qpow(int a,int b,int m){
    int ans = 1;
    a %= m;//防止溢出
    while(b){
	//b & 1的意思是b的最后一位，如果是1的话，就需要乘，不是则跳过
        if(b & 1) ans = (ans * a) % m;
	//把b左移一位，将最后一位删掉
        b >>= 1;
	//每次a倍增
        a = (a * a) % m;
    }
    return ans;
}

posted @ 2024-08-17 09:28 To_Carpe_Diem 阅读(20) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

相关博文：

· 「Day 9 & 10—DP问题」

· 「2024 - 暑假 - Day-2 提高笔记-字典树」

· 字符串哈希笔记

· 字符串哈希（Hash）

· 字符串哈希

阅读排行：
· 阿里最新开源QwQ-32B，效果媲美deepseek-r1满血版，部署成本又又又降低了！
· 单线程的Redis速度为什么快？
· SQL Server 2025 AI相关能力初探
· AI编程工具终极对决：字节Trae VS Cursor，谁才是开发者新宠？
· 展开说说关于C#中ORM框架的用法！

公告

昵称： To_Carpe_Diem
园龄： 1年7个月
粉丝： 6
关注： 6

+加关注

2025年3月

日

一

二

三

四

五

六

To_Carpe_Diem

To_Carpe_Diem

「Day 11 & 12 & 13 & 14—杂项」

字符串Hash

定义

P3370 【模板】字符串哈希

单哈希法

代码

双哈希法

代码

子串哈希值

数学

P1226 【模板】快速幂

公告

搜索

常用链接

最新随笔

我的标签

积分与排名

合集

随笔分类

随笔档案

阅读排行榜

评论排行榜

推荐排行榜

最新评论