C. Equal Frequencies
C. Equal Frequencies
Let's call a string balanced if all characters that are present in it appear the same number of times. For example, "coder", "appall", and "ttttttt" are balanced, while "wowwow" and "codeforces" are not.
You are given a string of length consisting of lowercase English letters. Find a balanced string of the same length consisting of lowercase English letters that is different from the string in as few positions as possible. In other words, the number of indices such that should be as small as possible.
Input
Each test contains multiple test cases. The first line contains the number of test cases (). The description of the test cases follows.
Each test case consists of two lines. The first line contains a single integer () — the length of the string .
The second line contains the string of length consisting of lowercase English letters.
It is guaranteed that the sum of over all test cases does not exceed .
Output
For each test case, print the smallest number of positions where string and a balanced string can differ, followed by such a string .
If there are multiple solutions, print any. It can be shown that at least one balanced string always exists.
Example
input
4 5 hello 10 codeforces 5 eevee 6 appall
output
1 helno 2 codefofced 1 eeeee 0 appall
Note
In the first test case, the given string "hello" is not balanced: letters 'h', 'e', and 'o' appear in it once, while letter 'l' appears twice. On the other hand, string "helno" is balanced: five distinct letters are present in it, and each of them appears exactly once. Strings "hello" and "helno" differ in just one position: the fourth character. Other solutions are possible too.
In the second test case, string "codefofced" is balanced since only letters 'c', 'o', 'd', 'e', and 'f' are present in it, and each of them appears exactly twice.
In the third test case, string "eeeee" is balanced since only letter 'e' is present in it.
In the fourth test case, the given string "appall" is already balanced.
解题思路
写这题分为两部,先找出最小改变次数,再对字符串进行修改。
可以发现,由于最后字符串中每种字符的出现次数都相同,因此还是很容易想到枚举最后有多少种字符来求最小改变次数。假设最后有种字符,那么要满足,同时每种字符都有个。统计字符串中每个字符的出现次数,并按照个数从大到小排序。为了得到最小的改变次数,贪心的思路是保留前种字符,然后把从第个字符开始的剩余字符都删掉(如果有的话),当然如果不同种类的字符不足个,那么意味需要额外添加其他种类的字符。这里的删除是指把该字符换成前个原本存在的字符种类,添加是指把该字符换成原本不存在的字符种类。假设改变次数为,枚举前个字符如果该字符的出现的次数,那么就要加上;然后再枚举剩下字符,这些字符都是要删掉的,因此要加上这些字符出现的次数。枚举完所有合法的并取最小的作为答案,最后答案还要除以,因为一次修改被算了两次(删除和添加)。
关于上面贪心的正确性说明,这里是选择保留出现次数多的字符而删除出现次数少的字符,上面求的公式为。假设有字符和,个数分别为和,如果保留而删除,变化后数值就变成,如果贪心是正确的话,那么变化后的应该是变大的。因此我们要证明出,才能保证贪心的正确性,然后我试着证明了一下,可以参考下面的附录部分。
最后是修改字符了,当时比赛时太急了没想出来,补题时发现确实还是蛮复杂的。
根据上面的求最小出现次数的过程我们就已经得到了最后要保留的字符种类个数,这里记为,以及每种字符保留的个数。然后我们要做的是根据前面统计出来的字符出现次数来替换字符串中的字符,当枚举到,如果字符的出现次数大于,那么我们就要把换成出现次数小于的字符种类;如果字符的出现次数小于就不用管了,因为后面一定会存在大于的字符。
实现的方法是先算出每个字符需要添加或删除的数量,具体做法是根据前面统计出来的每种字符出现次数,记为数组,把前个字符都减去,如果得到的结果大于则说明该需要将该种类的字符替换成结果小于的字符。第个开始之后的字符就不用管了,如果存在则对应的必定为正数,说明需要进行替换且出现的次数就是要被替换的次数。在修改字符串的过程中开一个指针指向第个字符,每次枚举到下一个字符前都检查一下所指向的字符在中是不是小于,不是的话就往前移。当发现,则将替换成指向的字符,同时进行相应的修改。
AC代码如下:
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 const int N = 1e5 + 10; 5 6 char str[N]; 7 int cnt[26], p[26]; 8 9 void solve() { 10 int n; 11 scanf("%d %s", &n, &str); 12 memset(cnt, 0, sizeof(cnt)); 13 for (int i = 0; i < n; i++) { 14 cnt[str[i] - 'a']++; // 统计原字符串每个字符出现的次数 15 } 16 for (int i = 0; i < 26; i++) { 17 p[i] = i; 18 } 19 sort(p, p + 26, [&](int i, int j) { // 字符按照出现次数从大到小排序 20 return cnt[i] > cnt[j]; 21 }); 22 int ret = n, sum; // ret记录修改的次数,sum记录最后字符串中字符的种类个数 23 for (int i = 1; i <= 26; i++) { // 枚举字符的种类,最多有26种 24 if (n % i == 0) { 25 int s = n / i, t = 0; 26 for (int j = 0; j < i; j++) { // 前i个字符要保留 27 t += abs(cnt[p[j]] - s); 28 } 29 for (int j = i; j < 26; j++) { // 剩余的字符都要删除 30 t += cnt[p[j]]; 31 } 32 if (ret > t / 2) ret = t / 2, sum = i; // 答案记得除以2 33 } 34 } 35 for (int i = 0; i < sum; i++) { 36 cnt[p[i]] -= n / sum; // cnt[p[i]]>0,表示需要把字符p[i]需要替换成<0的字符 37 } 38 for (int i = 0, j = sum - 1; i < n; i++) { 39 while (j >= 0 && cnt[p[j]] >= 0) { // j指向cnt[p[j]]<0的字符,表示要替换成p[j] 40 j--; 41 } 42 if (cnt[str[i] - 'a'] > 0) { // 修改str[i] 43 cnt[str[i] - 'a']--, cnt[p[j]]++; // 消除 44 str[i] = p[j] + 'a'; // 替换成p[j] 45 } 46 } 47 printf("%d\n%s\n", ret, str); 48 } 49 50 int main() { 51 int t; 52 scanf("%d", &t); 53 while (t--) { 54 solve(); 55 } 56 57 return 0; 58 }
附录
已知 ,
求证 。
证明:
即得
得证。
主要是我忘了不等式
一直记的是
不过如果我反过来证貌似得不到结果:
写到这里的时候我想到 可以分类讨论:
- 当 :
- 当 :
- 当 :
综上所述
同时得到不等式的上下界:
得证。
参考资料
Codeforces Round #844 (Div. 1 + Div. 2, based on VK Cup 2022 - Elimination Round) A-D:https://www.cnblogs.com/BlankYang/p/17056807.html
本文来自博客园,作者:onlyblues,转载请注明原文链接:https://www.cnblogs.com/onlyblues/p/17060826.html
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 单线程的Redis速度为什么快?
· 展开说说关于C#中ORM框架的用法!
· Pantheons:用 TypeScript 打造主流大模型对话的一站式集成库
· SQL Server 2025 AI相关能力初探
· 为什么 退出登录 或 修改密码 无法使 token 失效