[ABC268G] Random Student ID
Problem Statement
Takahashi Elementary School has $N$ new students. For $i = 1, 2, \ldots, N$, the name of the $i$-th new student is $S_i$ (which is a string consisting of lowercase English letters). The names of the $N$ new students are distinct.
The $N$ students will be assigned a student ID $1, 2, 3, \ldots, N$ in ascending lexicographical order of their names. However, instead of the ordinary order of lowercase English letters where a
is the minimum and z
is the maximum, we use the following order:
- First, Principal Takahashi chooses a string $P$ from the $26!$ permutations of the string
abcdefghijklmnopqrstuvwxyz
of length $26$, uniformly at random. - The lowercase English characters that occur earlier in $P$ are considered smaller.
For each of the $N$ students, find the expected value, modulo $998244353$, of the student ID assigned (see Notes).
What is the lexicographical order?
A string $S = S_1S_2\ldots S_{|S|}$ is said to be lexicographically smaller than a string $T = T_1T_2\ldots T_{|T|}$ if one of the following 1. and 2. holds. Here, $|S|$ and $|T|$ denote the lengths of $S$ and $T$, respectively.
- $|S| \lt |T|$ and $S_1S_2\ldots S_{|S|} = T_1T_2\ldots T_{|S|}$.
- There exists an integer $1 \leq i \leq \min\lbrace |S|, |T| \rbrace$ satisfying the following two conditions:
- $S_1S_2\ldots S_{i-1} = T_1T_2\ldots T_{i-1}$
- $S_i$ is a smaller character than $T_i$.
Notes
We can prove that the sought expected value is always a rational number. Moreover, under the Constraints of this problem, when the value is represented as $\frac{P}{Q}$ by two coprime integers $P$ and $Q$, we can prove that there is a unique integer $R$ such that $R \times Q \equiv P\pmod{998244353}$ and $0 \leq R \lt 998244353$. Find such $R$.
Constraints
- $2 \leq N$
- $N$ is an integer.
- $S_i$ is a string of length at least $1$ consisting of lowercase English letters.
- The sum of lengths of the given strings is at most $5 \times 10^5$.
- $i \neq j \Rightarrow S_i \neq S_j$
Input
Input is given from Standard Input in the following format:
$N$ $S_1$ $S_2$ $\vdots$ $S_N$
Output
Print $N$ lines. For each $i = 1, 2, \ldots, N$, the $i$-th line should contain the expected value, modulo $998244353$, of the student ID assigned to Student $i$.
Sample Input 1
3 a aa ab
Sample Output 1
1 499122179 499122179
The expected value of the student ID assigned to Student $1$ is $1$; the expected values of the student ID assigned to Student $2$ and $3$ are $\frac{5}{2}$.
Note that the answer should be printed modulo $998244353$. For example, the sought expected value for Student $2$ and $3$ is $\frac{5}{2}$, and we have $2 \times 499122179 \equiv 5\pmod{998244353}$, so $499122179$ should be printed.
Sample Input 2
3 a aa aaa
Sample Output 2
1 2 3
期望有线性法则。
和的期望等于期望的和。这里要求排名,而每次有一个比他小ID,对排名的贡献是1.最后要求贡献之和的期望,也可以拆成若干个期望之和。
考虑在trie树上弄,那么一个单词的末尾节点 \(x\),如果他的祖先也为单词节点,那么这个节点的字典序一定比他小,算入最终排名。如果以这个节点为根的子树有单词节点,那么这些单词字典序一定比他大。其他的单词超过这个单词的概率为 \(\frac 12\),对排名贡献的期望也是 \(\frac 12\),统计即可。
#include<bits/stdc++.h>
const int N=5e5+5,P=998244353,inv2=P+1>>1;
int n,tr[N][26],idx,sz[N],tag[N],ans[N],cnt=1;
char s[N];
void insert(char s[],int i)
{
int len=strlen(s+1),u=0;
for(int i=1;i<=len;i++)
{
if(!tr[u][s[i]-'a'])
tr[u][s[i]-'a']=++idx;
u=tr[u][s[i]-'a'];
sz[u]++;
}
tag[u]=i;
}
void dfs(int x)
{
if(tag[x])
{
(ans[tag[x]]=1LL*inv2*(n-sz[x]-cnt+1)%P+cnt)%=P;
++cnt;
}
for(int i=0;i<26;i++)
if(tr[x][i])
dfs(tr[x][i]);
if(tag[x])
--cnt;
}
int main()
{
scanf("%d",&n);
for(int i=1;i<=n;i++)
{
scanf("%s",s+1);
insert(s,i);
}
dfs(0);
for(int i=1;i<=n;i++)
printf("%d\n",ans[i]);
}