【CF827C DNA Evolution】题解

题目链接

题目

DNA链由核苷酸组成。有四种类型的核苷酸:“A”,“T”,“G”,“C”。 DNA链是核苷酸序列。科学家决定追踪一种稀有物种的进化,它最初的DNA链为s。

物种的进化被描述为DNA的一系列变化。每个变化都是某些核苷酸的变化,例如,DNA链“AAGC”中可能发生以下变化:第二个核苷酸可以变为“T”,然后变成“ATGC”。

科学家们知道DNA链的某些片段会受到某些未知感染的影响。这些感染可以被表示为核苷酸序列。科学家们对引起变化的感染十分感兴趣。因此,他们有时想知道某些感染对某些DNA片段的影响的价值。价值是这样计算的:

用字符串e代表表示感染的核苷酸序列,科学家们对DNA序列从l到r(包括端点)的片段感兴趣。

把字符串eee……的前缀(即字符串e重复许多次组成的字符串)写在字符串s从l到r(包括端点)的片段下边。

感染对DNA片段的价值是感染与DNA片段在l到r区间内,相同位置相同的核苷酸的数量。

作为开发者,Innokenty也对生物信息学感兴趣,因此科学家们向他求助。 Innokenty正在忙着准备VK杯,所以他决定将问题交给参赛者们。来帮帮科学家们吧!

Everyone knows that DNA strands consist of nucleotides. There are four types of nucleotides: "A", "T", "G", "C". A DNA strand is a sequence of nucleotides. Scientists decided to track evolution of a rare species, which DNA strand was string $ s $ initially.

Evolution of the species is described as a sequence of changes in the DNA. Every change is a change of some nucleotide, for example, the following change can happen in DNA strand "AAGC": the second nucleotide can change to "T" so that the resulting DNA strand is "ATGC".

Scientists know that some segments of the DNA strand can be affected by some unknown infections. They can represent an infection as a sequence of nucleotides. Scientists are interested if there are any changes caused by some infections. Thus they sometimes want to know the value of impact of some infection to some segment of the DNA. This value is computed as follows:

  • Let the infection be represented as a string $ e $ , and let scientists be interested in DNA strand segment starting from position $ l $ to position $ r $ , inclusive.
  • Prefix of the string $ eee... $ (i.e. the string that consists of infinitely many repeats of string $ e $ ) is written under the string $ s $ from position $ l $ to position $ r $ , inclusive.
  • The value of impact is the number of positions where letter of string $ s $ coincided with the letter written under it.

Being a developer, Innokenty is interested in bioinformatics also, so the scientists asked him for help. Innokenty is busy preparing VK Cup, so he decided to delegate the problem to the competitors. Help the scientists!

image

思路

发现 \(|e|\leqslant 10\),猜测其为突破口。

以下图为例,上面为原串,下面为询问区间\(|e|\)循环节长度(此时长度为3)

考虑 \(e\) 串中第 \(i\) 位对答案的贡献,假设 \(e_i\) 在原串中第一次对应的位置(先不考虑是否匹配)\(j\) 出现,则后面出现 \(e_i\) 的位置 \(k\) 必然满足 \(k\bmod|e|=j\bmod|e|\)。如果考虑匹配,只需分四类讨论,统计区间匹配个数,用三维树状数组维护。

树状数组其中一维为字母,现在讨论剩下两维。

综上所述,考虑原串中每个位置 \(i\),把它拆成 \(10\) ,枚举 \(p\) 使得 {$p\in N | 1\leqslant p\leqslant 10 $},求 \(i\bmod p=x\),则树状数组另外两维分别为 \(p\)\(x\)。这样子询问只需枚举 \(|e|\) 即可。

总时间复杂度 \(O(q\times \log n\times 10)\)

Code

// Problem: CF827C DNA Evolution
// Contest: Luogu
// URL: https://www.luogu.com.cn/problem/CF827C
// Memory Limit: 500 MB
// Time Limit: 2000 ms

#include<bits/stdc++.h>
using namespace std;
// #define int long long
inline int read(){int x=0,f=1;char ch=getchar();
while(ch<'0'||ch>'9'){if(ch=='-')f=-1;
ch=getchar();}while(ch>='0'&&ch<='9'){x=(x<<1)+
(x<<3)+(ch^48);ch=getchar();}return x*f;}
#define N 100010
//#define M
//#define mo
int n, m, i, j, k, T; 
int cnt[11][11][5][N]; 
int mp[150], q, l, r, x, y, e, o, ans; 
char s[N], t[20], c; 

void add(int cnt[], int x, int y)
{
	while(x<=n)
	{
		cnt[x]+=y; 
		x+=x&-x; 
	}
}

int qiu(int cnt[], int x)
{
	int ans=0; 
	while(x)
	{
		ans+=cnt[x]; 
		x-=x&-x; 
	}
	return ans; 
}

signed main()
{
//	freopen("tiaoshi.in","r",stdin);
//	freopen("tiaoshi.out","w",stdout);
	mp['A']=1; mp['C']=2; mp['T']=3; mp['G']=4; 
	scanf("%s", s+1); 
	n=strlen(s+1); 
	for(i=1; i<=n; ++i)
		for(j=1; j<=10; ++j)
			add(cnt[j][i%j][mp[(int)s[i]]], i, 1); 
	q=read(); 
	while(q--)
	{
		o=read(); 
		if(o==1)
		{
			x=read(); scanf("%c", &c); 
			for(j=1; j<=10; ++j)
				add(cnt[j][x%j][mp[(int)s[x]]], x, -1); 
			s[x]=c; 
			for(j=1; j<=10; ++j)
				add(cnt[j][x%j][mp[(int)s[x]]], x, 1); 
		}
		if(o==2)
		{
			l=read(); r=read(); 
			scanf("%s", t+1); 
			m=strlen(t+1); ans=0; 
			for(i=1, j=l; i<=m && j<=r; ++i, ++j) 
				ans+=(qiu(cnt[m][j%m][mp[(int)t[i]]], r)
				-qiu(cnt[m][j%m][mp[(int)t[i]]], l-1)); 
			printf("%d\n", ans); 
		}
	}
	return 0;
}

总结

这道题的切入点是 \(|e|\leq 10\),思考方向是 \(|e|\) 从1开始思考,再想2、3。

对于题目中数据特殊的,我们可以把这个特殊数据从小开始思考,易发现性质,来完成此类题目。

posted @ 2022-05-24 18:08  zhangtingxi  阅读(26)  评论(0编辑  收藏  举报