F. Sum of Progression

You are given an array a of n numbers. There are also q queries of the form s,d,k.

For each query q, find the sum of elements as+as+d2++as+d(k1)k. In other words, for each query, it is necessary to find the sum of k elements of the array with indices starting from the s-th, taking steps of size d, multiplying it by the serial number of the element in the resulting sequence.


Each test consists of several testcases. The first line contains one integer t (1t104) — the number of testcases. Next lines contain descriptions of testcases.

The first line of each testcase contains two numbers n,q (1n105,1q2105) — the number of elements in the array a and the number of queries.

The second line contains n integers a1,...an (108a1,...,an108) — elements of the array a.

The next q lines each contain three integers s, d, and k (1s,d,kn, s+d(k1)n ).

It is guaranteed that the sum of n over all testcases does not exceed 105, and that the sum of q over all testcases does not exceed 2105.


For each testcase, print q numbers in a separate line — the desired sums, separated with space.



3 3
1 1 2
1 2 2
2 2 1
1 1 2
3 1
-100000000 -100000000 -100000000
1 1 3
5 3
1 2 3 4 5
1 2 3
2 3 2
1 1 5
3 1
100000000 100000000 100000000
1 1 3
7 7
34 87 5 42 -44 66 -32
2 2 2
4 3 1
1 3 2
6 2 1
5 2 2
2 5 2
6 1 2


5 1 3 
22 12 55 
171 42 118 66 -108 23 2 



  对于询问 (s,d,k),对应序列的下标的公差为 d,首项为 a(s1)%d+1,因此 as 是序列中的第 l=sd 个元素,as+d(k1) 是序列中第 r=s+d(k1)d 个元素。那么答案 as+2as+d++kas+d(k1),就可以通过前缀和得到:(las+(l+1)as+d++ras+d(k1))(l1)(as+as+d++as+d(k1))

  由于每个询问保证 s+d(k1)n,粗略看作是 dk=n,对于这种形式的表达式可以考虑根号分治。由于每次询问暴力统计答案的时间复杂度为 O(nd),因此如果在 d>n 时才暴力统计那么时间复杂度就变成 O(n),总的询问的时间复杂度就是 O(qn)

  对于 dn 的情况,我们就可以先预处理出所有下标公差不超过 n 的序列的前缀和,对应的时间复杂度为 O(nn),此时就可以通过上面提到的公式以 O(1) 的复杂度算出答案。

  而预处理的话我们需要计算两种前缀和,一个是带权的前缀和,用 s1[i][j] 来表示下标公差为 i 的序列的前 jd 个带权元素的和,转移方程为 s1[i][j]s1[i][ji]+jdaj。另外一个是普通前缀和,用 s2[i][j] 来表示下标公差为 i 的序列的前 jd 个元素的和,转移方程为 s2[i][j]s2[i][ji]+aj

  AC 代码如下,时间复杂度为 O(nn+qn)

#include <bits/stdc++.h>
using namespace std;

typedef long long LL;

const int N = 1e5 + 10, M = 310;

int a[N];
LL s1[M][N], s2[M][N];

void solve() {
    int n, m;
    scanf("%d %d", &n, &m);
    for (int i = 1; i <= n; i++) {
        scanf("%d", a + i);
    for (int i = 1; i <= 300; i++) {
        for (int j = 1; j <= n; j++) {
            s1[i][j] = (j + i - 1ll) / i * a[j];
            s2[i][j] = a[j];
            if (j - i > 0) {
                s1[i][j] += s1[i][j - i];
                s2[i][j] += s2[i][j - i];
    while (m--) {
        int s, d, k;
        scanf("%d %d %d", &s, &d, &k);
        if (d <= 300) {
            int l = max(0, s - d), r = s + d * (k - 1);
            printf("%lld ", s1[d][r] - s1[d][l] - (l + d - 1) / d * (s2[d][r] - s2[d][l]));
        else {
            LL ret = 0;
            for (int i = 0; i < k; i++) {
                ret += (i + 1ll) * a[s + i * d];
            printf("%lld ", ret);

int main() {
    int t;
    scanf("%d", &t);
    while (t--) {
    return 0;

  再给出我一开始的做法,也是想着预处理所有的前缀和,对于每个下标公差为 d 的序列有 d 个不同的首项,因此总的时间复杂度就是 O(i=1nini)=O(n2)。但最多只有 q 个询问,意味着只需计算出其中 q 个前缀和即可。

  因此我的做法是对于每个询问 (s,d,k),记录二元组 ((s1)%d+1,d),如果这个二元组之前没有出现过则求出下标公差为 d 首项为 a(s1)%d+1 序列的前缀和,并把结果保存下来用编号标记,如果之后询问的二元组还是这个就可以通过 O(1) 算出结果。

  考虑时间复杂度和空间复杂度,最坏的情况下每次询问的二元组 ((s1)%d+1,d) 都不一样,并且是计算量最大的前 q 个序列,即 n1,n2,n2,n3,,n2q。因此时间复杂度和空间复杂度都是 O(i=12qini)=O(qn)

  当 nq 都取题目中给定的最大值时,其中所用内存大概是 633×105×8×2/220965 MB,题目给出的限制是 1024 MB,然而一直 MLE。

  所以优化的地方是对询问进行离线处理,把询问按照 ((s1)%d+1,d) 进行分组,对于公差和首项都相同的询问统一处理,结束后把空间释放,这样空间复杂度就可以降到 O(n) 了。

  AC 代码如下,时间复杂度为 O(qlogq+qn)

#include <bits/stdc++.h>
using namespace std;

typedef long long LL;

const int N = 2e5 + 10;

int a[N];
LL s1[N], s2[N];
array<int, 3> q[N];
int p[N];
LL ans[N];

LL get(int x, int y) {    // 把二元组映射成一个整数
    return ((x - 1) % y + 1) * 100001ll + y;

void solve() {
    int n, m;
    scanf("%d %d", &n, &m);
    for (int i = 1; i <= n; i++) {
        scanf("%d", a + i);
    for (int i = 0; i < m; i++) {
        scanf("%d %d %d", &q[i][0], &q[i][1], &q[i][2]);
        p[i] = i;
    sort(p, p + m, [&](int i, int j) {
        return get(q[i][0], q[i][1]) < get(q[j][0], q[j][1]);
    for (int i = 0; i < m; i++) {
        int s = q[p[i]][0], d = q[p[i]][1], k = q[p[i]][2];
        for (int i = (s - 1) % d + 1, j = 1; i <= n; i += d, j++) {    // 先求出前缀和
            s1[j] = s1[j - 1] + 1ll * a[i] * j;
            s2[j] = s2[j - 1] + a[i];
        int j = i;
        while (j < m && get(q[p[i]][0], q[p[i]][1]) == get(q[p[j]][0], q[p[j]][1])) {    // 同一个序列统一处理
            s = q[p[j]][0], d = q[p[j]][1], k = q[p[j]][2];
            int l = (s + d - 1) / d - 1, r = (s + d * (k - 1) + d - 1) / d;
            ans[p[j++]] = s1[r] - s1[l] - (s2[r] - s2[l]) * l;
        i = j - 1;
    for (int i = 0; i < m; i++) {
        printf("%lld ", ans[i]);

int main() {
    int t;
    scanf("%d", &t);
    while (t--) {
    return 0;



  Editorial for Codeforces Round 920 (Div. 3):https://codeforces.com/blog/entry/124757

  Codeforces Round 920 (Div. 3) F Sum of Progression 根号分治+前缀和:https://zhuanlan.zhihu.com/p/678014982

