四毛子算法教学

介绍

四毛子算法是一种可以接近 \(O(n)\) 级求解 \(RMQ\) 问题的算法，正宗的四毛子算法涉及到比较多的知识点且常数较大，我们常常使用朴素的简化版四毛子算法。

正宗四毛子算法引入

1. 什么是笛卡尔树

笛卡尔树如果你接触过平衡树，那将并不会陌生，它是 \(treap\) 的一种，当然常常的，我们为了维护序列上的平衡树，即文艺平衡树，以下标作为键值。

稍微提提与线段树对标的文艺平衡树是什么东西，且它的优势之处。对于普通的平衡树而言，它是一种以值作为比较对象的平衡二叉树，即左子树的值都比当前节点小，右子树则比自身大，那么我们如果将这个值换成下标，则左子树是比自身下标小的数，右子树是比自身下标大的数。那么显而易见，中序遍历拿到的序列，是从下标小的到下标大的，即为原序列。

笛卡尔树是一种 \(treap\)，在满足上述前提，同时满足堆的特性，我们在文艺平衡树中如果以 \(rnk\) 作为它的堆的比较值，那么它就是一棵正儿八经的 \(treap\) 了。而在 \(RMQ\) 问题中，我们常常以该下标处的值为堆的比较对象。

如图所示，容易看出笛卡尔树不仅下标作为了 \(BST\) 的键值，它的 \(val\) 也作为了堆的优先级，越上面的越小，这是个小根堆。

怎么建树？这里有一种单调栈建法，以小根堆为例，我们单调栈维护 \(val\) 从小到大的递增栈，那么我们从左往右依次加，栈中即为在当前插入点下标更小的元素，对于当前节点来说，如果在它之前且比它大的最后一个元素，显然即为它的左儿子。而栈顶即为最近下标比它小，且值也比它小，理应它为对方的右儿子。

参照大根堆建树代码

int root;
int a[N];
int child[N][2];

inline void build()
{
    stack<int> st;
    forn(i, 1, n)
    {
        int last = 0;
        while (!st.empty() and a[st.top()] < a[i])last = st.top(), st.pop();
        if (!st.empty())child[st.top()][1] = i;
        child[i][0] = last, st.push(i);
    }
    while (!st.empty())
    {
        if (st.size() == 1)root = st.top();
        st.pop();
    }
}

那么笛卡尔树的特点，我们可以用来干嘛？首先最基础的一点，大部分的 \(BST\) 都可以类似线段树一样，从中间往两边递归建树，满足 \(\log{n}\) 的期望高度，而如果我们的 \(treap\) 如果也参照如此，那么很显然会破坏堆的性质，虽然在平衡树当中 \(rnk\) 仅仅只是防止退化，不满足堆的特点也没关系。但我们可以利用这种建树方式，也可以达到 \(O(n)\) 级建造 \(treap\) 类树。

另一个特点，堆的特点。对于 \([l,r]\) 的 \(rmq\) 结果即为它们的 \(LCA\) 对应的值。

稍微证明下：

例如以这个为例，我们容易知道 \(val=10\) 的点的父亲的下标一定是 \(<left_{all}\)，所以这个容易推广到所有情形，对于一个 \(LCA\) 来说它的父亲及其以上部分的点都一定不包括 \([l,r]\) 区间。而从 \(l \rightarrow LCA\) 和 \(LCA \rightarrow r\)，这两条路径中。

显然 \(l \rightarrow LCA\) 要么直接走到，要么中途经历了拐弯。直接走到很简单，但很显然有一个显而易见的性质，\(l\le LCA \le r\)，这个性质来源于左右子树，而 \(LCA\) 又是在 \(fa(LCA)\) 之下的最高点，即以 \(LCA\) 作为的堆顶的堆的顶点，它上方的 \(fa\) 要么
\(< [l,r]\) 要么 \(>[l,r]\)，一定不在 \([l,r]\) 中，得证。这样一来，我们就可以把 \(RMQ\) 问题改为树上 \(LCA\) 问题了。

2.欧拉序求LCA怎么求

欧拉序有常见的两种：

这个就是第一种，这种我们只需要在 \(dfs\) 每个相邻节点结束加入即可，同时记录最早访问时间，即为 \(dfn\) 序。容易观察到每个点至多有 \(\text{度数} +1\) 个点，总个数为 \(2 \times n-1\) ，这个我们可以根据树的总边数为 \(n-1\)，然后每个点至少访问一次共 \(n\) 次，所以总共为 \(2n-1\)。

特点：

欧拉序上任意两个点最先出现的时候之间出现的点深度最浅的即为 \(LCA\)，根据 \(dfs\) 的过程不难得知这是正确的，因为我们是从一个点往上回溯到 LCA 再遍历另一个点。
欧拉序任意两个点的深度之差为 \(1\) 或者 \(-1\)。不难得知这也是正确的，因为我们 \(dfs\) 向下或者回溯向上递归一层就会加入一个点。

另一种即为常说的 \(dfs\) 序，特点就是一个点的 dfs 序的起点和终点即为子树上的点，在这篇文章中不再赘述。

那么知道这个性质以后，我们求出欧拉序以后，直接使用 \(ST\) 表，维护每个 \(2^i\) 次方长的答案即可，两段 \(2^i\) 次方长的答案拼成一段 \(2^{i+1}\) 长的答案，即根据这两段答案的 \(deep\) 比较，谁更浅当然谁就是这段的 \(LCA\) 了。这样就可以欧拉序求出 LCA 了。当然这种 \(m\log{m}，m=2n-1\) 的预处理以后，我们可以做到 \(O(1)\) 回答 \(LCA\)，当然预处理也可以通过分块 \(ST\) 表做到 \(O(m)\)，接下来阐述。

3. 分块+ST表

这是一个个人觉得比较天才的东西，确保这两个知识点你已经很熟悉了。我们知道 \(ST\) 表的复杂度瓶颈为 \(O(n\log{n})\) 的预处理，那么如果我们预处理要求只能使用 \(O(n)\)，我们能做到什么程度？前人已经帮我们回答了这个问题：

如果我们取块长为 \(\dfrac{\lceil \log{n}\rceil}{2}\)，则有块数为：\(\dfrac{2n}{\lceil \log{n}\rceil}\)，我们预处理出每个块的答案，然后把每个块看做单点，作为 ST 表维护的区间信息，即 \([l,r]\) 为第 \(l\) 块到第 \(r\) 块的答案。这样一来实际在 ST 表中的 \(n=\dfrac{2n}{\lceil \log{n}\rceil}\)，那么我们的预处理复杂度变为了：

\[\dfrac{2n}{\lceil \log{n}\rceil} \times \log{\dfrac{2n}{\lceil \log{n}\rceil}}=\dfrac{2n}{\lceil \log{n}\rceil} \times (\log2 +\log{n}-\log{\log{\lceil n \rceil}}) \]

\[\sim \dfrac{2n}{\lceil \log{n}\rceil} \times \log{n}\sim 2n \sim O(n) \]

而在 \(2e7\) 左右的条件下，块长仅仅为 \(14\) 左右。所以对于散块而言，其实直接有的人觉得可以暴力：\(O(14)\) 了，但在刻意制造数据下 \(14n\) 并不小，我们继续尝试优化散块复杂度。

4.+-1 的 RMQ 问题

注意到欧拉序的特点，相邻两个数深度要么 \(+1\)，要么 \(-1\)，那么对于每个块而言我们可以把这个信息记录下来，可以考虑状压，\(+1\) 就是 \(0\) 相比于前一个点高度 \(+1\)，而 \(-1\) 用 \(1\) 记录，则是相比较于前一个点深度为 \(-1\)。现在询问你深度最浅的点。

\(+1-1-1=011\) 这样压缩以后，我们深度变化就可以知道了 \(1,0,-1\)，显然最浅的点为 \(3\) 这个位置对应的点。那么散块提取出来的实际上就是一种 \(+1+1-1...\) 这种序列的状压情况。我们观察到 \(siz\) 不大，在 \(2e7\) 以下也仅仅为 \(14\)，实际写一下：

\(2^{\dfrac{\lceil \log{n}\rceil}{2}}=\sqrt{n}\)，并不是很大，那么我们可以枚举所有的 \(+1-1\) 对应的状压情况，并处理出这种情况下的答案，枚举的时候注意到还有枚举序列求答案，那么预处理复杂度为：

\[\sqrt{n} \times \dfrac{\lceil \log{n}\rceil}{2} \le n \]

这个复杂度极小，那么我们可以求出每种状态数从开头往后走完最佳答案位置。我们对于一个散块的 \([l,r]\) 查询而言，根据掩码位运算，使得当前块的状压量只保留 \([l,r]\) 的情况，并且需要移动使得 \(l\) 处对齐到开始位置，保证是从 \(l\) 位置开始时候的真正状态情况，就能做到 \(O(1)\) 了。\(idx\) 求出该点在块中 \(id\)，\(pos\) 为获取块 \(id\)，\(Diff\) 则为每个块的状压数，\(minIdx\) 则为每种状态量对应的答案位置，我们将 \(l\) 加上这个偏移位置就正确了。

inline int query(const int l, const int r)
{
    const int L = idx(l), R = idx(r);
    const int status = (Diff[pos(l)] & (1 << R) - 1) >> L;
    return a[eulr[l + minIdx[status]]];
}

例题：P3793 由乃救爷爷

正宗的四毛子常数是巨大的，因为要建树：\(O(n)\)，遍历树获得欧拉序 \(O(n)\)，\(m=2n-1\)，构建 \(ST\) 表：\(O(m)\)，预处理每个块的状压数和每种状态量答案：\(O(m+\sqrt{m}\times \dfrac{log{m}}{2})\)，总的预处理复杂度为：\(6n\) 左右，前提是不加上预处理分块的起点终点那些辅助数组以及 \(LOG2\) 之类的东西。除此之外，空间开销也是巨大的，所以并不常用，给出一份本题正宗四毛子实现代码，有些点会 \(MLE\)。

参照代码

#include <bits/stdc++.h>

// #pragma GCC optimize(2)
// #pragma GCC optimize("Ofast,no-stack-protector,unroll-loops,fast-math")
// #pragma GCC target("sse,sse2,sse3,ssse3,sse4.1,sse4.2,avx,avx2,popcnt,tune=native")

#define isPbdsFile

#ifdef isPbdsFile

#include <bits/extc++.h>

#else

#include <ext/pb_ds/priority_queue.hpp>
#include <ext/pb_ds/hash_policy.hpp>
#include <ext/pb_ds/tree_policy.hpp>
#include <ext/pb_ds/trie_policy.hpp>
#include <ext/pb_ds/tag_and_trait.hpp>
#include <ext/pb_ds/hash_policy.hpp>
#include <ext/pb_ds/list_update_policy.hpp>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/exception.hpp>
#include <ext/rope>

#endif

using namespace std;
using namespace __gnu_cxx;
using namespace __gnu_pbds;
typedef long long ll;
typedef long double ld;
typedef pair<int, int> pii;
typedef pair<ll, ll> pll;
typedef tuple<int, int, int> tii;
typedef tuple<ll, ll, ll> tll;
typedef unsigned int ui;
typedef unsigned long long ull;
typedef __int128 i128;
#define hash1 unordered_map
#define hash2 gp_hash_table
#define hash3 cc_hash_table
#define stdHeap std::priority_queue
#define pbdsHeap __gnu_pbds::priority_queue
#define sortArr(a, n) sort(a+1,a+n+1)
#define all(v) v.begin(),v.end()
#define yes cout<<"YES"
#define no cout<<"NO"
#define Spider ios_base::sync_with_stdio(false);cin.tie(nullptr);cout.tie(nullptr);
#define MyFile freopen("..\\input.txt", "r", stdin),freopen("..\\output.txt", "w", stdout);
#define forn(i, a, b) for(int i = a; i <= b; i++)
#define forv(i, a, b) for(int i=a;i>=b;i--)
#define ls(x) (x<<1)
#define rs(x) (x<<1|1)
#define endl '\n'
//用于Miller-Rabin
[[maybe_unused]] static int Prime_Number[13] = {0, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37};

template <typename T>
int disc(T* a, int n)
{
    return unique(a + 1, a + n + 1) - (a + 1);
}

template <typename T>
T lowBit(T x)
{
    return x & -x;
}

template <typename T>
T Rand(T l, T r)
{
    static mt19937 Rand(time(nullptr));
    uniform_int_distribution<T> dis(l, r);
    return dis(Rand);
}

template <typename T1, typename T2>
T1 modt(T1 a, T2 b)
{
    return (a % b + b) % b;
}

template <typename T1, typename T2, typename T3>
T1 qPow(T1 a, T2 b, T3 c)
{
    a %= c;
    T1 ans = 1;
    for (; b; b >>= 1, (a *= a) %= c)if (b & 1)(ans *= a) %= c;
    return modt(ans, c);
}

template <typename T>
void read(T& x)
{
    x = 0;
    T sign = 1;
    char ch = getchar();
    while (!isdigit(ch))
    {
        if (ch == '-')sign = -1;
        ch = getchar();
    }
    while (isdigit(ch))
    {
        x = (x << 3) + (x << 1) + (ch ^ 48);
        ch = getchar();
    }
    x *= sign;
}

template <typename T, typename... U>
void read(T& x, U&... y)
{
    read(x);
    read(y...);
}

template <typename T>
void write(T x)
{
    if (typeid(x) == typeid(char))return;
    if (x < 0)x = -x, putchar('-');
    if (x > 9)write(x / 10);
    putchar(x % 10 ^ 48);
}

template <typename C, typename T, typename... U>
void write(C c, T x, U... y)
{
    write(x), putchar(c);
    write(c, y...);
}


template <typename T11, typename T22, typename T33>
struct T3
{
    T11 one;
    T22 tow;
    T33 three;

    bool operator<(const T3 other) const
    {
        if (one == other.one)
        {
            if (tow == other.tow)return three < other.three;
            return tow < other.tow;
        }
        return one < other.one;
    }

    T3() { one = tow = three = 0; }

    T3(T11 one, T22 tow, T33 three) : one(one), tow(tow), three(three)
    {
    }
};

template <typename T1, typename T2>
void uMax(T1& x, T2 y)
{
    if (x < y)x = y;
}

template <typename T1, typename T2>
void uMin(T1& x, T2 y)
{
    if (x > y)x = y;
}

struct
{
    unsigned z1, z2, z3, z4, b;

    unsigned rand_()
    {
        b = (z1 << 6 ^ z1) >> 13;
        z1 = (z1 & 4294967294U) << 18 ^ b;
        b = (z2 << 2 ^ z2) >> 27;
        z2 = (z2 & 4294967288U) << 2 ^ b;
        b = (z3 << 13 ^ z3) >> 21;
        z3 = (z3 & 4294967280U) << 7 ^ b;
        b = (z4 << 3 ^ z4) >> 12;
        z4 = (z4 & 4294967168U) << 13 ^ b;
        return z1 ^ z2 ^ z3 ^ z4;
    }

    void srand(const unsigned x)
    {
        z1 = x;
        z2 = ~x ^ 0x233333333U;
        z3 = x ^ 0x1234598766U;
        z4 = ~x + 51;
    }

    int read()
    {
        int a = rand_() & 32767;
        int b = rand_() & 32767;
        return a * 32768 + b;
    }
} in;

constexpr int N = 2e7 + 10;
int n, m, root;
unsigned S;
int a[N];
int child[N][2];

inline void build()
{
    stack<int> st;
    forn(i, 1, n)
    {
        int last = 0;
        while (!st.empty() and a[st.top()] < a[i])last = st.top(), st.pop();
        if (!st.empty())child[st.top()][1] = i;
        child[i][0] = last, st.push(i);
    }
    while (!st.empty())
    {
        if (st.size() == 1)root = st.top();
        st.pop();
    }
}

int deep[N], eulr[N << 1], cnt, dfn[N];

inline void dfs(const int curr, const int fa)
{
    eulr[++cnt] = curr, dfn[curr] = cnt;
    deep[curr] = deep[fa] + 1;
    forn(i, 0, 1)
    {
        const int nxt = child[curr][i];
        if (!nxt or nxt == fa)continue;
        dfs(nxt, curr);
        eulr[++cnt] = curr;
    }
}

constexpr int MX = 4e7 + 10;
constexpr int SIZE = (log2(MX) + 1) / 2 + 1;
constexpr int CNT = (MX + SIZE - 2) / (SIZE - 1) + 1;
int blockSize, blockCnt;
#define s(x) ((x-1)*blockSize+1)
#define e(x) min(x*blockSize,cnt)
#define pos(x) ((x-1)/blockSize+1)
#define idx(x) ((x-1)%blockSize+1)
constexpr int T = ceil(log2(CNT));
int st[CNT][T];
int Diff[CNT], minIdx[1 << SIZE | 1];

inline void init()
{
    blockSize = (log2(cnt) + 1) / 2;
    blockCnt = (cnt + blockSize - 1) / blockSize;
    forn(i, 1, blockCnt)
    {
        int minDeep = MX, ans = 0;
        forn(j, s(i), e(i))if (deep[eulr[j]] < minDeep)minDeep = deep[eulr[j]], ans = eulr[j];
        st[i][0] = ans;
    }
    const int k = log2(cnt) + 1;
    forn(j, 1, k)
    {
        forn(i, 1, blockCnt-(1<<j)+1)
        {
            const int L = st[i][j - 1], R = st[i + (1 << j - 1)][j - 1];
            st[i][j] = deep[L] < deep[R] ? L : R;
        }
    }
    forn(i, 1, blockCnt)
    {
        int pre = N;
        forn(j, s(i), e(i))
        {
            if (deep[eulr[j]] < pre)Diff[i] |= 1 << idx(j) - 1;
            pre = deep[eulr[j]];
        }
    }
    forn(status, 1, (1<<blockSize)-1)
    {
        int minDeep = 0, minCurr = 0;
        int curr = 0;
        forn(i, 1, blockSize)
        {
            curr += status & 1 << i - 1 ? -1 : 1;
            if (curr < minDeep)minDeep = curr, minCurr = i;
        }
        minIdx[status] = minCurr;
    }
}

inline int query(const int l, const int r)
{
    const int L = idx(l), R = idx(r);
    const int status = (Diff[pos(l)] & (1 << R) - 1) >> L;
    return a[eulr[l + minIdx[status]]];
}

inline int queryMax(const int l, const int r)
{
    const int L = pos(l), R = pos(r);
    if (L == R)return query(l, r);
    int ans = max(query(l, e(L)), query(s(R), r));
    if (L + 1 <= R - 1)
    {
        const int queryL = L + 1, queryR = R - 1;
        const int k = log2(queryR - queryL + 1);
        const int v1 = st[queryL][k], v2 = st[queryR - (1 << k) + 1][k];
        uMax(ans, deep[v1] < deep[v2] ? a[v1] : a[v2]);
    }
    return ans;
}

ull ans;

inline void solve()
{
    cin >> n >> m >> S;
    in.srand(S);
    forn(i, 1, n)a[i] = in.read();
    build();
    dfs(root, 0);
    init();
    while (m--)
    {
        int l = in.read() % n + 1, r = in.read() % n + 1;
        if (dfn[l] > dfn[r])swap(l, r);
        ans += queryMax(dfn[l], dfn[r]);
    }
    cout << ans;
}

signed int main()
{
    // MyFile
    Spider
    //------------------------------------------------------
    // clock_t start = clock();
    int test = 1;
    //    read(test);
    // cin >> test;
    forn(i, 1, test)solve();
    //    while (cin >> n, n)solve();
    //    while (cin >> test)solve();
    // clock_t end = clock();
    // cerr << "time = " << double(end - start) / CLOCKS_PER_SEC << "s" << endl;
}

我们这里给出常用的朴素四毛子做法，还是基于第三点，我们散块直接暴力就行了，当然散块如果你预处理了前后缀最值，那么很显然不涉及到 \(l,r\) 属于同一个块 \(r-l<14\) 就是纯 \(O(1)\) 查询和预处理了。同一个块最坏则为 \(O(14)\)，但这显然是 \(2e7\) 级别的数据下，如果放在常见的 \(1e5\) 或者 \(1e6\) 以下，这个预处理复杂度是远小于倍增处理 \(ST\) 表的。

参照代码

#include <bits/stdc++.h>

// #pragma GCC optimize(2)
// #pragma GCC optimize("Ofast,no-stack-protector,unroll-loops,fast-math")
// #pragma GCC target("sse,sse2,sse3,ssse3,sse4.1,sse4.2,avx,avx2,popcnt,tune=native")

#define isPbdsFile

#ifdef isPbdsFile

#include <bits/extc++.h>

#else

#include <ext/pb_ds/priority_queue.hpp>
#include <ext/pb_ds/hash_policy.hpp>
#include <ext/pb_ds/tree_policy.hpp>
#include <ext/pb_ds/trie_policy.hpp>
#include <ext/pb_ds/tag_and_trait.hpp>
#include <ext/pb_ds/hash_policy.hpp>
#include <ext/pb_ds/list_update_policy.hpp>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/exception.hpp>
#include <ext/rope>

#endif

using namespace std;
using namespace __gnu_cxx;
using namespace __gnu_pbds;
typedef long long ll;
typedef long double ld;
typedef pair<int, int> pii;
typedef pair<ll, ll> pll;
typedef tuple<int, int, int> tii;
typedef tuple<ll, ll, ll> tll;
typedef unsigned int ui;
typedef unsigned long long ull;
typedef __int128 i128;
#define hash1 unordered_map
#define hash2 gp_hash_table
#define hash3 cc_hash_table
#define stdHeap std::priority_queue
#define pbdsHeap __gnu_pbds::priority_queue
#define sortArr(a, n) sort(a+1,a+n+1)
#define all(v) v.begin(),v.end()
#define yes cout<<"YES"
#define no cout<<"NO"
#define Spider ios_base::sync_with_stdio(false);cin.tie(nullptr);cout.tie(nullptr);
#define MyFile freopen("..\\input.txt", "r", stdin),freopen("..\\output.txt", "w", stdout);
#define forn(i, a, b) for(int i = a; i <= b; i++)
#define forv(i, a, b) for(int i=a;i>=b;i--)
#define ls(x) (x<<1)
#define rs(x) (x<<1|1)
#define endl '\n'
//用于Miller-Rabin
[[maybe_unused]] static int Prime_Number[13] = {0, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37};

template <typename T>
int disc(T* a, int n)
{
    return unique(a + 1, a + n + 1) - (a + 1);
}

template <typename T>
T lowBit(T x)
{
    return x & -x;
}

template <typename T>
T Rand(T l, T r)
{
    static mt19937 Rand(time(nullptr));
    uniform_int_distribution<T> dis(l, r);
    return dis(Rand);
}

template <typename T1, typename T2>
T1 modt(T1 a, T2 b)
{
    return (a % b + b) % b;
}

template <typename T1, typename T2, typename T3>
T1 qPow(T1 a, T2 b, T3 c)
{
    a %= c;
    T1 ans = 1;
    for (; b; b >>= 1, (a *= a) %= c)if (b & 1)(ans *= a) %= c;
    return modt(ans, c);
}

template <typename T>
void read(T& x)
{
    x = 0;
    T sign = 1;
    char ch = getchar();
    while (!isdigit(ch))
    {
        if (ch == '-')sign = -1;
        ch = getchar();
    }
    while (isdigit(ch))
    {
        x = (x << 3) + (x << 1) + (ch ^ 48);
        ch = getchar();
    }
    x *= sign;
}

template <typename T, typename... U>
void read(T& x, U&... y)
{
    read(x);
    read(y...);
}

template <typename T>
void write(T x)
{
    if (typeid(x) == typeid(char))return;
    if (x < 0)x = -x, putchar('-');
    if (x > 9)write(x / 10);
    putchar(x % 10 ^ 48);
}

template <typename C, typename T, typename... U>
void write(C c, T x, U... y)
{
    write(x), putchar(c);
    write(c, y...);
}


template <typename T11, typename T22, typename T33>
struct T3
{
    T11 one;
    T22 tow;
    T33 three;

    bool operator<(const T3 other) const
    {
        if (one == other.one)
        {
            if (tow == other.tow)return three < other.three;
            return tow < other.tow;
        }
        return one < other.one;
    }

    T3() { one = tow = three = 0; }

    T3(T11 one, T22 tow, T33 three) : one(one), tow(tow), three(three)
    {
    }
};

template <typename T1, typename T2>
void uMax(T1& x, T2 y)
{
    if (x < y)x = y;
}

template <typename T1, typename T2>
void uMin(T1& x, T2 y)
{
    if (x > y)x = y;
}

struct
{
    unsigned z1, z2, z3, z4, b;

    unsigned rand_()
    {
        b = (z1 << 6 ^ z1) >> 13;
        z1 = (z1 & 4294967294U) << 18 ^ b;
        b = (z2 << 2 ^ z2) >> 27;
        z2 = (z2 & 4294967288U) << 2 ^ b;
        b = (z3 << 13 ^ z3) >> 21;
        z3 = (z3 & 4294967280U) << 7 ^ b;
        b = (z4 << 3 ^ z4) >> 12;
        z4 = (z4 & 4294967168U) << 13 ^ b;
        return z1 ^ z2 ^ z3 ^ z4;
    }

    void srand(const unsigned x)
    {
        z1 = x;
        z2 = ~x ^ 0x233333333U;
        z3 = x ^ 0x1234598766U;
        z4 = ~x + 51;
    }

    int read()
    {
        int a = rand_() & 32767;
        int b = rand_() & 32767;
        return a * 32768 + b;
    }
} in;

constexpr int N = 2e7 + 10;
int n, m, root;
unsigned S;
int a[N];
constexpr int SIZE = (log2(N) + 1) / 2 + 1;
constexpr int CNT = (N + SIZE - 2) / (SIZE - 1) + 1;
int blockSize, blockCnt;
#define s(x) ((x-1)*blockSize+1)
#define e(x) min(x*blockSize,n)
#define pos(x) ((x-1)/blockSize+1)
#define idx(x) ((x-1)%blockSize+1)
constexpr int T = ceil(log2(CNT));
int st[CNT][T];
int Diff[CNT], minIdx[1 << SIZE | 1];
int pre[N], suf[N];

inline void init()
{
    blockSize = (log2(n) + 1) / 2;
    blockCnt = (n + blockSize - 1) / blockSize;
    forn(i, 1, blockCnt)
    {
        int ans = 0;
        forn(j, s(i), e(i))uMax(ans, a[j]);
        int curr = 0;
        forn(j, s(i), e(i))uMax(curr, a[j]), pre[j] = curr;
        curr = 0;
        forv(j, e(i), s(i))uMax(curr, a[j]), suf[j] = curr;
        st[i][0] = ans;
    }
    const int k = log2(blockCnt) + 1;
    forn(j, 1, k)
    {
        forn(i, 1, blockCnt-(1<<j)+1)
        {
            const int L = st[i][j - 1], R = st[i + (1 << j - 1)][j - 1];
            st[i][j] = max(L, R);
        }
    }
}

ull ans;

inline int queryMax(const int l, const int r)
{
    const int L = pos(l), R = pos(r);
    int ans = 0;
    if (L == R)
    {
        forn(i, l, r)uMax(ans, a[i]);
        return ans;
    }
    uMax(ans, suf[l]);
    uMax(ans, pre[r]);
    if (L + 1 <= R - 1)
    {
        const int queryL = L + 1, queryR = R - 1;
        const int k = log2(queryR - queryL + 1);
        const int v1 = st[queryL][k], v2 = st[queryR - (1 << k) + 1][k];
        uMax(ans, max(v1, v2));
    }
    return ans;
}

inline void solve()
{
    cin >> n >> m >> S;
    in.srand(S);
    forn(i, 1, n)a[i] = in.read();
    init();
    while (m--)
    {
        int l = in.read() % n + 1, r = in.read() % n + 1;
        if (l > r)swap(l, r);
        ans += queryMax(l, r);
    }
    cout << ans;
}

signed int main()
{
    // MyFile
    Spider
    //------------------------------------------------------
    // clock_t start = clock();
    int test = 1;
    //    read(test);
    // cin >> test;
    forn(i, 1, test)solve();
    //    while (cin >> n, n)solve();
    //    while (cin >> test)solve();
    // clock_t end = clock();
    // cerr << "time = " << double(end - start) / CLOCKS_PER_SEC << "s" << endl;
}

当然了，四毛子当中的涉及到的这些知识点才是最值得学习和体会的。

posted @ 2024-02-19 00:15 Athanasy 阅读(1026) 评论(0) 编辑收藏举报

刷新页面返回顶部

Athanasy