【DP】斜率优化
斜率优化
入门题:PKU3709
很多人貌似都是做这道题来K斜率优化的,所以看了资料以后还是开始入手吧。
然而还是得跪求大神的程序啊 ORZ ORZ……
其实理解斜率优化就是会列斜率不等式,还要理解剔除过程。
那么我们来看看这道题:
Time Limit: 4000MS | Memory Limit: 65536K | |
Total Submissions: 5602 | Accepted: 1805 |
Description
The explosively increasing network data in various application domains has raised privacy concerns for the individuals involved. Recent studies show that simply removing the identities of nodes before publishing the graph/social network data does not guarantee privacy. The structure of the graph itself, along with its basic form the degree of nodes, can reveal the identities of individuals.
To address this issue, we study a specific graph-anonymization problem. We call a graph k-anonymous if for every node v, there exist at least k-1 other nodes in the graph with the same degree as v. And we are interested in achieving k-anonymous on a graph with the minimum number of graph-modification operations.
We simplify the problem. Pick n nodes out of the entire graph G and list their degrees in ascending order. We define a sequence k-anonymous if for every element s, there exist at least k-1 other elements in the sequence equal to s. To let the given sequence k-anonymous, you could do one operation only—decrease some of the numbers in the sequence. And we define the cost of the modification the sum of the difference of all numbers you modified. e.g. sequence 2, 2, 3, 4, 4, 5, 5, with k=3, can be modified to 2, 2, 2, 4, 4, 4, 4, which satisfy 3-anonymous property and the cost of the modification will be |3-2| + |5-4| + |5-4| = 3.Give a sequence with n numbers in ascending order and k, we want to know the modification with minimal cost among all modifications which adjust the sequence k-anonymous.
Input
The first line of the input file contains a single integer T (1 ≤ T ≤ 20) – the number of tests in the input file. Each test starts with a line containing two numbers n (2 ≤ n ≤ 500000) – the amount of numbers in the sequence and k (2 ≤ k ≤ n). It is followed by a line with n integer numbers—the degree sequence in ascending order. And every number s in the sequence is in the range [0, 500000].
Output
For each test, output one line containing a single integer—the minimal cost.
Sample Input
2 7 3 2 2 3 4 4 5 5 6 2 0 3 3 4 8 9
Sample Output
3 5
很容易想到DP方程:f[i]=MIN{f[j]-sum[j]+sum[i]-a[j+1]*(i-j)}
那么我们可以显然地看出,最优解定然是比所有的解更加优化的。
所以必定存在决策A、B,满足A<B且B更加优于A。
那么我们用A和B带入这个DP方程,可以得到不等式:
f[A]+sum[i]-sum[A]+a[A+1]*(i-A)>= f[B]+sum[i]-sum[B]+a[B+1]*(i-B)
两边都有+sum[i],抵消,变为:
f[A]-sum[A]+a[A+1]*(i-A)>= f[B]-sum[B]+a[B+1]*(i-B)
把(i-A)与(i-B)拆开,得到:
f[A]-sum[A]+a[A+1]*i-a[A+1]*A>=f[B]-sum[B]+a[B+1]*i-a[B+1]*B
移项:
[(f[A]-sum[A]+a[A+1]*A)-(f[B]-sum[B]+a[B+1]*B)]>=i*(a[A+1]-a[B+1])
显然,有序序列中a[B+1]>=a[A+1](因为B>A),那么可以化简为:
[(f[A]-sum[A]+a[A+1]*A)-(f[B]-sum[B]+a[B+1]*B)]/(a[A+1]-a[B+1])<=i
那么对于a[A+1]=a[B+1]的情况,除法的式子就会没有下线,所以只能用乘法的式子。
如果决策A,B满足上述表达式,则B一定优于A。
难么如果不满足上述表达式B就比A差。
但事实不是这样的,为什么呢?
因为在某些特殊情况下(如成反比)那么随着i的增加不等式的右边是会变小的。
所以在某一个i的位置这个不等式也有可能成立。
我们可以令dy(A,B)=(f[A]-sum[A]+a[A+1]*A)-(f[B]-sum[B]+a[B+1])
dx(A,B)=a[A+1]-a[B+1].
我们要设置一个队列,队首元素很明显一开始为0
然后队列头指针为head,尾指针为tail,如果dy(queue[head],queue[head+1]) >= i*dx(queue[head],queue[head+1]),则队首元素可以直接丢掉,因为如果que[head]没有que[head+1]好,那么以后也不会有它好,所以可以直接丢掉。
那么尾部要怎么办呢?
对于两个元素来说,如果对于现在的i,y要比x烂,但是前面已经证明过了,对于比较大的i来说不一定是这样的。那么我们就在加一个元素就直观多了dy(x,y)/dx(x,y)>=dy(y,z)/dx(y,z)那么如果y优于x的话,那么z也一定优于y,无论怎样,它都会被一个更优的元素所代替,这样的话,那么留着y也就没有用了,所以我们可以把它剔除。
那么我们可以来考虑一下边界情况,那么dy(x,y)和dy(y,z)有四种极限值方式:INF INF、-INF INF、INF –INF、-INF –INF.那么地1、2、4都满足我们上面推过的式子,但是情况3可就不一样了,在这种情况下,y优于x且y也优于z,明显,按照我们的程序把y剔除掉难道不WA吗?但这种情况不会出现,因为dy(x,y)==0,那么a[x+1]==a[y+1],则可以在原来的式子中抵消,变为-sum[x]+a[x+1]*x <= -sum[y]+a[y+1]*y,那么如果a[x+1]==a[y+1],则dy(x,y)一定小于等于0.所以本题可以这样维护求解。
我们把过程转化为代码就可以了:
#include <iostream> #include <cstdio> #include <cstring> using namespace std; const int N = 500010; typedef long long llg; int n, k, queue[N]; llg sum[N], f[N], a[N]; llg dy(int j1, int j2) { return (f[j1]-sum[j1]+a[j1+1]*j1) - (f[j2]-sum[j2]+a[j2+1]*j2); } llg dx(int j1, int j2) { return (a[j1+1] - a[j2+1]); } void dp() { int i, j, head, tail, x, y, z; head = tail = 0; queue[0] = 0; for(i = 1; i <= n; i++) { while(head<tail && dy(queue[head], queue[head+1])>=i*dx(queue[head], queue[head+1])) head++; j = queue[head]; f[i] = f[j] + sum[i] - sum[j] - a[j+1]*(i-j); if(i >= 2*k-1) { z = i-k+1; while(head < tail) { x = queue[tail-1]; y = queue[tail]; if(dy(x,y)*dx(y,z) >= dy(y,z)*dx(x,y)) tail--; else break; } queue[++tail] = z; } } } int main() { int t, i; scanf("%d", &t); while(t--) { scanf("%d%d", &n, &k); sum[0] = 0; for(i = 1; i <= n; i++) { scanf("%I64d", a+i); sum[i] = sum[i-1] + a[i]; } dp(); printf("%I64d\n", f[n]); } return 0; } /* 网上摘抄的代码QAQ */