X264 rev1198 MB-Tree Ratecontrol

The rest, as they say, is the history.

用这个形容倒也不算过分了。
git repository的链接在此http://git.videolan.org/gitweb.cgi?p=x264.git;a=commit;h=bb66c482242a0747823661b212114c1a2f015fe3

DS大人还兴奋的写了篇ep感想文
A Tree of Thought
嘛 当然我就不吐槽你这装文青的心态如何了…

不过我感兴趣的是MB-Tree的实现原理,来看下ds的原文

About a year and a half ago, I had an idea: what if we made a graph of how each block of the video referenced other blocks temporally and used this graph to increase quality on blocks which are referenced a lot and lower it on those which are referenced less?

此处是说如果能记录并整理各帧内MB被参考的频繁程度的话,或许能通过这个频繁程度反馈的信息,优化编码各MB时bit allocation的策略,从而达到改善质量的效果。
这点不难理解,如果一个有偏差的(比如承载着错误的图像信息)编码MB,被频繁的作为参考帧使用,那么这个参考帧的错误会传递到所有使用该块做估计的MB上,于是形成了累加恶化效应,这也是后面ds所说的propagation的来由。

My guess was it had a simple 2-pass heuristic for I-frame quantizers; low-motion scenes would get higher quality I-frames and high-motion scenes would get lower quality I-frames; it seemed pretty straightforward. The way I decided to do it was via a concept I called “propagation.”

这里说的是Mainconcept中对I帧用的一种渐进式量化算法,低运动场景会得到高质量的I帧,而相反,高运动场景会得到低质量的I帧。
这个策略其优势也是不难理解,低运动场景大量的MB会被重复运用到后续编码块的prediction中,这种方式保证了这些块的误差相对较小,从而降低了错误累加恶化的速度。对大动态场景,运动的部分往往本身就是帧内预测块。(当然这由MSE决定)
另外此策略的优势来自于人眼,人眼对运动场景的辨识能力是随着运动频率递减的,而受视觉暂留的影响这种递减的速率会随着运动场景的逐渐加快而加速,因此人眼是很难辨认大动态场景中错误,选择较低的码率来编码大动态场景,这也是符合人主观质量的做法。

Clearly, the propagation method is just another way of implementing the basic concept of qcomp: frames whose data doesn’t propagate far are basically high complexity, and frames whose data does propagate far are basically low complexity. As a result, I disabled qcomp when testing this idea. And the tests bore me out: there was a significant improvement across most test clips! But on a few clips, especially anime, there was a very significant loss of quality.

As one might expect, in anime, the vast majority of complexity is usually confined to a small portion of the frame–for example, a character walking across an otherwise-static frame. Furthermore, the sharp lines making up the character are much more “complex” than the static background. Thus, it can appear to the propagation algorithm that a series of frames is complex, when in reality only the character’s motion is complex, while the rest of the frame is static. The algorithm then lowers the quality of all the frames, including the background, greatly decreasing quality, despite the fact that it should only have lowered quality on the moving character instead. If only we could apply this propagation algorithm to individual blocks instead of the whole frame…

嗯,用于选择quantization程度来获得优化质量的qcomp参数大家应该都熟知。它和上述mainconcept的精神是一致的,即给低动态更高的码率。但如果只是根据这个原理,将某些帧粗暴的设置成qcomp=1,对于前景后景复杂度迥异的anime来说,是很糟糕的。
于是MB-Tree的想法重现了。
怎么样,就是兜了个圈,看似很简单的原理,但就没人之前想到过么-.-或许去年11月出现在mailist上的这个讨论,给了ds很多启示呢
http://mailman.videolan.org/pipermail/x264-devel/2008-November/005116.html

DS最后给了个地灵殿的例子,弹幕游戏前后复杂度的差别实在太大了,当然MB-Tree很好的完成了这点,SSIM提高了60%。
至于主观的测试,今天太晚了,周末找时间试试吧。

文章链接:http://airness.hjlp.org/html/y2009/441.html

posted @ 2009-11-08 00:51  ciey  阅读(1005)  评论(0编辑  收藏  举报