GPU Down Sampling For Point Based Rendering

摘要：点渲染技术的EWA渲染方式能够有俺们的实时GPU超采样产生的图效果好么？肯定是不会有的啦。
Abstract : Is the EWA splatting will be better than my GPU multipass supersampling method ? Of course Not !
Zusammemfassung : Ist die EWA splatting so besser als meine GPU multipass supersampling Methode ? Naturlich nicht!

　　经过仔细的测试以及权衡，我决定抛弃EWA过滤技术转向GPU超采样的方式对基于点渲染的图像进行过滤。

　　During careful benchmark and balancing, I decide to abandon the EWA filter, turn to use GPU supersampling filter method in offline rendering, did as the same as NVIDIA Gelato.

　　具体的做法跟Gelato一样。在硬件能力允许的情况下光栅化超大分辨率的图像，尔后使用离线渲染器使用的过滤器（Catmull-Rom、Gaussian、Sinc等等）在帧缓冲中进行X方向与Y方向上的卷积过滤，结果的图像素质远远超过EWA过滤得到的效果，近似达到离线渲染的水平，性能上却几乎没有损失，避免了使用硬件提供的蹩脚的MSAA。

　　We will render the whole scene into a big enough RT( Render Target ) not beyond the capability of hardware, filter this RT by Catmull-Rom, gaussian, sinc filters etc. It divides into X and Y pass, results nearly as the same result as offline render for preview relighting result, its quality is much better than EWA, avoid to using MSAA supplied by poor hardware.

　　做法很简单。根据用户设定的超采样率，将图像渲染到RT（Render Target，下同）里。尔后采样过滤器生成查找表，计算真实过滤半径。将原始RT的纹理寻址模式设置为GL_CLAMP_TO_EDGE，而不能使用GL_CLAMP或者GL_REPEAT。建立FBO，建立2张临时RT，第一个临时RT储存X方向上过滤后的结果，第二张储存完全过滤后的结果，如果用户需要则输出为指定格式图片。下面是一些琐碎的代码块和图，仅供参考。过滤器代码来自RenderMan Interface Specfication。

int SSRate = RT.width() / DstSizeX;//计算超采样率，为了省事XY方向上的超采样率相同

int RealRadius = FilterRadius*SSRate;

float WeightSum = 0.0f;

for( int i=-RealRadius; i<RealRadius; i++ )

{

float W = RiCatmullRomFilter( (i+0.5)/(float)RealRadius,0, FilterRadius, FilterRadius);

WeightSum += W;

Weights.push_back( W );

}

float* WeightPtr = new float[ Weights.size() ];

for( size_t i=0; i<Weights.size(); i++ )

{

WeightPtr[i] = Weights[i] / WeightSum;

printf("%f\n",WeightPtr[i]);

}

glGenTextures(1,&FilterTex);//过滤器纹理

glBindTexture(GL_TEXTURE_RECTANGLE_ARB,FilterTex);

glTexImage2D(GL_TEXTURE_RECTANGLE_ARB,0,GL_ALPHA32F_ARB,Weights.size(),1,0,GL_ALPHA,GL_FLOAT,WeightPtr);

glTexParameterf(GL_TEXTURE_RECTANGLE_ARB,GL_TEXTURE_MIN_FILTER,GL_NEAREST);

glTexParameterf(GL_TEXTURE_RECTANGLE_ARB,GL_TEXTURE_MAG_FILTER,GL_NEAREST);

glTexParameterf(GL_TEXTURE_RECTANGLE_ARB,GL_TEXTURE_WRAP_S,GL_CLAMP_TO_EDGE);

glTexParameterf(GL_TEXTURE_RECTANGLE_ARB,GL_TEXTURE_WRAP_T,GL_CLAMP_TO_EDGE);

delete [] WeightPtr;

Weights.clear();

//Copyright Bo Schwarzstein ( bo[dot]schwarzstein[at]gmail[dot]com ) 2008

//Filter on X|Y direction, Y is in comment.

//TEX0 binds the origin sampler

//TEX1 binds the filter weights

//WeightNum is used for loop

//SSRate is used to calculate the correct pixel offset on orgin sampler

uniform samplerRect TEX0;

uniform samplerRect TEX1;

uniform int WeightNum;

uniform int SSRate;

void main()

{

vec4 WPOS = gl_FragCoord;

vec2 Center = vec2( (WPOS.x+0.5)*float(SSRate), WPOS.y );//vec2 Center = vec2( WPOS.x, (ceil(WPOS.y)+0.5)*float(SSRate) );

for( int i=0; i<WeightNum; i++ )

{

float Weight = textureRect( TEX1, vec2(float(i)+0.5,0.0) ).a;

gl_FragColor += Weight*textureRect(TEX0,Center + vec2(float( i - WeightNum ),0.0));

}

一个局部地带（点击查看大图）
Local Area ( click to view large picture )

过滤半径为2x2，超采样率4x4，左边是Gaussian过滤器，右边是Catmull-Rom过滤器（点击查看大图）
filter Radius 2x2, supersampling Rate 4x4, left is using gaussian filter, right is using catmull-rom filter ( click to view large picture )

　　有些朋友可能会说，“你个丫的不知道分Tile？”，是的，我尝试过，可是如果分Tile进行渲染，基于点渲染的技术将全部被推翻，具体表现为使用经过修改的透视变换矩阵得到的点是不对头的，也就无法进行“透视正确的点的尺寸的计算”。而且分多Tile势必浪费了不少像素填充率，还不如直接渲染个超大尺寸的RT快速。

　　Maybe some guy will say, " you big shit why not render tile by tile ?", yes, I tried, if did that, the all GPU point based rendering method will become "shit". Because if we rendering tile with modified perspective matrix, it will be wrong, it's not perspective correct result.

posted on 2008-05-22 16:00 Bo Schwarzstein 阅读(1677) 评论(9) 收藏举报