quad differential

To understand how these instructions work, it helps to understand the basic execution architecture of GPUs and how fragment programs map to that architecture.

GPUs run a bunch of threads in 'lock-step' over the same program, which each thread having its own set of registers.

So it fetches an instruction, then executes that instruction N times, once for each running thread.

To deal with conditional branches and such, they also have an 'active mask' for the currently running group of threads.
Threads that are not active in the mask don't actually run (so their registers don't change).
Whenever there is a conditional branch or join (branch target) the thread mask is changed appropriately.

Now when a fragment program is run, the fragments to be run are arranged into quads -- 2x2 squares of 4 pixels that always run together in a thread group.

Each thread in the group knows its own pixel coordinate, and can easily find the coordinate of the adjacent pixel in the quad by flipping the lowest bit of the x (or y) coord.

When the GPU executes a DDX or DDY instruction, what happens is that it peeks at the registers for the thread for the adjacent pixel and does a subtract with the value from the current pixel

subtracting the value for the higher coordinate (lowest bit 1) from the lower (lowest bit 0).

This has implications if you use dFdx or dFdy in a conditional branch -- if one of the threads in a quad is active while the other is not, the GPU will still look at the register of the inactive thread, which might have any old value in it, so the result could be anything.

uniform vec3      iResolution;           // viewport resolution (in pixels)
uniform float     iTime;                 // shader playback time (in seconds)
uniform float     iTimeDelta;            // render time (in seconds)
uniform float     iFrameRate;            // shader frame rate
uniform int       iFrame;                // shader playback frame
uniform float     iChannelTime[4];       // channel playback time (in seconds)
uniform vec3      iChannelResolution[4]; // channel resolution (in pixels)
uniform vec4      iMouse;                // mouse pixel coords. xy: current (if MLB down), zw: click
uniform samplerXX iChannel0..3;          // input channel. XX = 2D/Cube
uniform vec4      iDate;                 // (year, month, day, time in seconds)

// i came across dFdx here: https://www.shadertoy.com/view/MlGcDz
// and had some questions about the explanation of it here: https://stackoverflow.com/a/16368768/230851
// so this is a minor exploration.
//
// this shows that the value of dFdx() is duplicated for groups of pixels.
// the value we're taking the derivative of is a 45° line,
// and on the right side of the screen we use dFdx to produce the yellow pixels,
// and on the left side we manually generate little 2x2 pixel quads.
// the results look identical on my machine.
// the extra two diagonals are pixel-based, for reference.

void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
    float fc_delta = fragCoord.y - fragCoord.x;
    float f = fc_delta < 0.0 ? 1.0 : 0.0;

    float dx = dFdx(f);

    // if on the left side of the screen
    if (fragCoord.x < iResolution.y / 2.0) {
        int fcx = int(fragCoord.x);
        int fcy = int(fragCoord.y);
        if (fcx / 2 == fcy / 2) {
            dx = 1.0;
        }
        else {
            dx = 0.0;
        }
    }

    // add two pixel-based diagonals for comparison.
    dx += (abs(fc_delta) - 50.0 == 0.0) ? 1.0 : 0.0;

    float r = abs(fragCoord.x - iResolution.y / 2.0) <= 1.0 ? 1.0 : dx;

    vec3 col = vec3(r, dx, f);

    fragColor = vec4(col,1.0);
}

stackoverflow explanation-of-dfdx

posted @ 2023-03-14 11:20 fndefbwefsowpvqfx 阅读(18) 评论(0) 编辑收藏举报

刷新页面返回顶部