quad differential
To understand how these instructions work, it helps to understand the basic execution architecture of GPUs and how fragment programs map to that architecture.
GPUs run a bunch of threads in 'lock-step' over the same program, which each thread having its own
set of registers.
So it fetches an instruction, then executes that instruction N times, once for each running thread.
To deal with conditional branches and such, they also have an 'active mask' for the currently running group of threads.
Threads that are not active in the mask don't actually run (so their registers don't change).
Whenever there is a conditional branch or join (branch target) the thread mask is changed appropriately.
Now when a fragment program is run, the fragments to be run are arranged into quads
-- 2x2 squares of 4 pixels
that always run together in a thread group
.
Each thread
in the group
knows its own pixel coordinate
, and can easily find the coordinate of the adjacent
pixel in the quad
by flipping the lowest bit of the x (or y) coord.
When the GPU executes a DDX
or DDY
instruction, what happens is that it peeks at the registers for the thread for the adjacent pixel and does a subtract
with the value from the current pixel
- subtracting the value for the
higher
coordinate (lowest bit 1) from the lower (lowest bit 0).
This has implications if you use dFdx
or dFdy
in a conditional branch -- if one of the threads in a quad is active while the other is not, the GPU will still look at the register of the inactive thread, which might have any old value in it, so the result could be anything.
uniform vec3 iResolution; // viewport resolution (in pixels)
uniform float iTime; // shader playback time (in seconds)
uniform float iTimeDelta; // render time (in seconds)
uniform float iFrameRate; // shader frame rate
uniform int iFrame; // shader playback frame
uniform float iChannelTime[4]; // channel playback time (in seconds)
uniform vec3 iChannelResolution[4]; // channel resolution (in pixels)
uniform vec4 iMouse; // mouse pixel coords. xy: current (if MLB down), zw: click
uniform samplerXX iChannel0..3; // input channel. XX = 2D/Cube
uniform vec4 iDate; // (year, month, day, time in seconds)
// i came across dFdx here: https://www.shadertoy.com/view/MlGcDz
// and had some questions about the explanation of it here: https://stackoverflow.com/a/16368768/230851
// so this is a minor exploration.
//
// this shows that the value of dFdx() is duplicated for groups of pixels.
// the value we're taking the derivative of is a 45° line,
// and on the right side of the screen we use dFdx to produce the yellow pixels,
// and on the left side we manually generate little 2x2 pixel quads.
// the results look identical on my machine.
// the extra two diagonals are pixel-based, for reference.
void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
float fc_delta = fragCoord.y - fragCoord.x;
float f = fc_delta < 0.0 ? 1.0 : 0.0;
float dx = dFdx(f);
// if on the left side of the screen
if (fragCoord.x < iResolution.y / 2.0) {
int fcx = int(fragCoord.x);
int fcy = int(fragCoord.y);
if (fcx / 2 == fcy / 2) {
dx = 1.0;
}
else {
dx = 0.0;
}
}
// add two pixel-based diagonals for comparison.
dx += (abs(fc_delta) - 50.0 == 0.0) ? 1.0 : 0.0;
float r = abs(fragCoord.x - iResolution.y / 2.0) <= 1.0 ? 1.0 : dx;
vec3 col = vec3(r, dx, f);
fragColor = vec4(col,1.0);
}
stackoverflow explanation-of-dfdx