Today, NVIDIA announced the release of beta versions of the SDK and C complier for their Compute Unified Device Architecture (CUDA) Technology. The C complier includes a set of C language extensions that will enable developers to writer C code that targets NVIDIA's GPUs directly. These extensions are supported by softeware libraries and a special CUDA driver that exposes the GPU to the OS and applications as a math coprocessor.

NVIDIA's CUDA approach is a bit different from AMD/ATI's "close to metal" (CTM) initiative. With CTM, AMD/ATI have opened up the low-level ISA sa that their graphics products can be programed directly in assembly language. The idea here is that a development community will develop the sorts of libraries and higher-level tools that NVIDIA is providing in a prepackaged but closed form with CUDA. People who want to do math coprocessing with NVIDIA's parts will have to rely on the quality, stability, and performance of the company-provided driver, whereas CTM lets you roll your own interface to the hardware if you don't like what's on offer from AMD/ATI.

Clearly, there are advantages and disadvantages to both approches, and both approaches ultimately represent attempts by NVIDIA and AMD/ATI to get people who need this sort of GPU-driven math coprocessing power to standardize on one or the other vendor's hardware. Both of these moves are also about building an application base around existing GPU hardware before Intel moves into the market with whatever it's planning to unleash, so that Chipzilla (ideally) finds that the existing customers for GUPs-as coprocessors are already tied to either AMD/ATI or NVIDIA.

Both companies can do this kind of vendor lock-in, because applications must be custom-written to a particular platform (CUDA or CTM) to take real advantage of the GPU as a coprocessor. So a software vendor that wants to offload some parallel processing to the GPU will have to pick a vendor - AMD/ATI or NVIDIA - and write software with that vendor's hardware in mind ...  or will they?

The beauty of CTM is that someone could theoretically write a middleware layer for it that implements CUDA's C extensions and APIs. This would mean that you could switch hardware vendors relatively easily, depending on performance and price considerations. It would also mean that CUDA would become the de facto standard for GPU support in applications. There may be some legal considerations that would prevent it from being done, but from a technical standpoint it seems feasible.