DirectX10 Preview翻译

 

The Direct3D 10 December 2005 Technology Preview

The Direct3D 10 Technology Preview showcases the newest set of graphics API's for games and other high-performance multimedia applications on next-generation graphics hardware. This technology preview provides reference material, conceptual content, developer libraries, tutorials and samples that demonstrate how to use Direct3D 10. Additional content will provided in upcoming SDK releases.

Samples and applications built with the Direct3D 10 December 2005 Technology Preview require the Windows Vista December 2005 CTP to run. The Windows Vista December 2005 CTP is available to MSDN subscribers.

This documentation set is intended for developers using the C/C++ programming language.

D3D10技术预览揭示了最新的图形API集,这些API被用于开发基于下一代图形硬件的游戏或者其他高性能多媒体程序。本技术预览提供了参考材料、概念定义、开发库、入门教程和例子程序,教你如何使用D3D10。其他内容会在即将发布的SDK中提供。

D3D 200512月版中的例子和程序只能在Win Vista 2005 CTP 12月版上运行。MSDN用户可以获得Win Vista 2005 CTP 12月版。

这个文档是面向使用C/C++的开发人员。

Legal Information

Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

    除非特别申明,本文档中例举的公司、组织、产品、域名、邮箱地址、标志、人物、地点和事件都是虚构的,没有任何真实的公司、组织、产品、域名、邮箱地址、标志、人物、地点和事件与此相关。遵守版权法是使用者的责任。没有版权许可,无论为什么目的,本文档的任何部分都不能在任何检索系统中被复制,存储或者引用,或者使用任何手段把本文档转换为任何形式(如电子形式的,机械形式的,胶片,录影带等等),除非拥有微软公司的书面允许。

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

微软公司拥有和本文档主题相关的各种专利,专利程序,商标,版权或其他知识产权。除非明确拥有微软提供的书面许可证书,本文档并不提供使用这些专利,商标,版权和其他知识产权的权利。

1995-2005 Microsoft Corporation. All rights reserved.

1995-2005 微软公司。保留所有权利。(现在是2006^_^

Microsoft, MS-DOS, Windows, Windows NT, Direct3D, DirectAnimation, DirectDraw, DirectInput, DirectMusic, DirectPlay, DirectShow, DirectSound, DirectX, Visual C++, Visual Studio, Win32, Xbox, Xbox 360 and XNA are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

Microsoft, MS-DOS, Windows, Windows NT, Direct3D, DirectAnimation, DirectDraw, DirectInput, DirectMusic, DirectPlay, DirectShow, DirectSound, DirectX, Visual C++, Visual Studio, Win32, Xbox, Xbox 360 and XNA是注册商标或者是微软公司在美国或者其他公司的商标。

The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

本文档提及的实际的公司或者产品是它们各自拥有者的商标。

Direct3D 10 Graphics

Discover the Direct3D 10 graphics features in one of these sections:

你可以从以下章节中查看D3D10的细节

Programming Guide - This section contains architecture descriptions, functional block diagrams, descriptions of the building blocks in the pipeline, code snippets, and sample applications.

 

编程向导-这一节包括D3D架构描述,功能图,流水线功能模块的描述,代码片断和例子程序。

 

Reference - This section contains the reference pages for the Direct3D 10, D3DX10, and DXGI APIs. This includes the syntax for the API methods, functions, instructions, and data structures. It includes an explanation of how each API method works and often includes code snippets.

参考文档-这一节包括D3D10D3DX10DXGI接口的参考文档。包括这些API函数的语法,功能,指令和数据结构。它还包括对这些API的解释,以及一些代码片断。

Programming Guide

The programming guide contains information about how to use the Direct3D 10 programmable pipeline to create real-time 3D graphics for games.

编程向导告诉你如何使用D3D10编程流水线来在游戏中创建实时3D图形。

Reference

The Direct3D 10 API is described in the following sections:

D3D10 API在如下章节中描述:

Direct3D reference

D3DX reference

DXGI reference

API Features

The Direct3D 10 graphics pipeline represents a fundamental architecture change, rebuilt from the ground-up in hardware and software to power the next-generation of games and 3D multimedia applications. It is built upon the Windows Vista Display Driver Model (WDDM) infrastructure, enabling new performance and behavioral enhancements and guaranteeing of full virtualization of GPU memory.

D3D10图形流水线给出了绘制架构的根本性变化,为了增强了下一代游戏和3D多媒体程序的绘制能力,它的软件和硬件都由底层重新构建。它基于Windows Vista显示驱动模型(WDDM)构建,增强了功能和性能,保证了GPU显存的完全虚拟化(应该指从编程角度可以无视内存和显存差别)。

Developers familiar with Direct3D 9 will discover a series of functional enhancements and performance improvements in Direct3D 10, including:

熟悉D3D9的开发人员能够发现D3D10中一系列的功能增强和性能改进,包括:

  • The ability to process entire primitives (with adjacency), amplify and de-amplify data in the new geometry shader stage.

处理整个图元(包括邻接信息)的能力,在新的GS阶段改变物体几何

使用SO阶段从绘制流水线中生成顶点到显存中的能力

  • New objects and paradigms provided to minimize CPU overhead spent on validation and processing in the runtime and driver.

提供新的对象和输入模板来减小运行时使驱动生效处理数据的CPU消耗。

    • Organization of pipeline state into 5 immutable state objects, enabling fast configuration of the pipeline.

把绘制流水线状态组织成5个不变的绘制状态对象,使流水线的描述更为简洁。

    • Organization of shader constant variables into constant buffers, minimizing bandwidth overhead for supplying shader constant data.

Shader常数变量组织到常数缓存中,减少了给Shader提供常数的带宽。

    • The ability to perform per-primitive material swapping and setup using a geometry shader.

使用GS能够执行逐图元的材质变换和构建。

  • New resource types (including shader-indexable arrays of textures) and resource formats.

新的资源类型(包括Shader可读取的带索引的纹理数组)和资源格式。

  • Increased generalization of resources in memory and ubiquity of resource access - resource views enable interpretation of resources in memory as different types or representations.

增加内存资源的通用性,资源使用统一格式(资源视图)来读取在内存中不同类型和格式的资源。

 

  • A full set of required functionality: legacy hardware capability bits (caps) have been removed in favor of a rich set of guaranteed functionality. To enable this and other design improvements, the Direct3D 10 API only targets Direct3D 10-class hardware and later.

一系列完整的功能需求:legacy hardware capability bits (caps)被移除并引入了更为丰富的性能验证功能;为了使以上设计生效,D3D10 API只对将来D3D10类的硬件生效。

  • Layered Runtime - The Direct3D 10 API is constructed with layers, starting with the basic functionality at the core and building optional and developer-assist functionality (debug, etc.) in outer layers.

层次运行。D3D10 API由几个层次构成,从核心层次使用D3D10最基本的功能,在其他层次构建可选择的和辅助开发的功能(比如调试D3D10等)。

  • Full HLSL integration - All Direct3D 10 shaders are written in HLSL and implemented with the shader common core.

完全的HLSL集成。所有D3D10 Shader使用HLSL编写,使用通用Shader核心实现。

  • An increase in the number of render targets, textures, and samplers. There is also no shader length limit.

增加了绘制对象(窗口),纹理和纹理采样的数量,Shader没有指令数限制。

  • Full support for:
    • Integer and bitwise shader operations
    • Readback of depth/stencil and multisampled resources in the shader
    • Multisample Alpha-to-Coverage

 完全支持

       Shader整数和位操作;在Shader中读回DepthStencil值和多采样资源;多采样Alpha覆盖。

There are additional behavioral differences that Direct3D 9 developers should also be aware of. For a more complete list, refer to Direct3D 9 to Direct3D 10 Considerations.

D3D9的开发人员应该清楚其他的接口变化。改动完全列表请参照Direct3D 9 to Direct3D 10 Considerations

State Objects

In Direct3D 10, device state is grouped into five immutable State Objects:

D3D10中,设备状态被分为5种不变的状态对象:

  • Input Layout Object - This group of state (see D3D10_INPUT_LAYOUT_DESC) affects the input assembler state. This includes state such as the number of elements in the input buffer and the signature of the input data. The input assembler is a new stage in the pipeline whose job is to stream primitives from memory into the pipeline. To find out more about the input assembler stage and the input layout object, see Create the Input Layout.

输入层对象(IO):这组状态(请看D3D10_INPUT_LAYOUT_DESC)作用于输入集成器的状态。这些状态包括输入缓存的元素个数,输入数据的描述。输入集成器使在绘制流水线中的新阶段,它的功能使把显存中的图元流数据读入到绘制流水线中。关于输入集成器和输入层次对象的更多功能,请看Create the Input Layout

  • Rasterizer Object - This group of state (see D3D10_RASTERIZER_DESC) affects the rasterizer stage. This object includes state such as fill or cull modes, enabling a scissor rectangle for clipping, and setting multisample parameters. This stage rasterizes primitives into pixels, performing operations like clipping and mapping primitives to the viewport. To find out more about the rasterizer stage and the rasterizer state object, see Set Rasterizer State.

光栅化对象:这组状态(参见D3D10_RASTERIZER_DESC)影响光栅化阶段。这个对象包括填充模式,剔除模式,裁减模式和多采样参数等绘制状态。这个阶段把图元光栅化成像素,执行裁减和匹配图元到视口上的操作。更多关于光栅化阶段和光栅化状态对象的信息,请看Set Rasterizer State

  • DepthStencil Object - This group of state (see D3D10_DEPTH_STENCIL_DESC) configures interactions with the depth buffer and sets up stencil testing. To find out more about the output merger stage and the depth-stencil state object, see Set Depth Stencil State.

DepthStencil对象:这组状态(看D3D10_DEPTH_STENCIL_DESC)描述了对深度缓存进行交互并设置模板测试状态的操作。关于输出合并阶段和深度模板对象的更多信息,参见Set Depth Stencil State

  • Blend Object - This group of state (see D3D10_BLEND_DESC) defines how Pixel Shader outputs are blended with the current render target value in the Output Merger. See Set Blend State.

Blend对象:这组状态定义了PS输出和当前RT中的值进行混合的操作,看Set Blend State

  • Sampler Object - This group of state (see D3D10_SAMPLER_DESC) describes a texture Sampler. Samplers are used by the shader stages to filter textures in memory. Set sampler state in a shader by calling VSSetSamplers, PSSetSamplers or GSSetSamplers.

采样对象:这组状态(看。。。。)描述了纹理采样操作。纹理采样在Shader阶段使用,把内存中的纹理经过滤波读入。设置纹理采样状态需要调用VSSetSampler,PSSetSampler或者GSSetSamplers

Differences between Direct3D 9 and Direct3D 10:

In Direct3D 10, the sampler object is no longer bound to a specific texture - it just describes how to do filtering given any attached resource.

D3D9D3D10的区别:

D3D10中采样对象并不绑定在一张特定纹理上,它只是描述了对指定资源的滤波方式。

A state object is immutable, that is, once it is created it cannot be changed without being destroyed and recreated. This facilitates create-time validation and mapping, allows state-setting to be pipelined, and makes caching of these objects in hardware possible, minimizing state-setting overhead at runtime.

状态对象是不变的,这就是说,一旦它被创建就不能被消出或者重建。这使得创建时的生效和匹配更为方便,允许绘制状态成为流水线,并且可能在硬件中暂时存储这些状态,减少运行时的状态设置消耗。

For example, you could create several sampler objects with various sampler-state combinations. Changing the pipeline sampler state is then done by calling the appropriate Set API which takes a handle to the object. No state is passed, just an object handle so changing state becomes as fast as sending an object handle to the device. By encapsulating the state in these state objects, the number of calls is significantly reduced and the time for each call is also reduced since no state is passed.

举个例子,你可以创建几个采样对象使用不同的采样状态组合。改变流水线采样状态只需要调用合适的设置函数装入对应的对象。传入的不是绘制状态,而是一个状态对象,所以改变状态的速度和对设备传入一个操作对象的速度一样快。通过把状态封装到这些对象里,调用API的数量明显减少,调用API时间也因为没有状态传入而减少。

You can create up to 4096 of each type of state objects on a device. The Direct3D 10 Effects system will automatically manage efficient creation and destruction of state objects for your application.

 你能够在一个设备上为每种状态对象创建4096个实例。D3D10 Effect系统会有效管理应用程序里状态对象的创建和析构。

API Layers

The Direct3D 10 runtime is constructed with layers, starting with the basic functionality at the core and building optional and developer-assist functionality in outer layers.

D3D运行库由多个层次构成,从底层核心的基本功能开始,直到在外部层次构建可选择和辅助开发的功能为止。

Core Layer

The core layer exists by default; providing a very thin mapping between the API and the device driver, minimizing overhead for high-frequency calls. As the core layer is essential for performance, it only performs critical validation.

核心层默认存在,提供很少一部分API和设备驱动的匹配,减少高频率函数调用的消耗。核心层关乎性能,因此只执行一些关键操作。

The remaining layers are optional. As a general rule, layers add functionality, but do not modify existing behavior. For example, core functions will have the same return values independent of the debug layer being instantiated, although additional debug output may be provided if the debug layer is instantiated.

剩下的层次是可选的。作为一个通用规则,层次增加新的功能,但不改变已有功能。举个例子说,核心层函数返回值不会因为调试层的加入而改变,即便调试层会提供附加的调试输出信息。

Create layers when a device is created by calling D3D10CreateDevice and supplying one or more D3D10_CREATE_DEVICE_FLAG values.

调用D3D10CreateDevice生成D3D设备时使用多个D3D10_CREATE_DEVICE_FLAG标记来创建多个层次

Debug Layer

This layer provides extensive additional parameter and consistency validation (such as validating shader linkage and pipeline binding, validating parameter consistency, and reporting error descriptions). Its debug output is also provided as a queue of report strings, accessible via the ID3D10InfoQueue interface. Any errors produced by the core runtime layer will also be highlighted with warnings by the debug layer.

这一层提供附加的参数和一致性检查(比如Shader链接检查,参数一致性检查并报告错误)。调试输出提供一个报告字符串队列,可以通过ID3DinfoQueue接口获得。任何核心层运行时错误会在调试层被显式警告。

This layer is implemented in D3D10SDKLayers.DLL, which is only available with the SDK installed.

这个层次在D3D10SDKLayer.DLL中实现,这个动态链接库在SDK安装时提供。

The debug layer performs extensive additional parameter and consistency validation, and returns error reports in a queue of report strings.

    废话,懒得翻译了。

Shader-Reflection Layer

This layer enables an application to use Get methods to retrieve shader bytecode from the device.

这个层次让应用程序通过Get函数从设备上获得Shader的二进制代码。

Switch-to-Ref Layer

This layer provides the ability to transition between hardware and reference rasterizer implementations of the graphics pipeline for debugging purposes. All device state, resources and objects are maintained through this transition.

这个层次提供了为了调试而从硬件和参考光栅化实现之间的绘制流水线转换的功能。所有设备状态、资源和对象转换后都保留。

This layer is implemented in D3D10SDKLayers.DLL, which is only available if you have the SDK installed.

上面说过了,废话。

Thread-Safe Layer

This layer is designed to allow multi-threaded applications to access the device from multiple threads.

这一层允许多线程程序读取设备。

Direct3D 10 enables an application to exercise explicit control over the device synchronization primitive with device functions that can be invoked at any time over the lifetime of the device, including enabling and disabling the use of the critical section (temporarily enabling/disabling multithread protection), and a means to take and release the critical section lock and thereby hold the lock over multiple Direct3D 10 API entrypoints.

D3D10允许程序通过在设备的生命期内任何时刻都能调用的设备函数从外部同步控制设备图元,包括对临界区域(临时打开/关闭多线程保护)的生效和失效的控制,对邻接区域的开锁和关锁,并在多个D3D10函数入口点控制锁状态。

This layer is enabled by default, but if not present has no performance impact on single-thread accessed devices. Use D3D10_CREATE_DEVICE_SINGLETHREADED (in D3D10CreateDevice) to turn this layer off.

这个层次默认生效,但是如果没有的话对单线程性能也不会产生影响。使用D3D10_CREATE_DEVICE_SINGLETHREADED关闭这个层。

Differences between Direct3D 9 and Direct3D 10:

Unlike Direct3D 9, the Direct3D 10 API defaults to fully thread-safe.

 

D3D9D3D10的区别:

D3D9不一样,D3D10接口默认是线程间安全的。

Reference Counting

Direct3D10 pipeline Set functions do not hold a reference to the DeviceChild objects. This means that each application must hold a reference to the DeviceChild object for as long as the object needs to be bound to the pipeline. When the reference count of an object drops to zero, the object will be unbound from the pipeline and destroyed. This style of reference holding is also known as weak-reference holding because each pipeline binding location holds a weak reference to the interface/object that is bound to it.

D3D10流水线的设置函数并不保存设备子对象(比如绘制状态对象)的引用。这意味着应用程序必须在该子对象绑定在绘制流水线期间保存它的引用。当引用数量为0时,对象会从流水线上卸下并被删除。这种引用保持被成为弱引用保持,因为每个流水线保持一个绑定在它上的弱引用。

For example:

pDevice->CreateRasterizerState( ..., &pRasterizerState ); //创建光栅化状态对象
pDevice->RSSetState( pRasterizerState );//设置
 
pDevice->RSGetState( &pCurRasterizerState );//取得
// pCurRasterizerState will be equal to pRasterizerState.
pCurRasterizerState->Release();//析构
 
pRasterizerState->Release();//再次析构,会不会出错啊?如果不出错,我觉得就是败笔
//万一程序里有个引用没有执行Release(忘了,或者无意中被删了)会导致D3D资源泄漏。
// Since app released the final ref on this object, it is unbound.
pDevice->GetRasterizerState( &pCurRasterizerState );
// pCurRasterizerState will be equal to NULL.

Differences between Direct3D 9 and Direct3D 10:

In Direct3D 9, pipeline Set functions hold a reference to the device's objects; in Direct3D10 pipeline Set functions do not hold a reference to the DeviceChild objects.

D3D9D3D10的区别:

D3D9会保存一个引用,而D3D10不会。

Pipeline Stages

The Direct3D 10 programmable pipeline is designed for generating graphics for realtime gaming applications. The conceptual diagram below illustrates the data flow from input to output through each of the programmable stages.

D3D19编程流水线为实时游戏程序生成图形设计。下面的设计图描述了从输入到输出的每步编程阶段的数据流程。

 

Figure 1.  Direct3D 10 Pipeline Stages

All of the stages are configurable via the Direct3D 10 API. Stages featuring common shader cores (the rounded rectangular blocks) are programmable using the HLSL programming language. As you will see, this makes the pipeline extremely flexible and adaptable. The purpose of each of the stages is listed below.

上述所有阶段通过D3D10 API描述。通用Shader核心(椭圆框部分)使用HLSL编程实现。这样你会发现绘制流水线非常灵活机动。每个阶段的设计目标在下面列出。

  • Input Assembler Stage - The input assembler stage is responsible for supplying data (triangles, lines and points) to the pipeline.

输入集成阶段- 输入集成界软负责为绘制流水线提供数据(三角形,线,点等)。

  • Vertex Shader Stage - The vertex shader stage processes vertices, typically performing operations such as transformations, skinning, and lighting. A vertex shader always takes a single input vertex and produces a single output vertex.

VS阶段-VS阶段处理顶点,做坐标变换,表面细节和逐顶点光照计算。VS输入为一个顶点,输出也是一个顶点。

  • Geometry Shader Stage - The geometry shader processes entire primitives. Its input is a full primitive (which is three vertices for a triangle, two vertices for a line, or a single vertex for a point). In addition, each primitive can also include the vertex data for any edge-adjacent primitives. This could include at most an additional three vertices for a triangle or an additional two vertices for a line. The Geometry Shader also supports limited geometry amplification and de-amplification. Given an input primitive, the Geometry Shader can discard the primitive, or emit one or more new primitives.

GS阶段-GS处理完整的图元。它的输入是图元(包括三角形三个顶点,或者直线的两个顶点,或者点绘制的一个顶点)。同时,每个图元还可以包含它邻接边的顶点信息。对于三角形可以附加三个顶点,而对于直线,可以附加两个顶点。GS同样提供简单的几何膨胀和收缩操作。给定一个输入图元,GS能够取消绘制这个图元,或者产生更多的新图元。

  • Stream Output Stage - The stream output stage is designed for streaming primitive data from the pipeline to memory on its way to the rasterizer. Data can be streamed out and/or passed into the rasterizer. Data streamed out to memory can be recirculated back into the pipeline as input data or read-back from the CPU.

流输出阶段:流输出阶段为了让图元数据在光栅化过程中可以从流水线输出到显存中而设计。数据能北输出或者输入光栅化。输出到显存的流数据可以作为流水线输入数据循环使用,或者读回到CPU

  • Rasterizer Stage - The rasterizer is responsible for clipping primitives, preparing primitives for the pixel shader and determining how to invoke pixel shaders.

光栅化阶段-光栅化对裁减后的图元起作用,它把图元输出到PS,并确定使用什么PS绘制。

  • Pixel Shader Stage - The pixel shader stage receives interpolated data for a primitive and generates per-pixel data such as color.

PS阶段:PS阶段接受图元光栅化时的插值数据并生成逐像素的数据(比如颜色)。

  • Output Merger Stage - The output merger stage is responsible for combining various types of output data (pixel shader values, depth and stencil information) with the contents of the render target and depth/stencil buffers to generate the final pipeline result.

输出合并阶段-输出合并阶段把不同的数据数据(PS颜色,深度和模板缓存信息)和绘制对象及Depth/Stencil缓存中的数据合并,并输出最终结果。

Input Assembler Stage

At the front of the pipeline, there is a new stage that streams primitives from memory into the pipeline. The input assembler stage (IA) reads vertex data from user filled buffers and assembles the data into vertices and primitives (points, lines or triangles).

在流水线开始之前,有一个把图元数据流从内存输入到流水线的新阶段。IA阶段从用户填充的顶点缓存中读取顶点数据,并把数据集成为顶点和图元(点,线和三角形)。

The data is streamed out to the pipeline for processing by the shader stages. The input assembler can also generate new primitive types (a line list with adjacency or a triangle list with adjacency) to feed the geometry shader adjacency data. This is done by calling executing a Draw API.

输入数据在流水线中用Shader阶段操作。IA同样能够生成新图元(比如带邻接信息的线链表或者三角形链表)让GS能够读取邻接信息。这些操作通过执行一些绘制API完成。

While the IA is generating primitives, it may attach some useful information that is consumed by the shader cores. These system-generated values include information that identifies each primitive, different instances of a primitive, and different vertices. This data is then provided to the shade cores to minimize processing time by processing only those primitives, instances, or vertices that have not already been processed.

IA生成图元时,它可能会附加一些Shader核心使用的信息在顶点上。这些系统生成的信息包括图元的编号,图元的实例编号和不同的顶点编号。这些数据被Shader核心使用,使它们单独处理未处理的图元,实例和顶点时减少处理时间。

IA Stage API

These are the steps required to initialize and execute the input assembler stage:

这些是初始化和执行输入集成阶段的步骤:

创建输入Buffer,创建并初始化提供输入数据的输入缓存。

  • Create the Input Layout - Create an input-layout object that does two things: define how the input data is organized as it is streamed into the IA stage, and compares the input data to the vertex shader inputs to make sure they are compatible.

创建输入层次:创建输入层次对象做两件事,定义输入数据作为数据流输入IA阶段的组织格式,并比较输入数据和VS是否匹配。

绑定对象到IA阶段:绑定已创建的对象(输入还军和输入层次对象)到IA阶段上。

定义几何拓扑:告诉IA怎么把输入数据集成为图元

IA阶段创建系统生成值的例子:表现了系统值怎么绑定到一个三角带上。

Create Input Buffers

Step 1: Declare the vertex data

// Create vertex buffer
SimpleVertex vertices[] =
{
    D3DXVECTOR3( 0.0f, 0.5f, 0.5f ),
    D3DXVECTOR3( 0.5f, -0.5f, 0.5f ),
    D3DXVECTOR3( -0.5f, -0.5f, 0.5f ),
};

This is a simple structure of vertices. One triangle contains 3 vertices.

Step 2: Create a vertex buffer

D3D10_BUFFER_DESC bd;
bd.Usage = D3D10_USAGE_DEFAULT;
bd.ByteWidth = sizeof( SimpleVertex ) * 3;
bd.BindFlags = D3D10_BIND_VERTEX_BUFFER;
bd.CPUAccessFlags = 0;
bd.MiscFlags = 0;
 
D3D10_SUBRESOURCE_UP InitData;
InitData.pSysMem = vertices;
InitData.SysMemPitch = sizeof( vertices );
InitData.SysMemSlicePitch = sizeof( vertices );
if( FAILED( g_pd3dDevice->CreateBuffer( &bd, &InitData, &g_pVertexBuffer ) ) )
    return FALSE;

A vertex buffer is organized as an array of elements, each element contains the data associated with one vertex. There are two parts to creating the buffer resource.

定点缓存被组织为一系列元素的数组,每个元素包含顶点所需的数据。创建顶点缓存由两部分构成。

First, the buffer description is initialized, with settings that define how the application expects to use the buffer. These settings are important for speed and type checking. For example, the usage D3D_USAGE_DEFAULT is for a resource that is not expected to be updated by the CPU very often. This determines what type of memory the runtime will create the resource in. Video ram for example would be used for a resource that is constantly changing, so that the GPU is never interrupted by the need to get data; the D3D_USAGE_DEFAULT flag ensures that the resource cannot be mapped and can only be modified with UpdateSubresource.

首先,初始化缓存的描述结构,设置应用程序如何使用缓存的定义。这些设定对于类型检查和绘制速度来说非常重要。举个例子,对于D3D_USAGE_DEFAULT是对那些不频繁被CPU更新的数据使用的,这决定了在运行时哪种类型的缓存被用来放置定点缓存。再举个例子说,视频缓存是可能经常改变的,GPU在访问它的数据时不需要通过中断。D3D_USAGE_DEFAULT标记定义这个缓存不能被(GPUmapping,并且只能通过UpdateSubresource来更新。

Mapping in Direct3D 10 is analogous to locking in Direct3D 9. The default usage specifies the resource as one that cannot be locked; in other words, the application does not expect the resource to get locked and updated by the CPU. This means that the GPU can use the resource without fear of the resource stalling the GPU pipeline because the CPU wants access to it. For more about resource usages, see Resources.
D3D10
mapping的含义和D3D9中的Lock类似。D3D_USAGE_DEFAULT定义了不能被锁定的资源。换句话说,应用程序不能锁定资源并通过CPU来更新资源。这意味着GPU能够使用该资源而不必担心在GPU流水线中的等待CPU访问资源结束。更多的资源usage,参见Resources

In this example, the other important setting is the binding flag. This flag (D3D10_BIND_VERTEX_BUFFER) means that any resource (a buffer in this case) created using this description, can only be bound to the pipeline as a vertex buffer resource. This restricts the class of operations that can be done to this resource and once again enables the GPU to schedule its use for maximum performance.

在这个例子中,另一个重要的设置是绑定标记。这个标记表示任何使用D3D10_BIND_VERTEX_BUFFER创建的资源(这里是顶点缓存)只能被绑定到绘制流水线中的顶点缓存资源。这限制了能够对该资源的操作方式,并使GPU能够合理调度它以达到最大绘制效率。

The CPU access flag determines whether or not the CPU can access the buffer; a buffer that is declared as read only can reside anyplace where it can be read quickly.

CPU读取标记定义CPU能否读取这个缓存。一个被定义为只读的缓存可以被放置在任何可以快速读取它的地方。

Second, create the vertex buffer by calling CreateBuffer with the buffer description and a second description of the subresource. Each resource is made up of an array of subresources, and each subresource is made up or an array of elements. Each different resource has a specific hierarchy of resource and subresources and elements (according to figure 1 on the resources page). The subresource description not only points to the actual resource data, but also contains information about the size and layout of the data.

第二步,调用CreateBuffer使用缓存描述符和子资源描述符来创建顶点缓存。每种缓存由一组子资源构成,而每个子资源由一组元素构成。每个不同的资源拥有一系列堆结构的吱吱员和元素(资源页的图1)。子资源描述符不仅只想实际的资源数据,也包括数据的大小和层次。

Using this figure for a buffer, you can see the memory pitch and the memory slice for a buffer. This description gives the pipeline a clear idea of how to walk the resource and read/write to it. Since a buffer is a bag-of-bits, there is a 1D structure to its layout. As a result, the system memory pitch and system memory slice pitch are both the same; the size of the vertex data declaration. A buffer has the easiest memory pitch and memory slice pitch layout since it is a 1D layout.

使用这种描述方式来描述缓存,你能够看到内存的pitchslice。这些描述给流水线一个清晰的概念,告诉它怎么遍历资源和读取资源。因为缓存是二进制数据位的集合,因此它的层次只有一维结构。因此,内存pitchslice含义相同,表示顶点数据的大小。因为它的一维结构,所以它简单拥有pitchslice pitch

Create the Input Layout

The input layout describes how the data will get interpreted by the IA stage as it is streamed in from user memory. This layout is described by D3D10_INPUT_ELEMENT_DESC, which includes information like: the format of the data, what semantics are specified, and how to interpret instancing data. Tutorial 2 creates the input layout in 2 steps:

输入层次描述了数据如何被IA阶段从用户内存中读取作为输入数据流。这里的层次被描述为D3D10_INPUT_ELEMENT_DESC,包含这样的信息:数据格式,语法定义(用途),怎样转换为实际图形数据。练习2分两步创建输入层次。

First, declare the input layout as shown here:

首先,申明输入层次。

// Define the input layout
D3D10_INPUT_ELEMENT_DESC layout[] =
{
    { L"POSITION", 0, 
      DXGI_FORMAT_R32G32B32_FLOAT, 
      0, 0, 
      D3D10_INPUT_PER_VERTEX_DATA, 0 }, 
};

This example uses the input layout description to interpret the data stored in a single vertex buffer. The members in the description include:

这个例子使用输入层次描述符描述了存在一个单个数据缓存中的数据。这些描述包括:

  • Semantic Name and semantic index - Identifies how to interpret the data. You can use any number of arbitrary semantics, the semantics from Direct3D 9, or the additional semantics required by the hardware in Direct3D 10. The new semantics required by the hardware are called system values; one such example is the position semantic: SV_POSITION. All system value semantics begin with the SV_ prefix.

语义名和语义索引:定义了如何解释这些数据。你能够使用任何语义定义,比如D3D9的语义或者D3D10硬件需要的定义。硬件需要的新的语义叫做系统值,比如位置信息SV_POSITION。所有系统值语义使用SV_开头。

  • Format - This is the format of the data stored in the buffer. Direct3D 10 has many predefined format types including: 16, 32, 64 and 128 bit formats, signed and unsigned formats, typed and typeless formats, and integer and floating-point formats. Typeless formats are available when you want to allocate the proper amount of space for the data, but do not yet know what type the data will be at the time the input layout object is created.

格式:这是在缓存中存储数据的格式。D3D10由许多预定义的格式类型,包括163264128位数据,有符号和无符号数据,有类型和无类型数据,以及整数和浮点数。无类型数据当你需要给数据分配一定空间而在输入层次对象创建时不知道数据类型的时候使用。

  • InputSlot and AlignedByteOffset- These two parameters define the input entry point of the stage and any offset from the beginning of the stream to the data (this means you can use a header in your data buffers if you like). Every pipeline stage uses input slots and output slots to identify the input and output ports for streaming data. Each slot is a zero-based integer, and every stage has limitations on how many slots are supported (see d3d10.h).

输入槽和对齐偏移:这两个参数定义了输入数据在这个阶段的的入口点和从数据开始到数据的偏移(这表示如果你愿意的话,你可以使用相同的数据入口)。每个流水线阶段使用输入槽和输出槽定义数据流数据的输入和输出口。每个槽是一个基于0的整数,每个阶段限定了多少数目的数据槽被支持(参见d3d10.h)。

  • The input slot class - Tells the input assembler how to apply the data read from the input buffer. Data is identified as vertex data (read the data directly) and non-instanced or instanced. As the input assembler reads the vertex buffer, these options determine how much data to stream onto the next pipeline stage.

输入槽类型:表示IA如何应用从输入缓存读取的数据。数据被定义为顶点数据(直接读取数据),非实例数据和实例数据(实例数据表示可以被重用的集合模型,比如相同树可以在不同的位置画许多次)。当IA读取顶点数据时,这些设置决定了多少数据被送到下一个流水线阶段中。

  • InstanceDataStepRate - If the input buffer data is instanced, the input assembler needs to know how many instances to draw before incrementing the pointer.

实例数据阶段率:如果输入缓存数据可以实例化,IA需要知道在跳到另一个数据前需要画多少次。

Second, use the input layout declaration to generate the input layout object by calling ID3D10Device::CreateInputLayout, as shown here:

第二,使用输入层次申明创建输入层次对象,通过调用ID3D10Device::CreateInputLayout:

ID3D10InputLayout ** ppInputLayout;
g_pd3dDevice->CreateInputLayout( layout, 1, pShaderBytecode, &ppInputLayout );

To get the pointer to the shader byte code, use D3D10CompileShader to compile the shader and return an ID3D10Blob interface. Then use ID3D10Blob::GetBufferPointer() to get a pointer to the shader byte code.

为了得到Shader编码的指针,使用D3D10CompileShader编译Shader并返回ID3D10Blob接口。然后使用ID3D10Blob::GetBufferPointer()得到Shader编码的指针。

This function takes the following parameters:

这个函数包含一下几个参数:

  • The input layout from step 1. As said before, this describes how the IA will interpret the data in the vertex buffer.

从步骤1中创建的输入层次。前面说过,这描述了IA如何读取顶点缓存中的数据。

  • The number of element declarations in the input buffer. Each input buffer is laid out as an array of elements (possibly with a header and therefore an offset - see step 1). Since this example uses a single vertex buffer, each element is the data stored for a single vertex.

数据缓存中定义的数据元素个数。每个输入缓存由一组元素构成(可能由头或者偏移-见步骤1)。因为这个例子使用一个简单的顶点缓存,每个元素都被存储为一个顶点。

  • The shader input signature. To type check the data coming from the input stream with the data that will be generated for the next pipeline stage (which is a vertex shader), the input element description is compared against the shader input declaration (using the shader signature). The shader signature is part of the compiled shader. You cannot get this directly, instead you create a shader reflection object, which can get a pointer to the compiled shader. This pointer is then supplied to CreateInputLayout.

Shader输入符号:为了检查从输入数据流到下一步绘制流水线(VS)的数据类型是否匹配,输入元素描述符会和Shader数据申明做比较。Shader输入符号是编译好的Shader的一部分,你不能直接获取,但是你可以创建一个能够得到编译好的Shader指针的Shader映射对象,这个指针由CreateInputLayout提供。

  • If the function is successful, it returns a pointer to the Input Layout object. This interface will be used in a moment to set this object in the device.

如果函数成功,则返回输入层次对象的指针。这个指针会被设置到设备中。

Binding Objects To The IA Stage

With the input buffer resources and the input layout object created, you just need to set these objects to the device, which binds these objects to the IA stage. This is done by calling ID3D10Device::IASetInputLayout.

有了创建好的输入缓存资源和输入层次对象,你可以把这些对象设置到设备中,也就是把他们绑定到IA阶段上。使用ID3D10Device::IASetInputLayout完成这些工作。

// Set the input layout
g_pd3dDevice->IASetInputLayout( g_pVertexLayout );
 
// Set vertex buffer
UINT stride = sizeof( SimpleVertex );
UINT offset = 0;
g_pd3dDevice->IASetVertexBuffers( 0, 1, &g_pVertexBuffer, &stride, &offset );

Setting the input layout object to the device only required a pointer to the object. Setting the vertex buffer is a little more complicated.

设置输入层次对象只需要它的指针,设置顶点缓存稍微复杂一点。

This example only required a single vertex buffer. But you can see from the name of the API method, that SetVertexBuffers takes an array of vertex buffers. Setting one (or more) vertex buffers requires that you bind each buffer to a unique input slot on the IA, and specify anything special about the way the data is stored in the buffer (like any offset to the start of the data and the size of each element in the declaration). With these two pieces of information, the IA stage knows how to step through the VB one vertex at a time.

这个例子只需要一个简单的顶点缓存。但是你可以从API函数的名字中看见,SetVertexBuffers函数需要输入一组顶点缓存。设置一个(或者多个)顶点缓存需要你把每个缓存绑定到IA上某个数据槽上,并且定义不同数据缓存存储的方式(比如申明偏移,起点,大小等)。使用这两个信息,IA阶段知道怎么遍历顶点缓存读取顶点数据。

In addition, if your application uses an index buffer, follow a similar set of steps for the vertex buffer: create the index buffer by calling ID3D10Device::CreateIndexBuffer (only one of these is allowed) and set it to the device with ID3D10Device::SetIndexBuffer.

同时,如果你的程序使用索引缓存,使用和添加顶点缓存类似的步骤:使用ID3D10Device::CreateIndexBuffer创建(只可以创建一个),使用ID3D10Device::SetIndexBuffer设置到设备上。

Specify The Primitive Topology

With the size of the input data fully specified in the input layout and the input buffer declarations, the IA still needs to know how a primitive is described by the data. This is called the primitive topology because it tells you how to assemble a primitive from vertices.

除了通过输入层次和输入缓存申明知道输入数据的大小外,IA还需要知道图元是如何描述的。这被叫做图元拓扑,因为它告诉你如何把顶点组织为图元。

Direct3D 10 supports several primitive topologies including primitives built from points, lines or triangles, connected in strips (continuously connected primitives) or lists (a list of unconnected primitives), with or without adjacent primitives. See illustrations for each of these in primitive topologies.

D3D10支持几种图元拓扑,包括由点,线和三角形构成的,组织为条带(连续的图元)或链表(不连续的图元),具备或者不具备邻接信息的图元格式。这些拓扑详见primitive topologies

Set the primitive topology by calling IASetPrimitiveTopology:

通过调用IASetPrimitiveTopology来设置图元拓扑。

IASetPrimitiveTopology(D3D10_PRIMITIVE_TOPOLOGY_TRIANGLELIST) 

This example defines the data as a triangle list without adjacency. The rest of the choices are listed in D3D10_PRIMITIVE_TOPOLOGY.

这个例子定义了数据作为没有邻接信息的三角形条带保存。其他的选择在D3D10_PRIMITIVE_TOPOLOGY里列出。

Example of the Input Assembler Stage Generating System Values

As stated earlier, system values are generated by the IA to allow certain efficiencies in shader operations by attaching data such as:

前面提到过,系统值由IA生成来提高Shader操作效率。系统值由如下几个:

实例编号(VS可见),顶点编号(VS可见),图元编号(GS/PS可见)

  • InstanceID (visible to VS)
  • VertexID (visible to VS)
  • PrimitiveID (visible to GS/PS)

A subsequent shade stage may look for these system values to optimize processing in that stage. For instance, the VS stage may look for the InstanceID to grab additional per-vertex data for the shader or to perform other operations; the GS and PS stages may use the PrimitiveID to grab per-primitive data in the same way.

一些绘制阶段会读取这些值来优化绘制过程。对于实例,VS阶段会读取实例编号来增加附加的逐顶点信息让Shader做其他操作。GSPS阶段会使用图元编号来增加逐图元的信息。

Here's an example of the IA stage showing how system values may be attached to an instanced triangle strip:

这是IA阶段的一个例子,显示了系统值如何被添加到一个三角形条带上。

 

Figure 1.  IA Example

This example shows two instances of geometry that share vertices. The figure at the top left shows the first instance (U) of the geometry - the first two tables show the data that the IA generates to describe instance U. The input assembler generates the VertexID, PrimitiveID, and the InstanceID to label this primitive. The data ends with the strip cut, which separates this triangle strip from the next one.

这个例子描述了两个共享顶点的几何模型实例。左上角的图形表示第一个实例物体U,前两个表显示了IA如何描述实例UIA生成VertexIDPrimitiveIDInstanceID来标记这些图元。数据在条带切断的时候结束,把这个三角形条带和另外一个划分开来。

The rest of the figure pertains to the second instance (V) of geometry that shares vertices E, H, G, and I. Notice the corresponding InstanceID, VertexID and PrimitiveIDs that are generated.

剩下的图像描述了第二个图形V的集合,使用了顶点EHGI。注意对应的InstanceIDVertexIDPrimitiveID是系统生成的。

Primitive Topologies

Primitive topologies describe how to data is organized into primitives. Direct3D supports the following primitive types:

图元拓扑描述了输入数据如何组织成图元,D3D支持这些图元类型:

 

The winding direction of a triangle indicates the direction in which the vertices are ordered. It can either be clockwise or counter clockwise.

转动的方向表示三角形顶点的排序方向。可以是顺时针也可以是逆时针。

A leading vertex is the first vertex in a sequence of three vertices.

引导顶点就是顶点序列中三个顶点的第一个。

Point List

A point list is a collection of vertices that are rendered as isolated points. Use them in 3D scenes for star fields, or dotted lines on the surface of a polygon. Your application can apply materials and textures to a point list.

点链表是一组独立绘制的顶点的集合。它们在3D场景中绘制星形物或者多边形表面的点划线。在程序中可以对点应用材质和纹理。

Line List

A line list is a list of isolated, straight line segments. Line lists are useful for such tasks as adding sleet or heavy rain to a 3D scene. Applications create a line list by filling an array of vertices. Note that the number of vertices in a line list must be an even number greater than or equal to two. You can apply materials and textures to a line list.

线链表是一组独立线段的集合。线链表对于3D场景中添加条状物或者绘制大雨非常有效。程序通过填充一组顶点到顶点缓存中来创建线链表。注意顶点缓存中的顶点数必须是一个大于或者等于2的奇数。你可以对线链表使用材质或者纹理。

Line List with Adjacency

A line list that also contains adjacency information.

线链表可以拥有邻接信息。

The adjacency information specifies the neighboring vertices around a primitive and is used by a geometry shader to calculate things like edges that require knowledge of a primitive and any geometry that shares edges or vertices with it.

邻接信息定义了图元的邻接顶点,在GS中使用邻接信息来计算边缘需要知道图元和任何与之共享边缘的顶点。

Line Strip

A line strip is a primitive that is composed of connected line segments. Use line strips for creating polygons that are not closed.

线条带是有连接的线段组成的图元。使用线条带创建的多边形不是闭合的。

Line Strip with Adjacency

A line strip that also contains adjacency information.

线条带也可以拥有邻接信息。

Triangle List

A triangle list is a list of unconnected triangles. A triangle list must have at least three vertices and the total number of vertices must be divisible by three.

三角形链表是不连接的三角形的集合。三角形链表使用至少三个顶点,并且总顶点数必须可以被3整除。

Triangle lists are also useful for creating primitives that have sharp edges.
三角形链表可以用来创建有尖锐边缘的物体。

Triangle List with Adjacency

A triangle list that also contains adjacency information.

懒得说了。

Triangle Strip

A triangle strip is a series of connected triangles. Because the triangles are connected, the application does not need to repeatedly specify all three vertices for each triangle.

三角形条带是一系列连接的三角形。因为三角形是相互连接的,所以程序不需要对每个三角形独立定义所有三个顶点。

Most objects are composed of triangle strips. This is because triangle strips can be used to specify complex objects in a way that makes efficient use of memory and processing time.

许多物体由三角形条带构成。这是因为三角形条带能够定义复杂的物体,使内存使用和处理时间更有效。

Triangle Strip with Adjacency

A triangle strip that also contains adjacency information.

唉。

Shader Stages

All Direct3D10 shader stages expose all features of the shader Model 4.0 common shader core; the Direct3D 10 pipeline contains 3 programmable shader stages:

所有的D3D10 Shader阶段体现了SM4.0通用Shader内核的细节。D3D10Shader流水线包含3个可编程Shader阶段。

Vertex Shader Stage

The vertex shader stage processes vertices from the input assembler, performing per-vertex operations such as transformations, skinning, morphing, and per-vertex lighting. Vertex shaders always operate on a single input vertex and produce a single output vertex. The vertex shader stage must always be active for the pipeline to execute. If no vertex modification or transformation is required, a pass-through vertex shader must be created and set to the pipeline.

VS阶段处理从IA阶段输入的顶点,执行对每个顶点的操作,诸如几何变换,表面变换,变形和逐顶点光照。VS只对一个顶点执行操作,输出一个顶点。如果没有顶点变换需求,则需要创建一个空的VS设置到流水线中。

Each vertex shader input vertex can be comprised of up to 16 32-bit vectors (up to 4 components each) and each output vertex can be comprised of as many as 16 32-bit 4-component vectors. All vertex shaders must have a minimum of one input and one output, which can be as little as one scalar value.

每个VS输入顶点最多能够有1632位的向量(4个通道)构成;每个输出顶点可以由16324通道的向量构成。所有的VS至少拥有一个只具有一个通道的输入向量和输出向量。

The vertex shader stage can consume two system generated values from the input assembler: VertexID and InstanceID (see System Values and Semantics). Since VertexID and InstanceID are both meaningful at a vertex level, and IDs generated by hardware can only be fed into the first stage that understands them, these ID values can only be fed into the vertex shader stage.

VS能够识别两个系统生成值:VertexIDInstanceID(参见System Value and Semantics)。因为VertexIDInstanceID在顶点处理阶段都是有意义的,而且硬件生成的ID只能输入到第一个能够识别它们的阶段中,因此这些ID只能在VS阶段输入。

Vertex shaders are always run on all vertices, including adjacent vertices in input primitive topologies with adjacency. The number of times that the vertex shader has been executed can be queried from the CPU using the VSInvocations pipeline statistic.

VS对所有顶点执行,包括具有邻接信息的输入图元的邻接顶点。VS执行的次数能够通过在CPU上调用VSInvocations统计数据来获得。

The vertex shader can perform load and texture sampling operations where screen-space derivatives are not required (using HLSL intrinsic functions samplelevel, samplecmplevelzero, samplegrad).

Vs能够在不需要屏幕位置导数的情况下执行采样纹理操作(使用HLSL内置函数samplelevel, samplecmplevelzero, samplegrad)。

Geometry Shader Stage

The geometry shader runs application-specified shader code with vertices as input and the ability to generate vertices on output. Unlike vertex Shaders, which operate on a single vertex, the geometry shader's inputs are the vertices for a full primitive (two vertices for lines, three vertices for triangles, or single vertex for point). Geometry shaders can also bring in the vertex data for the edge-adjacent primitives as input (an additional two vertices for a line, an additional three for a triangle).

GS对一组顶点执行操作,并输出一组顶点。和VS不一样的是,它不是仅对一个顶点操作,而是对整个图元的所有顶点操作(线条有两个顶点,三角形有三个顶点,点有一个顶点)。GS同样能够得到邻接顶点的信息(线条可以多得到两个顶点,三角形可以多得到三个顶点)。

The geometry shader stage can consume the SV_PrimitiveID System Value that is auto-generated by the IA. This allows per-primitive data to be fetched or computed if desired.

BS阶段能够读取IA阶段自动生成的SV_PrimitiveID系统值。这似的逐图元的数据能够在需要的情况下被读取和计算。

The geometry shader stage is capable of outputting multiple vertices forming a single selected topology (GS output topologies available are: tristrip, linestrip, and pointlist). The number of primitives emitted can vary freely within any invocation of the geometry shader, though the maximum number of vertices that could be emitted must be declared statically. Strip lengths emitted from a GS invocation can be arbitrary, and new strips can be created via the RestartStrip HLSL intrinsic function.

GS阶段能够输出多个顶点组成一个选定的图元拓扑(GS输出的图元拓扑保多:三角形条带,线条带和点链表)。图元输出的个数能够自由变化,但是最大顶点数必须静态的申明。GS输出的条带长度是绝对的,新的条带可以通过HLSL内置函数RestartStrip来创建。

Geometry shader output may be fed to the rasterizer stage and/or to a vertex buffer in memory via the stream output stage. Output fed to memory is expanded to individual point/line/triangle lists (exactly as they would be passed to the rasterizer).

GS输出到光栅化阶段或者通过流输出阶段输出到显存中的顶点缓存中。输出到内存的是独立的顶点、直线或者三角形链表(当然也可能被送到光栅化过程中)。

When a geometry shader is active, it is invoked once for every primitive passed down or generated earlier in the pipeline. Each invocation of the geometry shader sees as input the data for the invoking primitive, whether that is a single point, a single line, or a single triangle. A triangle strip from earlier in the pipeline would result in an invocation of the geometry shader for each individual triangle in the strip (as if the strip were expanded out into a triangle list). All the input data for each vertex in the individual primitive is available (i.e. 3 vertices for triangle), plus adjacent vertex data if applicable/available.

当一个GS被激活时,它对每个传过来的或者被流水线预先生成的图元执行一次操作。每次GS操作把输入数据当作图元处理,也就是当作一个点,一条直线或者一个三角形。从前端流水传来的三角条带会引起GS对条带中每个独立的三角形做一次操作(就好像把条带扩展成为了三角链表)。每个独立图元的每个顶点的所有输入数据都可以取到(比如三角形的三个顶点),并且在预先申明的情况下还附加了邻接顶点信息。

A geometry shader outputs data one vertex at a time by appending vertices to an output stream object. The topology of the streams is determined by a fixed declaration, choosing one of: PointStream, LineStream, or TriangleStream as the output for the GS stage. There are three types of stream objects available, PointStream, LineStream and TriangleStream which are all templated objects. The topology of the output is determined by their respective object type, while the format of the vertices appended to the stream is determined by the template type. Execution of a geometry shader instance is atomic from other invocations, except that data added to the streams is serial. The outputs of a given invocation of a geometry shader are independent of other invocations (though ordering is respected). A geometry shader generating triangle strips will start a new strip on every invocation.

GS通过一次输出一个顶点,把顶点附加到输出流对象的最后来输出数据。输出流的拓扑通过设置绘制状态来确定,从PointStream, LineStreamTriangleStream中选择一个。这里有三种输出流对象,PointStream, LineStreamTriangleStream,它们都是模板对象。输出的拓扑通过它们对应的对象类型确定,添加到流上的顶点格式由模板类型决定。GS实例的执行操作相对同步其他的GS操作来说是独立的,但是添加流数据是按顺序进行的。GS的输出和其他GS的调用是相互独立的。每个GS每次会重新创建一个三角条带。

When a geometry shader output is identified as a System Interpreted Value (e.g. SV_RenderTargetArrayIndex or SV_Position), hardware looks at this data and performs some behavior dependent on the value, in addition to being able to pass the data itself to the next shader stage for input. When such data output from the geometry shader has meaning to the hardware on a per-primitive basis (such as SV_RenderTargetArrayIndex or SV_ViewportArrayIndex), rather than on a per-vertex basis (such as SV_ClipDistance[n] or SV_Position), the per-primitive data is taken from the leading vertex emitted for the primitive.

因为GS的输出是一个系统识别值(比如SV_RenderTargetArrayIndex or SV_Position),硬件会找到这些数据并且执行和系统值相关的操作,同时也为了能够把数据传送到下一个Shader阶段作为输入。因为GS输出的数据是基于图元的(比如SV_RenderTargetArrayIndex SV_ViewportArrayIndex)而不是基于顶点的(比如SV_ClipDistance[n]SV_Position),逐图元数据就要从输出的引导索引开始读取。

Partially completed primitives could be generated by the geometry shader if the geometry shader ends and the primitive is incomplete. Incomplete primitives are silently discarded. This is similar to the way the IA treats partially completed primitives.

部分完整的图元会当GS在图元没有生成完全结束时被GS生成。不完整的图元会被默认忽略,就和IA阶段对待部分完整的图元一样。

The geometry shader can perform load and texture sampling operations where screen-space derivatives are not required (samplelevel, samplecmplevelzero, samplegrad).

GS能够在不需要屏幕坐标位置导数的情况下进行采样纹理的操作(samplelevel, samplecmplevelzero, samplegrad

Algorithms that can be implemented in the geometry shader include:

使用GS可以实现的算法有:

  • Point Sprite Expansion 点精灵扩展
  • Dynamic Particle Systems 动态粒子系统
  • Fur/Fin Generation 皮毛生成
  • Shadow Volume Generation Shadow Volume生成Shadow Volume
  • Single Pass Render-to-Cubemap Pass绘制CubeMap
  • Per-Primitive Material Swapping 逐图元材质变换
  • Per-Primitive Material Setup - Including generation of barycentric coordinates as primitive data so that a pixel shader can perform custom attribute interpolation. 逐图元材质重建,包括生成图元的重心坐标数据,使PS能够执行其他属性的插值。

Pixel Shader Stage

A pixel shader is invoked by the rasterizer stage, to calculate a per-pixel value for each pixel in a primitive that gets rendered. The pixel shader enables rich shading techniques such as per-pixel lighting and post-processing. A pixel shader is a program that combines constant variables, texture values, interpolated per-vertex values, and other data to produce per-pixel outputs. The stage preceding the rasterizer stage (GS stage or the VS stage is the geometry shader is NULL) must output vertex positions in homogenous clip space.

PS在光栅化阶段后调用,用于计算一些被绘制图元的逐象素信息。PS包括丰富的光照技术,包括逐象素光照和后期处理。PS是包含常数变量,纹理值,顶点插值数据和其他逐象素数据的程序。PS前的光栅化阶段(GS或者当GS是空时为VS)必须向象素空间输出顶点位置。

A pixel shader can input up to 32 32-bit 4-component data for the current pixel location. It is only when the geometry shader is active that all 32 inputs can be fed with data from above in the pipeline. In the absence of the geometry shader, only up to 16 4-component elements of data can be input from upstream in the pipeline.

PS能够对每个象素输入32324通道的数据。只有当GS激活时所有的32位输入才能被完全填充。没有GS的话,只有最多164通道的数据能够从流水线前端输入。

Input data available to the pixel shader includes vertex attributes that can be chosen, on a per-element basis, to be interpolated with or without perspective correction, or be treated as per-primitive constants. In addition, declarations in a pixel shader can indicate which attributes to apply centroid evaluation rules to. Centroid evaluation is relevant only when multisampling is enabled, since cases arise where the pixel center may not be covered by the primitive (though subpixel center(s) are covered, hence causing the pixel shader to run once for the pixel). Attributes declared with centroid mode must be evaluated at a location covered by the primitive, preferably at a location as close as possible to the (non-covered) pixel center.

PS有效的输入数据包括被选中的逐元素的顶点数据,它们可以被透视校正或者不校正,然后再插值得到,或者作为逐图元的常量。同时,PS的申明还表示了哪个属性会被应用重心赋值规则得到。重心赋值规则只在多采样的时候生效,因为存在象素中心可能不被图元覆盖的情况(虽然子象素中心是被覆盖的,但这样会导致PS为这个象素再运行一次)。被申明为重心模式的属性必须在被图元覆盖的区域赋值,而且距离(没有被覆盖的)象素中心越近越好。

A pixel shader can output up to 8 32-bit 4-component data for the current pixel location to be combined with the render target(s), or no color (if the pixel is discarded). A pixel shader can also output an optional 32-bit float scalar depth value for the depth test (SV_Depth).

PS对于RT的当前象素位置能够输出8324通道的数据,或者不输出颜色(如果象素取消绘制的话)。PS同样能够输出可选的32位浮点深度值,用作深度测试。

For each primitive entering the rasterizer, the pixel shader is invoked once for each pixel covered by the primitive. When multisampling, the pixel shader is invoked once per covered pixel, though depth/stencil tests occur for each covered multisample, and multisamples that pass the tests are updated with the pixel shader output color(s).

对于进入光栅化后每个图元,PS对于图元覆盖的每个象素只调用一次。当多采样时,PS对每个覆盖的象素调用一次,虽然D/S测试对每个覆盖的采样点调用一次,通过D/S测试的采样点使用PS的输出颜色更新。

If there is no geometry shader, the IA is capable of producing one scalar per-primitive system-generated value to the pixel shader, the SV_PrimitiveID, which can be read as input to the pixel shader. The pixel shader can also retrieve the the SV_IsFrontFace value, generated by the rasterizer stage.

如果没有GS的话,IA能够创建一个逐图元生成的值给PS,那就是SV_PrimitiveID,能够作为PS的输入被PS读取。PS同样也能够得到在光栅化过程仲生成的SV_IsFrontFace值。

One of the inputs to the pixel shader can be declared with the name SV_Position, which means it will be initialized with the pixel's float32 xyzw position. Note that w is the reciprocal of the linearly interpolated 1/w value. When the rendertarget is a multisample buffer or a standard rendertarget, the xy components of position contain pixel center coordinates (which have a fraction of 0.5f).

PS的输入之一可以被申明为SV_Position,意味着它可以被初始化为象素的32位浮点数xyzw位置。注意w时线形插值得到的1/w的倒数。当RT时一个多采样缓存或者标准RT时,xy通道包含象素重心点的坐标(小数位偏移0.5f

The pixel shader instruction set includes several instructions that produce or use derivatives of quantities with respect to screen space x and y. The most common use for derivatives is to compute level-of-detail calculations for texture sampling and in the case of anisotropic filtering, selecting samples along the axis of anisotropy. Typically, hardware implementations run a pixel shader on multiple pixels (for example a 2x2 grid) simultaneously, so that derivatives of quantities computed in the pixel shader can be reasonably approximated as deltas of the values at the same point of execution in adjacent pixels.

PS的指令集包括几条生成和使用屏幕位置的x,y的导数的指令。使用最多的导数是纹理采样和多异向性滤波的LOD计算,选择沿轴方向的异向。典型的多采样硬件实现就是对多个象素(比如说2x2网格)运行类似的PS,所以PS里计算的导数理论上逼近对临近象素执行PS的实际值。

Stream Output Stage

The stream output stage (SO) is located in the pipeline right after the geometry shader stage and just before the rasterization stage.

流输出阶段在流水线后紧随GS之后在光栅化之前的一个阶段。

 

Figure 1.  Pipeline Block Diagram - the crosshatched stage is the Stream Output stage

The purpose of the SO stage is to write vertex data streamed out of the GS stage (or the VS stage if the GS stage is inactive) to one or more buffer resources in memory. Data streamed out to memory can be read back into the pipeline in a subsequent rendering pass, or can be copied to a staging resource for readback to the CPU. Since variable amounts of data can be generated by a geometry shader, the amount of data streamed out can vary. The DrawAuto API allows this variable amount of data to be processed in a subsequent pass without the need to query (from the CPU) the amount of data written to stream output.

SO阶段的目的是把GS阶段输出的顶点数据(或者在GS没有激活的情况下是VS阶段)写到显存中的一个或多个缓存资源中。输出到显存中的流数据能够被下一个绘制pass读回到流水线中,或者能够拷贝到某些资源中被CPU读回。因为GS生成数据的数量是变化的,流输出的大小也是可变的。DrawAuto API允许可变数量的数据在下一个pass中处理而不需要(向CPU查询)流数据的的数量。

SO Stage API

These are the steps required to initialize and execute the stream output stage:

这些是初始化和执行SO阶段的步骤

Compile a Geometry Shader

Given the following geometry shader (from Tutorial13):

有如下的GS

struct GSPS_INPUT
{
    float4 Pos : SV_POSITION;
    float3 Norm : TEXCOORD0;
    float2 Tex : TEXCOORD1;
};
 
[maxvertexcount(3)]
void GS( triangle GSPS_INPUT input[3], inout TriangleStream<GSPS_INPUT> TriStream )
{
    GSPS_INPUT output;
    
    //
    // Calculate the face normal
    //
    float3 faceEdgeA = input[1].Pos - input[0].Pos;
    float3 faceEdgeB = input[2].Pos - input[0].Pos;
    float3 faceNormal = normalize( cross(faceEdgeA, faceEdgeB) );
    
    for( int v=0; v<3; v++ )
    {
        output.Pos = input[v].Pos + float4(faceNormal*Explode,0);
        output.Pos = mul( output.Pos, View );
        output.Pos = mul( output.Pos, Projection );
        
        output.Norm = input[v].Norm;
        
        output.Tex = input[v].Tex;
        
        TriStream.Append( output );
    }
    
    TriStream.RestartStrip();
}

This shader calculates a face normal for each triangle, and outputs position, normal and texture coordinate data. A geometry shader looks just like a vertex or pixel shader, with the following exceptions:

这个Shader对每个三角形计算表面法向,并输出位置,法向和纹理坐标数据。GSVSPS类似,只是有如下区别:

  • GS function return type - the function return type does one thing, declares the maximum number of vertices that can be output by the shader. In this case,

GS函数返回类型 函数返回类型只做一件事,申明Shader能够输出的最大数量的顶点数,在这个例子中

maxvertexcount[3]

defines the output to be a maximum of 3 vertices.

定义了最大输出三个顶点。

  • GS input parameter declarations - This function takes two input parameters:

GS输出参数申明-这个函数带两个参数:

·             triangle GSPS_INPUT input[3] , inout TriangleStream<GSPS_INPUT> TriStream

The first parameter is an array of vertices (3 in this case) defined by a GSPS_INPUT struct (which defines per-vertex data as a position, a normal and a texture coordinate). The first parameter also uses the triangle keyword which means the input assembler stage must output data to the geometry shader as one of the triangle primitive types (triangle list or triangle strip).

第一个参数是一个使用GSPS_INPUT结构(定义了逐顶点的数据,比如一个位置,一个法向和一个纹理坐标)定义的顶点数组(在这个例子中是3个)。第一个参数同样使用了triagnle关键字,意味着IA阶段必须给GS输入一种三角形图元类型(三角链表或者三角条带)。

The second parameter is a triangle stream defined by the type

第二个参数是一个三角形数据流类型

 TriangleStream<GSPS_INPUT>

. This means the parameter is an array of triangles, each of which is made up of three vertices (that contain the data from the members of GSPS_INPUT).

这表示参数是一组三角形,每个由3个顶点构成(包含GSPS_INPUT中定义的数据)。

Use the triangle and trianglestream keywords to identify individual triangles or a stream of triangles in a GS.

使用trianglegrianglestream关键字在GS中定义独立的三角形和三角形流数据。

  • GS intrinsic function - The lines of code in the shader function use common-shader-core HLSL intrinsic functions except the last two lines, which call Append and RestartStrip. These functions are only available to a geometry shader. Append informs the geometry shader to append the output to the current strip; RestartStrip creates a new primitive strip. A new strip is implicitly created in every invocation of the GS stage.

GS内置函数-Shader函数使用的代码都是使用通用shader核心的HLSL内置函数,除了最后两行,调用了AppendRestartStrip。这些函数只对GS有效。Append通知GS把输出顶点附加到当前条带最后。RestartStrip创建一个新的图元条带。一个新的条带在每次调用GS时被隐式生成。

The rest of the shader looks very similar to a vertex or pixel shader. The geometry shader uses a struct to declare input parameters and marks the position member with the SV_POSITION semantic to tell the hardware that this is position data. The input structure identifies the other two input parameters as texture coordinates (even though one of them will contain a face normal). You could use your own custom semantic for the face normal if you prefer.

剩下的ShaderVSPS看起来很相似。GS使用结构来申明输入参数,使用SV_POSITION来标记位置成员,告诉硬件这是个位置数据。输入结构定义了其他两个输入参数作为纹理坐标(其中一个包含表面法向)。如果你喜欢的话,你能够使用你自己的语义来表示法向。

Having designed the geometry shader, call D3D10CompileShader to compile it like this:

设计完GS之后,调用D3D10CompileShader编译Shader,如下

DWORD dwShaderFlags = D3D10_SHADER_ENABLE_STRICTNESS;
ID3D10Blob** ppShader;
 
D3D10CompileShader( pSrcData, sizeof( pSrcData ), 
  "Tutorial13.fx", NULL, NULL, "GS", "gs_4_0", 
  UINT Flags, &ppShader, NULL );

Just like vertex and pixel shaders, you will need a shader flag to tell the compiler how you want the shader compiled (for debug, optimized for speed etc...), the entry point function, and the shader model to validate against. This example creates a geometry shader built from the Tutorial13.fx file, using the GS function. The shader is compiled for shader model 4.0.

就像VSPS一样,你需要一个Shader标记来告诉编译器你希望如何编译Shader(为了调试,优化等等),还有入口函数和Shader模型等信息。这个例子从Tutorial13.fx文件中创建了一个GS,使用GS函数。Shader被编译为Shader Model4.0

Create a Geometry Shader Object with Stream Output

Once you know that you will be streaming the data from the geometry, and you have successfully compiled the shader, the next step is to call CreateGeometryShaderWithStreamOutput to create the geometry shader object.

一旦你知道你会从GS中得到流数据,并且你已经编译好了Shader,下一步就是调用CreateGeometryShaderWithStreamOutput来创建GS对象。

But first, you need to declare the SO stage input signature. This signature matches or validates the GS outputs and the SO inputs at object creation time. Here's an example of the SO declaration:

首先你必须申明SO阶段输入符号,这个符号在对象创建的时候和GS的输出及SO的输入匹配或者使之生效。下面是一个SO申明的例子:

D3D10_STREAM_OUTPUT_DECLARATION_ENTRY pDecl[] =
{
        // semantic name, semantic index, start component, component count, output slot
        { L"SV_POSITION", 0, 0, 4, 0 },   // output all components of position
        { L"TEXCOORD0", 0, 0, 3, 0 },     // output the first 3 of the normal
        { L"TEXCOORD1", 0, 0, 2, 0 },     // output the first 2 texture coordinates
};
 
    D3D10Device->CreateGeometryShaderWithStreamOut( pShaderBytecode, pDecl, 3, 
        sizeof(pDecl), &pGS );

This function takes several parameters including:

这个函数包含以下几个参数:

  • A pointer to the compiled geometry shader (or vertex shader if no geometry shader will be present and data will be streamed out directly from the VS). To get this pointer call D3D10CompileShader to compile a shader (and return an ID3D10Blob interface); then call ID3D10Blob::GetBufferPointer() to get a pointer to the shader byte code.

一个指向编译好的GS(或者VS,如果没有GS的话,那么数据会从VS中输入)的指针。调用D3D10CompileShader编译Shader(返回ID3D10Blob接口)然后调用ID3D10Blob::GetBufferPointer()得到这个指向Shader二进制代码的指针。

  • A pointer to an array of declarations that describe the input data for the stream output stage. See D3D10_SO_DECLARATION_ENTRY. You can supply up to 64 declarations, one for each different type of element to be output from the SO stage.

指向描述输入到SO阶段的输入数据格式的一组申明的指针。参见D3D10_S0_DECLARATION_ENTRY。你能够最多提供64个申明,每个表示输入到SO阶段的不同类型的数据元素。

  • The number of elements that are written out by the SO stage.

输入到SO阶段的元素个数

  • A pointer to the geometry shader object created (see ID3D10GeometryShader Interface).

一个指向创建好的GS对象的指针(参见ID3D10GeometryShader接口)。

The stream output declaration defines the way data is written to a buffer resource. You can add as many components as you want to the output declaration. The SO stage supports writing out to a single buffer resource, or many buffer resources. When writing to a single buffer, the SO stage supports writing many different elements per-vertex. When writing to more than one buffer, the SO stage only supports writing one element to each buffer.

SO申明定义了数据写入缓存资源的方式。你能够添加足够多的申明项到输出申明中。SO阶段支持输出到单个(或多个)缓存资源中。当写单个缓存时,SO阶段支持逐顶点写多个不同的元素。当写入多个缓存时,SO只支持对每个缓存写一个元素。

Set The Output Targets

The last step is to set the SO buffers. Data can be streamed out into one or more buffers in memory for use later. This example shows how to create a single buffer that can be used for vertex data as well as for the SO stage to stream data into:

最后一步是设置SO缓存。数据能够输出到一个或多个缓存,被将来的程序使用。这个例子显示了如何创建一个能够用作让SO阶段输出流数据并保存顶点数据的缓存:

        ID3D10Buffer *m_pBuffer;
        int m_nBufferSize = 1000000;
 
        D3D10_BUFFER_DESC bufferDesc =
        {
                 m_nBufferSize,
                 D3D10_USAGE_DEFAULT,
                 D3D10_BIND_STREAM_OUTPUT,
                 0,
                 0
        };
        D3D10Device->CreateBuffer( &bufferDesc, NULL, &m_pBuffer );

Create a buffer by calling CreateBuffer. The buffer size is specified for megabyte with a default usage. This usage is typical for a buffer resource that is expected to be updated fairly frequently by the CPU. The binding flag identifies the pipeline stage that the resource can be bound to. Any resource used by the SO stage must also be created with the D3D10_BIND_STREAM_OUTPUT bind flag.

调用CreateBuffer创建缓存。默认用途(D3D10_USAGE_DEFAULT)的缓存大小以MB为单位。这种用途(D3D10_USAGE_DEFAULT)适用于希望很少被CPU更新的缓存资源。绑定标记定义了缓存绑定的流水线阶段。所以被SO阶段数用的资源必须使用D3D10_BIND_STREAM_OUTPUT绑定标记创建。

Once the buffer is successfully created, set it to the current device by calling SOSetTargets:

一旦缓存被成功创建,调用SOSetTargets把它设置到当前设备上。

    UINT offset[1] = 0;
 
    D3D10Device->SOSetTargets( 1, m_pBuffer, offset );

This call takes the number of buffers, a pointer to the buffers, and an array of offsets (one offset into each of the buffers to the start of the buffer data).

函数调用参数有缓存的输入,缓存指针和一个偏移量的数组(一个偏移量表示对应缓存相对缓存头数据偏移多少)。

Rasterizer Stage

The rasterizer stage transforms primitives to their pixel locations. To do so, the stage clips vertices to the view frustum, sets up the primitives for mapping to the 2D viewport, and determines how to invoke pixel shaders, if any are present. Some of these features are optional (like pixel shaders), however, the rasterizer always performs clipping, a perspective divide to transform the points into homogenous space, and maps the vertices to the viewport.

光栅化阶段把图元变换到它们的象素位置。为了做到这个,这个阶段剔除视锥范围外的顶点,把图元匹配到2维视口上,并且在PS存在的情况下确定如何调用PS。这些功能里有些是可选的(比如PS),当然,光栅器一定会执行象素剔除,通过透视变换把点变换到透视空间,把顶点匹配到视口上等操作。

Vertices (x,y,z,w), coming into the rasterizer stage are assumed to be in homogenous clip-space. In this coordinate space the X axis points right, Y points up and Z points away from camera.

到光栅化阶段的顶点(x,y,z,w)被认为是在一致剔除空间(即透视空间)中。在这个坐标系中,x轴指向右边,y轴指向上边,z轴指向相机前方。

Rasterizer API

The rasterizer stage is controlled with the following features:

光栅化阶段的API通过以下方式控制

Set the Viewport

A viewport maps vertex positions (in clip space) into rendertarget positions. This step scales the 3D positions into 2D space. A rendertarget is oriented with the Y axes pointing down; this requires that the Y coordinates get flipped during the viewport scale. In addition, the x and y extents (range of the x and y values) are scaled to fit the viewport size according to the following formulas:

视口把顶点位置(在象素剔除空间(透视空间))匹配到RT位置上。这一步把3维位置缩放到2维空间上。RT坐标中y轴向下,这需要Y坐标在视口变换中翻转。同时,xy扩展(xy值的范围)要通过下列方程缩放来匹配视口大小:

X = (X + 1) * Viewport.Width * 0.5 + Viewport.TopLeftX
Y = (1 - Y) * Viewport.Height * 0.5 + Viewport.TopLeftY
Z = Viewport.MinZ + Z * (Viewport.MaxZ - Viewport.MinZ) 

Tutorial 1 creates a viewport that is 640 x 480 like this:

练习1创建了一个640×480的视口如下:

    D3D10_VIEWPORT vp[1];
    vp[1].Width = 640.0f;
    vp[1].Height = 480.0f;
    vp[1].MinZ = 0;
    vp[1].MaxZ = 1;
    vp[1].TopLeftX = 0;
    vp[1].TopLeftY = 0;
    g_pd3dDevice->RSSetViewports( 1, vp );

The viewport description specifies the size of the viewport, the range to map depth to (using MinZ and MaxZ), and the placement of the top left of the viewport. MinZ nust be less than or equal to MaxZ; the range for both MinZ and MaxZ is between 0.0 and 1.0, inclusive. It is common for the viewport to map to a rendertarget but it is not necessary; additionally, the viewport does not have to have the same size or position as the rendertarget.

视口对象描述了视口的大小,深度范围(使用MinZMaxZ)和视口的左上点。MinZ必须小于等于MaxZMinZMaxZ都必须在0.01.0之间。视口一般要和RT匹配,但是不是必须的。同时视口也不必要和RT有相同的大小和位置。

Only one viewport can be set active at a time. The pipeline uses a default viewport (and scissor rectangle - see next section) during rasterization. The default is always the first viewport (or scissor rectangle) in the array. To perform a per-primitive selection of the viewport in the geometry shader, specify the ViewportArrayIndex semantic on the appropriate GS output component in the GS output signature declaration.

一次绘制时只有一个视口可以被激活。流水线在光栅化过程中使用默认视口(和裁减方框-见下一节)。默认视口通常是视口数组中的第一个视口(或裁减方框)。想在GS中执行逐图元选择视口的操作,就要通过Gs输出符号申明在GS输出项中定义合适的ViewportArrayIndex语义。

Set the Scissor Rectangle

A scissor rectangle gives you another opportunity to reduce the number of pixels that will be sent to the output merger stage. Pixels inside the scissor rectangle (a 2D rectangle on the surface of the viewport) are saved to the OM stage while pixels outside of the scissor rectangle are discarded. Only one scissor rectangle can be set active at a time in the geometry shader. The size of the scissor rectangle (in integers) is unlimited.

裁减方框给你另外一个机会去减少送到输出合并阶段的象素数目。在裁减方框里的象素(在视口表面上的一个2D的长方形)被保存下来送到OM阶段,而裁减方框外的则被忽略。裁减方框的大小(整数)是无限的。

To enable the scissor rectangle, use the ScissorEnable member (in D3D10_RASTERIZER_DESC). The default scissor rectangle is an empty rectangle, that is, all rect values are 0. In other words, if you do not set up the scissor rectangle and enable the scissor rectangle, you will not send any pixels to the OM stage. The most common set up is to initialize the scissor rectangle to the size of the viewport.

要使裁减方框生效,使用ScissorEnable成员(在D3D10_RASTERIZER_DESC中)。默认的裁减方框是一个空的方框,这就是说方框的所有参数都是0。换句话说,如果你没有设定裁减方框同时打开了裁减开关,你就不能送任何象素到OM阶段中。最普通的创建方式是把裁减方框初始化成视口大小。

To set an array of scissor rectangles to the device, call RSSetScissorRects:

通过调用RSSetScissorRects来设置裁减方框到设备中。

D3D10_RECT rects[1];
 rects[1].left = 0;
 rects[1].right = 640;
 rects[1].top = 0;
 rects[1].bottom = 480;
 
 D3DDevice->RSSetScissorRects( 1, rects );

This method takes two parameters: an array of rectangles, and the number of rectangles in the array.

这个函数有两个参数:一组方框和方框的数目。

The pipeline uses a default scissor rectangle index during rasterization (the default is a 0 size rectangle with clipping disabled). To override this, specify the SV_ViewportArrayIndex semantic to a GS output component in the GS output signature declaration. This will cause the GS stage to mark this GS output component as a system-generated component with this semantic. The rasterizer stage recognizes this semantic and will use the parameter it is attached to as the scissor rectangle index to access the array of scissor rectangles. Don't forget to tell the rasterizer stage to use the scissor rectangle you define by enabling the ScissorEnable value in the D3D10_RASTERIZER_DESC before creating the rasterizer object.

流水线在光栅化过程中使用默认的裁减方框索引(默认的是一个大小为0的方框,不做剔除)。可以在GS输出申明中对GS输出项设置SV_ViewportArrayIndex语义来改变默认设置。这样做会导致GS阶段标记使用这个语义GS输出项作为系统生成项。光栅化阶段识别到这个语义,并使用附加在其上的裁减方框数组索引来读取裁减数组对象素做裁减。在创建光栅化对象时,不要忘记设置D3D10_RASTERIZER_DESC中的ScissorEnable值使光栅化阶段可以应用裁减方框。

Set Rasterizer State

In Direct3D 10, rasterizer state is encapsulated in a rasterizer state object. You may create up to 4096 rasterizer state objects which can then be set to the device by passing a handle to the state object.

D3D10中,光栅化对象封装了光栅化状态。你能够创建最多4096个光栅化对象,并能通过传送一个句柄到设备上来设定光栅化对象。

Here is an example of creating a rasterizer state object:

这里描述了如何创建光栅化对象:

    ID3D10RasterizerState * g_pRasterState;
 
    D3D10_RASTERIZER_DESC rasterizerState;
    rasterizerState.FillMode = D3D10_FILL_SOLID;
    rasterizerState.CullMode = D3D10_CULL_FRONT;
    rasterizerState.FrontCounterClockwise = true;
    rasterizerState.DepthBias = false;
    rasterizerState.DepthBiasClamp = 0;
    rasterizerState.SlopeScaledDepthBias = 0;
    rasterizerState.DepthClipEnable = true;
    rasterizerState.ScissorEnable = true;
    rasterizerState.MultisampleEnable = false;
    rasterizerState.AntialiasedLineEnable = false;
    pd3dDevice->CreateRasterizerState( &rasterizerState, &g_pRasterState );

This set of state accomplishes perhaps the most basic rasterizer setup:

这个集合创建了最基本的光栅化设置

  • Solid fill mode 面填充模式
  • Cull out, that is, remove back faces; assume counterclockwise winding order for primitives 剔除模式,剔除背面;假设图元顶点按逆时针顺序旋转排列。
  • Turn off depth bias but enable depth buffering and enable the scissor rectangle
  • Turn off multisampling and line antialiasing 关闭多采样和线形反走样

In addition, basic rasterizer operations always include: clipping (to the view frustum), perspective divide, and the viewport Scale. After successfully creating the rasterizer state object, set it to the device like this:

通常基本的光栅化操作包括:裁减(对于视锥而言),透视变换和视口变换。成功创建光栅化对象后,把它设置到设备中:

    pd3dDevice->RSSetState(g_pRasterState);

Output Merger Stage

The output merger stage (OM) generates the final rendered pixel color using a combination of pipeline state, the pixel data generated by the pixel shaders, the contents of the rendertargets, and the contents of the depth-stencil buffers. The OM stage is the final step for determining which pixels are visible (using the depth-stencil buffers) and blending the final pixel colors.

输出合并阶段(OM)使用流水线状态,PS生成的象素数据,Rt里面的内容和D/S缓存里的内容合并来生成最终的象素颜色。OM阶段是确定象素是否可见(使用D/S缓存和合并最终象素颜色的最终阶段。

OM Stage API

These are the steps required to initialize and execute the stream output stage:

这是初始化和执行OM阶段的步骤。(辣块妈妈的,MS的人又写错了)

Set Depth Stencil State

typedef struct D3D10_DEPTH_STENCIL_DESC {
    BOOL DepthEnable;
    D3D10_DEPTH_WRITE_MASK DepthWriteMask;
    D3D10_COMPARISON_FUNC DepthFunc;
    BOOL StencilEnable;
    UINT8 StencilReadMask;
    UINT8 StencilWriteMask;
    D3D10_DEPTH_STENCILOP_DESC FrontFace;
    D3D10_DEPTH_STENCILOP_DESC BackFace;
} D3D10_DEPTH_STENCIL_DESC, *LPD3D10_DEPTH_STENCIL_DESC;
 
 
HRESULT CreateDepthStencilState(
 const D3D10_DEPTH_STENCIL_DESC * pDepthStencilDesc,
 ID3D10DepthStencilState ** ppDepthStencilState
);
 
 
void OMSetDepthStencilState(
 ID3D10DepthStencilState * pDepthStencilState,
 UINT StencilRef
);

Set Blend State

There is only one set of blend controls, so that the same blend is applied to all rendertargets with blending enabled.

混合控制只有一个集合,因此对于所有打开blendRT都使用相同的混合操作。

Once created, a blend-state object or depth/stencil state object cannot be edited.

一旦创建,一个混合状态对象或者D/S对象就不能被编辑。

 
typedef struct D3D10_BLEND_DESC {
    BOOL AlphaToCoverageEnable;
    BOOL BlendEnable[8];
    D3D10_BLEND SrcBlend;
    D3D10_BLEND DestBlend;
    D3D10_BLEND_OP BlendOp;
    D3D10_BLEND SrcBlendAlpha;
    D3D10_BLEND DestBlendAlpha;
    D3D10_BLEND_OP BlendOpAlpha;
    UINT8 RenderTargetWriteMask[8];
} D3D10_BLEND_DESC, *LPD3D10_BLEND_DESC;
 
 
HRESULT CreateBlendState(
 const D3D10_BLEND_DESC * pBlendStateDesc,
 ID3D10BlendState ** ppBlendState
);
 
 
void OMSetBlendState(
 ID3D10BlendState * pBlendState,
 const FLOAT BlendFactor,
 UINT SampleMask
);
 
BlendFactor
[in] The blend factor, so that this will be used as the blend color if the source or destination color is specified as D3D10_BLEND_BLEND_FACTOR or D3D10_BLEND_INVBLEND_FACTOR.
混合参数:这会被当作混合颜色,如果源颜色和目标颜色被定义为D3D10_BLEND_BLEND_FACTORD3D10_BLEND_INVBLEND_FACTOR的话
SampleMask
[in] Can be used to enable or disable writes to multisample subsamples. It also applies to non-multisample rendering. By default, this should be set to 0xffffffff.
采样Mask:能用作使写入对子象素多采样是否生效。它也对非多彩样的绘制生效。默认的被设为0xffffffff
Return Values
 

Set Render Targets

    // Set the render target and depth stencil
    ID3D10RenderTargetView* pRTV = DXUTGetRenderTargetView();
    pd3dDevice->OMSetRenderTargets( 1, &pRTV, g_pDSV );

Rendertargets must all be the same type of resource. If multisample antialiasing is used, all bound rendertargets and depth buffers must be have the same sample counts. A pixel shader can simultaneously render to more than one rendertargets.

一次绘制的所有RT必须是相同类形的资源,如果使用多采样反走样,所有绑定的RT和深度缓存必须有相同的采样数量。PS能够同时绘制到多个RT上。

There can only be one depth-stencil buffer active. All bound views of depth-stencil buffers must be the same dimension and array size (as opposed to the resources being the same size).

一次绘制只有一个D/S缓存被激活。所有绑定的D/S缓存的读取方式必须是同样的维度和数组大小(和资源有相同的大小相反(指的是读取方式不同))。

Write masks control what data gets written to a rendertarget.

mask控制那些数据写入RT中。

When a buffer is used as a rendertarget, the pipeline does not support a depth/stencil buffer or any rendertarget arrays.

当缓存被用作RT,流水线就不支持D/SRT数组。

Resources

In Direct3D 10, resource views provide a general model for access to resources (textures, buffers, etc.) in memory. Views enable the notion of a typeless resource. A typeless resource is a resource whose texel/pixel/element type can be interpreted as when it's bound as input or output to the pipeline (uint, float, unorm, etc.). Views expose many new capabilities for using resources, such as the ability to read back depth/stencil surfaces in the shader, generating dynamic cubemaps in a single pass, and rendering simultaneously to multiple slices of a volume.

D3D10中,资源视图提供了读取内存资源(纹理,缓存等)的一种通用模式。视图的作用在于读取无类型资源。所谓无类型资源是指资源类型能够在它绑定到流水线输入输出时被不同的阶段做不同的解释。视图提供了使用资源的许多新方式,比如在Shader中读会D/S缓存,单Pass生成动态的CubeMap,同时绘制一个体的多个表面等。

Direct3D 10 introduces a resource type that can be dynamically indexed within a shader: a texture array. This is an array of 1D or 2D textures. When a cubemap or volume texture is viewed as a rendertarget, the resource is interpreted as an array of 2D textures giving the shader access to each 2D texture in the resource.

D3D10剔除了一种资源类型能够在Shader里被动态索引:纹理数组。这是一个1维或2维纹理的数组。当CubeMap或者体纹理被当作RT时,这些资源被当作一个2维纹理数组,使Shader可以访问到资源中的每个2维纹理。

In Direct3D 10, to prevent any resource from interfering with the rate at which a pipeline stage processes data, any resource bound to a pipeline-stage output cannot:

D3D10,为了防止任何资源干扰流水线处理数据,绑定到流水线阶段的输出数据不能:

  • Be mapped for direct CPU access CPU直接调用
  • Be a destination for CopyResource or CopyRegions 不能作为CopyResourceCopyRegions的目标
  • Cannot be used with UpdateSubresourceUP 不能被UpdateSubresourceUP调用
  • To view the contents of an output resource (with the CPU), copy it to a staging resource. 不能(用CPU)看资源内容,或者拷贝到其他阶段资源上。

For the best performance, the application will need to make a number of choices about how to create and use its resources. These choices are defined by resource type, whether you want to generate typeless or typed resources and several types of flags that determine what pipeline stages get to access the resource and how they will access it. These topics are covered in the following sections:

为了得到最好的性能,程序必须对如何创建和使用资源作出许多选择。这些选择在资源类型定义时确定,包括是否想生成无类型或者有类型的资源,几种确定哪个流水线阶段能够读取它和怎么读取它的类型标记。这个话题包括如下几节:

Resource Types

Several different resource types (arrangements of memory storage) are available for input or output by various pipeline stages. The available resource types are:

对于不同的流水线阶段输入输出,DX10提供了几种资源类型(对内存的管理)。它们是:

The resource type, in general, determines many characteristics, like where the resource may be bound to in the graphics pipeline, what the mip level behavior is, what the sampling behavior is, and other possible restrictions on the resource. Resources are comprised of one or more subresources.

资源类型通常确定了许多特性,比如资源在哪里被绑定到图形流水线上,Mip层次行为是什么,采样行为是什么,还有其他可能的对资源的限制。资源由一个或者多个子资源构成。

Buffer Resource

A buffer resource does not support mip levels or array slices, it contains a single subresource.

缓存资源不支持LOD Mipmap层次或者数组属性,它只有一个子资源。

When an unstructured buffer is bound to the graphics pipeline, it's memory structure must be bound to the graphics pipeline along with it (providing element types and offsets for each element as well as an overall stride). A buffer resource can be bound at multiple places in the pipeline simultaneously; if the stages are all reading the buffer. If a buffer is being written to, then the buffer may only be bound to one location in the pipeline during a Draw call.

当一个没有结构的缓存被绑定到图形流水线上时,它的内存结构必须同时绑定(对每个元素和所有条带都提供元素类型和偏移)。如果所有阶段都是读某个缓存的话,这个缓存资源能够同时被绑定到流水线的多个地方;如果缓存是要被写入的,那么它只能通过绘制调用被绑定到流水线的一个地方。

 

Figure 1.  Buffer Resource Architecture

A buffer has the following input binding behaviors:

缓存由如下的输入绑定属性:

  • D3D10_BIND_VERTEX_BUFFER - This Bind flag specifies that the Resource may be passed to IASetVertexBuffers.

这个绑定标记表示资源可以作为IASetVertexBuffers的参数

  • D3D10_BIND_INDEX_BUFFER - This Bind flag specifies that the Resource may be passed to IASetIndexBuffer. The allowed formats are R16_UINT or R32_UINT.

这个绑定标记表示资源可以送到IASetIndexBuffer,允许的格式是R16_UINTR32_UINT

  • D3D10_BIND_CONSTANT_BUFFER - This Bind flag specifies that the Resource may be passed to SetConstantBuffers. This flag restricts the Resource to only Buffer, prevents partial updates of the Resource, and restricts the Buffer Width (Byte Width) to be equal or less than 4096 * sizeof( R32G32B32A32 ) and be a multiple of sizeof( R32G32B32A32 ). Since such a Resource does not support partial updates, UpdateSubresource must update the full Buffer, and D3D10_MAP_WRITE_NO_OVERWRITE may not be utilized. Constant Input is read into a shader given an integer array index to fetch a single element; there is no filtering.

 这个绑定标记表示资源可以送到SetConstantBuffers。这个标记限制了资源只作为Buffer存在,阻止部分更新资源,限制缓存宽度(Byte宽)小于等于4096×sizeofR32G32B32A32),并且是sizeofR32G32B32A32)的倍数。因为这种资源不存在部分的更新操作,UpdateSubresource必须更新全部缓存,而且D3D10_MAP_WRITE_NO_OVERWRITE不能使用。常数输入被Shader通过整数数组索引读入单个元素,没有滤波。

  • D3D10_BIND_SHADER_RESOURCE - This Bind flag specifies that the Resource may be passed to one of the SetShaderResources calls. In addition, such a usage flag prevents D3D10_MAP_WRITE_NO_OVERWRITE from being utilized.

这个绑定标记表示资源可以送到SetShaderResources。同时这个标记不能使用D3D10_MAP_WRITE_NO_OVERWRITE

Texture Arrays

A texture array is a homogeneous array of 1D or 2D textures. The array is homogeneous in the sense that each texture has the same data format and dimensions (including miplevels). The entire array of textures are created atomically.

纹理数组是一个1维或2维纹理的的数组。数组中的每个纹理有着相同的数据格式和维度(包括Mip层次)。整个纹理数组创建过程是不可分割的。

A texture array can be dynamically indexed from shader code. This means that instead of incurring CPU overhead to switch textures at the device level in the API, you can switch textures in a shader. You can also bind a texture array to each of your multiple render targets with a view. A view allows the pipeline to interpret the texture array based on the attributes of the view. For instance, when you bind a view of a cube map or volume as a render target, it is interpreted as a texture2darray so you can access every array slice (or individual texture).

纹理数组能够被Shader程序动态索引。这意味着除了使用CPU在设备层次使用API切换纹理外,还可以在Shader中切换纹理。你能够使用视图把纹理数组绑定到每个RT上。视图允许流水线基于视图属性读取纹理。举个例子,当你把一个cubemap视图或者体视图作为RT,它会被当作2维纹理数组。因此逆能够读取每个数组元素(或者独立的纹理)。

Texture1D Array Resource

A Texture1D resource is a homogenous array of 1D textures, that is, each texture has the same data format and dimensions (including miplevels). A texture resource may be typed or typeless. As illustrated by figure 1, a texture resource is composed of sub-groups of mip slices, array slices, and subresources.

Texture1D资源是1维纹理的数组,每个纹理有相同的数据格式和维度(包括mip层次)。纹理资源可以是有类型或者无类型的。和图1中描述的那样,纹理资源由mip条组成的子组,数组条以及子资源构成。

Figure 2.  Texture1D Resource Architecture with Mip Levels

The figure illustrates a Texture1D resource with a single texture, with 3 mip levels. Each mip level is an array of texels. The array is addressable by the u vector (or texture coordinate). The size of the 1D array is called the texture width; a width of 5 would mean that the array contains 5 texels.

这幅图描述了一个一维纹理资源,有一张瓦那里,3Mip层次。每个Mip层次是一组纹元。数组被u向量(或者纹理坐标)取址。1D数组的宽度叫做纹理宽度。纹理宽度为5表示这个数组有5个纹元。

A texture and a particular mip level is called a subresource. Use a view to access any texture subresource. The top-most mip level is the largest level; each successive level is a power of 2 (on each side) smaller, until you get down to an array with one element in it. In this example, since the top-level texture width is 5 elements, there are two mip levels before the texture width is reduced to 1.

纹理和特定的mip层次叫做子资源。使用视图来读取各种纹理子资源。最高层的mip层次是最大的层次。每个后继层次每边是上层的一半大小,知道只剩一个元素为止。在这个例子中,最高层的宽度是5个元素,所以在纹理宽度减为1之前有两个mip层次。

But a Texture1D resource is not just a single texture (with or without mip levels), it is an array of textures. This expands the basic texture layout to look like this:

但是Texture1D不是(有或者没有mip层次的)单个纹理。它是一个数组。这把基本的纹理层次扩展为这样:

 

Figure 3.  Texture1D Resource Array

Each texture occupies a different element in the array, and is called an array slice.

每个纹理在数组中占用不同的位置,叫做数组slice

A Texture1D resource object can be bound to the following pipeline objects:

Texture1D资源对象能够绑定到如下的流水线对象中:

Pipeline Stage

Description

Shader Resource Input

Shader输入资源

Use the load or sample intrinsic functions to read a shader resource view from within a shader. Address a Texture1D resource with 2 coordinates; the first is the texture address, the second is the array slice.

通过Shader使用load或者sample内置函数读取Shader资源视图,读取Texture1D资源有两个坐标,第一个是纹理地址,第二个是数组位置。

Rendertarget Output

RT输出

A Texture1D mip slice bound as a rendertarget can use an accompanying Texture1D depth/stencil resource of the same dimensions (including width and number of array slices). To choose the array slice of the resource in the geometry shader stage, declare shader output variable with the renderTargetArrayIndex semantic (by default index 0 is chosen if no value is specified or if the value is out of range).

Texture1D Mip Slice可以和有相同维度的(包括宽度和数组大小)Texture1D D/S绑定到RT上。可以在GS中申明一个有着renderTargetArrayIndex语义的输出变量(如果没有定义任何值或者值超出范围的话,默认索引是0),然后通过选择数组Slice来选择资源。

Depth/Stencil Output

D/S输出

A Texture1D depth/stencil resource must have a resource format which contains a D component (such as D32_FLOAT) or a typeless format which can be converted to a format with a D component (such as R32_typeless).

Texture1D D/S资源资源格式必须有D通道(比如D32_FLOAT)或者能够转换到有D通道的无类型格式(比如R32_typeless

Texture2D Array Resource

A Texture2D resource is a homogeneous array of 2D textures; that is, each texture has the same data format and dimensions (including mip levels). It has a similar layout as the Texture1D resource except that the textures now contain 2D data. A Texture2D array resource therefore looks like this:

Texture2D资源是一个二维纹理的数组,数组中每个元素有相同的数据类型和维度(包括Mip层次)。Texture2D除了里面包含的是二维数据外其他特性都和Texture1D资源的层次相似。Texture2D数组资源图如下:

 

Figure 4.  Texture2D Array Resource Architecture

A Texture2D resource consists of sub-groups of mip slices, array slices, and subresources in order to refer to discrete components of the resource to accomplish certain operations. The decomposition for graphics pipeline usage is achieved through the usage of views for each stage of the pipeline.

Texture2D资源由Mip SliceArray Slice和子资源的子组构成,这样是为了能够得到纹理的每个元素执行相应操作。流水线不同阶段对它的使用通过每个流水线阶段视图的用法来得到。

Like other resources, a Texture2D must be qualified with a set of flags at creation indicating where in the graphics pipeline the resource may be bound. Naturally, the resource may be bound at more than one location in the pipeline, but the resource must've been created with the restrictions that each pipeline usage flag indicates. Sometimes pipeline usage flags have restrictions which conflict with each other, so such pipeline usage flags are mutually exclusive.

和其他资源一样,Texture2D必须在创建的时候使用一系列flag来描述它会被绑定到流水线的哪个阶段上。

Pipeline Stage

Description

Shader Resource Input

Shader输入资源

Use the load or sample intrinsic functions to read a shader resource view from within a shader.

Shader里使用loadsample内置函数来读Shader资源视图

Rendertarget Output

RT输出

A Texture2D mip slice bound as a rendertarget can use an accompanying Texture2D depth/stencil resource of the same dimensions (including width and number of array slices). To choose the array slice of the resource in the geometry shader stage, declare an output variable with the renderTargetArrayIndex semantic (by default index 0 is chosen if no value is specified or if the value is out of range).

Texture2D Mip Slice可以和有相同维度的(包括宽度和数组大小)Texture2D D/S绑定到RT上。可以在GS中申明一个有着renderTargetArrayIndex语义的输出变量(如果没有定义任何值或者值超出范围的话,默认索引是0),然后通过选择数组Slice来选择资源。

Depth/Stencil Output

D/S输出

A Texture2D depth/stencil resource must have a resource format which contains a D component (such as D32_FLOAT) or a typeless format which can be converted to a format with a D component (such as R32_typeless).

Texture2D D/S资源必须拥有D通道(比如D32_FLOAT)或者一个无类型的格式但能转换成有D通道的格式(比如R32_typeless

Texture3D Resource

A Texture3D resource is a 3D grid data layout, supporting mipmaps; and is also known as a volume tTexture. The entire resource is created atomically. The memory for the entire resource need not be contiguous. As illustrated by the diagram and binding configurations, a Texture3D may be decomposed into sub-groups of mip slices, array slices, and subresources in order to refer to discrete components of the resource to accomplish certain operations. The decomposition for graphics pipeline usage is achieved through the usage of views for each stage of the pipeline.

Texture3D资源是一个3D网格数据层次,支持Mipmaps;它同时也是一个体纹理(有错别字)。整个资源需要同时创建,但它的内存区域不需要相邻。和图中表示以及绑定描述的一样,Texture3D能够由mip slicesarray slices和子资源的子组构成,这样能够得到资源的每个元素的值,完成对应操作。不同流水线阶段对资源的读取方式通过不同流水线阶段的资源视图用法来获得。

Pipeline Stage

Description

Shader Resource Input

Shader输入资源

Texture3D resources are addressed from the shader with a 3D coordinate.

可以在Shader里用三维纹理坐标获得

Rendertarget Output

RT输出

A Texture3D mip slice bound as a rendertarget output behaves identically to a Texture2D with n array slices where n is the depth (3rd dimension) of the Texture3D. The particular z slice in the Texture3D to render is chosen, from the Geometry shader stage stage, by declaring a scalar component of output data as the System Interpreted Value renderTargetArrayIndex.

Texture3D可以绑定为RT的输出,对应有n个元素的Texture2D数组,其中nTexture3D的深度维度。从GS中通过使用renderTargetArrayIndex系统生成值申明输出数据来选定Texture3D特定的z slice

TextureCube Resource

A TextureCube resource has 6 faces, each of which is like a square Texture2D, including mipmaps. As illustrated by the diagram and binding configurations, a TextureCube may be decomposed into sub-groups of mip slices, array slices, and subresources in order to refer to discrete components of the resource to accomplish certain operations. The decomposition for graphics pipeline usage is achieved through the usage of views for each stage of the pipeline

TextureCube资源有6个面,每个面是一个正方形的Texture2D,包括mipmaps。和图形以及绑定描述的一样,TextureCube由……子组构成,不同流水线阶段通过不同视图来读取。

Pipeline Stage

Description

Shader Input

Shader输入

TextureCube resources are addressed from the shader with a 3D vector pointing out from the center of the TextureCube.

TextureCube资源使用从cube中心到对应点3D向量取值。

Rendertarget Output

RT输出

When a TextureCube mip slice is bound as a rendertarget output, the TextureCube behaves identically to a Texture2D with 6 array slices.

TextureCube mip slice作为RT的输出,TextureCube作为6Texture2D数组处理

Depth/Stencil Output

D/S输出

A depth/stencil TextureCube resource must be paired with an equivalent sized rendertarget output TextureCube resource.

D/S TextureCube资源必须和RT Texture资源有相同的大小一起输出

Resource Creation Flags

Resource creation flags specify how a resource is to be used, where the resource is allowed to bind, and which upload and download methods are available. The flags are broken up into these categories:

资源创建标记定义了资源怎么被使用,资源允许被绑定到哪里以及支持哪种输入和输出函数。这些标记分以下几个目录。

Resource Creation Usage Flags

A resource usage flag specifies how often your resource will be changing. This update frequency depends on how often a resource is expected to change relative to each rendered frame. For instance:

使用标记表示资源被改变的频率。更新频率依赖于用户希望资源相对每帧改变有多频繁,比如说:

  • Never - the contents never change. Once the resource is created, it cannot be changed.

从不改变。资源一旦创建就不改变。

  • Infrequently - less than once per frame

不经常改变:一帧改变小于一次

  • Frequently - once or more per frame. Frequently describes a resource whose contents are expected to change so frequently that the upload of resource data (by the CPU) is expected to be a bottleneck.

经常:一帧改变多次。经常改变说明这个资源内容频繁变动,CPU传送数据将会是瓶颈。

  • Staging - this is a special case for a resource that can copy its contents to another resource (or vice versa).

阶段性改变:这是一个特殊的例子,它能够把自己的内容拷贝到其他资源上。

Usage

Update Frequency

Limitations

D3D10_USAGE_DEFAULT

Infrequently (less than once per frame)

不频繁,小于一帧一次

This is the most likely usage setting.

  • Mapping: not allowed, the resource can only be changed with UpdateSubresource.
  • Binding flag: Any, none

这是使用最多的标记:

锁定:不允许,只能被UpdateSubresource改变

绑定标记:可以使用任何标记,或者不使用

D3D10_USAGE_DYNAMIC

Frequently

 

A dynamic resource is limited to one that contains a single subresource.

动态资源,被限定只能拥有一个子资源

  • Mapping: Use Map( CPU writes directly to the resource) with the D3D10_CPU_ACCESS_WRITE flag. Optionally use D3D10_MAP_WRITE_DISCARD or D3D10_MAP_WRITE_NO_OVERWRITE.
  • 锁定:使用D3D10_CPU_ACCESS_WRITE标记可以锁定(用CPU直接写数据到资源)。可选择的使用D3D10_MAP_WRITE_DISCARD 或者D3D10_MAP_WRITE_NO_OVERWRITE.
  • Binding flag: at least one GPU input flag, GPU output flags are not allowed. You must use

绑定标记:至少一个GPU输入标记,不允许有GPU输出标记,你必须使用:

  • D3D10_MAP_WRITE_DISCARD and D3D10_MAP_WRITE_NO_OVERWRITE if the resource is a vertex or index buffer (uses D3D10_BIND_VERTEX_BUFFER or D3D10_BIND_INDEX_BUFFER).

当资源是顶点或者索引缓存时用:D3D10_MAP_WRITE_DISCARD D3D10_MAP_WRITE_NO_OVERWRITE

(即 使用D3D10_BIND_VERTEX_BUFFER or D3D10_BIND_INDEX_BUFFER绑定时)

D3D10_USAGE_IMMUTABLE

Never

  • Mapping: not allowed

锁定:不允许

  • Binding flag: at least one GPU input flag, GPU output flags are not allowed

绑定标记:至少一个GPU输入标记,不允许有GPU输出标记

D3D10_USAGE_STAGING

n/a

This resource cannot be bound to the pipeline directly; you may copy the contents of a resource to (or from) another resource that can be bound to the pipeline. Use this to download data from the GPU.

这个资源不能够被直接绑定到流水线上。你可以把它的内容拷入(或拷出)到其他能绑定到流水线上的资源。使用这个来获得GPU的数据。

  • Mapping: Use CopyResource or CopySubresource with either/both D3D10_CPU_ACCESS_WRITE and D3D10_CPU_ACCESS_READ to copy the contents of this resource to any resource with one of the other usage flags (which allows them to be bound to the pipeline).

锁定:使用CopyResource或者CopySubResourceD3D10_CPU_ACCESS_WRITE D3D10_CPU_ACCESS_READ标记(也可同时使用)把资源内容从其他资源中拷入或者拷出(这允许它们被绑定到流水线上)

  • Binding flag: None are valid

绑定标记:不允许

Resource Creation Binding Flags

Resources are bound to a pipeline stage through an API call. Resources may be bound at more than one location in the pipeline (even simultaneously within certain restrictions) as long as each resource satisfies any restrictions that pipeline stage has for resource properties (memory structure, usage flags, binding flags, cpu access flags).

资源通过API调用绑定到流水线上。资源可以绑定到流水线的一个或多个定方(同时有其他的限制),只要资源满速任何流水线阶段对资源属性的限制(内存结构,使用标记,绑定标记,CPU读取标记)

It is possible to bind a resource as an input and an output simultaneously, as long as the input view and the output view do not share the same subresources.

Binding a resource is a design choice affecting how that resource needs to interact with the GPU. If you can, try to design resources that will be reused for the same purpose; this will most likely result in better performance.

把资源同时绑定为输入和输出时可能的,但是输入视图和输出视图不能同时共享相同的子资源。绑定资源是流水线设计的选择,它决定了资源怎么和GPU交互。如果可以的话,尽量把有相同用途的资源重用,这会提高系统性能。

For example, if a render target is to be used as a texture, consider how it will be updated. Suppose multiple primitives will be rendered to the rendertarget and then the rendertarget will be used as a texture. For this scenario, it may be faster having two resources: the first would be a rendertarget to render to, the second would be a shader resource to supply the texture to the shader. Each resource would specify a single binding flag (the first would use D3D10_BIND_RENDER_TARGET and the second would use D3D10_BIND_SHADER_RESOURCE). If there is some reason that you cannot have two resources, you can specify both bind flags for a single resource and you will need to accept the performance trade-off. Of course, the only way to actually understand the performance implication is to measure it.

举个例子说,如果RT被作为纹理使用,就要考虑它怎么被更新。假设多个图元会被绘制到RT上,然后RT被当作纹理使用。在这中情况下,使用两个资源会更快:第一个是一个被绘制的RT资源,第二个是一个Shader资源在Shader中读取纹理。每个资源定义了单独的绑定标记(第一个是D3D10_BIND_RENDER_TARGET,第二个是D3D10_BIND_SHADER_RESOURCE)。如果由于某些原因你不能拥有两个资源的话,你能够把两个标记设定到同一个资源上,但是这样会导致系统性能下降。当然,实际上理解性能的唯一方式就是去测量它。

The bind flags are broken into two groups, those that allow a resource to be bound to GPU inputs and those that allow a resource to be bound to GPU outputs.

绑定标记分两类,GPU输入标记和GPU输出标记

GPU Input Flags

Flag

Resource Type

API Call

D3D10_BIND_VERTEX_BUFFER

unstructured buffer

IASetVertexBuffers

D3D10_BIND_INDEX_BUFFER

unstructured buffer

IASetIndexBuffer

D3D10_BIND_CONSTANT_BUFFER

unstructured buffer

VSSetConstantBuffers, GSSetConstantBuffers, PSSetConstantBuffers restricts the buffer width (in bytes) to be less than or equal to 4096 * sizeof( R32G32B32A32 ). The resource must also be a multiple of sizeof( R32G32B32A32 ). Use UpdateSubresource to update the entire buffer;

限制Buffer宽度(Byte)小于等于4096 * sizeof( R32G32B32A32 )。资源大小必须是4096 * sizeof( R32G32B32A32 )的倍数。使用UpdateSubresource更新Buffer

D3D10_MAP_WRITE_NO_OVERWRITE is not allowed.

不允许用

D3D10_MAP_WRITE_NO_OVERWRITE

D3D10_SHADER_RESOURCE

shader resource (vertex buffer, index buffer, texture)

VSSetShaderResources, or GSSetShaderResources, or PSSetShaderResources; this resource may not use D3D10_MAP_WRITE_NO_OVERWRITE.

GPU Output Flags

Flag

Resource Type

API Call

D3D10_BIND_STREAM_OUTPUT

unstructured buffer

SOSetTargets

D3D10_BIND_RENDER_TARGET

any resource (or subresource) with a rendertarget view

OMSetRenderTargets using a render target

D3D10_BIND_DEPTH_STENCIL

a Texture1D, Texture2D, or TectureCube resource (or subresource) with a depth stencil view

OMSetRenderTargets using a depth-stencil parameter

Resources bound to an output stage may not be:

绑定到输出阶段的资源不能是:

  • Mapped with ID3D10Buffer::Map or ID3D10TextureXXX::Map 使用ID3D10Buffer::Map or ID3D10TextureXXX::Map后的资源
  • Loaded with UpdateSubresource 使用UpdateSubresource载入的资源
  • Copied with CopyResource 使用CopyResource拷贝的资源

Resource Creation CPU Access Flags

These flags allow the CPU to read or write (or both) a resource. Reading or writing a resource requires that the resource be mapped (which is analogous to Lock in Direct3D 9) so that the resource cannot be simultaneously be read and written to.

这个标记允许CPU读或写资源。读写资源需要资源被map,所以资源不能被同时读或者写。

Flag

Limitations

D3D10_MAP_READ

  • If the writeable resource does not have the D3D10_MAP_WRITE bind flag also, then the application can only write to the memory address retrieved from Map.

如果可写的资源没有D3D10_MAP_WRITE标记,那么应用程序只能写从Map中得到的内存地址

D3D10_MAP_WRITE

  • If the writeable resource does not have the D3D10_MAP_READ bind flag also, then the application can only write to the memory address retrieved from Map

如果可写的资源没有.D3D10_MAP_READ,那么那么应用程序只能写从Map中得到的内存地址

Any resource that has either the read or write flag specified:

任何有读写标记的资源:

  • Cannot be updated with UpdateSubresource. 不能用UpdateSubresource更新
  • May not use any GPU output bind flag. 不能使用任何GPU输出绑定标记
  • Must use either the DYNAMIC or STAGING usage flag. 必须使用DYNAMIC 或者STAGING标记

Resource Creation Miscellaneous Flags

Flag

Limitations

D3D10_RESOURCE_MISC_MIPGEN

Use this flag to allow a shader resource view to use GenerateMips to generate mip maps; note that this requires that the shade resource view also is capable of being bound as a rendertarget (D3D10_BIND_RENDERTARGET). You may not generate mip maps for a buffer resource.

使用这个标记允许Shader资源视图使用GenerateMips来创建Mipmaps。注意到这需要Shader资源视图同样能够作为RT被绑定(D3D10_BIND_RENDERTARGET)。不能够对缓存资源创建Mipmap

D3D10_RESOURCE_MISC_COPY_DESTINATION

Use this flag to allow a resource to call CopySubresourceRegion or CopyResource (as a destination).

使用这个标记允许资源调用CopySubresourceRegionCopyResource(作为目标被拷入)

Every resource can be used with CopySubresourceRegion, and CopyResource (as a source). However, the primary advantage of not specifying this flag (when it could be used), is related to STAGING Resources and their interaction with DEVICEREMOVED. After DEVICEREMOVED, Map on STAGING Resources will fail with DEVICEREMOVED when the application specified COPY_DESTINATION. However, if the application will not use the STAGING Resource as a destination for Copy commands, it can continue to Map such Resources after DEVICEREMOVED. See the following table to determine which Resources can set this flag.

任何资源能够用CopySubresourceRegionCopyResource作为源被拷出。不使用这个标记(当它能够被使用时)的主要和STAGING资源以及它们和DEVICEREMOVED的交互相关。在DEVICEREMOVED后,当应用程序定义了COPY_DESTINATION时,锁定STAGING资源会失败。当然,如果应用程序不使用STAGING资源作为拷贝命令的目标,它就能够在DEVICEREMAP后继续锁定资源。详见下表决定那种资源能够设定这个标记

Resource Flag Combinations

Resource Type and Usage

GPU Input Bind

GPU Output Bind

Map( READ and/ or WRITE )

Map( WRITE_DISCARD or WRITE_NOOVERWRITE )

UpdateSubresource

Copy Dest

IMMUTABLE

R

 

 

 

 

 

DEFAULT (GPU Input)

C

 

 

 

E

C

DEFAULT (GPU Output)

C

R

 

 

 

 

DYNAMIC

R

 

 

D

 

C

STAGING

 

 

R

 

 

C

  • R = Requires at least one bit set. 需要至少设定一个
  • C = Compatible to use. 能够兼容使用
  • D = Compatible, but WRITE_NOOVERWRITE may only be used if Bind flags are restricted to VERTEX_BUFFER and INDEX_BUFFER. 兼容使用,但是WRITE_NOOVERWRITE只能在绑定标记是VERTEX_BUFFER INDEX_BUFFER使用
  • E = Compatible, but CONSTANT_BUFFER prevents partial CPU updates. 兼容使用,但是CONSTANT_BUFFER限制部分CPU更新
  • empty = Cannot be used together. 不能同时使用

Resource Access and Views

In Direct3D 10, resources are accessed with a view, which is a mechanism for hardware interpretation of a resource in memory. A view allows a particular pipeline stage to access only the subresources it needs, in the representation desired by the application.

D3D10中,资源能够通过视图获取,这其实是一种硬件读取内存资源的机制。视图允许特定的流水线阶段只读取它需要的子资源来满足应用程序的需要。

A view supports the notion of a typeless resource - that is, you can create a resource that is of certain size but whose type is interpreted as a uint, float, unorm, etc. only when it is bound to a pipeline stage. This makes a resource re-interpretable by different pipeline stages.

视图支持无类型的资源,这就是说,你能构创建一定大小的资源,但是它的类型可以认为是整数的,浮点的或者归一的等等;这只有当它绑定到流水线阶段时才确定。这使得资源可以被不同的流水线阶段用不同的方式读取。

Here is an example of binding a TextureCube resource two different ways through two different views. (Note: a subresource cannot be bound as both input and output to the pipeline simultaneously.)

这里有个例子,把TextureCube资源用两种不同的视图绑定。(注意:子资源不能够被同时绑定到流水线的输入和输出)

Using a texturecube as a shader resource. The resource is then addressed as a cube-mapped texture that is filtered correctly across edges and corners by the sampler.

使用texturecube作为Shader资源。这个资源作为cubemap纹理取值,sampler在边缘和角点使用正确的滤波采样。

 

Figure 1.  Texture Cube Resource - Viewed as a texture for sampling

Create this view object by calling CreateShaderResourceView. Then set the view object to the pipeline stage (the particular shader) by calling SetTexture (VSSetTexture, PSSetTexture, GSSetTexture). Use an HLSL texture instrinsic function to sample the texture.

通过调用CreateShaderResourceView创建这个视图对象,然后调用SetTexture (VSSetTexture, PSSetTexture, GSSetTexture)把视图对象设定到流水线阶段上(指定的Shader)。使用HLSL纹理内置函数来采样纹理。

Using a texturecube as a render target. The resource can be viewed as an array of 2D textures (6 in this case) with mip levels (3 in this case) like this:

使用texturecube作为rT。资源能够被看作一个有mip层次(3个在这个例子里)2维纹理数组(6个在这个例子里),如下:

 

Figure 2.  Texture Cube Viewed as 6 2D Texture Faces

Create a view object for a rendertarget by calling calling CreateRenderTargetView. Then call OMSetTargets to set the rendertarget view to the pipeline. Render into the rendertargets by calling Draw and using the RenderTargetArrayIndex to index into the proper array slice (texture face that is, +X, +Y, ...) in the view. You can use a subresource (a mip level, array slice combination) to bind to any array of subresources. So you could bind to the second mip level and only update this particular mip level if you wanted like this:

通过调用CreateRenderTargetView创建RT的视图对象。然后调用OMSetTargets设定RT视图到流水线上。调用Draw来绘制RT,使用RenderTargetArrayIndex来选择视图中指定的数组元素(纹理面是+X,+Y,…)。你能够使用子资源(Mip层次,array slice)来绑定任何子资源的数组。所以你能够把第二层mip层次绑定,然后只更新这个mip层次。

 

Figure 3.  Views can access an array of Subresources

Differences between Direct3D 9 and Direct3D 10:

In Direct3D 10, you no longer bind a resource directly to the pipeline, you create a view of a resource, and then set the view to the pipeline. This allows validation and mapping in the runtime and driver to occur at view creation, minimizing type checking at bind-time.

D3D10中,你不需要直接把资源绑定到流水线上,你只要创建一个资源的视图,然后把视图绑定到流水线上。这允许在视图创建时确认生效和运行时的匹配,减少绑定时间的类型检查。

New Resource Formats

Direct3D 10 offers new data compression formats for compressing high-dynamic range (HDR) lighting data, normal maps and heightfields to a fraction of their original size. These compression types include:

D3D10提供新的数据压缩类型来压缩高动态范围照明数据,法向图和高度场图。这些类型包括:

  • Shared-Exponent high-dynamic range (HDR) format (RGBE) 共享指数HDR格式(RGBE
  • New Block-Compressed 1-2 channel UNORM/SNORM formats 新块状压缩1-2通道UNORM/SNORM格式

The block compression formats can be used for any of the 2D or 3D texture types ( Texture2D, Texture2DArray, Texture3D, or TextureCube) including mip-map surfaces. The block compression techniques require texture dimensions to be a multiple of 4 (since the implementation compresses on blocks of 4x4 texels). In the texture sampler, compressed formats are always decompressed before texture filtering.

块压缩格式能够用作任何包含mipmap2D3D纹理类型( Texture2D, Texture2DArray, Texture3D, or TextureCube)。块压缩技术需要纹理维度维4的倍数(因为压缩实现基于4x4纹元)。纹理采样时,压缩格式会在滤波前被解压。

Texture Sampling

Differences between Direct3D9 and Direct3D10:

In Direct3D9, samplers were bound to specific textures. Textures and samplers are now independent objects in Direct3D 10. The new templated texture object supports several texture sampling methods that take both the texture and the sampler as input parameters. For example:

D3D9中,采样绑定到稳定上。在D3D10中,纹理和采样现在是相互独立的对象。新的模板纹理对象支持几种纹理采样方法,把纹理和采样都作为输入参数,例子如下:

sampler MySamp;
Texture2D&float4> MyTex;
 
float4 main( float2 TexCoords[2] : TEXCOORD ) : SV_Target
{
    return MyTex.Sample( MySamp, TexCoords[0] ));
}

See Shader Textures for more details.更多细节看Shader Textures

Shaders

At a very high level, data enters the graphics pipeline as a stream of primitives and is processed by up to as many as three shader stages:

在高层,数据作为图元流输入到图形流水线,被三个Shader阶段处理。

  • A vertex shader performs per-vertex processing such as transformations, skinning, displacement, and calculating per-vertex material attributes.

VS执行逐顶点的操作,比如变换,表面细节,displacement和计算逐顶点材质属性。

  • A geometry shader performs per-primitive processing such as material selection and silhouette-edge detection, and can generate new primitives for point sprite expansion, fin generation, shadow volume extrusion, and single pass rendering to multiple faces of a cube texture.

GS执行逐图元处理,比如材质选择,阴影边缘检测,为点精灵扩展生成新图元,表面凸起生成,Shadow Volume抽取和单pass绘制cube texture的多个表面。

  • A pixel shader performs per-pixel processing such as texture blending, lighting model computation, and per-pixel normal and/or environmental mapping.

PS执行逐象素处理,比如纹理混合,光照模型计算,逐象素法向和环境映照计算。

These stages are completely programmable via the High Level Shading Language (HLSL). All Direct3D 10 shaders are written in HLSL against the Shader Model 4.0 hardware target. Authored HLSL shaders can be compiled at author-time or at runtime, and set at runtime into the appropriate pipeline stage.

这些阶段可以完全通过HLSL可编程实现。所有的D3D ShaderHLSL编写到Shader Model4.0的硬件目标上。写完的HLSL Shader可以编写时编译或者在运行时编译,在运行时设置到对应的流水线阶段中。

Matrices and Vectors in HLSL

With HLSL, you can program shaders at an algorithm level. To understand the language, you will need to know how to declare variables and functions, use intrinsic functions, define custom data types and use semantics to connect shader arguments to other shaders and to the pipeline.

使用HLSL你可以在算法层次对Shader编程。为了了解这个语言,你需要知道怎么申明变量和函数,使用内置函数,定义自己的数据类型和使用语义在其他Shader和流水线之间连接Shader参数。

Once you learn how to author shaders in HLSL, you will need to learn about API calls so that you can: compile a shader for particular hardware, initialize shader constants, and initialize other pipeline state if necessary.

一旦你学会了怎么用HLSLShader,你需要知道关于ShaderAPI调用,这样你才能:在一定的硬件上编译Shader,初始化Shader常数和在需要的情况初始化其他流水线阶段。

HLSL Implements Per-Component Math Operations

HLSL uses two special types, a vector type and a matrix type to make programming 2D and 3D graphics easier. Each of these types contain more than one component; a vector contains up to four components, and a matrix contains up to 16 components. When vectors and matrices are used in standard HLSL equations, the math performed is designed to work per-component. For instance, HLSL implements this multiply:

HLSL使用两种特殊的类型:向量类型和矩阵类型,来使2D或者3D的编程更为容易。每种类型拥有多个通道。一个向量拥有最多4个通道,一个矩阵拥有最多16个通道。当向量和矩阵在标准HLSL等式中使用时,对它们执行的数学操作对每个通道都起作用。举个例子,HLSL实现这个乘法:

float4 v = a*b;

as a four-component multiply. The result is four scalars:

作为一个四通道的乘法,结果是四个数

float4 v = a*b;
 
v.x = a.x*b.x;
v.y = a.y*b.y;
v.z = a.z*b.z;
v.w = a.w*a.w;

This is four multiplications where each result is stored in a separate component of v. This is called a four-component multiply. HLSL uses component math which makes writing shaders very efficient.

这是四个乘法,每个乘法结果保存在v向量的每个通道中。这被叫做四通道乘法。HLSL使用通道乘法使Shader更为有效。

This is very different from a multiply which is typically implemented as a dot product which generates a single scalar:

这和点积乘法明显不同,点积只产生一个结果数:

v = a.x*b.x + a.y*b.y + a.z*b.z + a.w*b.w;

A matrix also uses per-component operations in HLSL:

矩阵在HLSL里一样使用逐通道操作

float3x3 mat1,mat2;
...
float3x3 mat3 = mat1*mat2;

The result is a per-component multiply of the two matrices (as opposed to a standard 3x3 matrix multiply). A per component matrix multiply yields this first term:

结果是对两个矩阵的每个通道做乘法(和标准3X3矩阵相乘相反)。逐通道的矩阵乘法表示如下:

mat3.m00 = mat1.m00 * mat2._m00;

This is different from a 3x3 matrix multiply which would yield this first term:

3x3的矩阵乘法不同,3x3的矩阵乘法用如下形式表示:

// First component of a four-component matrix multiply
mat.m00 = mat1._m00 * mat2._m00 + 
          mat1._m01 * mat2._m10 + 
          mat1._m02 * mat2._m20 + 
          mat1._m03 * mat2._m30;

Overloaded versions of the multiply intrinsic function handle cases where one operand is a vector and the other operand is a matrix. Such as: vector * vector, vector * matrix, matrix * vector, and matrix * matrix. For instance:

乘法内置函数的重载版本能够处理一个操作数是向量而另一个操作数是矩阵的情况。它们包括:向量乘向量,向量乘矩阵,矩阵乘向量和矩阵乘矩阵,比如说:

float4x3 World;
 
float4 main(float4 pos : SV_POSITION) : SV_POSITION
{
    float4 val;
    val.xyz = mul(pos,World);
    val.w = 0;
 
    return val;
}       

produces the same result as:

和这样的结果一样:

float4x3 World;
 
float4 main(float4 pos : SV_POSITION) : SV_POSITION
{
    float4 val;
    val.xyz = (float3) mul((float1x4)pos,World);
    val.w = 0;
 
    return val;
}       

This example casts the pos vector to a column vector using the (float1x4) cast. Changing a vector by casting, or swapping the order of the arguments supplied to multiply is equivalent to transposing the matrix.

这个例子使用(float1x4)强制转换把位置向量转换为列向量。通过强制转换改变向量,或者变换参数在乘法中的顺序和转置矩阵的作用是一样的。

Automatic cast conversion causes the multiply and dot intrinsic functions to return the same results as used here:

自动类型转换导致乘法和点积内置函数返回相同的结果。

{
 float4 val;
 return mul(val,val);
}

This result of the multiply is a 1x4 * 4x1 = 1x1 vector. This is equivalent to a dot product:

乘法结果是1x4*4x11x1向量,等于一个点积。

{
 float4 val;
 return dot(val,val);
}

which returns a single scalar value.

返回一个数。

The Vector Type

A vector is a data structure that contains between one and four components.

向量是拥有一个至4个通道的数据结构。

bool    bVector;   // scalar containing 1 Boolean
bool1   bVector;   // vector containing 1 Boolean
int1    iVector;   // vector containing 1 int
half2   hVector;   // vector containing 2 halfs
float3 fVector;   // vector containing 3 floats
double4 dVector;   // vector containing 4 doubles

The integer immediately following the data type is the number of components on the vector.

在数据类型后的整数表示这个向量的通道数。

Initializers can also be included in the declarations.

申明的同时也可以初始化。

bool    bVector = false;
int1    iVector = 1;
half2   hVector = { 0.2, 0.3 };
float3 fVector = { 0.2f, 0.3f, 0.4f };
double4 dVector = { 0.2, 0.3, 0.4, 0.5 };

Alternatively, the vector type can be used to make the same declarations:

向量类型也可以这样申明。

vector <bool,   1> bVector = false;
vector <int,    1> iVector = 1;
vector <half,   2> hVector = { 0.2, 0.3 };
vector <float, 3> fVector = { 0.2f, 0.3f, 0.4f };
vector <double, 4> dVector = { 0.2, 0.3, 0.4, 0.5 };

The vector type uses angle brackets to specify the type and number of components.

向量类型使用尖括号表示类型和通道数目

Vectors contain up to four components, each of which can be accessed using one of two naming sets:

向量拥有最多4个通道,每个通道可以使用两个命名集合获取。

  • The position set: x,y,z,w 位置集合:xyzw
  • The color set: r,g,b,a颜色集合:rgba

These statements both return the value in the third component.

这两句话同时返回第三个通道值。

// Given
float4 pos = float4(0,0,2,1);
 
pos.z    // value is 2
pos.b    // value is 2

Naming sets can use one or more components, but they cannot be mixed.

命名集合可以使用一个或多个通道,但它们不能被混合。

// Given
float4 pos = float4(0,0,2,1);
float2 temp;
 
temp = pos.xy // valid
temp = pos.rg // valid
 
temp = pos.xg // NOT VALID because the position and color sets were used.

Specifying one or more vector components when reading components is called swizzling. For example:

当读向量时指定一个或多个通道读取叫做swizzling。举个例子:

float4 pos = float4(0,0,2,1);
float2 f_2D;
f_2D = pos.xy;   // read two components 
f_2D = pos.xz;   // read components in any order       
f_2D = pos.zx;
 
f_2D = pos.xx;   // components can be read more than once
f_2D = pos.yy;

Masking controls how many components are written.

Masking控制多少通道被写入

float4 pos = float4(0,0,2,1);
float4 f_4D;
f_4D    = pos;     // write four components          
 
f_4D.xz = pos.xz; // write two components        
f_4D.zx = pos.xz; // change the write order
 
f_4D.xzyw = pos.w; // write one component to more than one component
f_4D.wzyx = pos;

Assignments cannot be written to the same component more than once. So the left side of this statement is invalid:

对一个通道的赋值在一条指令中不能超过一次。所有左边的语句是错误的。

f_4D.xx = pos.xy;   // cannot write to the same destination components 

Also, the component name spaces cannot be mixed. This is an invalid component write:

通道命名空间也不能被混合。这是一个无效的通道写入。

f_4D.xg = pos.rgrg;    // invalid write: cannot mix component name spaces 

The Matrix Type

A matrix is a data structure that contains rows and columns of data. The data can be any of the scalar data types, however, every element of a matrix is the same data type. The number of rows and columns is specified with the row-by-column string that is appended to the data type.

矩阵是一个拥有行和列的数据结果。数据元素能够是任何数值类型,当然,每个元素必须是相同的数据类型。行列数在数据类型后用一个行列字符表示。

int1x1    iMatrix;   // integer matrix with 1 row, 1 column
int2x1    iMatrix;   // integer matrix with 2 rows, 1 column
...
int4x1    iMatrix;   // integer matrix with 4 rows, 1 column
...
int1x4    iMatrix;   // integer matrix with 1 row, 4 columns
double1x1 dMatrix;   // double matrix with 1 row, 1 column
double2x2 dMatrix;   // double matrix with 2 rows, 2 columns
double3x3 dMatrix;   // double matrix with 3 rows, 3 columns
double4x4 dMatrix;   // double matrix with 4 rows, 4 columns

The maximum number of rows or columns is 4; the minimum number is 1.

最大行列数是4,最小是1

A matrix can be initialized when it is declared:

矩阵可以在申明的时候初始化。

float2x2 fMatrix = { 0.0f, 0.1, // row 1
                     2.1f, 2.2f // row 2
                   };   

Or, the matrix type can be used to make the same declarations:

矩阵也可以这样申明。

matrix <float, 2, 2> fMatrix = { 0.0f, 0.1, // row 1
                                 2.1f, 2.2f // row 2
                               };

The matrix type uses the angle brackets to specify the type, the number of rows, and the number of columns. This example creates a floating-point matrix, with two rows and two columns. Any of the scalar data types can be used.

矩阵类型使用尖括号来表示数据类型,行数和烈属。这个例子创建了一个浮点矩阵,有22列。任何数值类型都能被使用。

This declaration defines a matrix of half values (16-bit floating-point numbers) with two rows and three columns:

下面申明定义了一个23列的半浮点(16位浮点数)矩阵

matrix <half, 2, 3> fHalfMatrix;

A matrix contains values organized in rows and columns, which can be accessed using the structure operator "." followed by one of two naming sets:

矩阵组织位行和列,能够通过操作符.和一两个命名集合来读取矩阵元素值:

  • The zero-based row-column position:
    • _m00, _m01, _m02, _m03
    • _m10, _m11, _m12, _m13
    • _m20, _m21, _m22, _m23
    • _m30, _m31, _m32, _m33
  • The one-based row-column position:
    • _11, _12, _13, _14
    • _21, _22, _23, _24
    • _31, _32, _33, _34
    • _41, _42, _43, _44

Each naming set starts with an underscore followed by the row number and the column number. The zero-based convention also includes the letter "m" before the row and column number. Here's an example that uses the two naming sets to access a matrix:

每个命名集合由.开始,接下来是行号和列号。基于0的转换在行列数之前也包括字母m。这里是使用两种命名方式读取矩阵的例子:

// given
float2x2 fMatrix = { 1.0f, 1.1f, // row 1
                     2.0f, 2.1f // row 2
                   }; 
 
float f_1D;
f_1D = matrix._m00; // read the value in row 1, column 1: 1.0
f_1D = matrix._m11; // read the value in row 2, column 2: 2.1
 
f_1D = matrix._11; // read the value in row 1, column 1: 1.0
f_1D = matrix._22; // read the value in row 2, column 2: 2.1

Just like vectors, naming sets can use one or more components from either naming set.

// Given
float2x2 fMatrix = { 1.0f, 1.1f, // row 1
                     2.0f, 2.1f // row 2
                   };
float2 temp;
 
temp = fMatrix._m00_m11 // valid
temp = fMatrix._m11_m00 // valid
temp = fMatrix._11_22   // valid
temp = fMatrix._22_11   // valid

A matrix can also be accessed using array access notation, which is a zero-based set of indices. Each index is inside of square brackets. A 4x4 matrix is accessed with the following indices:

矩阵能够使用数组方式使用基于0的索引读取,每个索引在中括号中。一个4x4矩阵可以用如下索引读取:

  • [0][0], [0][1], [0][2], [0][3]
  • [1][0], [1][1], [1][2], [1][3]
  • [2][0], [2][1], [2][2], [2][3]
  • [3][0], [3][1], [3][2], [3][3]

Here is an example of accessing a matrix:

这是读取矩阵的一个例子:

float2x2 fMatrix = { 1.0f, 1.1f, // row 1
                     2.0f, 2.1f // row 2
                   };
float temp;
 
temp = fMatrix[0][0] // single component read
temp = fMatrix[0][1] // single component read

Notice that the structure operator "." is not used to access an array. Array access notation cannot use swizzling to read more than one component.

注意到.不能用作读取数组。数组读取符号不能使用swizzling读取超过一个通道。

float2 temp;
temp = fMatrix[0][0]_[0][1] // invalid, cannot read two components

However, array accessing can read a multi-component vector.

当然,数组方式能够读取多个向量。

float2 temp;
float2x2 fMatrix;
temp = fMatrix[0] // read the first row

As with vectors, reading more than one matrix component is called swizzling. More than one component can be assigned, assuming only one name space is used. These are all valid assignments:

和向量异向,读取多个矩阵通道叫做swizzling。假设只有一个名字空间被使用的话,多个通道能被赋值。下面都是有效的赋值:

// Given these variables
float4x4 worldMatrix = float4( {0,0,0,0}, {1,1,1,1}, {2,2,2,2}, {3,3,3,3} );
float4x4 tempMatrix;
 
tempMatrix._m00_m11 = worldMatrix._m00_m11; // multiple components
tempMatrix._m00_m11 = worldMatrix.m13_m23;
 
tempMatrix._11_22_33 = worldMatrix._11_22_33; // any order on swizzles
tempMatrix._11_22_33 = worldMatrix._24_23_22;

Masking controls how many components are written.

Masking控制多少通道能被写入。

// Given
float4x4 worldMatrix = float4( {0,0,0,0}, {1,1,1,1}, {2,2,2,2}, {3,3,3,3} );
float4x4 tempMatrix;
 
tempMatrix._m00_m11 = worldMatrix._m00_m11; // write two components
tempMatrix._m23_m00 = worldMatrix.m00_m11;

Assignments cannot be written to the same component more than once. So the left side of this statement is invalid:

不能在一条指令中对一个通道赋值多次。因此下面语句的左边是无效的。

// cannot write to the same component more than once
tempMatrix._m00_m00 = worldMatrix.m00_m11;

Also, the component name spaces cannot be mixed. This is an invalid component write:

当然,通道名字空间不能混合使用,这也是个无效的通道写入。

// Invalid use of same component on left side
tempMatrix._11_m23 = worldMatrix._11_22; 

Matrix Ordering

Matrix packing order for uniform parameters is set to column-major by default. This means each column of the matrix is stored in a single constant register. On the other hand, a row-major matrix packs each row of the matrix in a single constant register. Matrix packing can be changed with the

矩阵顺序默认为列对齐。这就是说矩阵的每一列保存在一个常数寄存器中。另外一种对齐方式行对齐把每一行保存在一个常数寄存器中。矩阵对齐方式能够通过以下标记改变:

 

#pragma pack_matrix

directive, or with the

row_major

or the

col_major

keyword.

The data in a matrix is loaded into shader constant registers before a shader runs. There are two choices for how the matrix data is read: in row-major order or in column-major order. Column-major order means that each matrix column will be stored in a single constant register, and row-major order means that each row of the matrix will be stored in a single constant register. This is an important consideration for how many constant registers are used for a matrix.

矩阵中的数据在Shader运行前被载入到Shader的常数寄存器中。矩阵输入读取的方式有两种选择:行顺序读取或者列顺序读取。列顺序意味着每列矩阵都可以被保存到一个简单的常数寄存器中,行矩阵意味着每行都可以保存到一个常数寄存器中。这对于使用多少个常数寄存器来保存矩阵是非常重要的考虑因素。

A row-major matrix is laid out like this:

行顺序矩阵是这样的:

11

12

13

14

21

22

23

24

31

32

33

34

41

42

43

44

A column-major matrix is laid out like this:

列排序矩阵是这样的:

11

21

31

41

12

22

32

42

13

23

33

43

14

24

34

44

Row-major and column-major matrix ordering determine the order the matrix components are read from shader inputs. Once the data is written into constant registers, matrix order has no effect on how the data is used or accessed from within shader code. Also, matrices declared in a shader body do not get packed into constant registers.

 Row-major and column-major packing order has no influence on the packing order of constructors (which always follows row-major ordering).

行顺序和列顺序矩阵决定Shader输入读入矩阵通道的顺序。一旦数据被写入常数寄存器,矩阵的顺序就对数据如何被Shader代码使用和读取不起作用了。同时Shader体中的矩阵申明不需要对常数寄存器打包。行顺序和列顺序对Shader构造的大包顺序没有影响(将一直是列顺序)

The order of the data in a matrix can be declared at compile time or the compiler will order the data at runtime for the most efficient use.

矩阵中数据的顺序可以在编译时申明或者编译器可以为了更有效的使用数据在运行时改变数据顺序。

Declaring HLSL Shader Variables

The simplest shader variable declaration includes a type and a variable name, such as this floating-point declaration:

最简单的Shader变量申明包括类型和变量明,比如浮点数的申明如下:

float fVar;

You can initialize a variable in the same statement.

你可以使用相同的语句初始化变量。

float fVar = 3.1f;

An array of variables can be declared,

数组能够这样申明

int iVar[3];

or declared and initialized in the same statement.

或者同时申明和初始化。

int iVar[3] = {1,2,3};

Here are a few declarations that demonstrate many of the characteristics of high-level shader language (HLSL) variables:

下面是一些申明的例子,描述了HLSL变量的一些其他特性:

float4 color;
uniform float4 position : SV_POSITION; 
const float4 lightDirection = {0,0,1};

Data declarations can use any valid type including:

数据申明能够使用的有效类型包括:

A shader can have top-level variables, arguments, and functions.

Shader能够拥有顶层变量,参数和函数

// top-level variable
float globalShaderVariable; 
 
// top-level function
void function(
in float4 position: SV_POSITION // top-level argument
              )
{
 float localShaderVariable; // local variable
 function2(...)
}
 
void function2()
{
 ...
}

Top-level variables are declared outside of all functions. Top-level arguments are parameters to a top-level function. A top-level function is any function called by the application (as opposed to a function that is called by another function).

顶层变量在函数外部申明。顶层参数是顶层函数的参数。顶层函数是程序调用的函数(和函数相反,函数是被其他函数调用的)

Basic Types

HLSL supports several basic scalar types:

BOOL

true or false (Boolean)

int

32-bit signed integer

half

16-bit floating point value

float

32-bit floating point value

double

64-bit floating point value

 

New for Direct3D 10: The following basic types have been added in Direct3D 10.

snorm4

IEEE 32 bit float in range -1 to 1 inclusive

unorm4

IEEE 32 bit float in range 0 to 1 inclusive

HLSL also supports the ASCII string type. There are no operations or states that accept strings. String parameters and annotations can, however, be queried by effects.

HLSL支持ASCII字符串类型。没有操作和状态接受字符串。字符串参数和注解可以被Effect读取。

Vector Type Syntax

A vector is a special data type that contains between one and four components. Every component of a vector must be of the same type.

Syntax

Type Number VariableName

where:

Type

The data type, which is one of the basic types.

Number

A positive integer that specifies the number of components. A vector must have at least one component, and can have no more than of four components.

VariableName

An ASCII string that uniquely identifies the variable name.

Remarks

Here are some examples:

bool    bVector;   // scalar containing 1 Boolean
half2   hVector;   // vector containing 2 halfs
int1    iVector = 1;
float3 fVector = { 0.2f, 0.3f, 0.4f };

A vector can be declared using this syntax also:

vector &<Type, Number> VariableName

Here are some examples:

vector &<int,    1> iVector = 1;
vector &<double, 4> dVector = { 0.2, 0.3, 0.4, 0.5 };

See Also

Matrix Type Syntax

A matrix is a special data type that contains between one and sixteen components. Every component of a matrix must be of the same type.

Syntax

TypeRxC VariableName

where:

TypeRxC

Identifies the data types, and number of components in each row and column. The type is one of the basic types. The number of rows and columns is a positive integer between 1 and 4.

VariableName

An ASCII string that uniquely identifies the variable name.

Remarks

Here are some examples:

int1x1    iMatrix;   // integer matrix with 1 row, 1 column
int4x1    iMatrix;   // integer matrix with 4 rows, 1 column
int1x4    iMatrix;   // integer matrix with 1 row, 4 columns
double3x3 dMatrix;   // double matrix with 3 rows, 3 columns
 
float2x2 fMatrix = { 0.0f, 0.1, // row 1
                     2.1f, 2.2f // row 2
                   };   

A matrix can be declared using this syntax also:

matrix &<Type, Number> VariableName

The matrix type uses the angle brackets to specify the type, the number of rows, and the number of columns. This example creates a floating-point matrix, with two rows and two columns. Any of the scalar data types can be used.

Here is an example:

matrix &<float, 2, 2> fMatrix = { 0.0f, 0.1, // row 1
                                 2.1f, 2.2f // row 2
                               };

See Also

Shader Type Syntax

Use the following syntax to create a shader from within an effect file (.fx):

使用下面的语法可以从Effect文件(fx)中创建Shader

Syntax

ShaderType Compile( ShaderTarget, ShaderFunction );        

where:

ShaderType

A pointer to a vertex shader object, a geometry shader object or a pixel shader object.

一个指向VSGSPS对象的指针,

ShaderTarget

The compile target. For Direct3D 10, the floowing shader models are supported: vs_4_0, gs_4_0, ps_4_0.

编译对象,对于D3D10,下面的Shader被支持(错别字)vs_4_0, gs_4_0, ps_4_0

ShaderFunction

The name of the shader function; this is an ASCII string that uniquely identifies the name of a shader function within an effect file.

Shader函数的名字。是一个ASCII的字符定义了Effect文件中Shader函数的名字。

Example

A vertex shader (and geometry and pixel shader) can be set to the device from within a technique using the CompileShader function. Here is an example from Tutorial2:

VSGSPS)可以使用CompileShader函数设置到设备中。这里是练习2的例子

technique10 Render
{
    pass P0
    {
        SetVertexShader( CompileShader( vs_4_0, VS() ) );
        SetGeometryShader( NULL );
        SetPixelShader( CompileShader( ps_4_0, PS() ) );
    }
}

This example creates a vertex shader and pixel shader object, compiled for shader model 4. The entry point functions for the vertex shader is VS(), the pixel shader corresponding function is called PS(). There is no geometry shader, so the pointer is set to NULL.

这个例子创建了一个VSPS对象,用SM4编译。VS的入口函数是VS(),PS对应的函数叫PS()。因为没有GS,所以这个指针是空。

When the effect is created, the CompileShader function creates a shader object by calling D3D10CompileShader. The shader object created is then set in the device, as each of these methods maps to the following APIs:

Effect创建时,CompileShader函数调用D3D10CompileShader函数创建Shader对象。创建好的Shader对象被设置到设备中,每种Shader使用不同的API设置:

SetVertexShader

VSSetShader

SetGeometryShader

GSSetShader

SetPixelShader

PSSetShader

Sampler Type Syntax

Syntax

sampler SamplerName[Array_Index] [= Initializers];

where:

sampler

The sampler keyword must appear here.

Sampler的关键字

SamplerName[Array_Index]

An ASCII string that uniquely identifies the name of a variable within a shader. Each variable can be defined as a single variable or optionally as an array of variables. Array_Index is the optional array size and is specified as a positive integer greater than or equal to 1.

ASCII字符串唯一定义了Shader中的变量名。每个变量能够被定义为简单变量或者为一个数组。数组索引是可选的数组大小,为一个大于等于1的整数

Initializers

Initializers specify default sampler state. Initializers are optional. If initializers are used, they must appear within a statement block (delimited by {}) with the sampler_state keyword as shown here:

初始化定义了默认sampler状态。初始化是可选的,如果使用初始化,它必须出现在语句块中使用sampler_state关键字用大括号括住

= sampler_state 
{ 
  ...; // sampling state
 ...; // sampling state
};

Remarks

This sampler is initialized with linear filtering.

sampler s = sampler_state 
{ 
  texture = NULL; 
  mipfilter = LINEAR; 
};

User Defined Type Syntax

In addition to the built-in intrinsic data types, HLSL supports user-defined or custom types which follow this syntax:

Syntax

typedef [const] type id [index];

where:

[const]

Optional. This keyword explicitly marks the type as a constant.

type

Identifies the data type; must be one of the HLSL intrinsic data types.

id

Variable name

index

Optional. Indicates an array of values.

Remarks

For compatibility with DirectX 8 effects, the following types are automatically defined at super-global scope:

typedef int DWORD;
typedef float FLOAT; 
typedef vector <float, 4> VECTOR;
typedef matrix <float, 4, 4> MATRIX;
typedef string STRING;
typedef texture TEXTURE;
typedef pixelshader PIXELSHADER;
typedef vertexshader VERTEXSHADER;

These types are not case-insensitive.

For convenience, the following types are automatically defined at super-global scope. Note that the pound sign (#) represents an integer digit between 1 and 4.

typedef vector <bool, #> bool#;
typedef vector <int, #> int#;
typedef vector <half, #> half#;
typedef vector <float, #> float#;
typedef vector <double, #> double#;
 
typedef matrix <bool, #, #> bool#x#;
typedef matrix <int, #, #> int#x#;
typedef matrix <half, #, #> half#x#;
typedef matrix <float, #, #> float#x#;
typedef matrix <double, #, #> double#x#;

Using HLSL Shaders

The pipeline has three shader stages and each one gets its operating instructions from a shader written in HLSL. All Direct3D10 shaders are written in HLSL, targeting Shader Model 4.0.

流水线有3Shader阶段,每个有自己的从HLSL写成的Shader中导入的操作指令。所有的D3D10 Shader都由HLSL写成,面向SM4

Differences between Direct3D 9 and Direct3D 10:

Unlike in earlier Direct3D 9 Shader Models, where shaders could be authored in an intermediate assembly language, Shader Model 4.0 shaders are only authored in HLSL. Offline compilation of shaders into device-consumable bytecode is still supported, and recommended for most scenarios.

和从前的D3D9 SM能够直接用汇编来写不一样,SM4只能用HLSL编写。离线编译Shader为设备支持的二进制码依然被支持,并且在多数情况下推荐使用。

This example uses only a vertex shader. Because all shaders are built from the common shader core, learning how to use a vertex shader is very similar to using a geometry or pixel shader.

这个例子使用一个VS。因为所有的Shader都为通用Shader核心建立,VS的使用和PSGS都类似。

Once you have authored an HLSL shader (this example uses the vertex shader HLSLWithoutFX.vsh), you will need to prepare it for the particular pipeline stage that will use it. To do this you need to:

如果你已经有了一个编译好的HLSL Shader(这个例子使用一个VS HLSLWithoutFX.vsh),你需要把它设置到需要用它的流水线阶段上。这样做需要:

These steps need to be repeated for each shader in the pipeline.

Compile a Shader

The first step is to compile the shader, to check to see that you have coded the HLSL statements correctly. This is done by calling D3D10CompileShader and supplying it with several parameters as shown here:

编译Shader的第一步是检查你的HLSL编码的正确性。这个可以通过调用D3D10CompileShader来实现,并且输入几个必须的参数,如下:

    IPD3D10Blob * pBlob;
        
                 
    // Compile the vertex shader from the file
    D3D10CompileShader( strPath, strlen( strPath ), "HLSLWithoutFX.vsh", 
            NULL, NULL, "Ripple", "vs_4_0", dwShaderFlags, &pBlob, NULL );

This function takes the following parameters:

这个函数需要如下几个参数:

  • The name of the file ( and length of the name string in bytes ) that contains the shader. This example uses a vertex shader only (in the file HLSLWithoutFX.vsh file where the file extension .vsh is an abbreviation for vertex shader).

包含Shader的文件名(文件名的Byte大小)。这个例子只使用了一个VS(在文件HLSLWithoutFX.vsh文件中,.vshVS文件的扩展名)

  • The shader function name. This example compiles a vertex shader from the Ripple function which takes a single input and returns an output struct (the function is from the HLSLWithoutFX sample):

Shader函数名。这个例子从Ripple函数中编译Shader,有一个简单的输入和一个输出结构(这个函数来自HLSLWithoutFX例子)

·             VS_OUTPUT Ripple( in float2 vPosition : POSITION )
  • A pointer to all macros used by the shader. Use D3D10_SHADER_MACRO to help define your macros; simply create a name string that contains all the macro names (with each name separated by a space) and a definition string (with each macro body separated by a space). Both strings need to be NULL terminated.

一个指向Shader使用宏的指针。使用D3D10_SHADER_MACRO来帮助定义宏。简单的创建一个包含所有宏名的名字字符串(使用空格隔开每个名字)和一个定义字符串(使用空格隔开每个宏定义)。所有字符串需要以NULL结尾。

  • A pointer to any other files that you need included to get your shaders to compile. This uses the ID3D10Include interface which has two user-implemented methods: Open and Close. To make this work, you will need to implement the body of the Open and Close methods; in the Open method add the code you would use to open whatever include files you want, in the Close function add the code to close the files when you are done with them.

一个指向所有在Shader编译时需要包含的其他文件的指针,这要使用ID3D10Include接口,有两个用户实现函数:OpenClose。为了让它生效,你需要实现OpenClose函数。在Open函数中添加你想要打开的任何包含文件,在Close函数中添加关闭这些文件的代码。

  • The name of the shader function to compile. This shader compiles the Ripple function.

Shader函数编译名。这个Shader编译Ripple函数。

  • The shader profile to target when compiling. Since you can compile a function into a vertex, geometry, or pixel shader, the profile tells the compiler which type of shader and which shader model to compare the code against.

编译时针对的Shader profile的名字。因为你能够把一个函数编译为VSGS或者PSprofile告诉编译器你想编译哪种Shader,使用那种Shader模型来比较代码。

  • Shader compiler flags. These flags tell the compiler what information to put into the compiled output and how you want the output code optimized: for speed, for debug, etc. See Constants for a listing of the available flags. The sample contains some code you can use to set the compiler flag values for your project - this is mainly a question of whether or not you want to generate debug information.

Shader编译标记。这个标记告诉编译器编译器输出的是那种Shader代码和你希望怎么优化代码(速度优先还是调试优先)。Constants里有有效的标记。这个例子的一些标记你可以在你的程序里设置-这些标记主要针对你是否想生成编译信息的问题。

  • A pointer to the buffer that contains the compiled shader code. The buffer also contains any embedded debug and symbol table information requested by the compiler flags.

指向得到编译后的Shader代码的Buffer。这个Buffer同样拥有编译标记需要的嵌入式调试和符号表信息。

  • A pointer to a buffer that contains a listing of errors and warnings that were encountered during the compile, which are the same messages you would see in the debug output if you were running the debugger while compiling the shader. NULL is an acceptable value when you don't want the errors returned to a buffer.

一个指向拥有编译时错误和警告列表的Buffer。如果你打开编译Shader的编译器的话,这些信息你可以在编译输出中看见。当你不要要返回错误Buffer时,可以设为NULL

If the shader compiles successfully, a pointer to the shader code is returned as a ID3D10Blob interface. It is called the Blob interface because the pointer is to a location in memory that is made up of an array of DWORD's. The interface is provided so that you can get a pointer to the compiled shader which you will need in the next step.

如果Shader编译成功,就会返回一个指向Shader代码的ID3D10Blob接口的指针。把它叫做Blob接口时因为指针指向内存的一块由DWORD数组组成的区域。这个接口让你在下一步中能够得到编译好的Shader的代码指针。

Create a Shader Object

Once the shader is compiled, call CreateVertexShader to create the shader object:

Shader编译好后,就可以用CreateVertexShader创建Shader对象

    ID3D10VertexShader ** ppVertexShader
    ID3D10Blob pBlob;
 
 
    // Create the vertex shader
    hr = pd3dDevice->CreateVertexShader( (DWORD*)pBlob->GetBufferPointer(),
                                            &ppVertexShader );
 
    // Release the pointer to the compiled shader once you are done with it
    pBlob->Release();

To create the shader object, pass the pointer to the compiled shader into CreateVertexShader. Since you had to successfully compile the shader first, this call will almost certainly pass, unless you have a memory problem on your machine.

创建Shader对象需要传入编译好的Shader指针到CreateVertexShader中。因为你预先编译好的Shader,这一步通常都会成功,除非你的机器有内存问题。

You can create as many shader objects as you like and simply keep pointers to them. This same mechanism works for geometry and pixel shaders assuming you match the shader profiles (when you call the compile method) to the interface names (when you call the create method).

你可以创建多个Shader对象然后保存它们的指针。这些机制同样对GSPS也有效,只要你把Shader Profile(当你在编译时调用的)和接口名字(当你在创建时调用的)对应起来。

Set the Shader Object

The last step is set the shader to the pipeline stage. Since there are three shader stages in the pipeline, you will need to make three API calls, one for each stage.

最后一步是把Shader设置到绘制流水线中。因为绘制流水线中有三个Shader阶段,你需要针对每个阶段的三个API调用。

    // Set a vertex shader
    pd3dDevice->VSSetShader( g_pVS10 );

The call to VSSetShader takes the pointer to the vertex shader created in step 1. This sets the shader in the device. The vertex shader stage is now initialized with its vertex shader code, all that remains is initializing any shader variables.

调用VSSetShader输入步骤1生成的VS。这一步把Shader设置到设备中。VS阶段使用这个VS代码初始化,剩下的就是初始化Shader变量。

Repeat for all 3 Shader Stages

Repeat these same set of steps to build any vertex or pixel shader or even a geometry shader that outputs to the pixel shader.

重复相同的步骤来创建PSGS

Shader Features

This section describes improvements to the shader programming model and High-Level Shading Language (HLSL) in Direct3D 10.

这一节描述了SM的进一步知识和D3D10HLSL细节。

Common Shader Core

In Direct3D 10, all shader stages offer the same base functionality, which is implemented by the Shader Model 4.0 Common Shader Core. In additional to the base each of the three shader stages (vertex, geometry, and pixel) offer some unique functionality only to that stage, such as the ability to generate new primitives from the geometry shader stage or to discard a specific pixel in the pixel shader stage. Here is a conceptual diagram of how data flows through a shader stage, and the relationship of the shader common core with shader memory resources:

D3D10中,所有的Shader阶段提供相同的基本功能,这些功能被SM4的通用Shader核心实现。在每个Shader阶段的基本功能外,每个Shader阶段又提供了本阶段独有的功能(比如在GS阶段创建新的图元或者在PS阶段取消绘制象素)。下面是一个Shader阶段的数据流以及Shader通用核心和Shader内存资源关系的概念图:

 

Figure 1.  Shader Common Core Flow Diagram

Input Data: Every shader processes numeric inputs from the previous stage in the pipeline. The vertex shader receives its inputs from the input assembler stage; other shaders receive their inputs from the previous shader stage. Additional inputs include system-generated values, which are consumable by the first unit in the pipeline to which they are applicable.

输入数据:每个Shader处理上一流水线阶段传入的多个输入数据。VS接收IA阶段的数据,其他Shader接收上一步Shader的数据。附加的输入包括系统生成值,被第一个能够识别它们的流水线阶段读取。

Output Data: Shaders generate output results to be passed onto the subsequent stage in the pipeline. In the case of the Geometry Shader, the amount of data output from a single invocation can vary. Some specific outputs are system-interpreted (examples include vertex position and rendertarget array index), the rest serve as generic data to be interpreted by the application.

输出数据:Shader输出结果送到下一个流水线阶段中。在GS中,一次调用数据输出的数量是变化的。一些特定的输出是系统读取的(比如顶点位置和RT数组索引),剩下的作为通用数据被应用程序读取。

Shader Code: Shaders can perform vector floating point and integer arithmetic operations, flow control operations, and read from memory. There is no instruction count limit for these shaders.

Shader核心:Shader能够执行浮点向量和整型算术操作,流控制操作和读取内存操作。Shader中因改没有指令数限制。

Samplers: Samplers define how to sample and filter textures. As many as 16 samplers can be bound to a given shader simultaneously.

SampersSamplers定义了如何采样和过滤纹理。一个Shader能够同时绑定16个纹理采样。

Textures: Textures can be filtered via samplers or read on a per-texel basis directly via the Load() HLSL intrinsic.

纹理:纹理能够通过Sampler过滤或者直接通过HLSL内置函数Load()逐纹元的读取

Buffers: Buffers can be read from memory on a per-element basis directly via the Load() HLSL intrinsic. They cannot be filtered. As many as 128 texture and buffer resources (combined) can be bound to a given shader simultaneously.

BuffersBuffers可以通过HLSLLoad()函数逐元素的从显存中读取。它们不能被滤波。128个纹理和Buffer资源(一起)可以同时被绑定到Shader上。

Constant Buffers: Constant buffers are buffer resources that are optimized for shader constant-variables. As many as 16 constant buffers can be bound to a shader stage simultaneously. They are designed for lower-latency access and more frequent update from the CPU. For this reason, additional size, layout, and access restrictions apply to constant buffers.

常数Buffer:常数Buffer是被Shader作为常数资源的Buffer16个常数Buffer能够同时绑定到一个Shader阶段上。它们能够快速读取,并且可以频繁被CPU更新。因为如此,常数Buffer有附加的大小,层次和设置限制。

Differences between Direct3D 9 and Direct3D 10:

In Direct3D 9, each shader unit had a single, small constant register file to store all constant shader variables. Accommodating all shaders with this limited constant space required frequent recycling of constants by the CPU.

D3D9中,每个Shader单元有一个很小的常数寄存器块存储常数Shader变量。为了让所有Shader能够使用这一块常数寄存器,CPU需要频繁循环更新它。

In Direct3D 10, constants are stored in immutable buffers in memory and are managed like any other resource. There is no limit to the number of constant buffers an application can create. By organizing constants into buffers by frequency of update and usage, the amount of bandwidth required to update constants to accommodate all shaders can be significantly reduced.

D3D10中,常数被存储在不变的显存缓存中,和其他资源一样管理。应用程序能够创建的常数缓存数量不限。通过把常数组织为一个缓存,用作频繁更新和使用,所有Shader调用它的带宽就被有效的降低了。

Integer and Bitwise Support

The common shader core provides a full set of IEEE-compliant 32-bit integer and bitwise operations. These operations enable a new class of algorithms in graphics hardware - examples include compression and packing techniques, FFT's, and bitfield program-flow control.

CSC支持完整支持的IEEE 32位整数操作。这些操作使图形硬件能够支持新一类的算法,比如压缩和打包技术,小波变换和位图程序流控制。

The int and uint data types in Direct3D 10 HLSL map to 32 bit integers in hardware.

D3D10中的intuint数据类型在硬件中对应32位整数。

Differences between Direct3D 9 and Direct3D 10:

In Direct3D 9 stream inputs marked as integer in HLSL were interpreted as floating-point. In Direct3D 10, stream inputs marked as integer are interpreted as a 32 bit integer.

D3D9中流输入的整数是作为浮点数处理。在D3D10中,流输入的整数被作为32位整数处理。

In addition, boolean values are now all bits set or all bits unset. Data converted to bool will be interpreted as TRUE if the value is not equal to 0.0f (both positive and negative zero are allowed to be FALSE) and FALSE otherwise.

同时,布尔值是所有bit都被设置或者所有bit都没有设置。不等于0.0f的值都是真(正0和负0都是假),其他都是假。

Bitwise operators

The common shader core supports the following bitwise operators:

CSC支持以下位操作:

Operator

Function

~

Logical Not

<< 

Left Shift

>> 

Right Shift

&

Logical And

|

Logical Or

^

Logical Xor

&&=

Left shift Equal

>>=

Right Shift Equal

&=

And Equal

|=

Or Equal

^=

Xor Equal

Bitwise operators are defined to operate only on Int and UInt data types. Attempting to use bitwise operators on float, or struct data types will result in an error. Bitwise operators follow the same precedence as C with regard to other operators.

位操作只对Int或者UInt数据类型有效。对浮点数或者结构数据使用位操作会导致错误。位操作和C的用法一样。

Binary Casts

Casting operation between an int and a float type will convert the numeric value following C rules for truncation of int data types. Casting a value from a float, to an int, back to a float result in a lossy conversion according to defined precision of the target.

Intfloat类型的转换会按照C的规则进行,整数会截断浮点数的小数部分。把一个浮点数转换为整数再转换回来是一个有损耗的转换,和目标定义的精度有关。

Binary casts may also be performed using HLSL intrinsic function. These cause the compiler to reinterpret the bit representation of a number into the target data type. Here are a few examples:

二进制装还可以使用HLSL内置函数进行。这会引起编译器重新分配数据的位表示到对应的数据类型中。下面是些例子:

asfloat() //Input data is aliased to float
asint()//Input data is aliased to int 
asuint() //Input data is aliased to Uint

Shader Constant Variables

In Direct3D 10, HLSL constant variables are stored in one or more buffer resources in memory.

D3D10中,HLSL常数变量能够存在一个或多个显存中的缓存资源中。

Shader constants can be organized into two types of buffers: constant buffers (cbuffers) and texture buffers (tbuffers). Constant buffers are optimized for shader-constant-variable usage: lower-latency access and more frequent update from the CPU. For this reason, additional size, layout, and access restrictions apply to these resources. Texture buffers utilize the texture access pipe and can have better performance for arbitrarily indexed data. Regardless of which type of resource you use, there is no limit to the number of cbuffers or tbuffers an application can create.

Shader常数能够组织乘两种类型的缓存:常数缓存(cbuffers)和纹理缓存(tbuffers)。常数缓存被优化做shader常数变量使用:低耗时读取和CPU频繁更新。因为如此,它有的附加的大小,层次的读取限制。纹理缓存为纹理读取设置,对于有绝对索引的读取有着更高的性能。不管用那种资源,应用程序对它们的个数没有限制。

Declaring a cbuffer or tbuffer in HLSL looks very much like a structure declaration in C. Define a variable in a constant buffer similar to the way a struct is defined:

HLSL里申明cbuffertbuffer看起来和C语言很类似。定义cbuffer变量和定义一个结构一样:

cbuffer name
{
 variable declaration;
 ...
};

where

  • name - the constant buffer name

名字:cbuffer的名字

  • variable declaration - any HLSL or effect non-object declaration (except texture and sampler)

变量申明:任何HLSL Effect非对象申明(除了纹理和Sampler

The register modifier can be applied to a cbuffer/tbuffer namespace, which overrides HLSL auto assignment and causes a user specified binding of the named cbuffer/tbuffer to a given constant buffer/texture slot.

寄存器记号能够添加到cbuffer/tbuffer的名字空间中,这会覆盖HLSL的自动赋值并导致用户自定义把对应名字的cbuffer/tbuffer设置到对应的cbuffer/tbuffer槽中。

Differences between Direct3D 9 and Direct3D 10:

Unlike the auto-allocation of constants in Direct3D 9 which did not perform packing and instead assigned each variable to a set of float4 registers, HLSL constant variables follow packing rules in Direct3D 10.

D3D9自动分配常数而不做打包不同只是把数据送入一个float4的寄存器不同,D3D10 HLSL常数变量遵循打包规则。

A cbuffer or tbuffer definition is not a declaration because there is no type checking done, the definition only acts as a mechanism for naming a constant/texture buffer. It is similar to a namespace in C. All variables defined in constant and texture buffers must have unique names within their namespace. Any variable without a namespace defined, is in the global namespace (called the $Globals constant buffer, which is defined in all shaders). The register specifier can be used for assignment within a cbuffer or tbuffer, in which case the assignment relative to that cbuffer/tbuffer namespace. The register specification overrides the packing derived location.

cbuffer/tbuffer定义并不是申明,因为没有做类型检测。定义的作用只是命名一个cbuffer/tbuffer,和C中的名字空间一样。所有定义的cbuffer/tbuufer变量必须在名字空间中有一个唯一的名字。没有名字空间的变量就是在全局名字空间的变量(叫做$Globals constant buffer,对所有Shader都有效)。寄存器标记能够对cbuffer/tbuffer赋值,赋值和它们的名字空间相关。寄存器标记重写了打包导出位置。

Organizing Constant Variables

To make efficient use of bandwidth and maximize performance, the application can organize its constant variables into buffers by frequency of update and usage. For instance, data that needs to be updated per object, should be grouped into a different buffer than data used by all objects and materials in the scene.

为了有效的使用带宽和最大化性能,应用程序可以根据数据的更新和使用频率来组织缓存。举个例子,对每个对象都要更新的数据,就要和被场景中所有对象和材质使用的数据放在不同的缓存中。

 

Figure 1.  Binding Constant Buffers to Shaders

The first shader uses only two of the constant buffers. The second shader may use the same or different constant buffers.

第一个Shader使用两个常数Buffer,第二个Shader使用相同或者其他的常数Buffer

As another example:

cbuffer myObject
{       
  float4x4 matWorld;
 float3   vObjectPosition;
 int      arrayIndex;
}
 
cbuffer myScene
{
 float3   vSunPosition;
 float4x4 matView;
}

This example declares two constant buffers and organizes the data in each based on their frequency of update: data that needs to be updated on a per-object basis (like a world matrix) is grouped into myObject which could be updated for each object. This is separate from data that characterizes a scene and is therefore likely to be updated much less often (when the scene changes).

这个例子申明了两个常数Buffer,并根据它们的更新频率在不同的地方组织数据:每个对象更新的数据(比如世界矩阵)在MyObject中,这和场景中公用的数据区分开来,因为它们不经常更新(只有等场景改变的时候)。

Constant Buffers

A shader constant buffer is a specialized buffer resource. A single constant buffer can hold up to 4096 4-channel 32-bit elements. As many as 16 constant buffers can be bound to a shader stage simultaneously.

Shader常数Buffer是一个特殊的Buffer资源。单个常数Buffer能够保存最多40964通道32位元素。16个常数Buffer能够同时被绑定到Shader上。

A constant buffer is designed to minimize the overhead of setting shader constants. The Direct3D10 Effects system will manage updating of tbuffers and cbuffers for you. Alternatively, the buffer can be updated directly by the application via API calls: Map with D3D10_MAP_WRITE_DISCARD, or UpdateResourceUP. The application can also copy data from another buffer (including a buffer used as a render target or stream out target) into a CB. A constant buffer is the only resource that does not use a view to bind it to the pipeline, therefore, you cannot use a view to reinterpret the data.

cbuffer是为了减小设置Shader常数的时间消耗而设计的。D3D10 Effect系统会管理更新tbuffercbufferCbuufer同时也能直接通过API调用更新:使D3D10_MAP_WRITE_DISCARD map或者使用UpdateResourceUP。应用程序也能从其他buffer中拷贝数据(包括用作render target或者stream out的资源)到CB中。CB是唯一不用视图绑定到流水线上的资源,你不能使用视图来读取数据。

You can store up to 4096 4-component values in each constant buffer, and you can bind up to 16 constant buffers to a shader at the same time.

你能够在CB中存储最多40964通道数据,能同时绑定16CB到一个Shader上。

Texture Buffers

A texture buffer is a buffer of shader constants that is read from texture loads (as opposed to a buffer load). Texture loads can have better performance for arbitrarily indexed data than constant buffer reads.

Tbuffer是作为纹理读入的Shader常量(和缓存读取不同)。纹理读取对于部分索引的数据比常数缓存读取有更好的性能。

Define a variable in a texture buffer similarly to a constant buffer:

定义变量tbuffercbuffer类似

tbuffer name
{
 variable declaration;
 ...
};

where

  • name - the texture buffer name 纹理buffer名字
  • variable declaration - any HLSL or effect non-object declaration (except texture or sampler) 变量申明

Designing a Geometry Shader

There are two key differences between geometry shaders and vertex and pixel shaders. They are:

GSVSPS有两个显著的区别

  • The way a geometry shader is declared GS申明的方式
  • Special HLSL intrinsic functions that only work with a eometry shader GS拥有的特殊的HLSL内置函数

Geometry Shader Function Declaration

There are three syntax differences:

语法上有三种不同:

  • Input parameter must also declare the input primitive type 输入参数使用输入图元类型定义
  • Output parameter must also declare the output primitive type 输出参数使用输出图元类型定义
  • The geometry shader must declare the output data size 要定义输出数据大小

Geometry Shader Input Primitive Type

A geometry shader requires that the input parameter add a primitive type to the declaration. For example, this could be a vertex shader declaration that takes one float3 input and returns one float4:

GS需要输入参数添加图元类型的申明。举个例子,下面是个VS申明输入float3向量,输出float4向量:

float2 VS_main( float4 inputPos[3] )
{
 ...
}

The funtion body is not important at the moment, focus on the input parameter declaration. To make this an input parameter for a geometry shader, add the primitive type to the parameter declaration like this:

函数体并不重要,注意输入参数申明。对于GS的输入参数,需要在参数申明前加上图元类型,如下:

float2 GS_main( triangle float4 inputPos[3] )
{
 // this is partially complete example
}

Notice the function has been renamed GS_main to indicate that it is going to be a geometry shader declaration. This example declares the input parameter as an array of float4's that contain input position data. The array of float's make up a triangle primitive, that is, a triangle list or a triangle strip. The primitive types that are supported in a geometry shader include:

注意函数被重命名位GS_main表示这是个GS申明。这个例子申明输入参数是一个float4的数组,包含输入位置信息。这个浮点数组组成一个三角形图元,即三角链表或者三角条带。在GS中支持的图元类型有:

Primitive Type

Description

point

Point list

line

Line list or line strip

triangle

Triangle list or triangle strip

lineadj

Line list with adjacency or line strip with adjacency

triangleadj

Triangle list with adjacency or triangle strip with adjacency

The primitive type is only recognized by the compiler on a top-level function.

图元类型只能被高层函数的的编译器识别。

Geometry Shader Output Parameter Declaration

A geometry shader also requires that the output parameter declare its primitive type, and use a stream object to stream out the data. The stream object is declared as both an input and an output using the inout keyword. An example of just declaring the output parameter looks like this:

GS同样也需要申明输出参数的图元类型,使用流对象来输出数据。流对象使用inout关键字申明为既是输入数据也是输出数据。申明输出参数的例子如下:

inout TriangleStream<float2> outPos

This declares the input/output parameter outPos, that is of type float2, and whose primitive type is a triangle stream. Combining the previous input parameter declaration with this output parameter declaration we get:

这里申明了输入/输出参数outPos,类型为float2,图元类型是三角形条带。把输入参数和输出参数一起申明就是:

float2 GS_main( triangle float4 inputPos[3], inout TriangleStream<float2> outPos )
{
 // this is partially complete example
}

These are the stream object types supported by HLSL:

HLSL支持的流对象类型是:

Stream Object Types

Description

PointStream

Stream out points

LineStream

Stream out lines

TriangleStream

Stream out triangles

Stream objects must be labeled with the inout keyword on all geometry shader functions regardless of whether they are top-level functions.

流对象在所有GS中必须用inout关键字标记,不管它是否是顶层函数。

Geometry Shader Maximum Output Data Declaration

A geometry shader must also declare the maximum number of vertices that it could generate. When the declared number of vertices has been reached by a geometry shader invocation, the geometry shader will terminater. Of course, the geometry shader statements may cause the shader to exit before it reaches this limit.

GS必须申明它能够生成的最大顶点数。当GS生成的顶点数据达到最大顶点数时,GS就会中止执行。当然,GS语句会在GS到达最大限制时跳出。

The maximum number is declared using the MaxVertexCount attribute type like this:

最大数目使用MaxVertexCount属性类型申明如下:

[MaxVertexCount(n)]

where n is the maximum number of vertices. This declaration replaces the return type of the function. So putting together the declarations for the input primitive type, the output primitive type and the maximum number of vertices looks like this:

n表示最大的顶点数。这个申明代替了函数的返回类型。所以把输入图元类型,输出图元类型和最大顶点数放在一起,申明如下:

[MaxVertexCount(n)]
GS_main( triangle float4 inputPos[3], inout TriangleStream<float2> outPos )
{
 // this is a complete GS declaration (except the body of the GS is empty)
}

The maximum number of 32-bit values that can be output from a geometry shader is 1024. So in this example, the geometry shader outputs a 2-component position, and therefore the number of vertices that can be output is (1024 / 2).

GS能够输出的最大的23位数时1024。举个例子,BS输出一个2通道的位置,那么它能够输出最大顶点数是1024/2

Geometry Shader HLSL Methods

Almost all of the HLSL intrinsic functions can be used from any shader. In addition, there are new HLSL methods designed specifically for a particular shader type. In the case of a geometry shader, there are two special HLSL methods that help output data from a geometry shader:

几乎所有的HLSL内置函数都能任何Shader使用。但是有些特殊的函数是为特殊的Shader来设计的。对于GS有两个特殊的HLSL函数帮助GS输出数据:

HLSL Methods

Stream Object Type

Description

Append

PointStream, LineStream, TriangleStream

Append output data to the output stream. The output data type must match the type specified in the template parameter to the stream object.

把输出数据添加的输出流上。输出数据类型必须和数据流对象模板参数的类型定义匹配。

RestartStrip

LineStream, TriangleStream

End the previous strip (if any) and start a new primitive strip.结束从前的条带(如果有的话)并开始一个新的图元条带

Shader System Values

A semantic is a string attached to a shader input or output that conveys information about the intended use of a parameter (see Shader Semantic Syntax). Semantics are required on all variables passed between shader stages.

语义是绑定在Shader输入输出参数上的字符串,包含Shader参数内部用途的信息(参见Shader Sematic Syntax)。所有在Shader阶段之间传送的参数都需要语义。

In general, data passed between pipeline stages is completely generic and is not uniquely interpreted by the system, that is, arbitrary semantics are allowed which have no special meaning. In certain cases however, the hardware will perform a fixed function operation based on the semantic. Parameters which contain these special semantics are referred to as system values. All system value semantics begin with an SV_ prefix, a common example is SV_POSITION, which is interpreted by the rasterizer stage.

通常在流水线阶段之间传送的数据都是通用的,不会被系统读取,也就是说,任意语义允许没有任何意义。在这种情况下,硬件会执行FF基于语义操作。拥有特殊语义的参数被当作系统值。所有系统语义使用SV_开头,比如SV_POSITION被光栅化阶段读取。

The table below specifies system value semantics understood by Direct3D. The SV_ semantics are valid at other parts of the pipeline. For instance, SV_Position can be specified as an input to a vertex shader as well as an output. Pixel shaders can only write to parameters with the SV_Depth and SV_Target semantics.

下面的表定义了D3D能读取的系统值。SV语义对于其他流水线部分也是有效的。比如SV_Position能够作为VS的输入输出。PS只能写有SV_DepthSV_Target语义的参数。

System Value Semantics

Type

SV_ClipDistance[n]

float

SV_Depth

uint

SV_InstanceID

uint

SV_IsFrontFace

bool

SV_Position

float4

SV_PrimitiveID

uint

SV_RenderTargetArrayIndex

uint

SV_VertexID

uint

SV_ViewportArrayIndex

uint

System-generated values (SV_VertexID, SV_PrimitiveID, SV_InstanceID, SV_IsFrontFace) can only be input into the first active shader in the pipeline that can interpret the particular value; from that point on it is the responsibility of the shader to manually pass the values down to subsequent stages. SV_PrimitiveID, for example, is not interpet-able by the Vertex Shader since a vertex can be a member of multiple primitives. Therefore, SV_PrimitiveID is only available to the Geometry Shader. If the Geometry Shader is inactive, is it available to the Pixel Shader.

系统生成值(SV_VertexID, SV_PrimitiveID, SV_InstanceID, SV_IsFrontFace)只能被流水线中第一个激活的并且能够读取对应值的Shader读取。因此Shader有责任把它的值在需要的情况下传到下一步。举个例子,SV_PrimitiveID不能被VS读取因为顶点可能是多个图元的组成部分,因此它只能被GS读取。如果GS没有被激活,那么它只能对PS有效。

Note to Direct3D 9 developers:

For Direct3D 9 targets, shader semantics must map to valid Direct3D 9 semantics. For backwards compatibility POSITION0 (and its variant names) is treated as SV_Position, COLOR is treated as SV_TARGET.

对于D3D9对象,Shader语义必须和有效的D3D9语义匹配。为了向上兼容,POSITION0被当作SV_POSITION,COLOR被当作SV_TARGET

Shader Signatures

In Direct3D 10, adjacent stages effectively share a register array, where the upstream stage writes data to specific locations in the register array and the downstream stage must read from the same locations. This allows for fast binding of shaders without the overhead of semantic resolution. The API mechanism for the two stages to share a common understanding of the linkage register locations is called a signature.

D3D10中,邻接的流水线阶段可以有效的分享寄存器数组,前一个阶段把数据写入寄存器数组特定的位置,后一阶段从相同的位置读取。这允许快速的绑定Shader而不需要语义转换的时间。两个阶段分享同一个相互了解的连接寄存器位置的API机制叫做信号。

Shaders in Direct3D 10 link to pipeline stages using a signature; input signatures are generated from an HLSL input declaration and the output signature is likewise generated from an output HLSL declaration. An input signature is compatible with an output signature when the output signature is a strict subset (each type and the order of the types) of the input signature. The most straightforward way to achieve this is to link corresponding shader inputs and outputs by the same structure type. For example, these signatures are compatible:

D3D10中的Shader使用信号连接流水线阶段;输入信号从HLSL输入申明输出信号从HLSL的输出申明。输入信号和输出信号兼容,输出信号必须是输入信号的一个子集(每个类型和类型的顺序要相同)。最直接的连接输入和输出的方式就是定义相同的数据结构,比如,这些信号是兼容的:

// Signature written by a vertex shader
Struct VSOut
{
 float4 Pos: SV_Position;
 float3 MyNormal: Normal;
 float2 MyTex : Texcoord0;
}
 
// Signature read by a pixel shader
Struct PSInWorks
{
 float4 Pos: SV_Position;
 float3 MyNormal: Normal;
}
 
// Signature read by a pixel shader
//   incompatible signature - ordering is wrong
Struct PSInFails
{
 float3 MyNormal: Normal;
 float4 Pos: SV_Position;
}

PSInWords is compatible with VSOut because PSInWorks is a compatible subset (that is, the first two entries in PSInWorks match both type and order with the first two entries in VSOut). However, PSInFails is incompatible because the ordering does not match with VSOut.

PSInWordsVSOut兼容,因为PSInWorks是它的一个兼容子集(就是说,PSInWorks开头两个入口数据在类型和结构上都和VSOut的开头两个数据相同)。当然,PSInFail不兼容因为它的顺序和VSOut不同。

Shader Textures

In Direct3D 10, the samplers and textures are specified independently. In addition, textures are typed and have a format associated with them, which is the base data type of the data returned with load and sample methods. These new templated texture types cannot be used with the Direct3D 9 style texture intrinsic functions. Instead, there are several new HLSL intrinsic functions implemented for use with the templated texture objects.

D3D10中,samplerstextures是完全分开的。Textures有与之相关的类型和格式,是loadsample函数的基本数据类型。这些新的模板纹理不能使用D3D9类型的纹理内置函数。因此有几个新的HLSL内置函数来使用模板纹理对象。

Templated Texture Types

These are the texture object types:

  • Texture1D
  • Texture1DArray
  • Texture2D
  • Texture2DArray
  • TextureCube
  • Texture3D
  • Texture2DMS

Templated Texture Methods

To support the return type and separation of sampler state and texture, the textures are templated objects that have methods that provide load, sample, and resource functionality for the given texture dimension type. For convenience, the template type is implied to be float4 unless specified otherwise, with the exception of the MSAA texture objects that always need the type and sample count specified.

为了支持返回的数据类型和分开sampler和纹理,纹理被当作模板对象,对于给定的纹理维度提供loadsample和资源功能。为了方便期间,模板类型默认为float4除非自己另外定义。MSAA纹理对象一定需要外部指定的类型和采样数。

The texture template methods are:

Texture Template Method

Description

GetDimensions

Get the texture dimension for a specified mip level

Load

Load data without any filtering or sampling

Sample

Sample a texture

SampleLevel

Sample a texture object on the specified mip level

SampleGrad

Sample a texture object using a gradient

SampleCmp

Use a comparison value to compare against texture samples before filtering

SampleCmpLevelZero

Identical to SampleCmp, but only utilizes LOD zero

Comparison Filtering provides a basic building-block filtering operation that is useful for Percentage Closer Depth Filtering. See SampleCmp for details.

比较滤波提供基本的块状滤波,对于PCF深度滤波非常有效。

sampler MySamp;
Texture2D <float4> MyTex;
Texture2D MyTexImpliedFloat4;
 
float4 main(float2 TexCoords[2] : TEXCOORD) : SV_Target
{
    return MyTex.Sample(MySamp, TexCoords[0])
        + MyTexImpliedFloat4.Sample(MySamp, TexCoords[1]);
}

Texture Return Types

The template type is the same as the texture resource type (DXGI_FORMAT). In other words, it can be any of the following types:

模板类型和纹理资源类型一样,它可以是以下任何一种资源。

Texture Intrinsic Function Return Type

Description

float

IEEE 32 bit float

int

32 bit signed integer

unsigned int

32 bit unsigned integer

snorm

IEEE 32 bit float in range -1 to 1 inclusive

unorm

IEEE 32 bit float in range 0 to 1 inclusive

A texture object return type can be any templated type including a structure, however, it must be less then 4 components. For instance, a float1 texture only returns one component.

纹理对象返回类型能够是任何模板类型包括结构。当然,必须小于四个通道,比如一个float1纹理只能返回一个通道。

Pixel Shader Interpolation Modifiers

The following interpolation usage specifiers can be placed on any input or output to a shader. They are ignored except as inputs to a function which is compiled to a pixel shader. If no interpolation modifier is present, the interpolation mode defaults to linear.

下面的插值定义能够用到shader的任意的输入和输出上,但是除了作为PS的输入外,其他的都会被忽略。如果没有插值定义的话,插值方式默认为线性插值。

Here is the list of interpolation types:

Modifier

Description

linear

Perform linear interpolation

centroid

Perform centroid interpolation (use this to improve anti-aliasing)

nointerpolation

Do not interpolate. Cannot be used with other interpolation modifiers. This modifier can only be used on uints and ints.

noperspective

Perform perspective-correct interpolation

If no interpolation modifier is present, the interpolation mode defaults to linear

Differences between Direct3D 9 and Direct3D 10:

The centroid interpolation type can no longer be attached to an input with a semantic (like this):

重心插值定义可以不用添加到输入语义上

float4 TexturePointCentroidPS( float4 TexCoord : TEXCOORD0_centroid ) : COLOR0
{
 
}

Similarly, you cannot apply centroid to a member of a structure:

可以作为结构成员

struct In
{
     centroid Texcoord;
};

If a struct uses a centroid modifier, centroid is applied to all members of the struct and overrides any other interpolation modifiers present.

如果一个结构使用重心插值,重心插值会应用到所有结构成员上,覆盖任何目前的插值方式

Improved Flow Control Management

In Direct3D 10 HLSL, if statements can be preceded with either a branch or flatten attribute. Branch forces the compiler to branch. If it is not possible to branch (due to a gradient instruction), the compiler will error. Likewise, flatten will cause the compiler to flatten the if statement, thereby evaluating both sides of the if statement and resolving the differences. The compiler will error when it is unable to flatten, when a stream emission or other system flow control construct is present.

D3D10 HLSL中,if语句能够对分支或者flat预执行。分支迫使编译器走分支语句。如果没有可能分支(比如梯度指令),编译器会出错。Flatten会导致编译器flat if语句,即执行if语句的两个分支,然后从中选择一个。编译器会在无法flat的时候出错,比如在流输出或者有其他流控制结构的时候。

[branch] if(a) a = sqrt(a); 
 
[flatten] if(a) a= sqrt(a);

For statements can be preceded with either a loop or unroll attributes. loop forces the compiler to use a loop construct, while an unroll forces the compiler to unroll. unroll can optionally specify the maximum number of times the loop is to execute.

For语句能够被循环或者循环展开预执行。循环迫使编译器使用loop结构,循环展开强迫编译器循环展开。循环展开能够可选的控制循环执行的最大次数。

[unroll(3)] while(1) { VStream.Append(0); }
[loop] for(uint i = 0;i < 4; i++) { VStream.Append(0); }
posted on 2006-05-06 12:59  王大牛的幸福生活  阅读(8822)  评论(8编辑  收藏  举报