渲染指令提交到GPU与等待执行完
CPU执行渲染指令,是不会立即发送给GPU硬件的。它们首先被送到驱动的指令缓冲区,当缓冲区满或执行某个函数,然后才被发送到GPU硬件上去执行。 注:PC上缓冲区满会被发送到GPU,而Mobile下即使缓冲区满也不会发送,除非pass结束或执行执行某个函数
发送到GPU后会立即返回,不会等待GPU执行完这些渲染指令。如果需要等待GPU执行完,需要开发者自己来做同步等待。
OpenGL
glFlush():将CPU端的渲染命令发送到GPU上,清空缓存,发送完后立即返回。
glFinish():将CPU端的渲染命令发送到GPU上,清空缓存,发送完,并等待GPU执行完再返回。
注1:glReadPixels()会隐式调用glFinish(),要把texture读到系统内存,如果不保证之前的渲染命令执行完,那么读出来的结果是残缺不全的。
注2:使用双缓冲时,调用SwapBuffer()会隐式调用glFinish()。因为SwapBuffer()是将双buffer进行交换从而让正在接受渲染的back buffer能显示出来,如果不保证所有渲染命令执行完,显示出来会是残缺不全的。
Vulkan
将CommandBuffer渲染命令列表添加到Queue队列中
VkCommandBufferAllocateInfo allocInfo = {}; allocInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO; allocInfo.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY; allocInfo.commandPool = mCommandPool; allocInfo.commandBufferCount = 1; VkCommandBuffer commandBuffer; vkAllocateCommandBuffers(mDevice, &allocInfo, &commandBuffer); VkCommandBufferBeginInfo beginInfo = {}; beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO; beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT; vkBeginCommandBuffer(commandBuffer, &beginInfo); //------------------- // Do Something //------------------- vkEndCommandBuffer(commandBuffer); VkSubmitInfo submitInfo = {}; submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO; submitInfo.commandBufferCount = 1; submitInfo.pCommandBuffers = &commandBuffer; vkQueueSubmit(mGraphicsQueue, 1, &submitInfo, VK_NULL_HANDLE); // VkQueue mGraphicsQueue
Vulkan使用如下方式来等待GPU执行完
VkResult vkQueueWaitIdle(VkQueue Queue) // 等待Queue执行完并返回
VkResult vkDeviceWaitIdle(VkDevice Device) // 等待Device执行完并返回
Metal
将commandBuffer渲染命令列表提交到MTLCommandQueue队列中
id <MTLCommandBuffer> commandBuffer = [_commandQueue commandBuffer]; // id <MTLCommandQueue> _commandQueue commandBuffer.label = @"MyCommand"; MTLRenderPassDescriptor* renderPassDescriptor = view.currentRenderPassDescriptor; // MTKView * view if(renderPassDescriptor != nil) { id <MTLRenderCommandEncoder> renderEncoder = [commandBuffer renderCommandEncoderWithDescriptor:renderPassDescriptor]; renderEncoder.label = @"MyRenderEncoder"; [renderEncoder pushDebugGroup:@"DrawTriangle"]; [renderEncoder setRenderPipelineState:_pipelineState]; // id <MTLRenderPipelineState> _pipelineState [renderEncoder setDepthStencilState:_depthState]; // id <MTLDepthStencilState> _depthState [renderEncoder setVertexBuffer:vertexBuffer offset:0 atIndex:0]; // id<MTLBuffer> vertexBuffer [renderEncoder drawPrimitives:MTLPrimitiveTypeTriangle vertexStart:0 vertexCount:3]; [renderEncoder popDebugGroup]; [renderEncoder endEncoding]; [commandBuffer presentDrawable:view.currentDrawable]; } [commandBuffer commit];
Metal使用如下方式来等待GPU执行完
[commandBuffer waitUntilCompleted] // id <MTLCommandBuffer> commandBuffer
D3D11
FD3D11DeviceContext::Flush():将CPU端的渲染命令发送到GPU上,清空缓存,发送完后立即返回。
D3D11没有Finish函数,可以使用如下方式来等待GPU执行完:
// 参考了UnrealEngine\Engine\Source\Runtime\Windows\D3D11RHI\Private\D3D11Commands.cpp中的void FD3D11DynamicRHI::RHIBlockUntilGPUIdle()函数 D3D11_QUERY_DESC Desc = {}; Desc.Query = D3D11_QUERY_EVENT; TRefCountPtr<ID3D11Query> Query; VERIFYD3D11RESULT_EX(Direct3DDevice->CreateQuery(&Desc, Query.GetInitReference()), Direct3DDevice); // TRefCountPtr<FD3D11Device> Direct3DDevice Direct3DDeviceIMContext->End(Query.GetReference()); // TRefCountPtr<FD3D11DeviceContext> Direct3DDeviceIMContext // 将Query作为最后一个渲染指令 Direct3DDeviceIMContext->Flush(); for(;;) { BOOL EventComplete = false; Direct3DDeviceIMContext->GetData(Query.GetReference(), &EventComplete, sizeof(EventComplete), 0); if (EventComplete) { break; } else { FPlatformProcess::Sleep(0.005f); } }
D3D12
将CommandList渲染命令列表添加到CommandQueue队列中
ID3D12CommandList* cmdsLists[] = { mCommandList.Get() }; // Microsoft::WRL::ComPtr<ID3D12GraphicsCommandList> mCommandList mCommandQueue->ExecuteCommandLists(_countof(cmdsLists), cmdsLists); // Microsoft::WRL::ComPtr<ID3D12CommandQueue> mCommandQueue #define _countof(_Array) (sizeof(_Array) / sizeof(_Array[0]))
D3D12没有Finish函数,可以使用如下方式来等待GPU执行完
Microsoft::WRL::ComPtr<ID3D12Device> md3dDevice; Microsoft::WRL::ComPtr<ID3D12CommandQueue> mCommandQueue; Microsoft::WRL::ComPtr<ID3D12Fence> m_fence; HANDLE m_fenceEvent; void Init() { //创建Fence md3dDevice->CreateFence(0, D3D12_FENCE_FLAG_NONE, IID_PPV_ARGS(&m_fence)); //创建一个event对象 m_fenceEvent = CreateEvent(nullptr, FALSE, FALSE, nullptr); } void WaitGPUFinish() { // NewFenceValue表示应该设置的新的围栏值,它应该和从GetCompletedValue()获得的旧的围栏值不一样,比如+1。 const UINT64 NewFenceValue = m_fence->GetCompletedValue() + 1; // 向CommandQueue队列添加一个围栏,当GPU执行到围栏时,表示前面的命令都执行完,此时将围栏设定为新的值 mCommandQueue->Signal(m_fence.Get(), NewFenceValue); // 围栏自己也设定:当它到达新的值时,将会触发m_fenceEvent这个事件 m_fence->SetEventOnCompletion(NewFenceValue, m_fenceEvent); // 让程序无限地等待m_fenceEvent这个事件,直到CommandQueue完成队列 WaitForSingleObject(m_fenceEvent, INFINITE); }