Monday, December 31, 2018

Studying D3D12HelloFrameBuffer desktop project


DirectX-Graphics-Samples-master\Samples\Desktop\D3D12HelloWorld\src\HelloFrameBuffering

Shows optimal vsync waiting. Should investigate in detail.

Difference from HelloTriangle sample

 
D3D12HelloFrameBuffering.h
ComPtr<ID3D12CommandAllocator> m_commandAllocators[FrameCount];
Two command allocators. Note m_commandList is still one instance.
UINT64 m_fenceValues[FrameCount];
There are two fences are used. m_fence is one instance.
void MoveToNextFrame();
add Signal() with m_fence to m_commandQueue  and set completion event and wait until the signal of previous MoveToNextFrame() is processed on GPU. If previous MoveToNextFrame() signal is already reached, this function does not block.
void WaitForGpu();
add Signal() with m_fence to m_commandQueue and set completion event and wait until this signal is processed on GPU. This function always blocks.

Double buffering of draw commands

 
There are two commandAllocators and command list is double buffered, one frame time of delay is acceptable.

m_fence is only one instance while m_fenceValues has two integer value to remember previous fence value.

m_fenceValues[] Initialized with zeros on constructor.

LoadAssets() increments m_fenceValues[0] twice and the value becomes 2, m_fenceValues[1] remains 0;

On the first MoveToNextFrame()

m_fenceValues[0] is 2. currentFenceValue :=2  and m_fence is passed to commandQueue->Signal()

m_frameIndex changes from 0 to 1

Function never blocks

m_fenceValues[1] := 3;

GPU executes commandList and redraw buffer 0 and when GPU reaches signal command, m_fence completed value will become 2.

Second MoveNextFrame() call

m_fenceValues[1] is 3 and currentFenceValue :=3 and m_fence is passed to commandQueue->Signal(). after previous commandList execution is finished, GPU will executes commandList and redraw buffer 1.

m_frameIndex changes from 1 to 0. If commandQueue does not reach to previous MoveToNextFrame() Signal value (==2), block until m_fence value becomes 2.




Sunday, December 30, 2018

Studying D3D12HelloTexture desktop project


DirectX-Graphics-Samples-master\Samples\Desktop\D3D12HelloWorld\src\HelloTexture
Paints 2d texture (checkerboard image) to triangle.

Difference from D3D12HelloTriangle

 
Took diff with WinMerge to find difference.

D3D12HelloTexture.h

static const UINT TextureWidth = 256;

static const UINT TextureHeight = 256;

static const UINT TexturePixelSize = 4;

struct Vertex

    {

        XMFLOAT3 position;

        XMFLOAT2 uv; //< texture UV instead of vertex color.

    };

ComPtr<ID3D12DescriptorHeap> m_srvHeap;

ComPtr<ID3D12Resource> m_texture;


Texture UV corrdinate


Direct X traditionally uses right=X+, down=Y+ coordinates for UV.

 Objects and their relations


 Changes from HelloTriangle example is highlighted in orange.



 

m_texture
width=256px, height=256px, pixelformat=RGBA8888
mipmap level=1
initialized as COPY_DEST state (ready to update texture image)
UpdateSubResources() to upload texture image of CPU memory to texture GPU memory
change resource state to PIXEL_SHADER_RESOURCE for pixel shader to read
sampler
defines texture sampler behavior such as texture clamp and mip map parameters.
m_rootSignature has this sampler info.
m_srvHeap (Shader Resource View heap)
Pixel shader sees m_texture via m_srvHeap.
m_commandList connects m_srvHeap via those calls:
ID3D12DescriptorHeap* ppHeaps[] = { m_srvHeap.Get() };
m_commandList->SetDescriptorHeaps(_countof(ppHeaps), ppHeaps);
m_commandList->SetGraphicsRootDescriptorTable(
  0, m_srvHeap->GetGPUDescriptorHandleForHeapStart());
 

PipelineStateDesc.InputLayout is changed to input texture UV instead of vertex color.

 Shader code

 DirectX-Graphics-Samples-master\Samples\Desktop\D3D12HelloWorld\src\HelloTexture\shaders.hlsl
Vertex shader pass through input position and UV value.
Pixel shader samples texture to get pixel color.


Saturday, December 29, 2018

Studying D3D12HelloTriangle and D3D12HelloBundles desktop project

D3D12HelloTriangle






Difference from D3D12HelloWorld

Took diff with WinMerge to find difference.

D3D12HelloTriangle class has several new members
CD3DX12_VIEWPORT m_viewport;
CD3DX12_RECT m_scissorRect;
ComPtr<ID3D12RootSignature> m_rootSignature;
ComPtr<ID3D12PipelineState> m_pipelineState;
ComPtr<ID3D12Resource> m_vertexBuffer;
D3D12_VERTEX_BUFFER_VIEW m_vertexBufferView;
New struct declaration of triangle vertex
struct Vertex {
    XMFLOAT3 position;
    XMFLOAT4 color;
};


D3D12HelloTriangle.cpp objects and their relations

m_rootSignature

rootSignature specifies shader input parameters such as vertex buffer location or shader constants.

But vertex buffer is specified by m_commandList->IASetVertexBuffers() and shader does not use shader constants.

m_rootSignature is referenced by pipeline state object and pso uses it internally.

m_commandList also referenced m_rootSignature but it seems it is not absolute necessary on this sample.

m_viewPort

specifies viewport size (used to scale images to fit client area)

m_scissorRect

this parameter is used for “scissoring” : scissors triangles which crosses window edge to prevent corruption of geometry shape.

m_pipelineState

contains rendering pipeline parameters such as vertex shader, pixel shader, alpha blending, render target format and m_rootSignature

m_vertexBuffer

contains triangle vertices position and vertex colors data.

data is placed on GPU memory.

m_vertexBufferView

struct to store GPU memory address of vertex buffer and its size info.

used by m_commandList

Shader code
 
Shader is a program that runs on GPU.

D3D12HelloTriangle sample contains shaders.hlsl

shaders.hlsl contains vertex shader VSMain() and pixel shader PSMain().

VSMain() processes one vertex, input vertex position and color from arg, and send it to subsequent stage. VSMain() is called 3 times.

PSMain() is called on every pixel of triangle with VSMain() return value and calculate pixel color.

Shader program is compiled to the executable code on D3D12HelloTriangle::OnInit() by calling D3DCompileFromFile() and those shader binaries is passed to pipeline state object.




D3D12HelloBundles sample



DirectX-Graphics-Samples-master\Samples\Desktop\D3D12HelloWorld\src\HelloBundles

Shows efficient triangle drawing using bundles.


Difference from HelloTriangle sample

 
D3D12HelloBundles.h

ComPtr<ID3D12CommandAllocator> m_bundleAllocator;

ComPtr<ID3D12GraphicsCommandList> m_bundle;

D3D12HelloBundles.cpp

m_bundleAllocator created as COMMAND_LIST_TYPE_BUNDLE

m_bundle command list is created as COMMAND_LIST_TYPE_BUNDLE and  record pipeline setup and draw commands

m_commandList->ExecuteBundle() to execute recorded pipeline setup and draw command


Studying D3D12HelloWorld desktop project

Studying D3D12HelloWindow desktop project



D3D12HelloWorld

Main.cpp
  • There is WinMain() function, the entry point of the program.
  • Instanciate D3D12HelloWindow and pass it to Win32Application::Run() function.
Win32Application.cpp
  • Handles Windows API calls such as CreateWindow() and WindowProc() callback.
DXSample.cpp
  • DXSample::GetHardwareAdapter() calls D3D12CreateDevice to get ID3D12Device ptr.
  • ParseCommandLineArgs() reads argc and argv and decide to use Warp(reference rasterizer) or not.
D3D12HelloWindow.cpp
  • Contains interesting DirectX12 code.
  • D3D12HelloWindow::OnInit() prepares DirectX 12 resources.
  • D3D12HelloWindow::OnRender() redraws image of render buffer and swap foreground render buffer with background render buffer.

D3D12HelloWindow members


m_swapChain
  • Has two Output Render Targets, its size is window size and pixel format is RGBA8888
  • m_swapChain->Present() is called to swap front buffer and back buffer on the next display vbrank event. this mechanism exists to prevent 都creen tearing・ it happens when displaying render target is overwritten by graphics redraw.
m_commandQueue
  • prepared on D3D12HelloWindow::OnInit()
  • associated to m_swapChain for force flush. (?)
  • on OnRender() m_commandQueue->ExecuteCommandLists() is called.
  • m_commandQueue->Signal() is used to wait frame redraw update.
m_commandList
  • contains draw call to run on GPU.
  • ClearRenderTargetView() with blue color specified.


Render target state management

 It is necessary to change render target state to RENDER_TARGET before drawing and revert to PRESENT state when drawing finished.

m_commandList->ResourceBarrier() do this task.



Double Buffering




VBlank event waiting


D3D12HelloWindow.cpp implementation is not optimal. D3DHelloFrameBuffering is more efficient. So VBlank event waiting code review will be performed with D3DHelloFrameBuffering



Next: Draw a triangle!


https://yamamoto2002.blogspot.com/2018/12/studying-d3d12hellotriangle-project.html


Saturday, December 22, 2018

AMD Threadripper 2990WX and ECC memory

AMD TR 2990WX supports ECC memory? I tested it and found it supports Unbuffered ECC memory but it does not recognize Registered ECC memory at all.

Summary

  • Unbuffered ECC memory is recognized and it can be used as ECC memory. 
  • Registered ECC memory is not recognized at all and it cannot be used. Entire memory area does not show up.


Computer configuration

  • AMD TR 2990WX
  • ASUS Zenith Extreme motherboard.
  • Kingston KVR24E17D8/16 16GB unbuffered ECC DIMM x4, total 64GB

Checking error correction function of DRAM on Windows


Powershell command:
wmic os get caption,osarchitecture,version
wmic CPU get Name
wmic CPU get NumberOfCores,NumberOfLogicalProcessors
wmic bios get manufacturer,name,version
wmic memorychip get banklabel,manufacturer,partnumber,speed
$a = Get-WMIObject -Class "Win32_PhysicalMemoryArray"
Switch ($a.MemoryErrorCorrection) {
    0 {Write-Host "ECC Type: Reserved"}
    1 {Write-Host "ECC Type: Other"}
    2 {Write-Host "ECC Type: Unknown"}
    3 {Write-Host "ECC Type: None"}
    4 {Write-Host "ECC Type: Parity"}
    5 {Write-Host "ECC Type: Single-bit ECC"}
    6 {Write-Host "ECC Type: Multi-bit ECC"}
    7 {Write-Host "ECC Type: CRC"}
}

Result:
Caption                   OSArchitecture  Version
Microsoft Windows 10 Pro  64-bit          10.0.17763

Name
AMD Ryzen Threadripper 2990WX 32-Core Processor

NumberOfCores  NumberOfLogicalProcessors
32             64

Manufacturer              Name  Version
American Megatrends Inc.  1601  AMD - 3242016

BankLabel     Manufacturer  PartNumber        Speed
P0 CHANNEL A  Kingston      9965669-027.A00G  2666
P0 CHANNEL B  Kingston      9965669-027.A00G  2666
P0 CHANNEL C  Kingston      9965669-027.A00G  2666
P0 CHANNEL D  Kingston      9965669-027.A00G  2666

ECC Type: Multi-bit ECC

It seems ECC is working.


Note: AMD Threadripper 2990WX optimally works with 4x DDR4 2933 memory. I ordered wrong memory 😄

ECC error probability

I read somewhere, some datacenter, which operates 1000 PCs of 24/7, observes 3 to 5 ECC soft error events (1bit flipped and safely corrected) in one month. On this case, ECC memory is must have feature. 

If you have one PC which is used 8 hours a day sees one ECC error event in 50 years to 83 years of operation. 

BIOS settings of memory scrubbing


On some motherboard, a BIOS parameter change is needed to generate machine check exception logs, then ECC error correction event can be viewed by Windows event viewer. It seems my motherboard does not have this menu item. Hope it is enabled by default.

ASUS Zenith Extreme motherboard have a DRAM ECC enabled/disabled option. Also there is a bios setting to scrub: periodically read, check and correct 1bit flip errors on ECC memory area. I chose 24 hours interval scrub.


Testing memory error using Memtest86 v8.0


Passed the test.

Following data is interesting:

CacheHierarchy Size   Speed
L1 cache       96KB   93.13GB/sec
L2 cache       512KB  55.13GB/sec
L3 cache       64MB   16.24GB/sec
DRAM           64GB   16.07GB/sec


(This is DDR-2400 memory result)



About registered ECC memory and TR 2990WX


I have Registered ECC memory (16GB DDR4-2333 RDIMM x8) for another computer so I tested if it works or not. Found it is not recognized. Not just ECC part of the memory, entire memory is not recognized at all.

Conclusion


AMD TR 2990WX and Asus Zenith Extreme combination recognizes Unbuffered ECC memory. it seems ECC functionality is working.

But it does not recognize Registered ECC memory.

Be careful to choose correct memory type for your computer!