cancel
Showing results for 
Search instead for 
Did you mean: 

Free DX 12 Rift Engine Code

lamour42
Expert Protege
Hi,

if you want to write code for the Rift using DirectX 12 you might want to take
a look at the code I provided on GitHub https://github.com/ClemensX/ShadedPath12.git

The sample engine is extremely limited on draw abilities: I can only draw lines!
But it may serve as a learning playground for DirectX 12 and Rift programming.

I find it fascinating how a bunch of simple lines suddenly become great if you can walk around them and view them from any direction when you wear the Rift!

The current state of the code is a first step to porting my older DX 11 engine to DX 12.
If you want you are allowed to use any code you like in your own projects.

I want to express gratitude to galopin, who came up with a detailed 8-step guide on how to combine
DirectX 12 with Oculus SDK rendering. See this thread https://forums.oculus.com/viewtopic.php?f=20&t=25900
When I found out that using oculus API ovr_CreateSwapTextureSetD3D11 on a
D3D11On12Device throws NullPointerExceptions I would have given up if he had not given this advice!

Some features of the code example:

  • Engine / Sample separation. Look at Sample1.cpp to see what you can currently do with this engine and see how it is done.

  • Oculus Rift support (head tracking and rendering). See vr.cpp

  • Post Effect Shader: Copy rendered frame to texture - Rift support is built on top of this feature

  • Use Threads to update GPU data. See LinesEffect::update()

  • Synchronize GPU and CPU via Fences

  • Free float camera - use WASD or arrow keys to navigate. Or just walk/turn/duck if you wear the Rift


Any feedback welcome.
56 REPLIES 56

cybereality
Grand Champion
Nice.

glaze
Honored Guest
Thanks! I'll probably learn from this codebase when adding Rift support into my engine's D3D12 renderer.

galopin
Heroic Explorer
Gratitude accepted 🙂

If only Oculus could add real support, integrating d3d12 queues and fences, plus manual management of the surface memory...

lamour42
Expert Protege


Added a geometry shader to draw 3D text at any world position. Also allows to draw a coordinate system.

The text is copied to the GPU with some positional data, then the geometry shader parses the text and produces all the lines necessary to draw the letters. Intended as a diagnostics tool.

While looking lame in the picture above it looks far more impressive if watched with the Rift. Makes you want to touch the lines.

lamour42
Expert Protege


Added texture support.

The texture shader uploads DDS image files to the GPU and a billboard shader draws them at user defined world position and size.

Texture support re-uses the DX12 DDSTextureLoader from Microsofts MiniEngine. Slightly changed to be easier to use outside MiniEngine.

blazespinnaker2
Explorer
This looks great. Just curious, have you done any perf testing compared to the tiny room demo?

lamour42
Expert Protege
No, I didn't compare performance to other examples. But performance is a big topic for me. And the Microsoft tools are very good in showing you the bottlenecks of your code.

Some remarks with regards to performance:


  • I copied the approach of the Microsoft provided DX12 examples of using 3 frames at the same time for rendering and synchronizing them with fences. Unfortunately the documentation about this topics lacks any depth, so there are a lot questions unanswered. I found it hard to come up with a system that really runs parallel and doesn't limit access to your central objects. Certainly a topic that needs to be revisited.

  • Texture preloading. For a small framework like mine I think it is ok, even beneficial, to preload all textures in start up phase. It is just a lot easier when you know that all textures are already in GPU memory when you start rendering. Not something a big engine for big games could do, but for smaller applications I think it is the right way to go.

  • Threaded approach. I experimented a lot with threads for the 3D text shader. Meant as a diagnostic tool, it doesn't matter if text changes are reflected some frames too late in the world. So on rendering the input buffer that is already presend on the GPU is just reused. Only some bytes with the current View/Projection Matrix have to be copied to the GPU before rendering can start. A background thread is responsible to update the GPU input buffer for all the text in the background and then just switch to the new buffer once it is ready. With this approach it doesn't really matter how much text you display (at least not until the text shader on the GPU becomes the bottleneck). In my example I display over 1000 lines of text without seeing any performance degradation at all. I still have several hundred frames per second in a window and constant 75 fps in the rift.

  • Rift optimizations. One advantage to start with a completely new framework is that I do not have to pay attention to existing shader code and more traditional ways of rendering. Usually, for most existing engines, each shader updates it's data for each frame in an update method, then renders in a draw method. When you draw for the rift, you draw the two images for each eye right after another. The images are very similar, but obviously not exactly the same. There is a lot overhead involved in going through all the update and drawing code twice. My shaders are designed in a way that as much unnecessary double work can be avoided as possible. Basically, all setup (like input buffers, updating world positions of your objects) is done only once. When it is time to issue the actual draw call on the GPU, the corrected Model/View/Projection matrix for the current eye is copied to the GPU and rendering starts.

galopin
Heroic Explorer
Overhead to render in dx12 ? it is because you do not think GPU 🙂

Yes, pushing on the CPU can be light speed compared to DX11 while using it in an old cpu fashion way, but the real force is to do things differently, more stream lined to the GPU.

I broke my oculus mode right now, still the screenshots are 256K objects, with per instance texture, in a couple of dispatches, one ExecuteIndirect that contains 4096 draws when no culling no occlusion is performed ( to emulate a collection of different objects, should be one ExecuteIndirect per PSO, my commands are made of one index buffer, two vertex buffers and a draw instanced ) and a few draw calls for text, debug draws, gpu timers and blits.

The cpu cost of the app right now is near to zero, if i look at the sky, i am still gpu bound at 0.8ms, 0.6ms is the draw indirect ( it should be zero but is not able to claim performance from a count buffer value smaller than the max argument, nvidia need to fix that, and AMD is just pure broken right now on ExecuteIndirect, no kidding ). Imagine you culled on the cpu 256K objects, even with the right hierarchical structure, your are way behind that.

In a real app, because of dx12 bindless, the number of real cpu unique draw calls is lower than it could have been on dx11. For a stereo render, you can imagine a lot of techniques, mine is doubling the groups in the ExecuteIndirect, and add an extra root constant to say left/right in the command signature and use that to use the proper viewprojectionviewport matrix + extra clip plane between the two fake viewports ( because VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation is false on nvidia, or it would be even simpler, just a semantic to output from the VS ).

The culling occlusion is the small red bar in the top left part of the screen. The red part in the right is the blit to backbuffer prior to text and gpu timers. And the purple just before is depth buffer pyramid for occlusion in the next frame. Most of my stuff are still rough and not optimal, and of course, you do not want to know the number of millions of triangles that are on these screenshots 🙂

full sized images

Only frustum culling:


With occlusion culling:


What was hidden:


the stripped grey bar show the milliseconds.

lamour42
Expert Protege
yes, you get the feeling that once your stuff is on the GPU the speed is limitless. Here we look at 1 Million billboard textures from inside the rift. At totally constant 75 FPS.