New version of the samples and tutorials based on KHR_ray_tracing

This commit is contained in:
mklefrancois 2020-03-31 17:51:08 +02:00
parent 2fd15056a2
commit b6402f0c09
271 changed files with 134108 additions and 2 deletions

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 18 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

BIN
docs/Images/VkInstances.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 152 KiB

BIN
docs/Images/animation1.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 641 KiB

BIN
docs/Images/animation2.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

BIN
docs/Images/anyhit.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 449 KiB

BIN
docs/Images/callable.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 226 KiB

BIN
docs/Images/manyhits.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 73 KiB

BIN
docs/Images/manyhits2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.1 KiB

BIN
docs/Images/manyhits3.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.9 KiB

BIN
docs/Images/manyhits4.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 366 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 380 KiB

BIN
docs/Images/rayquery.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 146 KiB

BIN
docs/Images/reflections.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 382 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 497 KiB

251
docs/NV_to_KHR.md.htm Normal file
View file

@ -0,0 +1,251 @@
<meta charset="utf-8" lang="en">
**Converting VK_NV_ray_tracing to VK_KHR_ray_tracing**
This document is a quick guide on what need to be changed to convert an existing application
using NV ray tracing extension to KHR.
# The Obvious
For most structures and enum, the ending with NV can be replaced with KHR.
This is true for example for:
Some examples:
NVIDIA | KHRONOS
-------------------------------|-----------------------------
VK_SHADER_STAGE_RAYGEN_BIT_NV | VK_SHADER_STAGE_RAYGEN_BIT_KHR
VK_SHADER_STAGE_CLOSEST_HIT_BIT_NV | VK_SHADER_STAGE_CLOSEST_HIT_BIT_KHR
VK_GEOMETRY_INSTANCE_TRIANGLE_CULL_DISABLE_BIT_NV | VK_GEOMETRY_INSTANCE_TRIANGLE_CULL_DISABLE_BIT_KHR
VK_GEOMETRY_INSTANCE_FORCE_OPAQUE_BIT_NV | VK_GEOMETRY_INSTANCE_FORCE_OPAQUE_BIT_KHR
[Types and Flags]
NVIDIA | KHRONOS
-------------------------------|-----------------------------
VkWriteDescriptorSetAccelerationStructureNV | VkWriteDescriptorSetAccelerationStructureKHR
VkRayTracingShaderGroupCreateInfoNV | VkRayTracingShaderGroupCreateInfoKHR
VkRayTracingPipelineCreateInfoNV | VkRayTracingPipelineCreateInfoKHR
[Structures]
# Handles version Device Addresses -> memory allocations
With KHR, we no longer pass the buffer or an handle, but the `vk::DeviceAddress`.
First, when allocating a buffer, it has to have the `VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT` flag.
Similarly, the memory associated with this buffer, needs also the `VK_MEMORY_ALLOCATE_DEVICE_ADDRESS_BIT` flag.
For the memory allocation, this could be done like this:
~~~~ C++
vk::MemoryAllocateFlagsInfo memFlagInfo;
memFlagInfo.setFlags(vk::MemoryAllocateFlagBits::eDeviceAddress);
vk::MemoryAllocateInfo memAlloc;
memAlloc.setPNext(&memFlagInfo);
// Allocate memory
~~~~
The buffer address could then be retrieved like this:
~~~~ C++
vk::DeviceAddress vertexAddress = m_device.getBufferAddress({model.vertexBuffer.buffer});
~~~~
# Where is GeometryNV?
The structure to create BLAS was replaced by different structures.
* `vk::AccelerationStructureCreateGeometryTypeInfoKHR` : describe how the acceleration structure is created. It is an indication how large it could be.
* `vk::AccelerationStructureGeometryKHR` : the geometry to build, addresses of vertices and indices
* `vk::AccelerationStructureBuildOffsetInfoKHR` : the number of elements to build and offsets
Those three structures can be an array of each, meaning that a BLAS can be a combination fo multiple geometries.
As an example on how those are filed. It returns `nvvkpp::RaytracingBuilderKHR::Blas` which has vectors
of the above structures.
~~~~ C++
//--------------------------------------------------------------------------------------------------
// Converting a OBJ primitive to the ray tracing geometry used for the BLAS
//
nvvkpp::RaytracingBuilderKHR::Blas HelloVulkan::objectToVkGeometryKHR(const ObjModel& model)
{
// Setting up the creation info of acceleration structure
vk::AccelerationStructureCreateGeometryTypeInfoKHR asCreate;
asCreate.setGeometryType(vk::GeometryTypeKHR::eTriangles);
asCreate.setIndexType(vk::IndexType::eUint32);
asCreate.setVertexFormat(vk::Format::eR32G32B32Sfloat);
asCreate.setMaxPrimitiveCount(model.nbIndices / 3); // Nb triangles
asCreate.setMaxVertexCount(model.nbVertices);
asCreate.setAllowsTransforms(VK_FALSE); // No adding transformation matrices
// Building part
vk::DeviceAddress vertexAddress = m_device.getBufferAddress({model.vertexBuffer.buffer});
vk::DeviceAddress indexAddress = m_device.getBufferAddress({model.indexBuffer.buffer});
vk::AccelerationStructureGeometryTrianglesDataKHR triangles;
triangles.setVertexFormat(asCreate.vertexFormat);
triangles.setVertexData(vertexAddress);
triangles.setVertexStride(sizeof(VertexObj));
triangles.setIndexType(asCreate.indexType);
triangles.setIndexData(indexAddress);
triangles.setTransformData({});
// Setting up the build info of the acceleration
vk::AccelerationStructureGeometryKHR asGeom;
asGeom.setGeometryType(asCreate.geometryType);
asGeom.setFlags(vk::GeometryFlagBitsKHR::eOpaque);
asGeom.geometry.setTriangles(triangles);
// The primitive itself
vk::AccelerationStructureBuildOffsetInfoKHR offset;
offset.setFirstVertex(0);
offset.setPrimitiveCount(asCreate.maxPrimitiveCount);
offset.setPrimitiveOffset(0);
offset.setTransformOffset(0);
// Our blas is only one geometry, but could be made of many geometries
nvvkpp::RaytracingBuilderKHR::Blas blas;
blas.asGeometry.emplace_back(asGeom);
blas.asCreateGeometryInfo.emplace_back(asCreate);
blas.asBuildOffsetInfo.emplace_back(offset);
return blas;
}
~~~~
# Creating and building BLAS/TLAS
With the structures filled in, there are some similarities with the NVIDIA extension.
## BLAS
The construction of a BLAS AS will look like this:
~~~~ C++
vk::AccelerationStructureCreateInfoKHR asCreateInfo{{}, vk::AccelerationStructureTypeKHR::eBottomLevel};
asCreateInfo.setFlags(vk::BuildAccelerationStructureFlagBitsKHR::ePreferFastTrace);
asCreateInfo.setMaxGeometryCount((uint32_t)blas.asCreateGeometryInfo.size());
asCreateInfo.setPGeometryInfos(blas.asCreateGeometryInfo.data());
// Create an acceleration structure identifier and allocate memory to
// store the resulting structure data
blas.as = m_alloc.createAcceleration(asCreateInfo);
~~~~
To retrieve the memory requirements, there is a new flag, to be build on the host or the device.
~~~~ C++
vk::AccelerationStructureMemoryRequirementsInfoKHR memoryRequirementsInfo{
vk::AccelerationStructureMemoryRequirementsTypeKHR::eBuildScratch,
vk::AccelerationStructureBuildTypeKHR::eDevice, blas.as.accel};
~~~~
Building the acceleration structure requires to pass a pointer to the array of vk::AccelerationStructureGeometryKHR.
~~~~ C++
const vk::AccelerationStructureGeometryKHR* pGeometry = blas.asGeometry.data();
vk::AccelerationStructureBuildGeometryInfoKHR bottomASInfo{vk::AccelerationStructureTypeKHR::eBottomLevel};
bottomASInfo.setFlags(flags);
bottomASInfo.setUpdate(VK_FALSE);
bottomASInfo.setSrcAccelerationStructure({});
bottomASInfo.setDstAccelerationStructure(blas.as.accel);
bottomASInfo.setGeometryArrayOfPointers(VK_FALSE);
bottomASInfo.setGeometryCount((uint32_t)blas.asGeometry.size());
bottomASInfo.setPpGeometries(&pGeometry);
bottomASInfo.setScratchData(scratchAddress);
~~~~
It will be also necessary to create an array of pointers to the vk::AccelerationStructureBuildOffsetInfoKHR of each BLAS.
~~~~ C++
// Pointers of offset
std::vector<const vk::AccelerationStructureBuildOffsetInfoKHR*> pBuildOffset(blas.asBuildOffsetInfo.size());
for(size_t i = 0; i < blas.asBuildOffsetInfo.size(); i++)
pBuildOffset[i] = &blas.asBuildOffsetInfo[i];
~~~~
## TLAS
The same structures are now used to build the top-level, using instances as the type of geometry.
FOr example, here how can be created the AS for an array of instances
~~~~ C++
vk::AccelerationStructureCreateGeometryTypeInfoKHR geometryCreate{vk::GeometryTypeKHR::eInstances};
geometryCreate.setMaxPrimitiveCount(static_cast<uint32_t>(instances.size()));
geometryCreate.setAllowsTransforms(VK_TRUE);
vk::AccelerationStructureCreateInfoKHR asCreateInfo{{}, vk::AccelerationStructureTypeKHR::eTopLevel};
asCreateInfo.setFlags(flags);
asCreateInfo.setMaxGeometryCount(1);
asCreateInfo.setPGeometryInfos(&geometryCreate);
// Create the acceleration structure object and allocate the memory
// required to hold the TLAS data
m_tlas.as = m_alloc.createAcceleration(asCreateInfo);
~~~~
Also, there are now a structure to hold the instances `vk::AccelerationStructureInstanceKHR`
You will need to fill an array with all the information, create a buffer and use
the address to set the geometry
~~~~ C++
// Allocate the instance buffer and copy its contents from host to device
// memory
m_instBuffer = m_alloc.createBuffer(cmdBuf, geometryInstances,
vk::BufferUsageFlagBits::eRayTracingKHR | vk::BufferUsageFlagBits::eShaderDeviceAddress);
m_debug.setObjectName(m_instBuffer.buffer, "TLASInstances");
vk::DeviceAddress instanceAddress = m_device.getBufferAddress(m_instBuffer.buffer);
vk::AccelerationStructureGeometryKHR topASGeometry{vk::GeometryTypeKHR::eInstances};
topASGeometry.geometry.instances.setArrayOfPointers(VK_FALSE);
topASGeometry.geometry.instances.setData(instanceAddress);
~~~~
# Calling TraceRaysKHR
This is very close to the NVIDIA version, the difference is instead of passing buffer addresses, offsets, strides,
for each stages, we have to fill vk::StridedBufferRegionKHR structure of each stages, which have
the same parameters: buffer, offset, stride and SBT size
Example:
~~~~ C++
vk::DeviceSize sbtSize = progSize * (vk::DeviceSize)m_rtShaderGroups.size();
const vk::StridedBufferRegionKHR raygenShaderBindingTable = {m_rtSBTBuffer.buffer, rayGenOffset,
progSize, sbtSize};
const vk::StridedBufferRegionKHR missShaderBindingTable = {m_rtSBTBuffer.buffer, missOffset,
progSize, sbtSize};
const vk::StridedBufferRegionKHR hitShaderBindingTable = {m_rtSBTBuffer.buffer, hitGroupOffset,
progSize, sbtSize};
const vk::StridedBufferRegionKHR callableShaderBindingTable;
cmdBuf.traceRaysKHR(&raygenShaderBindingTable, &missShaderBindingTable, &hitShaderBindingTable,
&callableShaderBindingTable, //
m_size.width, m_size.height, 1); //
~~~~
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

2
docs/README.md Normal file
View file

@ -0,0 +1,2 @@

# Start [Ray Tracing Tutorial](https://nvpro-samples.github.io/vk_raytracing_tutorial/)

BIN
docs/files/shaders.zip Normal file

Binary file not shown.

Binary file not shown.

12
docs/index.html Normal file
View file

@ -0,0 +1,12 @@
<meta charset="utf-8">
(insert vkrt_tutorial.md.htm here)
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

45
docs/setup.md.html Normal file
View file

@ -0,0 +1,45 @@
 <meta charset="utf-8">
# Environment Setup
To get support for `VK_NV_ray_tracing`, please install an [NVIDIA driver](http://www.nvidia.com/Download/index.aspx?lang=en-us)
with version 440.97 or later, and the [Vulkan SDK](http://vulkan.lunarg.com/sdk/home) version 1.1.126.0 or later.
This tutorial is a modification of [`ray_tracing__simple`](https://github.com/nvpro-samples/vk_raytracing_tutorial/tree/master/ray_tracing__simple), which is the result of the ray tracing tutorial.
All following instructions are based on the modification of this project.
Besides the current repository, you will also need to clone or download the following repositories:
* [shared_sources](https://github.com/nvpro-samples/shared_sources): The primary framework that all samples depend on.
* [shared_external](https://github.com/nvpro-samples/shared_external): Third party libraries that are provided pre-compiled, mostly for Windows x64 / MSVC.
The directory structure should be looking like:
**********************************************************
* \
* |
* +-- 📂 shared_external
* |
* +-- 📂 shared_sources
* |
* +-- 📂 vk_raytracing_tutorial
* | |
* | +-- 📂 ray_tracing__simple (<-- Start here)
* | |
* | +-- 📂 ray_tracing_...
* | ⋮
* |
* ⋮
**********************************************************
!!! Warning
**Run CMake** in vk_raytracing_tutorial.
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

View file

@ -0,0 +1,587 @@
<meta charset="utf-8" lang="en">
**NVIDIA Vulkan Ray Tracing Tutorial**
**Animation**
<small>Authors: [Martin-Karl Lefrançois](https://devblogs.nvidia.com/author/mlefrancois/), Neil Bickford </small>
# Animation
![](Images/animation2.gif)
This is an extension of the [Vulkan ray tracing tutorial](vkrt_tutorial.md.htm).
We will discuss two animation methods: animating only the transformation matrices, and animating the geometry itself.
(insert setup.md.html here)
# Animating the Matrices
This first example shows how we can update the matrices used for instances in the TLAS.
## Creating a Scene
In main.cpp we can create a new scene with a ground plane and 21 instances of the Wuson model, by replacing the
`helloVk.loadModel` calls in `main()`. The code below creates all of the instances
at the same position, but we will displace them later in the animation function. If you run the example,
you will find that the rendering is considerably slow, because the geometries are exactly at the same position
and the acceleration structure does not deal with this well.
~~~~ C++
helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths),
nvmath::scale_mat4(nvmath::vec3f(2.f, 1.f, 2.f)));
helloVk.loadModel(nvh::findFile("media/scenes/wuson.obj", defaultSearchPaths));
HelloVulkan::ObjInstance inst = helloVk.m_objInstance.back();
for(int i = 0; i < 20; i++)
helloVk.m_objInstance.push_back(inst);
~~~~
## Animation Function
We want to have all of the Wuson models running in a circle, and we will first modify the rasterizer to handle this.
Animating the transformation matrices will be done entirely on the CPU, and we will copy the computed transformation to the GPU.
In the next example, the animation will be done on the GPU using a compute shader.
Add the declaration of the animation to the `HelloVulkan` class.
~~~~ C++
void animationInstances(float time);
~~~~
The first part computes the transformations for all of the Wuson models, placing each one behind another.
~~~~ C++
void HelloVulkan::animationInstances(float time)
{
const int32_t nbWuson = static_cast<int32_t>(m_objInstance.size() - 1);
const float deltaAngle = 6.28318530718f / static_cast<float>(nbWuson);
const float wusonLength = 3.f;
const float radius = wusonLength / (2.f * sin(deltaAngle / 2.0f));
const float offset = time * 0.5f;
for(int i = 0; i < nbWuson; i++)
{
int wusonIdx = i + 1;
ObjInstance& inst = m_objInstance[wusonIdx];
inst.transform = nvmath::rotation_mat4_y(i * deltaAngle + offset)
* nvmath::translation_mat4(radius, 0.f, 0.f);
inst.transformIT = nvmath::transpose(nvmath::invert(inst.transform));
}
~~~~
Next, we update the buffer that describes the scene, which is used by the rasterizer to set each object's position, and also by the ray tracer to compute shading normals.
~~~~ C++
// Update the buffer
vk::DeviceSize bufferSize = m_objInstance.size() * sizeof(ObjInstance);
nvvkBuffer stagingBuffer = m_alloc.createBuffer(bufferSize, vk::BufferUsageFlagBits::eTransferSrc,
vk::MemoryPropertyFlagBits::eHostVisible);
// Copy data to staging buffer
auto* gInst = m_alloc.map(stagingBuffer);
memcpy(gInst, m_objInstance.data(), bufferSize);
m_alloc.unmap(stagingBuffer);
// Copy staging buffer to the Scene Description buffer
nvvkpp::SingleCommandBuffer genCmdBuf(m_device, m_graphicsQueueIndex);
vk::CommandBuffer cmdBuf = genCmdBuf.createCommandBuffer();
cmdBuf.copyBuffer(stagingBuffer.buffer, m_sceneDesc.buffer, vk::BufferCopy(0, 0, bufferSize));
m_debug.endLabel(cmdBuf);
genCmdBuf.flushCommandBuffer(cmdBuf);
m_alloc.destroy(stagingBuffer);
}
~~~~
<script type="preformatted">
!!! Note:
We could have used `cmdBuf.updateBuffer<ObjInstance>(m_sceneDesc.buffer, 0, m_objInstance)` to
update the buffer, but this function only works for buffers with less than 65,536 bytes. If we had 2000 Wuson models, this
call wouldn't work.
## Loop Animation
In `main()`, just before the main loop, add a variable to hold the start time.
We will use this time in our animation function.
~~~~ C++
auto start = std::chrono::system_clock::now();
~~~~
Inside the `while` loop, just before calling `appBase.prepareFrame()`, invoke the animation function.
~~~~ C++
std::chrono::duration<float> diff = std::chrono::system_clock::now() - start;
helloVk.animationInstances(diff.count());
~~~~
If you run the application, the Wuson models will be running in a circle when using the rasterizer, but
they will still be at their original positions in the ray traced version. We will need to update the TLAS for this.
# Update TLAS
Since we want to update the transformation matrices in the TLAS, we need to keep some of the objects used to create it.
First, move the vector of `nvvkpp::RaytracingBuilder::Instance` objects from `HelloVulkan::createTopLevelAS()` to the
`HelloVulkan` class.
~~~~ C++
std::vector<nvvkpp::RaytracingBuilder::Instance> m_tlas;
~~~~
Make sure to rename it to `m_tlas`, instead of `tlas`.
One important point is that we need to set the TLAS build flags to allow updates, by adding the`vk::BuildAccelerationStructureFlagBitsKHR::eAllowUpdate` flag.
This is absolutely needed, since otherwise the TLAS cannot be updated.
~~~~ C++
void HelloVulkan::createTopLevelAS()
{
m_tlas.reserve(m_objInstance.size());
for(int i = 0; i < static_cast<int>(m_objInstance.size()); i++)
{
nvvkpp::RaytracingBuilder::Instance rayInst;
rayInst.transform = m_objInstance[i].transform; // Position of the instance
rayInst.instanceId = i; // gl_InstanceID
rayInst.blasId = m_objInstance[i].objIndex;
rayInst.hitGroupId = m_objInstance[i].hitgroup;
rayInst.flags = vk::GeometryInstanceFlagBitsKHR::eTriangleCullDisable;
m_tlas.emplace_back(rayInst);
}
m_rtBuilder.buildTlas(m_tlas, vk::BuildAccelerationStructureFlagBitsKHR::ePreferFastTrace
| vk::BuildAccelerationStructureFlagBitsKHR::eAllowUpdate);
}
~~~~
Back in `HelloVulkan::animationInstances()`, we need to copy the new computed transformation
matrices to the vector of `nvvkpp::RaytracingBuilder::Instance` objects.
In the `for` loop, add at the end
~~~~ C++
nvvkpp::RaytracingBuilder::Instance& tinst = m_tlas[wusonIdx];
tinst.transform = inst.transform;
~~~~
The last point is to call the update at the end of the function.
~~~~ C++
m_rtBuilder.updateTlasMatrices(m_tlas);
~~~~
![](Images/animation1.gif)
## nvvkpp::RaytracingBuilder::updateTlasMatrices (Implementation)
We currently use `nvvkpp::RaytracingBuilder` to update the matrices for convenience, but
this could be done more efficiently if one kept some of the buffer and memory references. Using a
memory allocator, such as the one described in the [Many Objects Tutorial](vkrt_tuto_instances.md.htm),
could also be an alternative for avoiding multiple reallocations. Here's the implementation of `nvvkpp::RaytracingBuilder::updateTlasMatrices`.
### Staging Buffer
As in the rasterizer, the data needs to be staged before it can be copied to the buffer used for
building the TLAS.
~~~~ C++
void updateTlasMatrices(const std::vector<Instance>& instances)
{
VkDeviceSize bufferSize = instances.size() * sizeof(vk::AccelerationStructureInstanceKHR);
// Create a staging buffer on the host to upload the new instance data
nvvkBuffer stagingBuffer = m_alloc.createBuffer(bufferSize, vk::BufferUsageFlagBits::eTransferSrc,
#if defined(ALLOC_VMA)
VmaMemoryUsage::VMA_MEMORY_USAGE_CPU_TO_GPU
#else
vk::MemoryPropertyFlagBits::eHostVisible | vk::MemoryPropertyFlagBits::eHostCoherent
#endif
);
// Copy the instance data into the staging buffer
auto* gInst = reinterpret_cast<vk::AccelerationStructureInstanceKHR*>(m_alloc.map(stagingBuffer));
for(int i = 0; i < instances.size(); i++)
{
gInst[i] = instanceToVkGeometryInstanceKHR(instances[i]);
}
m_alloc.unmap(stagingBuffer);
~~~~
### Scratch Memory
Building the TLAS always needs scratch memory, and so we need to request it. If
we hadn't set the `eAllowUpdate` flag, the returned size would be zero and the rest of the code
would fail.
~~~~ C++
// Compute the amount of scratch memory required by the AS builder to update the TLAS
vk::AccelerationStructureMemoryRequirementsInfoKHR memoryRequirementsInfo{
vk::AccelerationStructureMemoryRequirementsTypeKHR::eUpdateScratch,
vk::AccelerationStructureBuildTypeKHR::eDevice, m_tlas.as.accel};
vk::DeviceSize scratchSize =
m_device.getAccelerationStructureMemoryRequirementsKHR(memoryRequirementsInfo).memoryRequirements.size;
// Allocate the scratch buffer
nvvkBuffer scratchBuffer = m_alloc.createBuffer(scratchSize, vk::BufferUsageFlagBits::eRayTracingKHR
| vk::BufferUsageFlagBits::eShaderDeviceAddress);
vk::DeviceAddress scratchAddress = m_device.getBufferAddress({scratchBuffer.buffer});
~~~~
### Update the Buffer
In a new command buffer, we copy the staging buffer to the device buffer and
add a barrier to make sure the memory finishes copying before updating the TLAS.
~~~~ C++
// Update the instance buffer on the device side and build the TLAS
nvvkpp::SingleCommandBuffer genCmdBuf(m_device, m_queueIndex);
vk::CommandBuffer cmdBuf = genCmdBuf.createCommandBuffer();
cmdBuf.copyBuffer(stagingBuffer.buffer, m_instBuffer.buffer, vk::BufferCopy(0, 0, bufferSize));
vk::DeviceAddress instanceAddress = m_device.getBufferAddress(m_instBuffer.buffer);
// Make sure the copy of the instance buffer are copied before triggering the
// acceleration structure build
vk::MemoryBarrier barrier(vk::AccessFlagBits::eTransferWrite, vk::AccessFlagBits::eAccelerationStructureWriteKHR);
cmdBuf.pipelineBarrier(vk::PipelineStageFlagBits::eTransfer, vk::PipelineStageFlagBits::eAccelerationStructureBuildKHR,
vk::DependencyFlags(), {barrier}, {}, {});
~~~~
### Update Acceleration Structure
We update the TLAS using the same acceleration structure for source and
destination to update it in place, and using the VK_TRUE parameter to trigger the update.
~~~~ C++
vk::AccelerationStructureGeometryKHR topASGeometry{vk::GeometryTypeKHR::eInstances};
topASGeometry.geometry.instances.arrayOfPointers = VK_FALSE;
topASGeometry.geometry.instances.data = instanceAddress;
const vk::AccelerationStructureGeometryKHR* pGeometry = &topASGeometry;
vk::AccelerationStructureBuildGeometryInfoKHR topASInfo;
topASInfo.setFlags(m_tlas.flags);
topASInfo.setUpdate(VK_TRUE);
topASInfo.setSrcAccelerationStructure(m_tlas.as.accel);
topASInfo.setDstAccelerationStructure(m_tlas.as.accel);
topASInfo.setGeometryArrayOfPointers(VK_FALSE);
topASInfo.setGeometryCount(1);
topASInfo.setPpGeometries(&pGeometry);
topASInfo.setScratchData(scratchAddress);
uint32_t nbInstances = (uint32_t)instances.size();
vk::AccelerationStructureBuildOffsetInfoKHR buildOffsetInfo = {nbInstances, 0, 0, 0};
const vk::AccelerationStructureBuildOffsetInfoKHR* pBuildOffsetInfo = &buildOffsetInfo;
// Update the acceleration structure. Note the VK_TRUE parameter to trigger the update,
// and the existing TLAS being passed and updated in place
cmdBuf.buildAccelerationStructureKHR(1, &topASInfo, &pBuildOffsetInfo);
genCmdBuf.flushCommandBuffer(cmdBuf);
~~~~
### Cleanup
Finally, we release all temporary buffers.
~~~~ C++
m_alloc.destroy(scratchBuffer);
m_alloc.destroy(stagingBuffer);
}
~~~~
# BLAS Animation
In the previous chapter, we updated the transformation matrices. In this one we will modify vertices in a compute shader.
## Adding a Sphere
In this chapter, we will animate a sphere. In `main.cpp`, set up the scene like this:
~~~~ C++
helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths),
nvmath::scale_mat4(nvmath::vec3f(2.f, 1.f, 2.f)));
helloVk.loadModel(nvh::findFile("media/scenes/wuson.obj", defaultSearchPaths));
HelloVulkan::ObjInstance inst = helloVk.m_objInstance.back();
for(int i = 0; i < 5; i++)
helloVk.m_objInstance.push_back(inst);
helloVk.loadModel(nvh::findFile("media/scenes/sphere.obj", defaultSearchPaths));
~~~~
Because we now have a new instance, we have to adjust the calculation of the number of Wuson models in `HelloVulkan::animationInstances()`.
~~~~ C++
const int32_t nbWuson = static_cast<int32_t>(m_objInstance.size() - 2);
~~~~
## Compute Shader
The compute shader will update the vertices in-place.
Add all of the following members to the `HelloVulkan` class:
~~~~ C++
void createCompDescriptors();
void updateCompDescriptors(nvvkBuffer& vertex);
void createCompPipelines();
std::vector<vk::DescriptorSetLayoutBinding> m_compDescSetLayoutBind;
vk::DescriptorPool m_compDescPool;
vk::DescriptorSetLayout m_compDescSetLayout;
vk::DescriptorSet m_compDescSet;
vk::Pipeline m_compPipeline;
vk::PipelineLayout m_compPipelineLayout;
~~~~
The compute shader will work on a single `VertexObj` buffer.
~~~~ C++
void HelloVulkan::createCompDescriptors()
{
m_compDescSetLayoutBind.emplace_back(vk::DescriptorSetLayoutBinding(
0, vk::DescriptorType::eStorageBuffer, 1, vk::ShaderStageFlagBits::eCompute));
m_compDescSetLayout = nvvkpp::util::createDescriptorSetLayout(m_device, m_compDescSetLayoutBind);
m_compDescPool = nvvkpp::util::createDescriptorPool(m_device, m_compDescSetLayoutBind, 1);
m_compDescSet = nvvkpp::util::createDescriptorSet(m_device, m_compDescPool, m_compDescSetLayout);
}
~~~~
`updateCompDescriptors` will set the set the descriptor to the buffer of `VertexObj` objects to which the animation will be applied.
~~~~ C++
void HelloVulkan::updateCompDescriptors(nvvkBuffer& vertex)
{
std::vector<vk::WriteDescriptorSet> writes;
vk::DescriptorBufferInfo dbiUnif{vertex.buffer, 0, VK_WHOLE_SIZE};
writes.emplace_back(
nvvkpp::util::createWrite(m_compDescSet, m_compDescSetLayoutBind[0], &dbiUnif));
m_device.updateDescriptorSets(static_cast<uint32_t>(writes.size()), writes.data(), 0, nullptr);
}
~~~~
The compute pipeline will consist of a simple shader and a push constant, which will be used
to set the animation time.
~~~~ C++
void HelloVulkan::createCompPipelines()
{
// pushing time
vk::PushConstantRange push_constants = {vk::ShaderStageFlagBits::eCompute, 0, sizeof(float)};
vk::PipelineLayoutCreateInfo layout_info{{}, 1, &m_compDescSetLayout, 1, &push_constants};
m_compPipelineLayout = m_device.createPipelineLayout(layout_info);
vk::ComputePipelineCreateInfo computePipelineCreateInfo{{}, {}, m_compPipelineLayout};
computePipelineCreateInfo.stage =
nvvkpp::util::loadShader(m_device,
nvh::loadFile("shaders/anim.comp.spv", true, defaultSearchPaths),
vk::ShaderStageFlagBits::eCompute);
m_compPipeline = m_device.createComputePipelines({}, computePipelineCreateInfo, nullptr)[0];
m_device.destroy(computePipelineCreateInfo.stage.module);
}
~~~~
Finally, destroy the resources in `HelloVulkan::destroyResources()`:
~~~~ C++
m_device.destroy(m_compDescPool);
m_device.destroy(m_compDescSetLayout);
m_device.destroy(m_compPipeline);
m_device.destroy(m_compPipelineLayout);
~~~~
## `anim.comp`
The compute shader will be simple. We need to add a new shader file, `anim.comp`, to the `shaders` filter in the solution.
This will move each vertex up and down over time.
~~~~ C++
#version 460
#extension GL_ARB_separate_shader_objects : enable
#extension GL_EXT_scalar_block_layout : enable
#extension GL_GOOGLE_include_directive : enable
#include "wavefront.glsl"
layout(binding = 0, scalar) buffer Vertices
{
Vertex v[];
}
vertices;
layout(push_constant) uniform shaderInformation
{
float iTime;
}
pushc;
void main()
{
Vertex v0 = vertices.v[gl_GlobalInvocationID.x];
// Compute vertex position
const float PI = 3.14159265;
const float signY = (v0.pos.y >= 0 ? 1 : -1);
const float radius = length(v0.pos.xz);
const float argument = pushc.iTime * 4 + radius * PI;
const float s = sin(argument);
v0.pos.y = signY * abs(s) * 0.5;
// Compute normal
if(radius == 0.0f)
{
v0.nrm = vec3(0.0f, signY, 0.0f);
}
else
{
const float c = cos(argument);
const float xzFactor = -PI * s * c;
const float yFactor = 2.0f * signY * radius * abs(s);
v0.nrm = normalize(vec3(v0.pos.x * xzFactor, yFactor, v0.pos.z * xzFactor));
}
vertices.v[gl_GlobalInvocationID.x] = v0;
}
~~~~
## Animating the Object
First add the declaration of the animation function in `HelloVulkan`:
~~~~ C++
void animationObject(float time);
~~~~
The implementation only pushes the current time and calls the compute shader (`dispatch`).
~~~~ C++
void HelloVulkan::animationObject(float time)
{
ObjModel& model = m_objModel[2];
updateCompDescriptors(model.vertexBuffer);
nvvkpp::SingleCommandBuffer genCmdBuf(m_device, m_graphicsQueueIndex);
vk::CommandBuffer cmdBuf = genCmdBuf.createCommandBuffer();
cmdBuf.bindPipeline(vk::PipelineBindPoint::eCompute, m_compPipeline);
cmdBuf.bindDescriptorSets(vk::PipelineBindPoint::eCompute, m_compPipelineLayout, 0,
{m_compDescSet}, {});
cmdBuf.pushConstants(m_compPipelineLayout, vk::ShaderStageFlagBits::eCompute, 0, sizeof(float),
&time);
cmdBuf.dispatch(model.nbVertices, 1, 1);
genCmdBuf.flushCommandBuffer(cmdBuf);
}
~~~~
## Invoking Animation
In `main.cpp`, after the other resource creation functions, add the creation functions for the compute shader.
~~~~ C++
helloVk.createCompDescriptors();
helloVk.createCompPipelines();
~~~~
In the rendering loop, after the call to `animationInstances`, call the object animation function.
~~~~ C++
helloVk.animationObject(diff.count());
~~~~
!!! Note
At this point, the object should be animated when using the rasterizer, but should still be immobile when using the ray tracer.
## Update BLAS
In `nvvkpp::RaytracingBuilder` in `raytrace_vkpp.hpp`, we can add a function to update a BLAS whose vertex buffer was previously updated. This function is very similar to the one used for instances, but in this case, there is no buffer transfer to do.
~~~~ C++
//--------------------------------------------------------------------------------------------------
// Refit the BLAS from updated buffers
//
void updateBlas(uint32_t blasIdx)
{
Blas& blas = m_blas[blasIdx];
// Compute the amount of scratch memory required by the AS builder to update the TLAS
vk::AccelerationStructureMemoryRequirementsInfoKHR memoryRequirementsInfo{
vk::AccelerationStructureMemoryRequirementsTypeKHR::eUpdateScratch,
vk::AccelerationStructureBuildTypeKHR::eDevice, blas.as.accel};
vk::DeviceSize scratchSize =
m_device.getAccelerationStructureMemoryRequirementsKHR(memoryRequirementsInfo).memoryRequirements.size;
// Allocate the scratch buffer
nvvkBuffer scratchBuffer = m_alloc.createBuffer(scratchSize, vk::BufferUsageFlagBits::eRayTracingKHR
| vk::BufferUsageFlagBits::eShaderDeviceAddress);
vk::DeviceAddress scratchAddress = m_device.getBufferAddress({scratchBuffer.buffer});
const vk::AccelerationStructureGeometryKHR* pGeometry = blas.asGeometry.data();
vk::AccelerationStructureBuildGeometryInfoKHR asInfo{vk::AccelerationStructureTypeKHR::eBottomLevel};
asInfo.setFlags(blas.flags);
asInfo.setUpdate(VK_TRUE);
asInfo.setSrcAccelerationStructure(blas.as.accel);
asInfo.setDstAccelerationStructure(blas.as.accel);
asInfo.setGeometryArrayOfPointers(VK_FALSE);
asInfo.setGeometryCount((uint32_t)blas.asGeometry.size());
asInfo.setPpGeometries(&pGeometry);
asInfo.setScratchData(scratchAddress);
std::vector<const vk::AccelerationStructureBuildOffsetInfoKHR*> pBuildOffset(blas.asBuildOffsetInfo.size());
for(size_t i = 0; i < blas.asBuildOffsetInfo.size(); i++)
pBuildOffset[i] = &blas.asBuildOffsetInfo[i];
// Update the instance buffer on the device side and build the TLAS
nvvkpp::SingleCommandBuffer genCmdBuf(m_device, m_queueIndex);
vk::CommandBuffer cmdBuf = genCmdBuf.createCommandBuffer();
// Update the acceleration structure. Note the VK_TRUE parameter to trigger the update,
// and the existing BLAS being passed and updated in place
cmdBuf.buildAccelerationStructureKHR(asInfo, pBuildOffset);
genCmdBuf.flushCommandBuffer(cmdBuf);
m_alloc.destroy(scratchBuffer);
}
~~~~
The previous function (`updateBlas`) uses geometry information stored in `m_blas`.
To be able to re-use this information, we need to keep the structure of `nvvkpp::RaytracingBuilderKHR::Blas` objects
used for its creation.
Move the `nvvkpp::RaytracingBuilderKHR::Blas` vector from `HelloVulkan::createBottomLevelAS()` to the `HelloVulkan` class, renaming it to `m_blas`.
~~~~ C++
std::vector<nvvkpp::RaytracingBuilderKHR::Blas> m_blas;
~~~~
As with the TLAS, the BLAS needs to allow updates. We will also enable the
`VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_BUILD_BIT_KHR` flag, which indicates that the given
acceleration structure build should prioritize build time over trace performance.
~~~~ C++
void HelloVulkan::createBottomLevelAS()
{
// BLAS - Storing each primitive in a geometry
m_blas.reserve(m_objModel.size());
for(const auto & obj : m_objModel)
{
auto blas = objectToVkGeometryKHR(obj);
// We could add more geometry in each BLAS, but we add only one for now
m_blas.push_back(blas);
}
m_rtBuilder.buildBlas(m_blas, vk::BuildAccelerationStructureFlagBitsKHR::eAllowUpdate
| vk::BuildAccelerationStructureFlagBitsKHR::ePreferFastBuild);
}
~~~~
Finally, we can add a line at the end of `HelloVulkan::animationObject()` to update the BLAS.
~~~~ C++
m_rtBuilder.updateBlas(2);
~~~~
![](Images/animation2.gif)
</script>
# Final Code
You can find the final code in the folder [ray_tracing_animation](https://github.com/nvpro-samples/vk_raytracing_tutorial/tree/master/ray_tracing_animation)
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

View file

@ -0,0 +1,250 @@
<meta charset="utf-8" lang="en">
**NVIDIA Vulkan Ray Tracing Tutorial**
**Anyhit Shaders**
<small>Authors: [Martin-Karl Lefrançois](https://devblogs.nvidia.com/author/mlefrancois/), Neil Bickford </small>
![](Images/anyhit.png)
This is an extension of the Vulkan ray tracing [tutorial](vkrt_tutorial.md.htm).
Like closest hit shaders, any hit shaders operate on intersections between rays and geometry. However, the any hit
shader will be executed for all hits along the ray. The closest hit shader will then be invoked on the closest accepted
intersection.
The any hit shader can be useful for discarding intersections, such as for alpha cutouts for example, but can also be
used for simple transparency. In this example we will show what is needed to do to add this shader type and to create a
transparency effect.
!!! Note Note
This example is based on many elements from the [Antialiasing Tutorial](vkrt_tuto_jitter_cam.md.htm).
(insert setup.md.html here)
# Any Hit
## `raytrace.rahit`
Create a new shader file `raytrace.rahit` and rerun CMake to have it added to the solution.
This shader starts like `raytrace.chit`, but uses less information.
~~~~ C++
#version 460
#extension GL_NV_ray_tracing : require
#extension GL_EXT_nonuniform_qualifier : enable
#extension GL_EXT_scalar_block_layout : enable
#extension GL_GOOGLE_include_directive : enable
#include "random.glsl"
#include "raycommon.glsl"
#include "wavefront.glsl"
// clang-format off
layout(location = 0) rayPayloadInNV hitPayload prd;
layout(binding = 2, set = 1, scalar) buffer ScnDesc { sceneDesc i[]; } scnDesc;
layout(binding = 4, set = 1) buffer MatIndexColorBuffer { int i[]; } matIndex[];
layout(binding = 5, set = 1, scalar) buffer Vertices { Vertex v[]; } vertices[];
layout(binding = 6, set = 1) buffer Indices { uint i[]; } indices[];
layout(binding = 1, set = 1, scalar) buffer MatColorBufferObject { WaveFrontMaterial m[]; } materials[];
// clang-format on
~~~~
!!! Note
You can find the source of `random.glsl` in the Antialiasing Tutorial [here](../ray_tracing_jitter_cam/README.md#toc1.1).
For the any hit shader, we need to know which material we hit, and whether that material supports transparency. If it is
opaque, we simply return, which means that the hit will be accepted.
~~~~ C++
void main()
{
// Object of this instance
uint objId = scnDesc.i[gl_InstanceID].objId;
// Indices of the triangle
uint ind = indices[objId].i[3 * gl_PrimitiveID + 0];
// Vertex of the triangle
Vertex v0 = vertices[objId].v[ind.x];
// Material of the object
int matIdx = matIndex[objId].i[gl_PrimitiveID];
WaveFrontMaterial mat = materials[objId].m[matIdx];
if (mat.illum != 4)
return;
~~~~
Now we will apply transparency:
~~~~ C++
if (mat.dissolve == 0.0)
ignoreIntersectionNV();
else if(rnd(prd.seed) > mat.dissolve)
ignoreIntersectionNV();
}
~~~~
As you can see, we are using a random number generator to determine if the ray hits or ignores the object. If we
accumulate enough rays, the final result will converge to what we want.
## `raycommon.glsl`
The random `seed` also needs to be passed in the ray payload.
In `raycommon.glsl`, add the seed:
~~~~ C++
struct hitPayload
{
vec3 hitValue;
uint seed;
};
~~~~
## Adding Any Hit to `createRtPipeline`
The any hit shader will be part of the hit shader group. Currently, the hit shader group only contains the closest hit shader.
In `createRtPipeline()`, after loading `raytrace.rchit.spv`, load `raytrace.rahit.spv`
~~~~ C++
vk::ShaderModule ahitSM =
nvvkpp::util::createShaderModule(m_device, //
nvh::loadFile("shaders/raytrace.rahit.spv", true, paths));
~~~~
add the any hit shader to the hit group
~~~~ C++
stages.push_back({{}, vk::ShaderStageFlagBits::eClosestHitKHR, chitSM, "main"});
hg.setClosestHitShader(static_cast<uint32_t>(stages.size() - 1));
stages.push_back({{}, vk::ShaderStageFlagBits::eAnyHitKHR, ahitSM, "main"});
hg.setAnyHitShader(static_cast<uint32_t>(stages.size() - 1));
m_rtShaderGroups.push_back(hg);
~~~~
and at the end, delete it:
~~~~ C++
m_device.destroy(ahitSM);
~~~~
## Give access of the buffers to the Any Hit shader
In `createDescriptorSetLayout()`, we need to allow the Any Hit shader to access some buffers.
This is the case for the material and scene description buffers
~~~~ C++
// Materials (binding = 1)
m_descSetLayoutBind.emplace_back(
vkDS(1, vkDT::eStorageBuffer, nbObj,
vkSS::eVertex | vkSS::eFragment | vkSS::eClosestHitKHR | vkSS::eAnyHitKHR));
// Scene description (binding = 2)
m_descSetLayoutBind.emplace_back( //
vkDS(2, vkDT::eStorageBuffer, 1,
vkSS::eVertex | vkSS::eFragment | vkSS::eClosestHitKHR | vkSS::eAnyHitKHR));
~~~~
and also for the vertex, index and material index buffers:
~~~~ C++
// Materials (binding = 4)
m_descSetLayoutBind.emplace_back( //
vkDS(4, vkDT::eStorageBuffer, nbObj,
vkSS::eFragment | vkSS::eClosestHitKHR | vkSS::eAnyHitKHR));
// Storing vertices (binding = 5)
m_descSetLayoutBind.emplace_back( //
vkDS(5, vkDT::eStorageBuffer, nbObj, vkSS::eClosestHitKHR | vkSS::eAnyHitKHR));
// Storing indices (binding = 6)
m_descSetLayoutBind.emplace_back( //
vkDS(6, vkDT::eStorageBuffer, nbObj, vkSS::eClosestHitKHR | vkSS::eAnyHitKHR));
~~~~
## Opaque Flag
In the example, when creating `VkAccelerationStructureGeometryKHR` objects, we set their flags to `vk::GeometryFlagBitsKHR::eOpaque`. However, this avoided invoking the any hit shader.
We could remove all of the flags, but another issue could happen: the any hit shader could be called multiple times for the same triangle. To have the any hit shader process only one hit per triangle, set the `eNoDuplicateAnyHitInvocation` flag:
~~~~ C++
geometry.setFlags(vk::GeometryFlagBitsKHR::eNoDuplicateAnyHitInvocation);
~~~~
## `raytrace.rgen`
If you have done the previous [Jitter Camera/Antialiasing](../ray_tracing_jitter_cam) tutorial,
you will need just a few changes.
First, `seed` will need to be available in the any hit shader, which is the reason we have added it to the hitPayload structure.
Change the local `seed` to `prd.seed` everywhere.
~~~~ C++
prd.seed = tea(gl_LaunchIDNV.y * gl_LaunchSizeNV.x + gl_LaunchIDNV.x, pushC.frame);
~~~~
For optimization, the `TraceNV` call was using the `gl_RayFlagsOpaqueNV` flag. But
this will skip the any hit shader, so change it to
~~~~ C++
uint rayFlags = gl_RayFlagsNoneNV;
~~~~
## `raytrace.rchit`
Similarly, in the closest hit shader, change the flag to `gl_RayFlagsSkipClosestHitShaderNV`, as we want to enable the any hit and miss shaders, but we still don't care
about the closest hit shader for shadow rays. This will enable transparent shadows.
~~~~ C++
uint flags = gl_RayFlagsSkipClosestHitShaderNV;
~~~~
# Scene and Model
For a more interesting scene, you can replace the `helloVk.loadModel` calls in `main()` with the following scene:
~~~~ C++
helloVk.loadModel(nvh::findFile("media/scenes/wuson.obj", defaultSearchPaths));
helloVk.loadModel(nvh::findFile("media/scenes/sphere.obj", defaultSearchPaths),
nvmath::scale_mat4(nvmath::vec3f(1.5f))
* nvmath::translation_mat4(nvmath::vec3f(0.0f, 1.0f, 0.0f)));
helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths));
~~~~
## OBJ Materials
By default, all objects are opaque, you will need to change the material description.
Edit the first few lines of `media/scenes/wuson.mtl` and `media/scenes/sphere.mtl` to use a new illumination model (4) with a dissolve value of 0.5:
~~~~ C++
newmtl default
illum 4
d 0.5
...
~~~~
# Accumulation
As mentioned earlier, for the effect to work, we need to accumulate frames over time. Please implement the following from [Jitter Camera/Antialiasing](vkrt_tuto_jitter_cam.md):
* [Frame Number](vkrt_tuto_jitter_cam.md.htm#toc1.2)
* [Storing or Updating](vkrt_tuto_jitter_cam.md.htm#toc1.4)
* [Application Frame Update](vkrt_tuto_jitter_cam.md.htm#toc2)
# Final Code
You can find the final code in the folder [ray_tracing_anyhit](https://github.com/nvpro-samples/vk_raytracing_tutorial/tree/master/ray_tracing_anyhit)
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

View file

@ -0,0 +1,203 @@
<meta charset="utf-8" lang="en">
**NVIDIA Vulkan Ray Tracing Tutorial**
**Instances**
<small>Author: [Martin-Karl Lefrançois](https://devblogs.nvidia.com/author/mlefrancois/)</small>
![](Images/callable.png)
This is an extension of the Vulkan ray tracing [tutorial](vkrt_tutorial.md.htm).
Ray tracing allow to use [callable shaders](https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/chap8.html#shaders-callable)
in ray-generation, closest-hit, miss or another callable shader stage.
It is similar to an indirect function call, whitout having to link those shaders with the executable program.
(insert setup.md.html here)
# Data Storage
Data can only access data passed in to the callable from parent stage. There will be only one structure pass at a time and should be declared like for payload.
In the parent stage, using the `callableDataNV` storage qualifier, it could be declared like:
~~~~ C++
layout(location = 0) callableDataNV rayLight cLight;
~~~~
where `rayLight` struct is defined in a shared file.
~~~~ C++
struct rayLight
{
vec3 inHitPosition;
float outLightDistance;
vec3 outLightDir;
float outIntensity;
};
~~~~
And in the incoming callable shader, you must use the `callableDataInNV` storage qualifier.
~~~~ C++
layout(location = 0) callableDataInNV rayLight cLight;
~~~~
# Execution
To execute one of the callable shader, the parent stage need to call `executeCallableNV`.
The first parameter is the SBT record index, the second one correspond to the 'location' index.
Example of how it is called.
~~~~ C++
executeCallableNV(pushC.lightType, 0);
~~~~
# Adding Callable Shaders to the SBT
## Create Shader Modules
In `HelloVulkan::createRtPipeline()`, immediately after adding the closest-hit shader, we will add
3 callable shaders, for each type of light.
~~~~ C++
// Callable shaders
vk::RayTracingShaderGroupCreateInfoKHR callGroup{vk::RayTracingShaderGroupTypeKHR::eGeneral,
VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR,
VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR};
vk::ShaderModule call0 =
nvvkpp::util::createShaderModule(m_device,
nvh::loadFile("shaders/light_point.rcall.spv", true, paths));
vk::ShaderModule call1 =
nvvkpp::util::createShaderModule(m_device,
nvh::loadFile("shaders/light_spot.rcall.spv", true, paths));
vk::ShaderModule call2 =
nvvkpp::util::createShaderModule(m_device,
nvh::loadFile("shaders/light_inf.rcall.spv", true, paths));
stages.push_back({{}, vk::ShaderStageFlagBits::eCallableKHR, call0, "main"});
callGroup.setGeneralShader(static_cast<uint32_t>(stages.size() - 1));
m_rtShaderGroups.push_back(callGroup);
stages.push_back({{}, vk::ShaderStageFlagBits::eCallableKHR, call1, "main"});
callGroup.setGeneralShader(static_cast<uint32_t>(stages.size() - 1));
m_rtShaderGroups.push_back(callGroup);
stages.push_back({{}, vk::ShaderStageFlagBits::eCallableKHR, call2, "main"});
callGroup.setGeneralShader(static_cast<uint32_t>(stages.size() - 1));
m_rtShaderGroups.push_back(callGroup);
~~~~
And at the end of the function, delete the shaders.
~~~~ C++
m_device.destroy(call0);
m_device.destroy(call1);
m_device.destroy(call2);
~~~~
### Shaders
Here are the source of all shaders
* [light_point.rcall](https://github.com/nvpro-samples/vk_raytracing_tutorial/blob/master/ray_tracing_callable/shaders/light_point.rcall)
* [light_spot.rcall](https://github.com/nvpro-samples/vk_raytracing_tutorial/blob/master/ray_tracing_callable/shaders/light_spot.rcall)
* [light_inf.rcall](https://github.com/nvpro-samples/vk_raytracing_tutorial/blob/master/ray_tracing_callable/shaders/light_inf.rcall)
## Passing Callable to traceRaysKHR
In `HelloVulkan::raytrace()`, we have to tell where the callable shader starts. Since they were added after the hit shader, we have in the SBT the following.
********************
* +---------+
* | ray-gen |
* +---------+
* | miss0 |
* | miss1 |
* +---------+
* | hit0 |
* +---------+
* | call0 |
* | call1 |
* | call2 |
* +---------+
********************
Therefore, the callable starts at `4 * progSize`
~~~~ C++
vk::DeviceSize callableGroupOffset = 4u * progSize; // Jump over the previous shaders
vk::DeviceSize callableGroupStride = progSize;
~~~~
Then we can call `traceRaysKHR`
~~~~ C++
const vk::StridedBufferRegionKHR callableShaderBindingTable = {
m_rtSBTBuffer.buffer, callableGroupOffset, progSize, sbtSize};
cmdBuf.traceRaysKHR(&raygenShaderBindingTable, &missShaderBindingTable, &hitShaderBindingTable,
&callableShaderBindingTable, //
m_size.width, m_size.height, 1); //
~~~~
# Calling the Callable Shaders
In the closest-hit shader, instead of having a if-else case, we can now call directly the right shader base on the type of light.
~~~~ C++
cLight.inHitPosition = worldPos;
//#define DONT_USE_CALLABLE
#if defined(DONT_USE_CALLABLE)
// Point light
if(pushC.lightType == 0)
{
vec3 lDir = pushC.lightPosition - cLight.inHitPosition;
float lightDistance = length(lDir);
cLight.outIntensity = pushC.lightIntensity / (lightDistance * lightDistance);
cLight.outLightDir = normalize(lDir);
cLight.outLightDistance = lightDistance;
}
else if(pushC.lightType == 1)
{
vec3 lDir = pushC.lightPosition - cLight.inHitPosition;
cLight.outLightDistance = length(lDir);
cLight.outIntensity =
pushC.lightIntensity / (cLight.outLightDistance * cLight.outLightDistance);
cLight.outLightDir = normalize(lDir);
float theta = dot(cLight.outLightDir, normalize(-pushC.lightDirection));
float epsilon = pushC.lightSpotCutoff - pushC.lightSpotOuterCutoff;
float spotIntensity = clamp((theta - pushC.lightSpotOuterCutoff) / epsilon, 0.0, 1.0);
cLight.outIntensity *= spotIntensity;
}
else // Directional light
{
cLight.outLightDir = normalize(-pushC.lightDirection);
cLight.outIntensity = 1.0;
cLight.outLightDistance = 10000000;
}
#else
executeCallableNV(pushC.lightType, 0);
#endif
~~~~
# Final Code
You can find the final code in the folder [ray_tracing_callable](https://github.com/nvpro-samples/vk_raytracing_tutorial/tree/master/ray_tracing_callable)
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

View file

@ -0,0 +1,108 @@
## [Jitter Camera (Anti-Aliasing)](vkrt_tuto_jitter_cam.md.htm)
Anti-aliases the image by accumulating small variations of rays over time.
* Random ray direction generation
* Read/write/accumulate final image
![Antialising](Images/antialiasing.png height= "300px")
## [Handle Thousands of Objects](vkrt_tuto_instances.md.htm)
The current example allocates memory for each object, each of which has several buffers.
This shows how to get around Vulkan's limits on the total number of memory allocations by using a memory allocator.
* Extend the limit of 4096 memory allocations
* Using memory allocators: DMA, VMA
![20000 'unique' object](Images/VkInstances.png height= "300px")
## [Any Hit Shader (Transparency)](vkrt_tuto_anyhit.md.htm)
Implements transparent materials by adding a new shader to the Hit group and using the material
information to discard hits over time.
* Adding anyhit (.ahit) to the ray tracing pipeline
* Randomly letting the ray hit or not which is making simple transparency
![One usage of anyhit shader](Images/anyhit.png height= "300px")
## [Reflections](vkrt_tuto_reflection.md.htm)
Reflections can be implemented by shooting new rays from the closest hit shader, or by iteratively shooting them from
the raygen shader. This example shows the limitations and differences of these implementations.
* Calling traceNV() from the closest hit shader (recursive)
* Adding more data to the ray payload to continue the ray from the raygen shader.
![Hundread of Reflections](Images/reflections.png height= "300px")
## [Multiple Closest Hits Shader and Shader Records](vkrt_tuto_manyhits.md.htm)
Explains how to add more closest hit shaders, choose which instance uses which shader, and add data per SBT that can be
retrieved in the shader, and more.
* One closest hit shader per object
* Sharing closest hit shaders for some object
* Passing shader record to closest hit shader
![Different Closest Hit and Shader Record](Images/manyhits.png height= "300px")
## [Animation](vkrt_tuto_animation.md.htm)
This tutorial shows how animating the transformation matrices of the instances (TLAS)
and animating the vertices of an object (BLAS) in a compute shader, could be done.
* Refit of top level acceleration structure
* Refit of bottom level acceleration structure
![TLAS and BLAS Animation](Images/animation2.gif height= "300px")
## [Intersection Shader](vkrt_tuto_intersection.md.html)
Adding thousands of implicit primitives and using an intersection shader to render spheres and cubes. The tutorial
explains what is needed to get procedural hit group working.
* Intersection Shader
* Sphere intersection
* Axis aligned bounding box intersection
![Intersection Shader with Spheres and Cubes](Images/ray_tracing_intersection.png height= "300px")
## [Callable Shader](vkrt_tuto_callable.md.html)
Replacing if/else by callable shaders. The code to execute the lighting is done in separate callable shaders instead of been part of the code.
* Adding multiple callable shaders
* Calling ExecuteCallableNV from the closest hit shader
![Infinite | Spot | Point from callable shaders](Images/callable.png height= "300px")
## [Ray Query](vkrt_tuto_rayquery.md.html)
Inkoking ray interestion queries directly from the fragment shader to cast shadow rays.
* Ray tracing directly from the fragment shader
![Ray Query](Images/rayquery.png height= "300px")
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://casual-effects.com/markdeep/latest/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

View file

@ -0,0 +1,308 @@
<meta charset="utf-8" lang="en">
**NVIDIA Vulkan Ray Tracing Tutorial**
**Instances**
<small>Authors: [Martin-Karl Lefrançois](https://devblogs.nvidia.com/author/mlefrancois/), Neil Bickford </small>
![](Images/VkInstances.png)
This is an extension of the Vulkan ray tracing [tutorial](vkrt_tutorial.md.htm).
Ray tracing can easily handle having many object instances at once. For instance, a top level acceleration structure can
have many different instances of a bottom level acceleration structure. However, when we have many different objects, we
can run into problems with memory allocation. Many Vulkan implementations support no more than 4096 allocations, while
our current application creates 4 allocations per object (Vertex, Index, and Material), then one for the BLAS. That
means we are hitting the limit with just above 1000 objects.
(insert setup.md.html here)
# Many Instances
First, let's look how the scene would look like when we have just a few objects, with many instances.
In `main.cpp`, add the following includes:
~~~~ C++
#include <random>
~~~~
Then replace the calls to `helloVk.loadModel` in `main()` by
~~~~ C++
// Creation of the example
helloVk.loadModel(nvh::findFile("media/scenes/cube.obj", defaultSearchPaths));
helloVk.loadModel(nvh::findFile("media/scenes/cube_multi.obj", defaultSearchPaths));
helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths));
std::random_device rd; // Will be used to obtain a seed for the random number engine
std::mt19937 gen(rd()); // Standard mersenne_twister_engine seeded with rd()
std::normal_distribution<float> dis(1.0f, 1.0f);
std::normal_distribution<float> disn(0.05f, 0.05f);
for(int n = 0; n < 2000; ++n)
{
HelloVulkan::ObjInstance inst;
inst.objIndex = n % 2;
inst.txtOffset = 0;
float scale = fabsf(disn(gen));
nvmath::mat4f mat =
nvmath::translation_mat4(nvmath::vec3f{dis(gen), 2.0f + dis(gen), dis(gen)});
mat = mat * nvmath::rotation_mat4_x(dis(gen));
mat = mat * nvmath::scale_mat4(nvmath::vec3f(scale));
inst.transform = mat;
inst.transformIT = nvmath::transpose(nvmath::invert((inst.transform)));
helloVk.m_objInstance.push_back(inst);
}
~~~~
!!! Note:
This will create 3 models (OBJ) and their instances, and then add 2000 instances
distributed between green cubes and cubes with one color per face.
# Many Objects
Instead of creating many instances, create many objects.
Remove the previous code and replace it with the following
~~~~ C++
// Creation of the example
std::random_device rd; //Will be used to obtain a seed for the random number engine
std::mt19937 gen(rd()); //Standard mersenne_twister_engine seeded with rd()
std::normal_distribution<float> dis(1.0f, 1.0f);
std::normal_distribution<float> disn(0.05f, 0.05f);
for(int n = 0; n < 2000; ++n)
{
helloVk.loadModel(nvh::findFile("media/scenes/cube_multi.obj", defaultSearchPaths));
HelloVulkan::ObjInstance& inst = helloVk.m_objInstance.back();
float scale = fabsf(disn(gen));
nvmath::mat4f mat =
nvmath::translation_mat4(nvmath::vec3f{dis(gen), 2.0f + dis(gen), dis(gen)});
mat = mat * nvmath::rotation_mat4_x(dis(gen));
mat = mat * nvmath::scale_mat4(nvmath::vec3f(scale));
inst.transform = mat;
inst.transformIT = nvmath::transpose(nvmath::invert((inst.transform)));
}
helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths));
~~~~
The example might still work, but the console will print the following error after loading 1363 objects. All other objects allocated after the 1363rd will fail to be displayed.
!!! Error
Error: VUID_Undefined
Number of currently valid memory objects is not less than the maximum allowed (4096).
- Object[0] - Type Device
!!! Note:
This is the best case; the application can run out of memory and crash if substantially more objects are created (e.g. 20,000)
# Device Memory Allocator (DMA)
It is possible to use a memory allocator to fix this issue.
## `hello_vulkan.h`
In `hello_vulkan.h`, add the following defines at the top of the file to indicate which allocator to use
~~~~ C++
// #VKRay
//#define ALLOC_DEDICATED
#define ALLOC_DMA
~~~~
Replace the definition of buffers and textures and include the right allocator.
~~~~ C++
#if defined(ALLOC_DEDICATED)
#include "nvvkpp/allocator_dedicated_vkpp.hpp"
using nvvkBuffer = nvvkpp::BufferDedicated;
using nvvkTexture = nvvkpp::TextureDedicated;
#elif defined(ALLOC_DMA)
#include "nvvkpp/allocator_dma_vkpp.hpp"
using nvvkBuffer = nvvkpp::BufferDma;
using nvvkTexture = nvvkpp::TextureDma;
#endif
~~~~
And do the same for the allocator
~~~~ C++
#if defined(ALLOC_DEDICATED)
nvvkpp::AllocatorDedicated m_alloc; // Allocator for buffer, images, acceleration structures
#elif defined(ALLOC_DMA)
nvvkpp::AllocatorDma m_alloc; // Allocator for buffer, images, acceleration structures
nvvk::DeviceMemoryAllocator m_dmaAllocator;
#endif
~~~~
## `hello_vulkan.cpp`
In the source file there are also a few changes to make.
DMA needs to be initialized, which will be done in the `setup()` function:
~~~~ C++
#if defined(ALLOC_DEDICATED)
m_alloc.init(device, physicalDevice);
#elif defined(ALLOC_DMA)
m_dmaAllocator.init(device, physicalDevice);
m_alloc.init(device, &m_dmaAllocator);
#endif
~~~~
When using DMA, memory buffer mapping is done through the DMA interface (instead of the VKDevice). Therefore, change the lines at the end of `updateUniformBuffer()` to
~~~~ C++
#if defined(ALLOC_DEDICATED)
void* data = m_device.mapMemory(m_cameraMat.allocation, 0, sizeof(CameraMatrices));
memcpy(data, &ubo, sizeof(ubo));
m_device.unmapMemory(m_cameraMat.allocation);
#elif defined(ALLOC_DMA)
void* data = m_dmaAllocator.map(m_cameraMat.allocation);
memcpy(data, &ubo, sizeof(ubo));
m_dmaAllocator.unmap(m_cameraMat.allocation);
#endif
~~~~
The RaytracerBuilder was made to allow various allocators, but we still need to pass the right one in its setup function. Change the last line of `initRayTracing()` to
~~~~ C++
#if defined(ALLOC_DEDICATED)
m_rtBuilder.setup(m_device, m_physicalDevice, m_graphicsQueueIndex);
#elif defined(ALLOC_DMA)
m_rtBuilder.setup(m_device, m_dmaAllocator, m_graphicsQueueIndex);
#endif
~~~~
## Destruction
The VMA allocator need to be released in `HelloVulkan::destroyResources()` after the last `m_alloc.destroy`.
~~~~ C++
#if defined(ALLOC_DMA)
m_dmaAllocator.deinit();
#endif
~~~~
# Result
Instead of thousands of allocations, our example will have only 14 allocations. Note that some of these allocations are allocated by Dear ImGui, and not by DMA. These are the 14 objects with blue borders below:
![Memory](Images/VkInstanceNsight1.png)
Finally, here is the Vulkan Device Memory view from Nsight Graphics:
![VkMemory](Images/VkInstanceNsight2.png)
# VMA: Vulkan Memory Allocator
We can also modify the code to use the [Vulkan Memory Allocator](https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator) from AMD.
Download [vk_mem_alloc.h](https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator/blob/master/src/vk_mem_alloc.h) from GitHub and add this to the `current` folder.
There is already a variation of the allocator for VMA, which is located under [nvpro-samples](https://github.com/nvpro-samples/shared_sources/tree/master/nvvkpp). This allocator has the same simple interface as the `AllocatorDedicated` class in `allocator_dedicated_vkpp.hpp`, but will use VMA for memory management.
VMA might use dedicated memory, which we do, so you need to add the following extension to the
creation of the context in `main.cpp`.
~~~~ C++
contextInfo.addDeviceExtension(VK_KHR_BIND_MEMORY_2_EXTENSION_NAME);
~~~~
## hello_vulkan.h
Follow the changes done before and add the following
~~~~ C++
#define ALLOC_VMA
~~~~
~~~~ C++
#elif defined(ALLOC_VMA)
#include "nvvkpp/allocator_vma_vkpp.hpp"
using nvvkBuffer = nvvkpp::BufferVma;
using nvvkTexture = nvvkpp::TextureVma;
~~~~
~~~~ C++
#elif defined(ALLOC_VMA)
nvvkpp::AllocatorVma m_alloc; // Allocator for buffer, images, acceleration structures
VmaAllocator m_vmaAllocator;
~~~~
## hello_vulkan.cpp
First, the following should only be defined once in the entire program, and it should be defined before `#include "hello_vulkan.h"`:
~~~~ C++
#define VMA_IMPLEMENTATION
~~~~
In `setup()`
~~~~ C++
#elif defined(ALLOC_VMA)
VmaAllocatorCreateInfo allocatorInfo = {};
allocatorInfo.physicalDevice = physicalDevice;
allocatorInfo.device = device;
allocatorInfo.flags |=
VMA_ALLOCATOR_CREATE_KHR_DEDICATED_ALLOCATION_BIT | VMA_ALLOCATOR_CREATE_KHR_BIND_MEMORY2_BIT;
vmaCreateAllocator(&allocatorInfo, &m_vmaAllocator);
m_alloc.init(device, m_vmaAllocator);
~~~~
In `updateUniformBuffer()`
~~~~ C++
#elif defined(ALLOC_VMA)
void* data;
vmaMapMemory(m_vmaAllocator, m_cameraMat.allocation, &data);
memcpy(data, &ubo, sizeof(ubo));
vmaUnmapMemory(m_vmaAllocator, m_cameraMat.allocation);
~~~~
In `destroyResources()`
~~~~ C++
#elif defined(ALLOC_VMA)
vmaDestroyAllocator(m_vmaAllocator);
~~~~
In `initRayTracing()`
~~~~ C++
#elif defined(ALLOC_VMA)
m_rtBuilder.setup(m_device, m_vmaAllocator, m_graphicsQueueIndex);
~~~~
Additionally, VMA has its own usage flags, so since `VMA_MEMORY_USAGE_CPU_TO_GPU` maps to `vkMP::eHostVisible` and `vkMP::eHostCoherent`, change the call to `m_alloc.createBuffer` in `HelloVulkan::createUniformBuffer()` to
~~~~ C++
m_cameraMat = m_alloc.createBuffer(sizeof(CameraMatrices), vkBU::eUniformBuffer,
#if defined(ALLOC_DEDICATED) || defined(ALLOC_DMA)
vkMP::eHostVisible | vkMP::eHostCoherent
#elif defined(ALLOC_VMA)
VMA_MEMORY_USAGE_CPU_TO_GPU
#endif
);
~~~~
# Final Code
You can find the final code in the folder [ray_tracing_instances](https://github.com/nvpro-samples/vk_raytracing_tutorial/tree/master/ray_tracing_instances)
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

View file

@ -0,0 +1,564 @@
<meta charset="utf-8" lang="en">
**NVIDIA Vulkan Ray Tracing Tutorial**
**Intersection Shader**
<small>Author: [Martin-Karl Lefrançois](https://devblogs.nvidia.com/author/mlefrancois/)</small>
![](Images/ray_tracing_intersection.png)
# Introduction
This tutorial chapter shows how to use intersection shader and render different primitives with different materials.
This is an extension of the Vulkan ray tracing [tutorial](vkrt_tutorial.md.htm).
(insert setup.md.html here)
## High Level Implementation
On a high level view, we will
* Add 2.000.000 axis aligned bounding boxes in a BLAS
* 2 materials will be added
* Every second intersected object will be a sphere or a cube and will use one of the two material.
To do this, we will need to:
* Add an intersection shader (.rint)
* Add a new closest hit shader (.chit)
* Create `VkAccelerationStructureGeometryKHR` from `VkAccelerationStructureGeometryAabbsDataKHR`
## Creating all spheres
In the HelloVulkan class, we will add the structures we will need. First the structure that defines a sphere.
~~~~ C++
struct Sphere
{
nvmath::vec3f center;
float radius;
};
~~~~
Then we need the Aabb structure holding all the spheres, but also used for the creation of the BLAS (`VK_GEOMETRY_TYPE_AABBS_KHR`).
~~~~ C++
struct Aabb
{
nvmath::vec3f minimum;
nvmath::vec3f maximum;
};
~~~~
All the information will need to be hold in buffers, which will be available to the shaders.
<script type="preformatted">
~~~~ C++
std::vector<Sphere> m_spheres; // All spheres
nvvkBuffer m_spheresBuffer; // Buffer holding the spheres
nvvkBuffer m_spheresAabbBuffer; // Buffer of all Aabb
nvvkBuffer m_spheresMatColorBuffer; // Multiple materials
nvvkBuffer m_spheresMatIndexBuffer; // Define which sphere uses which material
~~~~
Finally, there are two functions, one to create the spheres, and one that will create the `VkGeometryNV` for the BLAS.
~~~~ C++
void createSpheres();
nvvkpp::RaytracingBuilderKHR::Blas sphereToVkGeometryKHR();
~~~~
The following implementation will create 2.000.000 spheres at random positions and radius. It will create the Aabb from the sphere definition, two materials which will be assigned alternatively to each object. All the created information will be moved to Vulkan buffers to be accessed by the intersection and closest shaders.
~~~~ C++
//--------------------------------------------------------------------------------------------------
// Creating all spheres
//
void HelloVulkan::createSpheres()
{
std::random_device rd{};
std::mt19937 gen{rd()};
std::normal_distribution<float> xzd{0.f, 5.f};
std::normal_distribution<float> yd{3.f, 1.f};
std::uniform_real_distribution<float> radd{.05f, .2f};
// All spheres
Sphere s;
for(uint32_t i = 0; i < 2000000; i++)
{
s.center = nvmath::vec3f(xzd(gen), yd(gen), xzd(gen));
s.radius = radd(gen);
m_spheres.emplace_back(s);
}
// Axis aligned bounding box of each sphere
std::vector<Aabb> aabbs;
for(const auto& s : m_spheres)
{
Aabb aabb;
aabb.minimum = s.center - nvmath::vec3f(s.radius);
aabb.maximum = s.center + nvmath::vec3f(s.radius);
aabbs.emplace_back(aabb);
}
// Creating two materials
MatrialObj mat;
mat.diffuse = vec3f(0, 1, 1);
std::vector<MatrialObj> materials;
std::vector<int> matIdx;
materials.emplace_back(mat);
mat.diffuse = vec3f(1, 1, 0);
materials.emplace_back(mat);
// Assign a material to each sphere
for(size_t i = 0; i < m_spheres.size(); i++)
{
matIdx.push_back(i % 2);
}
// Creating all buffers
using vkBU = vk::BufferUsageFlagBits;
nvvkpp::SingleCommandBuffer genCmdBuf(m_device, m_graphicsQueueIndex);
auto cmdBuf = genCmdBuf.createCommandBuffer();
m_spheresBuffer = m_alloc.createBuffer(cmdBuf, m_spheres, vkBU::eStorageBuffer);
m_spheresAabbBuffer = m_alloc.createBuffer(cmdBuf, aabbs);
m_spheresMatIndexBuffer = m_alloc.createBuffer(cmdBuf, matIdx, vkBU::eStorageBuffer);
m_spheresMatColorBuffer = m_alloc.createBuffer(cmdBuf, materials, vkBU::eStorageBuffer);
genCmdBuf.flushCommandBuffer(cmdBuf);
// Debug information
m_debug.setObjectName(m_spheresBuffer.buffer, "spheres");
m_debug.setObjectName(m_spheresAabbBuffer.buffer, "spheresAabb");
m_debug.setObjectName(m_spheresMatColorBuffer.buffer, "spheresMat");
m_debug.setObjectName(m_spheresMatIndexBuffer.buffer, "spheresMatIdx");
}
~~~~
Do not forget to destroy the buffers in `destroyResources()`
~~~~ C++
m_alloc.destroy(m_spheresBuffer);
m_alloc.destroy(m_spheresAabbBuffer);
m_alloc.destroy(m_spheresMatColorBuffer);
m_alloc.destroy(m_spheresMatIndexBuffer);
~~~~
We need a new bottom level acceleration structure (BLAS) to hold the implicit primitives. For efficiency and since all those primitives are static, they will all be added in a single BLAS.
What is changing compare to triangle primitive is the Aabb data (see Aabb structure) and the geometry type (`VK_GEOMETRY_TYPE_AABBS_KHR`).
~~~~ C++
//--------------------------------------------------------------------------------------------------
// Returning the ray tracing geometry used for the BLAS, containing all spheres
//
nvvkpp::RaytracingBuilderKHR::Blas HelloVulkan::sphereToVkGeometryKHR()
{
vk::AccelerationStructureCreateGeometryTypeInfoKHR asCreate;
asCreate.setGeometryType(vk::GeometryTypeKHR::eAabbs);
asCreate.setMaxPrimitiveCount((uint32_t)m_spheres.size()); // Nb triangles
asCreate.setIndexType(vk::IndexType::eNoneKHR);
asCreate.setVertexFormat(vk::Format::eUndefined);
asCreate.setMaxVertexCount(0);
asCreate.setAllowsTransforms(VK_FALSE); // No adding transformation matrices
vk::DeviceAddress dataAddress = m_device.getBufferAddress({m_spheresAabbBuffer.buffer});
vk::AccelerationStructureGeometryAabbsDataKHR aabbs;
aabbs.setData(dataAddress);
aabbs.setStride(sizeof(Aabb));
// Setting up the build info of the acceleration
vk::AccelerationStructureGeometryKHR asGeom;
asGeom.setGeometryType(asCreate.geometryType);
asGeom.setFlags(vk::GeometryFlagBitsKHR::eOpaque);
asGeom.geometry.setAabbs(aabbs);
vk::AccelerationStructureBuildOffsetInfoKHR offset;
offset.setFirstVertex(0);
offset.setPrimitiveCount(asCreate.maxPrimitiveCount);
offset.setPrimitiveOffset(0);
offset.setTransformOffset(0);
nvvkpp::RaytracingBuilderKHR::Blas blas;
blas.asGeometry.emplace_back(asGeom);
blas.asCreateGeometryInfo.emplace_back(asCreate);
blas.asBuildOffsetInfo.emplace_back(offset);
return blas;
}
~~~~
## Setting up the scene
In `main.cpp`, where we are loading the OBJ model, we can replace it with
~~~~ C++
// Creation of the example
helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths));
helloVk.createSpheres();
~~~~
**Note**: it is possible to have more OBJ models, but the spheres will need to be added after all of them.
The scene will be large, better to move the camera out
~~~~ C++
CameraManip.setLookat(nvmath::vec3f(20, 20, 20), nvmath::vec3f(0, 1, 0), nvmath::vec3f(0, 1, 0));
~~~~
## Acceleration structures
### BLAS
The function `createBottomLevelAS()` is creating a BLAS per OBJ, the following modification will add a new BLAS containing the Aabb's of all spheres.
~~~~ C++
void HelloVulkan::createBottomLevelAS()
{
// BLAS - Storing each primitive in a geometry
std::vector<nvvkpp::RaytracingBuilderKHR::Blas> allBlas;
allBlas.reserve(m_objModel.size());
for(const auto& obj : m_objModel)
{
auto blas = objectToVkGeometryKHR(obj);
// We could add more geometry in each BLAS, but we add only one for now
allBlas.emplace_back(blas);
}
// Spheres
{
auto blas = sphereToVkGeometryKHR();
allBlas.emplace_back(blas);
}
m_rtBuilder.buildBlas(allBlas, vk::BuildAccelerationStructureFlagBitsKHR::ePreferFastTrace);
}
~~~~
### TLAS
Similarly in `createTopLevelAS()`, the top level acceleration structure will need to add a reference to the BLAS of the spheres. We are setting the instanceID and blasID to the last element, which is why the sphere BLAS must be added after everything else.
The hitGroupId will be set to 1 instead of 0. We need to add a new hit group for the implicit primitives, since we will need to compute attributes like the normal, since they are not provide like with triangle primitives.
Just before building the TLAS, we need to add the following
~~~~ C++
// Add the blas containing all spheres
{
nvvkpp::RaytracingBuilder::Instance rayInst;
rayInst.transform = m_objInstance[0].transform; // Position of the instance
rayInst.instanceId = static_cast<uint32_t>(tlas.size()); // gl_InstanceID
rayInst.blasId = static_cast<uint32_t>(m_objModel.size());
rayInst.hitGroupId = 1; // We will use the same hit group for all objects
rayInst.flags = vk::GeometryInstanceFlagBitsKHR::eTriangleCullDisable;
tlas.emplace_back(rayInst);
}
~~~~
## Descriptors
To access the newly created buffers holding all the spheres and materials, some changes are required to the descriptors.
In function `createDescriptorSetLayout()`, the addition of the material and material index need to be instructed.
~~~~ C++
// Materials (binding = 1)
m_descSetLayoutBind.emplace_back(vkDS(1, vkDT::eStorageBuffer, nbObj + 1,
vkSS::eVertex | vkSS::eFragment | vkSS::eClosestHitKHR));
// Materials Index (binding = 4)
m_descSetLayoutBind.emplace_back(
vkDS(4, vkDT::eStorageBuffer, nbObj + 1, vkSS::eFragment | vkSS::eClosestHitKHR));
~~~~
And the new buffer holding the spheres
~~~~ C++
// Storing spheres (binding = 7)
m_descSetLayoutBind.emplace_back( //
vkDS(7, vkDT::eStorageBuffer, 1, vkSS::eClosestHitKHR | vkSS::eIntersectionKHR));
~~~~
The function `updateDescriptorSet()` which is writing the values of the buffer need also to be modified.
At the end of the loop on all models, lets add the new material and material index.
~~~~ C++
for(auto& model : m_objModel)
{
dbiMat.emplace_back(model.matColorBuffer.buffer, 0, VK_WHOLE_SIZE);
dbiMatIdx.emplace_back(model.matIndexBuffer.buffer, 0, VK_WHOLE_SIZE);
dbiVert.emplace_back(model.vertexBuffer.buffer, 0, VK_WHOLE_SIZE);
dbiIdx.emplace_back(model.indexBuffer.buffer, 0, VK_WHOLE_SIZE);
}
dbiMat.emplace_back(m_spheresMatColorBuffer.buffer, 0, VK_WHOLE_SIZE);
dbiMatIdx.emplace_back(m_spheresMatIndexBuffer.buffer, 0, VK_WHOLE_SIZE);
~~~~
Then write the buffer for the spheres
~~~~ C++
vk::DescriptorBufferInfo dbiSpheres{m_spheresBuffer.buffer, 0, VK_WHOLE_SIZE};
writes.emplace_back(nvvkpp::util::createWrite(m_descSet, m_descSetLayoutBind[7], &dbiSpheres));
~~~~
## Intersection Shader
The intersection shader is added to the Hit Group `VK_RAY_TRACING_SHADER_GROUP_TYPE_PROCEDURAL_HIT_GROUP_KHR`. In our example, we already have a Hit Group for triangle and a closest hit associated. We will add a new one, which will become the Hit Group ID (1), see the TLAS section.
Here is how the two hit group looks like:
~~~~ C++
// Hit Group0 - Closest Hit
vk::ShaderModule chitSM =
nvvkpp::util::createShaderModule(m_device, //
nvh::loadFile("shaders/raytrace.rchit.spv", true, paths));
{
vk::RayTracingShaderGroupCreateInfoKHR hg{vk::RayTracingShaderGroupTypeKHR::eTrianglesHitGroup,
VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR,
VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR};
stages.push_back({{}, vk::ShaderStageFlagBits::eClosestHitKHR, chitSM, "main"});
hg.setClosestHitShader(static_cast<uint32_t>(stages.size() - 1));
m_rtShaderGroups.push_back(hg);
}
// Hit Group1 - Closest Hit + Intersection (procedural)
vk::ShaderModule chit2SM =
nvvkpp::util::createShaderModule(m_device, //
nvh::loadFile("shaders/raytrace2.rchit.spv", true, paths));
vk::ShaderModule rintSM =
nvvkpp::util::createShaderModule(m_device, //
nvh::loadFile("shaders/raytrace.rint.spv", true, paths));
{
vk::RayTracingShaderGroupCreateInfoKHR hg{vk::RayTracingShaderGroupTypeKHR::eProceduralHitGroup,
VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR,
VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR};
stages.push_back({{}, vk::ShaderStageFlagBits::eClosestHitKHR, chit2SM, "main"});
hg.setClosestHitShader(static_cast<uint32_t>(stages.size() - 1));
stages.push_back({{}, vk::ShaderStageFlagBits::eIntersectionKHR, rintSM, "main"});
hg.setIntersectionShader(static_cast<uint32_t>(stages.size() - 1));
m_rtShaderGroups.push_back(hg);
}
~~~~
And destroy the two shaders at the end
~~~~ C++
m_device.destroy(chit2SM);
m_device.destroy(rintSM);
~~~~
### raycommon.glsl
To share the structure of the data across the shaders, we can add the following to `raycommon.glsl`
~~~~ C++
struct Sphere
{
vec3 center;
float radius;
};
struct Aabb
{
vec3 minimum;
vec3 maximum;
};
#define KIND_SPHERE 0
#define KIND_CUBE 1
~~~~
### raytrace.rint
The intersection shader `raytrace.rint` need to be added to the shader directory and CMake to be rerun such that it is added to the project. The shader will be called every time a ray will hit one of the Aabb of the scene. Note that there are no Aabb information that can be retrieved in the intersection shader. It is also not possible to have the value of the hit point that the ray tracer might have calculated on the GPU.
The only information we have is that one of the Aabb was hit and using the `gl_PrimitiveID`, it is possible to know which one it was. Then, with the information stored in the buffer, we can retrive the geometry information of the sphere.
We first declare the extensions and include common files.
~~~~ C++
#version 460
#extension GL_NV_ray_tracing : require
#extension GL_EXT_nonuniform_qualifier : enable
#extension GL_EXT_scalar_block_layout : enable
#extension GL_GOOGLE_include_directive : enable
#include "raycommon.glsl"
#include "wavefront.glsl"
~~~~
Then we **must** add the following, otherwise the intersection shader will not report any hit.
~~~~ C++
hitAttributeNV vec3 HitAttribute;
~~~~
The following is the topology of all spheres, which we will be able to retrieve using `gl_PrimitiveID`.
~~~~ C++
layout(binding = 7, set = 1, scalar) buffer allSpheres_
{
Sphere i[];
}
allSpheres;
~~~~
We will implement two intersetion method against the incoming ray.
~~~~ C++
struct Ray
{
vec3 origin;
vec3 direction;
};
~~~~
The sphere intersection
~~~~ C++
// Ray-Sphere intersection
// http://viclw17.github.io/2018/07/16/raytracing-ray-sphere-intersection/
float hitSphere(const Sphere s, const Ray r)
{
vec3 oc = r.origin - s.center;
float a = dot(r.direction, r.direction);
float b = 2.0 * dot(oc, r.direction);
float c = dot(oc, oc) - s.radius * s.radius;
float discriminant = b * b - 4 * a * c;
if(discriminant < 0)
{
return -1.0;
}
else
{
return (-b - sqrt(discriminant)) / (2.0 * a);
}
}
~~~~
And the axis aligned bounding box intersection
~~~~ C++
// Ray-AABB intersection
float hitAabb(const Aabb aabb, const Ray r)
{
vec3 invDir = 1.0 / r.direction;
vec3 tbot = invDir * (aabb.minimum - r.origin);
vec3 ttop = invDir * (aabb.maximum - r.origin);
vec3 tmin = min(ttop, tbot);
vec3 tmax = max(ttop, tbot);
float t0 = max(tmin.x, max(tmin.y, tmin.z));
float t1 = min(tmax.x, min(tmax.y, tmax.z));
return t1 > max(t0, 0.0) ? t0 : -1.0;
}
~~~~
Both are returning -1 if there is no hit, otherwise, it returns the distance from to origin of the ray.
Retrieving the ray is straight forward
~~~~ C++
void main()
{
Ray ray;
ray.origin = gl_WorldRayOriginNV;
ray.direction = gl_WorldRayDirectionNV;
~~~~
And getting the information about the geometry enclosed in the Aabb can be done like this.
~~~~ C++
// Sphere data
Sphere sphere = allSpheres.i[gl_PrimitiveID];
~~~~
Now we just need to know if we will hit a sphere or a cube.
~~~~ C++
float tHit = -1;
int hitKind = gl_PrimitiveID % 2 == 0 ? KIND_SPHERE : KIND_CUBE;
if(hitKind == KIND_SPHERE)
{
// Sphere intersection
tHit = hitSphere(sphere, ray);
}
else
{
// AABB intersection
Aabb aabb;
aabb.minimum = sphere.center - vec3(sphere.radius);
aabb.maximum = sphere.center + vec3(sphere.radius);
tHit = hitAabb(aabb, ray);
}
~~~~
Intersection information is reported using `reportIntersectionNV`, with a distance from the origin and a second argument (hitKind) that can be used to differentiate the primitive type.
~~~~ C++
// Report hit point
if(tHit > 0)
reportIntersectionNV(tHit, hitKind);
}
~~~~
The shader can be found [here](shaders/raytrace.rint)
### raytrace2.rchit
The new closest hit can be found [here](shaders/raytrace2.rchit)
This shader is almost identical to original `raytrace.rchit`, but since the primitive is implicit, we will only need to compute the normal for the primitive that was hit.
We retrieve the world position from the ray and the `gl_HitTNV` which was set in the intersection shader.
~~~~ C++
vec3 worldPos = gl_WorldRayOriginNV + gl_WorldRayDirectionNV * gl_HitTNV;
~~~~
The sphere information is retrieved the same way as in the `raytrace.rint` shader.
~~~~ C++
Sphere instance = allSpheres.i[gl_PrimitiveID];
~~~~
Then we compute the normal, as for a sphere.
~~~~ C++
// Computing the normal at hit position
vec3 normal = normalize(worldPos - instance.center);
~~~~
To know if we have intersect a cube rather than a sphere, we are using `gl_HitKindNV`, which was set in the second argument of `reportIntersectionNV`.
So when this is a cube, we set the normal to the major axis.
~~~~ C++
// Computing the normal for a cube if the hit intersection was reported as 1
if(gl_HitKindNV == KIND_CUBE) // Aabb
{
vec3 absN = abs(normal);
float maxC = max(max(absN.x, absN.y), absN.z);
normal = (maxC == absN.x) ?
vec3(sign(normal.x), 0, 0) :
(maxC == absN.y) ? vec3(0, sign(normal.y), 0) : vec3(0, 0, sign(normal.z));
}
~~~~
# Final Code
You can find the final code in the folder [ray_tracing_intersection](https://github.com/nvpro-samples/vk_raytracing_tutorial/tree/master/ray_tracing_intersection)
</script>
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

View file

@ -0,0 +1,295 @@
<meta charset="utf-8" lang="en">
**NVIDIA Vulkan Ray Tracing Tutorial**
**Antialiasing**
![](Images/antialiasing.png)
# Introduction
This is an extension of the Vulkan ray tracing [tutorial](vkrt_tutorial.md.htm).
In this extension, we will implement antialiasing by jittering the offset of each ray for each pixel over time, instead of always shooting each ray from the middle of its pixel.
(insert setup.md.html here)
## Random Functions
We will use some simple functions for random number generation, which suffice for this example.
Create a new shader file `random.glsl` with the following code. Add it to the `shaders` directory and rerun CMake, and include this new file in `raytrace.rgen`:
~~~~ C++
// Generate a random unsigned int from two unsigned int values, using 16 pairs
// of rounds of the Tiny Encryption Algorithm. See Zafar, Olano, and Curtis,
// "GPU Random Numbers via the Tiny Encryption Algorithm"
uint tea(uint val0, uint val1)
{
uint v0 = val0;
uint v1 = val1;
uint s0 = 0;
for(uint n = 0; n < 16; n++)
{
s0 += 0x9e3779b9;
v0 += ((v1 << 4) + 0xa341316c) ^ (v1 + s0) ^ ((v1 >> 5) + 0xc8013ea4);
v1 += ((v0 << 4) + 0xad90777d) ^ (v0 + s0) ^ ((v0 >> 5) + 0x7e95761e);
}
return v0;
}
// Generate a random unsigned int in [0, 2^24) given the previous RNG state
// using the Numerical Recipes linear congruential generator
uint lcg(inout uint prev)
{
uint LCG_A = 1664525u;
uint LCG_C = 1013904223u;
prev = (LCG_A * prev + LCG_C);
return prev & 0x00FFFFFF;
}
// Generate a random float in [0, 1) given the previous RNG state
float rnd(inout uint prev)
{
return (float(lcg(prev)) / float(0x01000000));
}
~~~~
## Frame Number
Since our jittered samples will be accumulated across frames, we need to know which frame we are currently rendering. A frame number of 0 will indicate a new frame, and we will accumulate the data for larger frame numbers.
Note that the uniform image is read/write, which makes it possible to accumulate previous frames.
In `raytrace.rgen`, add the push constant block from `raytrace.rchit`, adding a new `frame` member:
~~~~ C++
layout(push_constant) uniform Constants
{
vec4 clearColor;
vec3 lightPosition;
float lightIntensity;
int lightType;
int frame;
}
pushC;
~~~~
Also add this frame member to the `RtPushConstant` struct in `hello_vulkan.h`:
~~~~ C++
struct RtPushConstant
{
nvmath::vec4f clearColor;
nvmath::vec3f lightPosition;
float lightIntensity;
int lightType;
int frame{0};
} m_rtPushConstants;
~~~~
## Random and Jitter
In `raytrace.rgen`, at the beginning of `main()`, initialize the random seed:
~~~~ C++
// Initialize the random number
uint seed = tea(gl_LaunchIDNV.y * gl_LaunchSizeNV.x + gl_LaunchIDNV.x, pushC.frame);
~~~~
Then we need two random numbers to vary the X and Y inside the pixel, except for frame 0, where we always shoot
in the center.
~~~~ C++
float r1 = rnd(seed);
float r2 = rnd(seed);
// Subpixel jitter: send the ray through a different position inside the pixel
// each time, to provide antialiasing.
vec2 subpixel_jitter = pushC.frame == 0 ? vec2(0.5f, 0.5f) : vec2(r1, r2);
~~~~
Now we only need to change how we compute the pixel center:
~~~~ C++
const vec2 pixelCenter = vec2(gl_LaunchIDNV.xy) + subpixel_jitter;
~~~~
## Storing or Updating
At the end of `main()`, if the frame number is equal to 0, we write directly to the image.
Otherwise, we combine the new image with the previous `frame` frames.
~~~~ C++
// Do accumulation over time
if(pushC.frame > 0)
{
float a = 1.0f / float(pushC.frame + 1);
vec3 old_color = imageLoad(image, ivec2(gl_LaunchIDNV.xy)).xyz;
imageStore(image, ivec2(gl_LaunchIDNV.xy), vec4(mix(old_color, prd.hitValue, a), 1.f));
}
else
{
// First frame, replace the value in the buffer
imageStore(image, ivec2(gl_LaunchIDNV.xy), vec4(prd.hitValue, 1.f));
}
~~~~
# Application Frame Update
We need to increment the current rendering frame, but we also need to reset it when something in the
scene is changing.
Add two new functions to the `HelloVulkan` class:
~~~~ C++
void resetFrame();
void updateFrame();
~~~~
The implementation of `updateFrame` resets the frame counter if the camera has changed; otherwise, it increments the frame counter.
~~~~ C++
//--------------------------------------------------------------------------------------------------
// If the camera matrix has changed, resets the frame.
// otherwise, increments frame.
//
void HelloVulkan::updateFrame()
{
static nvmath::mat4f refCamMatrix;
auto& m = CameraManip.getMatrix();
if(memcmp(&refCamMatrix.a00, &m.a00, sizeof(nvmath::mat4f)) != 0)
{
resetFrame();
refCamMatrix = m;
}
m_rtPushConstants.frame++;
}
~~~~
Since `resetFrame` will be called before `updateFrame` increments the frame counter, `resetFrame` will set the frame counter to -1:
~~~~ C++
void HelloVulkan::resetFrame()
{
m_rtPushConstants.frame = -1;
}
~~~~
At the begining of `HelloVulkan::raytrace`, call
~~~~ C++
updateFrame();
~~~~
The application will now antialias the image when ray tracing is enabled.
## Resetting Frame on UI Change
The frame number should also be reset when any parts of the scene change, such as the light direction or the background color. In `renderUI()` in `main.cpp`, check for UI changes and reset the frame number when they happen:
~~~~ C++
void renderUI(HelloVulkan& helloVk)
{
static int item = 1;
bool changed = false;
if(ImGui::Combo("Up Vector", &item, "X\0Y\0Z\0\0"))
{
nvmath::vec3f pos, eye, up;
CameraManip.getLookat(pos, eye, up);
up = nvmath::vec3f(item == 0, item == 1, item == 2);
CameraManip.setLookat(pos, eye, up);
changed = true;
}
changed |=
ImGui::SliderFloat3("Light Position", &helloVk.m_pushConstant.lightPosition.x, -20.f, 20.f);
changed |=
ImGui::SliderFloat("Light Intensity", &helloVk.m_pushConstant.lightIntensity, 0.f, 100.f);
changed |= ImGui::RadioButton("Point", &helloVk.m_pushConstant.lightType, 0);
ImGui::SameLine();
changed |= ImGui::RadioButton("Infinite", &helloVk.m_pushConstant.lightType, 1);
if(changed)
helloVk.resetFrame();
}
~~~~
We also need to check for UI changes inside the main loop inside `main()`:
~~~~ C++
bool changed = false;
// Edit 3 floats representing a color
changed |= ImGui::ColorEdit3("Clear color", reinterpret_cast<float*>(&clearColor));
// Switch between raster and ray tracing
changed |= ImGui::Checkbox("Ray Tracer mode", &useRaytracer);
if(changed)
helloVk.resetFrame();
~~~~
# Quality
After enough samples, the quality of the rendering will be sufficiently high that it might make sense to avoid accumulating further images.
Add a member variable to `HelloVulkan`
~~~~ C++
int m_maxFrames{100};
~~~~
and also add a way to control it in `renderUI()`, making sure that `m_maxFrames` cannot be set below 1:
~~~~ C++
changed |= ImGui::InputInt("Max Frames", &helloVk.m_maxFrames);
helloVk.m_maxFrames = std::max(helloVk.m_maxFrames, 1);
~~~~
Then in `raytrace()`, immediately after the call to `updateFrame()`, return if the current frame has exceeded the max frame.
~~~~ C++
if(m_rtPushConstants.frame >= m_maxFrames)
return;
~~~~
Since the output image won't be modified by the ray tracer, we will simply display the last good image, reducing GPU usage when the target quality has been reached.
# More Samples in RayGen
To improve efficiency, we can perform multiple samples directly in the ray generation shader. This will be faster than calling `raytrace()` the equivalent number of times.
To do this, add a constant to `raytrace.rgen` (this could alternatively be added to the push constant block and controlled by the application):
~~~~ C++
const int NBSAMPLES = 10;
~~~~
In `main()`, after initializing the random number seed, create a loop that encloses the lines from the generation of `r1` and `r2` to the `traceNV` call, and accumulates the colors returned by `traceNV`. At the end of the loop, divide by the number of samples that were taken.
~~~~ C++
vec3 hitValues = vec3(0);
for(int smpl = 0; smpl < NBSAMPLES; smpl++)
{
float r1 = rnd(seed);
float r2 = rnd(seed);
// ...
// TraceNV( ... );
hitValues += prd.hitValue;
}
prd.hitValue = hitValues / NBSAMPLES;
~~~~
For a given value of `m_maxFrames` and `NBSAMPLE`, the image this converges to will have `m_maxFrames * NBSAMPLE` antialiasing samples. For instance, if `m_maxFrames = 10` and `NBSAMPLE = 10`, this will be equivalent in quality to an image using `m_maxFrames = 100` and `NBSAMPLE = 1`. However, using `NBSAMPLE=10` in the ray generation shader will be faster than calling `raytrace()` with `NBSAMPLE=1` 10 times in a row.
# Final Code
You can find the final code in the folder [ray_tracing_jitter_cam](https://github.com/nvpro-samples/vk_raytracing_tutorial/tree/master/ray_tracing_jitter_cam)
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

View file

@ -0,0 +1,339 @@
<meta charset="utf-8" lang="en">
**NVIDIA Vulkan Ray Tracing Tutorial**
**Multiple Closest Hit Shaders**
![](Images/manyhits.png)
This is an extension of the Vulkan ray tracing [tutorial](vkrt_tutorial.md.htm).
The ray tracing tutorial only uses one closest hit shader, but it is also possible to have multiple closest hit shaders.
For example, this could be used to give different models different shaders, or to use a less complex shader when tracing
reflections.
(insert setup.md.html here)
# Setting up the Scene
For this example, we will load the `wuson` model and create another translated instance of it.
Then you can change the `helloVk.loadModel` calls to the following:
~~~~ C++
// Creation of the example
helloVk.loadModel(nvh::findFile("media/scenes/wuson.obj", defaultSearchPaths),
nvmath::translation_mat4(nvmath::vec3f(-1, 0, 0)));
HelloVulkan::ObjInstance inst;
inst.objIndex = 0;
inst.transform = nvmath::translation_mat4(nvmath::vec3f(1, 0, 0));
inst.transformIT = nvmath::transpose(nvmath::invert(inst.transform));
helloVk.m_objInstance.push_back(inst);
helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths));
~~~~
# Adding a new Closest Hit Shader
We will need to create a new closest hit shader (CHIT), to add it to the raytracing pipeline, and to indicate which instance will use this shader.
## `raytrace2.rchit`
We can make a very simple shader to differentiate this closest hit shader from the other one.
As an example, create a new file called `raytrace2.rchit`, and add it to Visual Studio's `shaders` filter with the other shaders.
~~~~ C++
#version 460
#extension GL_NV_ray_tracing : require
#extension GL_GOOGLE_include_directive : enable
#include "raycommon.glsl"
layout(location = 0) rayPayloadInNV hitPayload prd;
void main()
{
prd.hitValue = vec3(1,0,0);
}
~~~~
## `createRtPipeline`
This new shader needs to be added to the raytracing pipeline. So, in `createRtPipeline` in `hello_vulkan.cpp`, load the new closest hit shader immediately after loading the first one.
~~~~ C++
vk::ShaderModule chit2SM =
nvvkpp::util::createShaderModule(m_device, //
nvh::loadFile("shaders/raytrace2.rchit.spv", true, paths));
~~~~
Then add a new hit group group immediately after adding the first hit group:
~~~~ C++
// Second group
stages.push_back({{}, vk::ShaderStageFlagBits::eClosestHitKHR, chit2SM, "main"});
hg.setClosestHitShader(static_cast<uint32_t>(stages.size() - 1));
m_rtShaderGroups.push_back(hg);
~~~~
## `raytrace.rgen`
As a test, you can try changing the `sbtRecordOffset` parameter of the `traceNV` call in `raytrace.rgen`.
If you set the offset to `1`, then all ray hits will use the new CHIT, and the raytraced output should look like the image below:
![](Images/manyhits2.png)
!!! Warning
After testing this out, make sure to revert this change in `raytrace.rgen` before continuing.
## `hello_vulkan.h`
In the `ObjInstance` structure, we will add a new member variable that specifies which hit shader the instance will use:
~~~~ C++
uint32_t hitgroup{0}; // Hit group of the instance
~~~~
This change also needs to be reflected in the `sceneDesc` structure in `wavefront.glsl`:
~~~~ C++
struct sceneDesc
{
int objId;
int txtOffset;
mat4 transfo;
mat4 transfoIT;
int hitGroup;
};
~~~~
!!! Warning:
The solution will not automatically recompile the shaders after this change to `wavefront.glsl`; instead, you will need to recompile all of the SPIR-V shaders.
## `hello_vulkan.cpp`
Finally, we need to tell the top-level acceleration structure which hit group to use for each instance. In `createTopLevelAS()` in `hello_vulkan.cpp`, change the line setting `rayInst.hitGroupId` to
~~~~ C++
rayInst.hitGroupId = m_objInstance[i].hitgroup;
~~~~
## Choosing the Hit shader
Back in `main.cpp`, after loading the scene's models, we can now have both `wuson` models use the new CHIT by adding the following:
~~~~ C++
helloVk.m_objInstance[0].hitgroup = 1;
helloVk.m_objInstance[1].hitgroup = 1;
~~~~
![](Images/manyhits3.png)
# Shader Record Data `shaderRecordKHR`
When creating the [Shader Binding Table](https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/chap33.html#shader-binding-table), see previous, each entry in the table consists of a handle referring to the shader that it invokes. We have packed all data to the size of `shaderGroupHandleSize`, but each entry could be made larger, to store data that can later be referenced by a `shaderRecordKHR` block in the shader.
This information can be used to pass extra information to a shader, for each entry in the SBT.
!!! Note: Note
Since each entry in an SBT group must have the same size, each entry of the group has to have enough space to accommodate the largest element in the entire group.
The following diagram represents our current SBT, with the addition of some data to `HitGroup1`. As mentioned in the **note**, even if
`HitGroup0` doesn't have any shader record data, it still needs to have the same size as `HitGroup1`.
**************************
*+-----------+----------+*
*| RayGen | Handle 0 |*
*+-----------+----------+*
*| Miss | Handle 1 |*
*+-----------+----------+*
*| Miss | Handle 2 |*
*+-----------+----------+*
*| HitGroup0 | Handle 3 |*
*| | -Empty- |*
*+-----------+----------+*
*| HitGroup1 | Handle 4 |*
*| | Data 0 |*
*+-----------+----------+*
**************************
## `hello_vulkan.h`
In the HelloVulkan class, we will add a structure to hold the hit group data.
<script type="preformatted">
~~~~ C++
struct HitRecordBuffer
{
nvmath::vec4f color;
};
std::vector<HitRecordBuffer> m_hitShaderRecord;
~~~~
</script>
## `raytrace2.rchit`
In the closest hit shader, we can retrieve the shader record using the `layout(shaderRecordNV)` descriptor
~~~~ C++
layout(shaderRecordNV) buffer sr_ { vec4 c; } shaderRec;
~~~~
and use this information to return the color:
~~~~ C++
void main()
{
prd.hitValue = shaderRec.c.rgb;
}
~~~~
## `main.cpp`
In `main`, after we set which hit group an instance will use, we can add the data we want to set through the shader record.
~~~~ C++
helloVk.m_hitShaderRecord.resize(1);
helloVk.m_hitShaderRecord[0].color = nvmath::vec4f(1, 1, 0, 0); // Yellow
~~~~
## `HelloVulkan::createRtShaderBindingTable`
Since we are no longer compacting all handles in a continuous buffer, we need to fill the SBT as described above.
After retrieving the handles of all 5 groups (raygen, miss, miss shadow, hit0, and hit1)
using `getRayTracingShaderGroupHandlesNV`, store the pointers to easily retrieve them.
~~~~ C++
// Retrieve the handle pointers
std::vector<uint8_t*> handles(groupCount);
for(uint32_t i = 0; i < groupCount; i++)
{
handles[i] = &shaderHandleStorage[i * groupHandleSize];
}
~~~~
The size of each group can be described as follows:
~~~~ C++
// Sizes
uint32_t rayGenSize = groupHandleSize;
uint32_t missSize = groupHandleSize;
uint32_t hitSize = groupHandleSize + sizeof(HitRecordBuffer);
uint32_t newSbtSize = rayGenSize + 2 * missSize + 2 * hitSize;
~~~~
Then write the new SBT like this, where only Hit 1 has extra data.
~~~~ C++
std::vector<uint8_t> sbtBuffer(newSbtSize);
{
uint8_t* pBuffer = sbtBuffer.data();
memcpy(pBuffer, handles[0], groupHandleSize); // Raygen
pBuffer += rayGenSize;
memcpy(pBuffer, handles[1], groupHandleSize); // Miss 0
pBuffer += missSize;
memcpy(pBuffer, handles[2], groupHandleSize); // Miss 1
pBuffer += missSize;
memcpy(pBuffer, handles[3], groupHandleSize); // Hit 0
pBuffer += groupHandleSize;
pBuffer += sizeof(HitRecordBuffer); // No data
memcpy(pBuffer, handles[4], groupHandleSize); // Hit 1
pBuffer += groupHandleSize;
memcpy(pBuffer, &m_hitShaderRecord[0], sizeof(HitRecordBuffer)); // Hit 1 data
pBuffer += sizeof(HitRecordBuffer);
}
~~~~
Then change the call to `m_alloc.createBuffer` to create the SBT buffer from `sbtBuffer`:
~~~~ C++
m_rtSBTBuffer = m_alloc.createBuffer(cmdBuf, sbtBuffer, vk::BufferUsageFlagBits::eRayTracingKHR);
~~~~
## `raytrace`
Finally, since the size of the hit group is now larger than just the handle, we need to set the new value of the hit group stride in `HelloVulkan::raytrace`.
~~~~ C++
vk::DeviceSize hitGroupStride = progSize + sizeof(HitRecordBuffer);
~~~~
!!! Note:
The result should now show both `wuson` models with a yellow color.
![](Images/manyhits4.png)
# Extending Hit
The SBT can be larger than the number of shading models, which could then be used to have one shader per instance with its own data. For some applications, instead of retrieving the material information as in the main tutorial using a storage buffer and indexing into it using the `gl_InstanceID`, it is possible to set all of the material information in the SBT.
The following modification will add another entry to the SBT with a different color per instance. The new SBT hit group (2) will use the same CHIT handle (4) as hit group 1.
**************************
*+-----------+----------+*
*| RayGen | Handle 0 |*
*+-----------+----------+*
*| Miss | Handle 1 |*
*+-----------+----------+*
*| Miss | Handle 2 |*
*+-----------+----------+*
*| HitGroup0 | Handle 3 |*
*| | -Empty- |*
*+-----------+----------+*
*| HitGroup1 | Handle 4 |*
*| | Data 0 |*
*+-----------+----------+*
*| HitGroup2 | Handle 4 |*
*| | Data 1 |*
*+-----------+----------+*
**************************
## `main.cpp`
In the description of the scene in `main`, we will tell the `wuson` models to use hit groups 1 and 2 respectively, and to have different colors.
~~~~ C++
helloVk.m_objInstance[0].hitgroup = 1;
helloVk.m_objInstance[1].hitgroup = 2;
helloVk.m_hitShaderRecord.resize(2);
helloVk.m_hitShaderRecord[0].color = nvmath::vec4f(0, 1, 0, 0); // Green
helloVk.m_hitShaderRecord[1].color = nvmath::vec4f(0, 1, 1, 0); // Cyan
~~~~
## `createRtShaderBindingTable`
The size of the SBT will now account for its 3 hit groups:
~~~~ C++
uint32_t newSbtSize = rayGenSize + 2 * missSize + 3 * hitSize;
~~~~
Finally, we need to add the new entry as well at the end of the buffer, reusing the handle of the second Hit Group and setting a different color.
~~~~ C++
memcpy(pBuffer, handles[4], groupHandleSize); // Hit 2
pBuffer += groupHandleSize;
memcpy(pBuffer, &m_hitShaderRecord[1], sizeof(HitRecordBuffer)); // Hit 2 data
pBuffer += sizeof(HitRecordBuffer);
~~~~
!!! Warning
Adding entries like this can be error-prone and inconvenient for decent
scene sizes. Instead, it is recommended to wrap the storage of handles, data,
and size per group in a SBT utility to handle this automatically.
# Final Code
You can find the final code in the folder [ray_tracing_manyhits](https://github.com/nvpro-samples/vk_raytracing_tutorial/tree/master/ray_tracing_manyhits)
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

146
docs/vkrt_tuto_rayquery.htm Normal file
View file

@ -0,0 +1,146 @@
<meta charset="utf-8" lang="en">
**NVIDIA Vulkan Ray Tracing Tutorial**
**Ray Query**
![](Images/rayquery.png)
This is an extension of the Vulkan ray tracing [tutorial](vkrt_tutorial.md.htm).
(insert setup.md.html here)
# Ray Query
This extension is allowing to execute ray intersection queries in any shader stages. In this example, we will add
ray queries to the fragment shader to cast shadow rays.
In the contrary to all other examples, with this one, we are removing code. There are no need to have a SBT and a raytracing pipeline, the only thing that
will matter, is the creation of the acceleration structure.
Starting from the end of the tutorial, [ray_tracing__simple](https://github.com/nvpro-samples/vk_raytracing_tutorial/tree/master/ray_tracing__simple) we will remove
all functions that were dedicated to ray tracing and keep only the construction of the BLAS and TLAS.
# Cleanup
First, let's remove all extra code
## hello_vulkan (header)
Remove most functions and members to keep only what is need to create the acceleration structure:
~~~~ C++
// #VKRay
void initRayTracing();
nvvkpp::RaytracingBuilderKHR::Blas objectToVkGeometryKHR(const ObjModel& model);
void createBottomLevelAS();
void createTopLevelAS();
vk::PhysicalDeviceRayTracingPropertiesKHR m_rtProperties;
nvvkpp::RaytracingBuilderKHR m_rtBuilder;
~~~~
## hello_vulkan (source)
From the source code, remove the code for all functions that was previously removed.
## Shaders
You can safely remove all raytrace.* shaders
# Support for Fragment shader
In `HelloVulkan::createDescriptorSetLayout`, add the acceleration structure to the description layout.
~~~~ C++
// The top level acceleration structure
m_descSetLayoutBind.emplace_back( //
vkDS(7, vkDT::eAccelerationStructureKHR, 1, vkSS::eFragment));
~~~~
In `HelloVulkan::updateDescriptorSet`, write the value to the descriptor set.
~~~~ C++
vk::WriteDescriptorSetAccelerationStructureKHR descASInfo;
descASInfo.setAccelerationStructureCount(1);
descASInfo.setPAccelerationStructures(&m_rtBuilder.getAccelerationStructure());
writes.emplace_back(nvvkpp::util::createWrite(m_descSet, m_descSetLayoutBind[7], &descASInfo));
~~~~
## Shader
The last modification is in the fragment shader, where we will add the ray intersection query to trace shadow rays.
First, the version has bumpped to 460
~~~~ C++
#version 460
~~~~
Then we need to add new extensions
~~~~ C++
#extension GL_EXT_ray_tracing : enable
#extension GL_EXT_ray_query : enable
~~~~
We have to add the layout to access the top level acceleration structure.
~~~~ C++
layout(binding = 7, set = 0) uniform accelerationStructureEXT topLevelAS;
~~~~
Ad the end of the shader, add the following code to initiate the ray query. As we are only interested to know if the ray
has hit something, we can keep the minimal.
~~~~ C++
// Ray Query for shadow
vec3 origin = worldPos;
vec3 direction = L; // vector to light
float tMin = 0.01f;
float tMax = lightDistance;
// Initializes a ray query object but does not start traversal
rayQueryEXT rayQuery;
rayQueryInitializeEXT(rayQuery, topLevelAS, gl_RayFlagsTerminateOnFirstHitEXT, 0xFF, origin, tMin,
direction, tMax);
// Start traversal: return false if traversal is complete
while(rayQueryProceedEXT(rayQuery))
{
}
// Returns type of committed (true) intersection
if(rayQueryGetIntersectionTypeEXT(rayQuery, true) != gl_RayQueryCommittedIntersectionNoneEXT)
{
// Got an intersection == Shadow
outColor *= 0.1;
}
~~~~
!!! Info Ray Query
Information about [Ray Query](https://github.com/KhronosGroup/GLSL/blob/master/extensions/ext/GLSL_EXT_ray_query.txt) extension.
# Final Code
You can find the final code in the folder [ray_tracing_reflections](https://github.com/nvpro-samples/vk_raytracing_tutorial/tree/master/ray_tracing_reflections)
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

View file

@ -0,0 +1,270 @@
<meta charset="utf-8" lang="en">
**NVIDIA Vulkan Ray Tracing Tutorial**
**Reflections**
![](Images/reflections.png)
This is an extension of the Vulkan ray tracing [tutorial](vkrt_tutorial.md.htm).
(insert setup.md.html here)
# Setting Up the scene
First, we will create a scene with two reflective planes and a multicolored cube in the center. Change the `helloVk.loadModel` calls in `main()` to
~~~~ C++
// Creation of the example
helloVk.loadModel(nvh::findFile("media/scenes/cube.obj", defaultSearchPaths),
nvmath::translation_mat4(nvmath::vec3f(-2, 0, 0))
* nvmath::scale_mat4(nvmath::vec3f(.1f, 5.f, 5.f)));
helloVk.loadModel(nvh::findFile("media/scenes/cube.obj", defaultSearchPaths),
nvmath::translation_mat4(nvmath::vec3f(2, 0, 0))
* nvmath::scale_mat4(nvmath::vec3f(.1f, 5.f, 5.f)));
helloVk.loadModel(nvh::findFile("media/scenes/cube_multi.obj", defaultSearchPaths));
helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths),
nvmath::translation_mat4(nvmath::vec3f(0, -1, 0)));
~~~~
Then find `cube.mtl` in `media/scenes` and modify the material to be 95% reflective, without any diffuse
contribution:
~~~~ C++
newmtl cube_instance_material
illum 3
d 1
Ns 32
Ni 0
Ka 0 0 0
Kd 0 0 0
Ks 0.95 0.95 0.95
~~~~
# Recursive Reflections
Vulkan ray tracing allows recursive calls to traceNV, up to a limit defined by `VkPhysicalDeviceRayTracingPropertiesKHR`.
In `createRtPipeline()` in `hello_vulkan.cpp`, bring the maximum recursion depth up to 10, making sure not to exceed the physical device's maximum recursion limit:
~~~~ C++
rayPipelineInfo.setMaxRecursionDepth(
std::max(10u, m_rtProperties.maxRecursionDepth)); // Ray depth
~~~~
## `raycommon.glsl`
We will need to track the depth and the attenuation of the ray.
In the `hitPayload` struct in `raycommon.glsl`, add the following:
~~~~ C++
int depth;
vec3 attenuation;
~~~~
## `raytrace.rgen`
In the ray generation shader, we will initialize all payload values before calling `traceNV`.
~~~~ C++
prd.depth = 0;
prd.hitValue = vec3(0);
prd.attenuation = vec3(1.f, 1.f, 1.f);
~~~~
## `raytrace.rchit`
At the end of the closest hit shader, before setting `prd.hitValue`, we need to shoot a ray if the material is reflective.
~~~~ C++
// Reflection
if(mat.illum == 3 && prd.depth < 10)
{
vec3 origin = worldPos;
vec3 rayDir = reflect(gl_WorldRayDirectionNV, normal);
prd.attenuation *= mat.specular;
prd.depth++;
traceNV(topLevelAS, // acceleration structure
gl_RayFlagsNoneNV, // rayFlags
0xFF, // cullMask
0, // sbtRecordOffset
0, // sbtRecordStride
0, // missIndex
origin, // ray origin
0.1, // ray min range
rayDir, // ray direction
100000.0, // ray max range
0 // payload (location = 0)
);
prd.depth--;
}
~~~~
The calculated `hitValue` needs to be accumulated, since the payload is global for the
entire execution from raygen, so change the last line of `main()` to
~~~~ C++
prd.hitValue += vec3(attenuation * lightIntensity * (diffuse + specular)) * prd.attenuation;
~~~~
## `raytrace.rmiss`
Finally, the miss shader also needs to attenuate its contribution:
~~~~ C++
prd.hitValue = clearColor.xyz * 0.8 * prd.attenuation;
~~~~
## Working, but limited
This is working, but it is limited to the number of recursions the GPU can do, and could also impact performance. Trying to go over the limit of recursions would eventually generate a device lost error.
# Iterative Reflections
Instead of dispatching new rays from the closest hit shader, we will return the information in the payload to shoot new rays if needed.
## 'raycommon.glsl'
Enhance the structure to add information to start new rays if wanted.
~~~~ C++
int done;
vec3 rayOrigin;
vec3 rayDir;
~~~~
## `raytrace.rgen`
Initialize the new members of the payload:
~~~~ C++
prd.done = 1;
prd.rayOrigin = origin.xyz;
prd.rayDir = direction.xyz;
~~~~
Instead of calling traceNV only once, we will call it in a loop until we are done.
Wrap the trace call in `raytrace.rgen` like this:
~~~~ C++
vec3 hitValue = vec3(0);
for(;;)
{
traceNV( /*.. */);
hitValue += prd.hitValue * prd.attenuation;
prd.depth++;
if(prd.done == 1 || prd.depth >= 10)
break;
origin.xyz = prd.rayOrigin;
direction.xyz = prd.rayDir;
prd.done = 1; // Will stop if a reflective material isn't hit
}
~~~~
And make sure to write the correct value
~~~~ C++
imageStore(image, ivec2(gl_LaunchIDNV.xy), vec4(hitValue, 1.0));
~~~~
## `raytrace.rchit`
We no longer need to shoot rays from the closest hit shader, so we can replace the block at the end with
~~~~ C++
if(mat.illum == 3)
{
vec3 origin = worldPos;
vec3 rayDir = reflect(gl_WorldRayDirectionNV, normal);
prd.attenuation *= mat.specular;
prd.done = 0;
prd.rayOrigin = origin;
prd.rayDir = rayDir;
}
~~~~
The calculation of the hitValue also no longer needs to be additive, or take attenuation into account:
~~~~ C++
prd.hitValue = vec3(attenuation * lightIntensity * (diffuse + specular));
~~~~
## `raytrace.rmiss`
Since the ray generation shader now handles attenuation, we no longer need to attenuate the value returned in the miss shader:
~~~~ C++
prd.hitValue = clearColor.xyz * 0.8;
~~~~
## Max Recursion
Finally, we no longer need to have a deep recursion setting in `createRtPipeline` -- just a depth of 2, one for the initial ray generation segment and another for shadow rays.
~~~~ C++
rayPipelineInfo.setMaxRecursionDepth(2); // Ray depth
~~~~
In `raytrace.rgen`, we can now make the maximum ray depth significantly larger -- such as 100, for instance -- without causing a device lost error.
# Controlling Depth
As an extra, we can also add UI to control the maximum depth.
In the `RtPushConstant` structure, we can add a new `maxDepth` member to pass to the shader.
~~~~ C++
struct RtPushConstant
{
nvmath::vec4f clearColor;
nvmath::vec3f lightPosition;
float lightIntensity;
int lightType;
int maxDepth{10};
} m_rtPushConstants;
~~~~
In the `raytrace.rgen` shader, we will collect the push constant data
~~~~ C++
layout(push_constant) uniform Constants
{
vec4 clearColor;
vec3 lightPosition;
float lightIntensity;
int lightType;
int maxDepth;
}
pushC;
~~~~
Then test for the value for when to stop
~~~~ C++
if(prd.done == 1 || prd.depth >= pushC.maxDepth)
break;
~~~~
Finally, in `main.cpp` in the `renderUI` function, we will add a slider to control the value.
~~~~ C++
ImGui::SliderInt("Max Depth", &helloVk.m_rtPushConstants.maxDepth, 1, 100);
~~~~
# Final Code
You can find the final code in the folder [ray_tracing_reflections](https://github.com/nvpro-samples/vk_raytracing_tutorial/tree/master/ray_tracing_reflections)
<!-- Markdeep: -->
<link rel="stylesheet" href="vkrt_tutorial.css?">
<script> window.markdeepOptions = { tocStyle: "medium" };</script>
<script src="markdeep.min.js" charset="utf-8"></script>
<script src="https://developer.nvidia.com/sites/default/files/akamai/gameworks/whitepapers/markdeep.min.js" charset="utf-8"></script>
<script>
window.alreadyProcessedMarkdeep || (document.body.style.visibility = "visible")
</script>

175
docs/vkrt_tutorial.css Normal file
View file

@ -0,0 +1,175 @@
/* Custom stylesheet for API documentation by Aras Pranckevičius, http://aras-p.info/
and tweaked by Morgan McGuire.
Licensed as public domain or BSD 2-clause, whichever is more convenient for you.
Originally from https://github.com/aras-p/markdeep-docs-style */
body {
max-width: 50em;
font-family: "Helvetica Neue", Helvetica, Arial, sans-serif;
text-align: left;
/*margin: 1.5em;*/
padding: 0 1em;
}
/* if screen is wide enough, put table of contents on the right side */
@media screen and (min-width: 64em) {
.md .longTOC, .md .mediumTOC, .md .shortTOC {
max-width: 20em;
left: 0em;
display:block;
position: fixed;
top:0;
bottom:0;
overflow-y:scroll;
margin-top:0;
margin-bottom:0;
padding-top:1em;
}
body {
margin-left: 25em;
}
}
/* for narrow screens or print, hide table of contents */
@media screen and (max-width: 64em) {
.md .longTOC, .md .mediumTOC, .md .shortTOC { display: none; }
}
@media print {
.md .longTOC, .md .mediumTOC, .md .shortTOC { display: none; }
body { max-width: 100%; }
}
/* reset heading/link fonts to that of body */
.md a,
.md div.title, contents, .md .tocHeader,
.md h1, .md h2, .md h3, .md h4, .md h5, .md h6,
.md .nonumberh1, .md .nonumberh2, .md .nonumberh3, .md .nonumberh4, .md .nonumberh5, .md .nonumberh6,
.md .shortTOC, .md .mediumTOC, .md .longTOC {
font-family: inherit;
}
.md div.title {
margin: 0.4em 0 0 0;
padding: 0;
text-align: inherit;
}
.md div.subtitle {
text-align: inherit;
}
/* faint border below headings */
.md h1, .md h2, .md h3, .md h4,
.md .nonumberh1, .md .nonumberh2, .md .nonumberh3, .md .nonumberh4 {
border-bottom: 1px solid rgba(0,0,0,.1);
}
/* heading font styles */
.md h1, .md .nonumberh1, .md div.title {
font-size: 150%;
font-weight: 600;
color: rgba(0,0,0,.7);
}
.md h2, .md .nonumberh2 {
font-size: 120%;
font-weight: 400;
color: rgba(0,0,0,.9);
}
.md h3, .md .nonumberh3 {
font-size: 110%;
font-weight: 400;
color: rgba(0,0,0,.7);
}
/* no numbering of headings */
/* .md h1:before, .md h2:before, .md h3:before, .md h4:before { content: none; } */
/* link styling */
.md a:link, .md a:visited {
color: #3f51b5;
}
/* inline and block code */
.md code, .md pre.listing {
background-color: rgba(0,0,0,.05);
padding: 0.1em 0.2em;
border-radius: 0.15em;
}
.md pre.listing code {
background-color: transparent;
padding: 0;
border: none;
}
/* table of contents styling; make all 3 forms of it look the same */
.md .longTOC, .md .mediumTOC, .md .shortTOC {
font-size: inherit;
line-height: 120%;
margin: 1em 0;
padding: .4rem;
border-left: .1rem solid #3f51b5;
}
.md .tocHeader {
margin: 0;
padding: 0;
border: none;
font-size: inherit;
}
/*
.md .tocNumber {
display: none;
}
*/
.md .longTOC .level1, .md .mediumTOC .level1, .md .shortTOC .level1 {
font-weight: inherit;
/* padding: 0; */
/* margin: 0; */
}
.md .longTOC p, .md .mediumTOC p, .md .shortTOC p {
overflow: hidden;
text-overflow: ellipsis;
}
.md .longTOC center, .md .mediumTOC center, .md .shortTOC center, .md .tocHeader {
text-align: left;
}
.md .longTOC b, .md .mediumTOC b, .md .shortTOC b {
font-weight: 400;
}
.md .longTOC center b, .md .mediumTOC center b, .md .shortTOC center b {
font-weight: bold;
}
/* .md .longTOC a, .md .mediumTOC a, .md .shortTOC a { */
/* color: black; */
/* } */
.md .longTOC .level1, .md .mediumTOC .level1, .md .shortTOC .level1,
.md .longTOC .level2, .md .mediumTOC .level2, .md .shortTOC .level2,
.md .longTOC .level3, .md .mediumTOC .level3, .md .shortTOC .level3 {
white-space: nowrap;
/* margin: 0; */
/* padding: 0; */
font-size: 90%;
}
/* tables; use fainter colors than regular markdeep style */
.md table.table {
font-size: 90%;
}
.md table.table th {
border: none;
background-color: #ccc;
color: rgba(0,0,0,.6);
}
.md table.table tr, .md table.table td {
border-color: #eee;
}
.md table.table tr:nth-child(even) {
background-color: #f4f4f4;
}

1862
docs/vkrt_tutorial.md.htm Normal file

File diff suppressed because it is too large Load diff