diff --git a/README.md b/README.md index 45194e4..adbc1a1 100644 --- a/README.md +++ b/README.md @@ -2,26 +2,42 @@ # NVIDIA Vulkan Ray Tracing Tutorials -The focus of this project and the provided code is to showcase a basic integration of -ray tracing within an existing Vulkan sample, using the -[`VK_KHR_ray_tracing`](https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#VK_KHR_ray_tracing) extension. -The following tutorials starts from a the end of the previous ray tracing tutorial and provides step-by-step instructions to modify and add methods and functions. -The sections are organized by components, with subsections identifying the modified functions. - -This project contains multiple tutorials all around Vulkan ray tracing. - -Instead of having examples fully functional, those tutorial starts from a program and guide the user to add what is necessary. - -## Ray Tracing Tutorial - -The first tutorial is starting from a Vulkan code example, which can load multiple OBJ and render them using the rasterizer, and adds step-by-step what is require to do ray tracing. - -### [**Start Ray Tracing Tutorial**](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/) - ![resultRaytraceShadowMedieval](docs/Images/resultRaytraceShadowMedieval.png) -# Going Further -From this point on, you can continue creating your own ray types and shaders, and experiment with more advanced ray tracing based algorithms. +The focus of this repository and the provided code is to showcase a basic integration of +ray tracing within an existing Vulkan sample, using the +[`VK_KHR_ray_tracing`](https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#VK_KHR_ray_tracing) extension. -### [**All Extra Tutorials**](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/vkrt_tuto_further.md.html) +## Setup + +To be able to compile and run those examples, please follow the [setup](docs/setup.md) instructions. Find more over [nvpro-samples](https://github.com/nvpro-samples/build_all) + +## Tutorials + +The [first tutorial] (https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/) starts from a very simple Vulkan application. It loads a OBJ file and use the rasterizer to render it. The tutorial is adding, **step-by-step**, all what is needed to be able to ray traced the scene. + +------- +### Ray Tracing Tutorial -> [**Start **](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/) <- + +------- + + +## Extra Tutotrials + +All other tutorials starts from the end of the _first_ ray tracing tutorial and provides also step-by-step instructions to modify, add methods and functions for that +extra section. + + + +Tutorial | Details +---------|-------- +![small](ray_tracing_anyhit/images/anyhit.png) | [Any Hit Shader](ray_tracing_anyhit)
Implements transparent materials by adding a new shader to the Hit group and using the material information to discard hits over time. Adding anyhit (.ahit) to the ray tracing pipeline. Randomly letting the ray hit or not which is making simple transparency. +![small](ray_tracing_jitter_cam/images/antialiasing.png) | [Jitter Camera](ray_tracing_jitter_cam)
Anti-aliases the image by accumulating small variations of rays over time. Random ray direction generation. Read/write/accumulate final image +![img](ray_tracing_instances/images/instances.png) | [Thousands of Objects](ray_tracing_instances)
The current example allocates memory for each object, each of which has several buffers. This shows how to get around Vulkan's limits on the total number of memory allocations by using a memory allocator. Extend the limit of 4096 memory allocations. Using memory allocators: DMA, VMA +![img](ray_tracing_reflections/images/reflections.png) | [Reflections](ray_tracing_reflections)
Reflections can be implemented by shooting new rays from the closest hit shader, or by iteratively shooting them from the raygen shader. This example shows the limitations and differences of these implementations. Calling traceRayEXT() from the closest hit shader (recursive). Adding more data to the ray payload to continue the ray from the raygen shader. +![img](ray_tracing_manyhits/images/manyhits.png) | [Multiple Closest Hits Shader and Shader Records](ray_tracing_manyhits)
Multiple Closest Hits Shader and Shader Records. Explains how to add more closest hit shaders, choose which instance uses which shader, and add data per SBT that can be retrieved in the shader, and more. One closest hit shader per object. Sharing closest hit shaders for some object.Passing shader record to closest hit shader. +![img](ray_tracing_animation/images/animation2.gif) | [Animation](ray_tracing_animation)
This tutorial shows how animating the transformation matrices of the instances (TLAS) and animating the vertices of an object (BLAS) in a compute shader, could be done. Refit of top level acceleration structure. Refit of bottom level acceleration structure. +![img](ray_tracing_intersection/images/intersection.png) | [Intersectiom Shader](ray_tracing_intersection)
Adding thousands of implicit primitives and using an intersection shader to render spheres and cubes. The tutorial explains what is needed to get procedural hit group working. Intersection Shader. Sphere intersection. Axis aligned bounding box intersection. +![img](ray_tracing_callable/images/callable.png) | [Callable Shader](ray_tracing_callable)
Replacing if/else by callable shaders. The code to execute the lighting is done in separate callable shaders instead of been part of the code. Adding multiple callable shaders. Calling ExecuteCallableEXT from the closest hit shader. +![img](ray_tracing_rayquery/images/rayquery.png) | [Ray Query](ray_tracing_rayquery)
Invoking ray intersection queries directly from the fragment shader to cast shadow rays. Ray tracing directly from the fragment shader. diff --git a/docs/setup.md b/docs/setup.md new file mode 100644 index 0000000..7e516ea --- /dev/null +++ b/docs/setup.md @@ -0,0 +1,62 @@ + +# Environment Setup + + +## Repositories + +Besides the current repository, you will also need to clone or download the following repositories: + +* [shared_sources](https://github.com/nvpro-samples/shared_sources): The primary framework that all samples depend on. +* [shared_external](https://github.com/nvpro-samples/shared_external): Third party libraries that are provided pre-compiled, mostly for Windows x64 / MSVC. + +The directory structure should be looking like this: + +~~~~ + \ + | + +-- :file_folder: shared_external + | + +-- :file_folder: shared_sources + | + +-- :open_file_folder: vk_raytracing_tutorial_KHR + | | + | +-- :file_folder: ray_tracing__simple + | | + | +-- :file_folder: ray_tracing_... + | | + | ⋮ + | + ⋮ +~~~~ + +## Latest Vulkan SDK + +This repository tries to always be up to date with the latest Vulkan SDK, therefore we suggest to download and install it. + +**Vulkan SDK**: https://vulkan.lunarg.com/sdk/home + + +## Beta Installation + +KHR ray tracing is still in Beta, therefore you will need the latest +Vulkan driver. + +**Latest driver**: https://developer.nvidia.com/vulkan-driver + + +## CMake + +The CMakefile will use other makefiles from `shared_sources` and look for Vulkan environment variables for the installation of the SDK. Therefore, it is important to have all the above installed before running Cmake in the +`vk_raytracing_tutorial_KHR` directory. + +**_Note_**: If you are using your own Vulkan header files, it is possible to overide the default search path. + Modify `VULKAN > VULKAN_HEADERS_OVERRIDE_INCLUDE_DIR` to the path to beta vulkan headers. + +## Starting From Extra Tutorial + +All _extra_ tutorials are starting from the end result of the _first tutorial_. The directory of the _extra_ tutorials is the end result of doing it. + +To start the tutorial from the begining. + +* Make a copy of the ray_tutorial__simple (backup) +* Follow the tutorial by modifying ray_tutorial__simple \ No newline at end of file diff --git a/docs/vkrt_tuto_anyhit.md.htm b/docs/vkrt_tuto_anyhit.md.htm index 611e8c1..0bf8b3c 100644 --- a/docs/vkrt_tuto_anyhit.md.htm +++ b/docs/vkrt_tuto_anyhit.md.htm @@ -235,6 +235,174 @@ As mentioned earlier, for the effect to work, we need to accumulate frames over * [Storing or Updating](vkrt_tuto_jitter_cam.md.htm#toc1.4) * [Application Frame Update](vkrt_tuto_jitter_cam.md.htm#toc2) + +# Fixing Pipeline + +The above code works, but might not work in the future. The reason is, the shadow ray `traceRayEXT` call in the Closest Hit shader, uses payload 1 +and when intersecting the object, the any hit shader will be executed using payload 0. In the time of writing those lines, the driver add +padding and there are no side effect, but this is not how thing should be done. + +Each `traceRayEXT` invocation should have as many Hit Groups as there are trace calls with different payload. For the other examples, it is still fine, +because we are using the `gl_RayFlagsSkipClosestHitShaderNV` flag and the closest hit shader (payload 0) will not be called and there were not +any hit or intersection shaders in the Hit Group. But in this example, the closest hit will be skiped, but not the any hit. + +**To fix this**, we need to add another hit group. + +This is how the current SBT looks like. + +![](Images/anyhit_0.png) + +And we need to add the following to the ray tracing pipeline, a copy of the previous Hit Group, with a new AnyHit using the proper payload. + +![](Images/anyhit_01.png) + + +## New shaders + +Create two new files `raytrace_0.ahit` and `raytrace_1.ahit`, and rename `raytrace.ahit` to `raytrace_ahit.glsl` + +!!! WARNING CMake + Cmake need to be re-run to add the new files to the project. + +In `raytrace_0.ahit` add the following code + +~~~~ C +#version 460 +#extension GL_GOOGLE_include_directive : enable + +#define PAYLOAD_0 +#include "raytrace_rahit.glsl" +~~~~ + +and in `raytrace_1.ahit`, replace `PAYLOAD_0` by `PAYLOAD_1` + +Then in `raytrace_ahit.glsl` remove the `#version 460` and add the following code, so that we have the right layout. + +~~~~ C +#ifdef PAYLOAD_0 +layout(location = 0) rayPayloadInNV hitPayload prd; +#elif defined(PAYLOAD_1) +layout(location = 1) rayPayloadInNV shadowPayload prd; +#endif +~~~~ + +## New Payload + +We cannot simply have a bool for our shadow ray payload. We also need the `seed` for the random function. + +In the `raycommon.glsl` file, add the following structure + +~~~~ C +struct shadowPayload +{ + bool isHit; + uint seed; +}; +~~~~ + +The usage of the shadow payload is done in the closest hit and shadow miss shader. First, let's modify `raytraceShadow.rmiss` to look like this + +~~~~ C +#version 460 +#extension GL_NV_ray_tracing : require +#extension GL_GOOGLE_include_directive : enable + +#include "raycommon.glsl" + +layout(location = 1) rayPayloadInNV shadowPayload prd; + +void main() +{ + prd.isHit = false; +} +~~~~ + +The the change in the closest hit shader `raytrace.rchit`, need to change the usage of the payload, but also the call to `traceRayEXT` + +Replace the payload to + +~~~~ C +layout(location = 1) rayPayloadNV shadowPayload prdShadow; +~~~~ + +Then just before the call to `traceRayEXT`, initialize the values to + +~~~~ C +prdShadow.isHit = true; +prdShadow.seed = prd.seed; +~~~~ + +and after the trace, set the seed value back to the main payload + +~~~~ C +prd.seed = prdShadow.seed; +~~~~ + +And check if the trace shadow hit an object of not + +~~~~ C +if(prdShadow.isHit) +~~~~ + +### traceRayEXT + +When we call `traceRayEXT`, since we are using the payload 1 (last argument), we also +need the trace to hit the alternative hit group, the one using the payload 1. +To do this, we need to set the sbtRecordOffset to 1 + +~~~~ C +traceRayEXT(topLevelAS, // acceleration structure + flags, // rayFlags + 0xFF, // cullMask + 1, // sbtRecordOffset + 0, // sbtRecordStride + 1, // missIndex + origin, // ray origin + tMin, // ray min range + rayDir, // ray direction + tMax, // ray max range + 1 // payload (location = 1) + ); +~~~~ + + + + +## Ray tracing Pipeline + +The final step is to add the new Hit Group. This is a change in `HelloVulkan::createRtPipeline()`. +We need to load the new any hit shader and create a new Hit Group. + +Replace the `"shaders/raytrace.rahit.spv"` for `"shaders/raytrace_0.rahit.spv"` + +Then, after the creating of the first Hit Group, create a new one, where only the any hit using payload 1 +is added. We are skipping the closest hit shader in the trace call, so we can ignore it in the Hit Group. + +~~~~ C +// Payload 1 +vk::ShaderModule ahit1SM = + nvvk::createShaderModule(m_device, // + nvh::loadFile("shaders/raytrace_1.rahit.spv", true, paths)); +hg.setClosestHitShader(VK_SHADER_UNUSED_NV); // Not used by shadow (skipped) +stages.push_back({{}, vk::ShaderStageFlagBits::eAnyHitNV, ahit1SM, "main"}); +hg.setAnyHitShader(static_cast(stages.size() - 1)); +m_rtShaderGroups.push_back(hg); +~~~~ + +At the end of the function, delete the shader module `ahit1SM`. + + +!!! NOTE Re-Run + Everything should work as before, but now it does it right. + + + + + + + + + # Final Code You can find the final code in the folder [ray_tracing_anyhit](https://github.com/nvpro-samples/vk_raytracing_tutorial_KHR/tree/master/ray_tracing_anyhit) diff --git a/ray_tracing__advance/images/ray_tracing__advance.png b/ray_tracing__advance/images/ray_tracing__advance.png new file mode 100644 index 0000000..49b3236 Binary files /dev/null and b/ray_tracing__advance/images/ray_tracing__advance.png differ diff --git a/ray_tracing__advance/shaders/raytrace.rahit b/ray_tracing__advance/shaders/raytrace.rahit index 57d0d56..8ef6d33 100644 --- a/ray_tracing__advance/shaders/raytrace.rahit +++ b/ray_tracing__advance/shaders/raytrace.rahit @@ -28,8 +28,9 @@ void main() if(mat.illum != 4) return; + uint seed = prd.seed; // We don't want to modify the PRD if(mat.dissolve == 0.0) ignoreIntersectionEXT(); - else if(rnd(prd.seed) > mat.dissolve) + else if(rnd(seed) > mat.dissolve) ignoreIntersectionEXT(); } diff --git a/ray_tracing__advance/shaders/raytrace2.rahit b/ray_tracing__advance/shaders/raytrace2.rahit index c1f52a4..a71a877 100644 --- a/ray_tracing__advance/shaders/raytrace2.rahit +++ b/ray_tracing__advance/shaders/raytrace2.rahit @@ -24,8 +24,9 @@ void main() if(mat.illum != 4) return; + uint seed = prd.seed; // We don't want to modify the PRD if(mat.dissolve == 0.0) ignoreIntersectionEXT(); - else if(rnd(prd.seed) > mat.dissolve) + else if(rnd(seed) > mat.dissolve) ignoreIntersectionEXT(); } diff --git a/ray_tracing__before/main.cpp b/ray_tracing__before/main.cpp index dcf0c53..5d77bce 100644 --- a/ray_tracing__before/main.cpp +++ b/ray_tracing__before/main.cpp @@ -119,10 +119,10 @@ int main(int argc, char** argv) // Search path for shaders and other media defaultSearchPaths = { - PROJECT_ABSDIRECTORY, - PROJECT_ABSDIRECTORY "../", - NVPSystem::exePath() + std::string(PROJECT_RELDIRECTORY), - NVPSystem::exePath() + std::string(PROJECT_RELDIRECTORY) + std::string("../"), + PROJECT_ABSDIRECTORY, // shaders + PROJECT_ABSDIRECTORY "../", // media + PROJECT_NAME, // installed: shaders + media + NVPSystem::exePath() + std::string(PROJECT_NAME), }; // Enabling the extension feature diff --git a/ray_tracing_animation/README.md b/ray_tracing_animation/README.md index e1bdb09..d61fb9c 100644 --- a/ray_tracing_animation/README.md +++ b/ray_tracing_animation/README.md @@ -1,5 +1,595 @@ -# NVIDIA Vulkan Ray Tracing Tutorial +# Ray Tracing Animation - Tutorial -[Start the tutorial of this project](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/vkrt_tuto_animation.md.htm) +![](Images/animation2.gif) -![](../docs/Images/animation2.gif) \ No newline at end of file +## Tutorial ([Setup](../docs/setup.md)) + +This is an extension of the Vulkan ray tracing [tutorial](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR). + +We will implement two animation methods: animating only the transformation matrices, and animating the geometry itself. + + +## Animating the Matrices +This first example shows how we can update the matrices used for instances in the TLAS. + +### Creating a Scene + +In main.cpp we can create a new scene with a ground plane and 21 instances of the Wuson model, by replacing the +`helloVk.loadModel` calls in `main()`. The code below creates all of the instances +at the same position, but we will displace them later in the animation function. If you run the example, +you will find that the rendering is considerably slow, because the geometries are exactly at the same position +and the acceleration structure does not deal with this well. + +~~~~ C++ + helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths), + nvmath::scale_mat4(nvmath::vec3f(2.f, 1.f, 2.f))); + helloVk.loadModel(nvh::findFile("media/scenes/wuson.obj", defaultSearchPaths)); + HelloVulkan::ObjInstance inst = helloVk.m_objInstance.back(); + for(int i = 0; i < 20; i++) + helloVk.m_objInstance.push_back(inst); +~~~~ + +### Animation Function +We want to have all of the Wuson models running in a circle, and we will first modify the rasterizer to handle this. +Animating the transformation matrices will be done entirely on the CPU, and we will copy the computed transformation to the GPU. +In the next example, the animation will be done on the GPU using a compute shader. + +Add the declaration of the animation to the `HelloVulkan` class. +~~~~ C++ +void animationInstances(float time); +~~~~ + +The first part computes the transformations for all of the Wuson models, placing each one behind another. +~~~~ C++ +void HelloVulkan::animationInstances(float time) +{ + const int32_t nbWuson = static_cast(m_objInstance.size() - 1); + const float deltaAngle = 6.28318530718f / static_cast(nbWuson); + const float wusonLength = 3.f; + const float radius = wusonLength / (2.f * sin(deltaAngle / 2.0f)); + const float offset = time * 0.5f; + + for(int i = 0; i < nbWuson; i++) + { + int wusonIdx = i + 1; + ObjInstance& inst = m_objInstance[wusonIdx]; + inst.transform = nvmath::rotation_mat4_y(i * deltaAngle + offset) + * nvmath::translation_mat4(radius, 0.f, 0.f); + inst.transformIT = nvmath::transpose(nvmath::invert(inst.transform)); + } +~~~~ + +Next, we update the buffer that describes the scene, which is used by the rasterizer to set each object's position, and also by the ray tracer to compute shading normals. +~~~~ C++ + // Update the buffer + vk::DeviceSize bufferSize = m_objInstance.size() * sizeof(ObjInstance); + nvvkBuffer stagingBuffer = m_alloc.createBuffer(bufferSize, vk::BufferUsageFlagBits::eTransferSrc, + vk::MemoryPropertyFlagBits::eHostVisible); + // Copy data to staging buffer + auto* gInst = m_alloc.map(stagingBuffer); + memcpy(gInst, m_objInstance.data(), bufferSize); + m_alloc.unmap(stagingBuffer); + // Copy staging buffer to the Scene Description buffer + nvvk::CommandPool genCmdBuf(m_device, m_graphicsQueueIndex); + vk::CommandBuffer cmdBuf = genCmdBuf.createCommandBuffer(); + cmdBuf.copyBuffer(stagingBuffer.buffer, m_sceneDesc.buffer, vk::BufferCopy(0, 0, bufferSize)); + m_debug.endLabel(cmdBuf); + genCmdBuf.submitAndWait(cmdBuf); + m_alloc.destroy(stagingBuffer); + + m_rtBuilder.updateTlasMatrices(m_tlas); + m_rtBuilder.updateBlas(2); +} +~~~~ + +**Note**: + We could have used `cmdBuf.updateBuffer(m_sceneDesc.buffer, 0, m_objInstance)` to + update the buffer, but this function only works for buffers with less than 65,536 bytes. If we had 2000 Wuson models, this + call wouldn't work. + +### Loop Animation +In `main()`, just before the main loop, add a variable to hold the start time. +We will use this time in our animation function. + +~~~~ C++ + auto start = std::chrono::system_clock::now(); +~~~~ + +Inside the `while` loop, just before calling `appBase.prepareFrame()`, invoke the animation function. + +~~~~ C++ + std::chrono::duration diff = std::chrono::system_clock::now() - start; + helloVk.animationInstances(diff.count()); +~~~~ + +If you run the application, the Wuson models will be running in a circle when using the rasterizer, but +they will still be at their original positions in the ray traced version. We will need to update the TLAS for this. + + +## Update TLAS + +Since we want to update the transformation matrices in the TLAS, we need to keep some of the objects used to create it. + +First, move the vector of `nvvk::RaytracingBuilder::Instance` objects from `HelloVulkan::createTopLevelAS()` to the +`HelloVulkan` class. +~~~~ C++ +std::vector m_tlas; +~~~~ + +Make sure to rename it to `m_tlas`, instead of `tlas`. + +One important point is that we need to set the TLAS build flags to allow updates, by adding the`vk::BuildAccelerationStructureFlagBitsKHR::eAllowUpdate` flag. +This is absolutely needed, since otherwise the TLAS cannot be updated. + +~~~~ C++ +void HelloVulkan::createTopLevelAS() +{ + m_tlas.reserve(m_objInstance.size()); + for(int i = 0; i < static_cast(m_objInstance.size()); i++) + { + nvvk::RaytracingBuilder::Instance rayInst; + rayInst.transform = m_objInstance[i].transform; // Position of the instance + rayInst.instanceId = i; // gl_InstanceID + rayInst.blasId = m_objInstance[i].objIndex; + rayInst.hitGroupId = m_objInstance[i].hitgroup; + rayInst.flags = VK_GEOMETRY_INSTANCE_TRIANGLE_CULL_DISABLE_BIT_NV; + m_tlas.emplace_back(rayInst); + } + m_rtBuilder.buildTlas(m_tlas, vk::BuildAccelerationStructureFlagBitsKHR::ePreferFastTrace + | vk::BuildAccelerationStructureFlagBitsKHR::eAllowUpdate); +} +~~~~ + +Back in `HelloVulkan::animationInstances()`, we need to copy the new computed transformation +matrices to the vector of `nvvk::RaytracingBuilder::Instance` objects. + +In the `for` loop, add at the end + +~~~~ C++ + nvvk::RaytracingBuilder::Instance& tinst = m_tlas[wusonIdx]; + tinst.transform = inst.transform; +~~~~ + +The last point is to call the update at the end of the function. + +~~~~ C++ +m_rtBuilder.updateTlasMatrices(m_tlas); +~~~~ + +![](Images/animation1.gif) + +### nvvk::RaytracingBuilder::updateTlasMatrices (Implementation) + +We currently use `nvvk::RaytracingBuilder` to update the matrices for convenience, but +this could be done more efficiently if one kept some of the buffer and memory references. Using a +memory allocator, such as the one described in the [Many Objects Tutorial](vkrt_tuto_instances.md.htm), +could also be an alternative for avoiding multiple reallocations. Here's the implementation of `nvvk::RaytracingBuilder::updateTlasMatrices`. + +#### Staging Buffer + +As in the rasterizer, the data needs to be staged before it can be copied to the buffer used for +building the TLAS. + +~~~~ C++ + void updateTlasMatrices(const std::vector& instances) + { + VkDeviceSize bufferSize = instances.size() * sizeof(VkAccelerationStructureInstanceKHR); + // Create a staging buffer on the host to upload the new instance data + nvvkBuffer stagingBuffer = m_alloc.createBuffer(bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, +#if defined(ALLOC_VMA) + VmaMemoryUsage::VMA_MEMORY_USAGE_CPU_TO_GPU +#else + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT +#endif + ); + + // Copy the instance data into the staging buffer + auto* gInst = reinterpret_cast(m_alloc.map(stagingBuffer)); + for(int i = 0; i < instances.size(); i++) + { + gInst[i] = instanceToVkGeometryInstanceKHR(instances[i]); + } + m_alloc.unmap(stagingBuffer); +~~~~ + +#### Scratch Memory + +Building the TLAS always needs scratch memory, and so we need to request it. If +we hadn't set the `eAllowUpdate` flag, the returned size would be zero and the rest of the code +would fail. + +~~~~ C++ + // Compute the amount of scratch memory required by the AS builder to update + VkAccelerationStructureMemoryRequirementsInfoKHR memoryRequirementsInfo{ + VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_INFO_KHR}; + memoryRequirementsInfo.type = VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_UPDATE_SCRATCH_KHR; + memoryRequirementsInfo.accelerationStructure = m_tlas.as.accel; + memoryRequirementsInfo.buildType = VK_ACCELERATION_STRUCTURE_BUILD_TYPE_DEVICE_KHR; + + VkMemoryRequirements2 reqMem{VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2}; + vkGetAccelerationStructureMemoryRequirementsKHR(m_device, &memoryRequirementsInfo, &reqMem); + VkDeviceSize scratchSize = reqMem.memoryRequirements.size; + + // Allocate the scratch buffer + nvvkBuffer scratchBuffer = + m_alloc.createBuffer(scratchSize, VK_BUFFER_USAGE_RAY_TRACING_BIT_KHR | VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT); + VkBufferDeviceAddressInfo bufferInfo{VK_STRUCTURE_TYPE_BUFFER_DEVICE_ADDRESS_INFO}; + bufferInfo.buffer = scratchBuffer.buffer; + VkDeviceAddress scratchAddress = vkGetBufferDeviceAddress(m_device, &bufferInfo); +~~~~ + +#### Update the Buffer +In a new command buffer, we copy the staging buffer to the device buffer and +add a barrier to make sure the memory finishes copying before updating the TLAS. + +~~~~ C++ + // Update the instance buffer on the device side and build the TLAS + nvvk::CommandPool genCmdBuf(m_device, m_queueIndex); + VkCommandBuffer cmdBuf = genCmdBuf.createCommandBuffer(); + + VkBufferCopy region{0, 0, bufferSize}; + vkCmdCopyBuffer(cmdBuf, stagingBuffer.buffer, m_instBuffer.buffer, 1, ®ion); + + //VkBufferDeviceAddressInfo bufferInfo{VK_STRUCTURE_TYPE_BUFFER_DEVICE_ADDRESS_INFO}; + bufferInfo.buffer = m_instBuffer.buffer; + VkDeviceAddress instanceAddress = vkGetBufferDeviceAddress(m_device, &bufferInfo); + + + // Make sure the copy of the instance buffer are copied before triggering the + // acceleration structure build + VkMemoryBarrier barrier{VK_STRUCTURE_TYPE_MEMORY_BARRIER}; + barrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT; + barrier.dstAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_KHR; + vkCmdPipelineBarrier(cmdBuf, VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_KHR, + 0, 1, &barrier, 0, nullptr, 0, nullptr); +~~~~ + +#### Update Acceleration Structure + +We update the TLAS using the same acceleration structure for source and +destination to update it in place, and using the VK_TRUE parameter to trigger the update. + +~~~~ C++ + VkAccelerationStructureGeometryDataKHR geometry{VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_GEOMETRY_INSTANCES_DATA_KHR}; + geometry.instances.arrayOfPointers = VK_FALSE; + geometry.instances.data.deviceAddress = instanceAddress; + VkAccelerationStructureGeometryKHR topASGeometry{VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_GEOMETRY_KHR}; + topASGeometry.geometryType = VK_GEOMETRY_TYPE_INSTANCES_KHR; + topASGeometry.geometry = geometry; + + const VkAccelerationStructureGeometryKHR* pGeometry = &topASGeometry; + + VkAccelerationStructureBuildGeometryInfoKHR topASInfo{VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_BUILD_GEOMETRY_INFO_KHR}; + topASInfo.flags = m_tlas.flags; + topASInfo.update = VK_TRUE; + topASInfo.srcAccelerationStructure = m_tlas.as.accel; + topASInfo.dstAccelerationStructure = m_tlas.as.accel; + topASInfo.geometryArrayOfPointers = VK_FALSE; + topASInfo.geometryCount = 1; + topASInfo.ppGeometries = &pGeometry; + topASInfo.scratchData.deviceAddress = scratchAddress; + + uint32_t nbInstances = (uint32_t)instances.size(); + VkAccelerationStructureBuildOffsetInfoKHR buildOffsetInfo = {nbInstances, 0, 0, 0}; + const VkAccelerationStructureBuildOffsetInfoKHR* pBuildOffsetInfo = &buildOffsetInfo; + + // Build the TLAS + + // Update the acceleration structure. Note the VK_TRUE parameter to trigger the update, + // and the existing TLAS being passed and updated in place + vkCmdBuildAccelerationStructureKHR(cmdBuf, 1, &topASInfo, &pBuildOffsetInfo); + + genCmdBuf.submitAndWait(cmdBuf); +~~~~ + +#### Cleanup + +Finally, we release all temporary buffers. + +~~~~ C++ + m_alloc.destroy(scratchBuffer); + m_alloc.destroy(stagingBuffer); + } +~~~~ + +## BLAS Animation + +In the previous chapter, we updated the transformation matrices. In this one we will modify vertices in a compute shader. + +### Adding a Sphere + +In this chapter, we will animate a sphere. In `main.cpp`, set up the scene like this: + +~~~~ C++ + helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths), + nvmath::scale_mat4(nvmath::vec3f(2.f, 1.f, 2.f))); + helloVk.loadModel(nvh::findFile("media/scenes/wuson.obj", defaultSearchPaths)); + HelloVulkan::ObjInstance inst = helloVk.m_objInstance.back(); + for(int i = 0; i < 5; i++) + helloVk.m_objInstance.push_back(inst); + helloVk.loadModel(nvh::findFile("media/scenes/sphere.obj", defaultSearchPaths)); +~~~~ + +Because we now have a new instance, we have to adjust the calculation of the number of Wuson models in `HelloVulkan::animationInstances()`. + +~~~~ C++ + const int32_t nbWuson = static_cast(m_objInstance.size() - 2); +~~~~ + +### Compute Shader + +The compute shader will update the vertices in-place. + +Add all of the following members to the `HelloVulkan` class: + +~~~~ C++ + void createCompDescriptors(); + void updateCompDescriptors(nvvkBuffer& vertex); + void createCompPipelines(); + + nvvk::DescriptorSetBindings m_compDescSetLayoutBind; + vk::DescriptorPool m_compDescPool; + vk::DescriptorSetLayout m_compDescSetLayout; + vk::DescriptorSet m_compDescSet; + vk::Pipeline m_compPipeline; + vk::PipelineLayout m_compPipelineLayout; +~~~~ + +The compute shader will work on a single `VertexObj` buffer. + +~~~~ C++ +void HelloVulkan::createCompDescriptors() +{ + m_compDescSetLayoutBind.addBinding(vk::DescriptorSetLayoutBinding( + 0, vk::DescriptorType::eStorageBuffer, 1, vk::ShaderStageFlagBits::eCompute)); + + m_compDescSetLayout = m_compDescSetLayoutBind.createLayout(m_device); + m_compDescPool = m_compDescSetLayoutBind.createPool(m_device, 1); + m_compDescSet = nvvk::allocateDescriptorSet(m_device, m_compDescPool, m_compDescSetLayout); +} +~~~~ + +`updateCompDescriptors` will set the set the descriptor to the buffer of `VertexObj` objects to which the animation will be applied. + +~~~~ C++ +void HelloVulkan::updateCompDescriptors(nvvkBuffer& vertex) +{ + std::vector writes; + vk::DescriptorBufferInfo dbiUnif{vertex.buffer, 0, VK_WHOLE_SIZE}; + writes.emplace_back(m_compDescSetLayoutBind.makeWrite(m_compDescSet, 0, dbiUnif)); + m_device.updateDescriptorSets(static_cast(writes.size()), writes.data(), 0, nullptr); +} +~~~~ + +The compute pipeline will consist of a simple shader and a push constant, which will be used +to set the animation time. + +~~~~ C++ +void HelloVulkan::createCompPipelines() +{ + // pushing time + vk::PushConstantRange push_constants = {vk::ShaderStageFlagBits::eCompute, 0, sizeof(float)}; + vk::PipelineLayoutCreateInfo layout_info{{}, 1, &m_compDescSetLayout, 1, &push_constants}; + m_compPipelineLayout = m_device.createPipelineLayout(layout_info); + vk::ComputePipelineCreateInfo computePipelineCreateInfo{{}, {}, m_compPipelineLayout}; + + computePipelineCreateInfo.stage = + nvvk::createShaderStageInfo(m_device, + nvh::loadFile("shaders/anim.comp.spv", true, defaultSearchPaths), + VK_SHADER_STAGE_COMPUTE_BIT); + m_compPipeline = m_device.createComputePipeline({}, computePipelineCreateInfo, nullptr); + m_device.destroy(computePipelineCreateInfo.stage.module); +} +~~~~ + +Finally, destroy the resources in `HelloVulkan::destroyResources()`: + +~~~~ C++ + m_device.destroy(m_compDescPool); + m_device.destroy(m_compDescSetLayout); + m_device.destroy(m_compPipeline); + m_device.destroy(m_compPipelineLayout); +~~~~ + +### `anim.comp` + +The compute shader will be simple. We need to add a new shader file, `anim.comp`, to the `shaders` filter in the solution. + +This will move each vertex up and down over time. + +~~~~ C++ +#version 460 +#extension GL_ARB_separate_shader_objects : enable +#extension GL_EXT_scalar_block_layout : enable +#extension GL_GOOGLE_include_directive : enable +#include "wavefront.glsl" + +layout(binding = 0, scalar) buffer Vertices +{ + Vertex v[]; +} +vertices; + +layout(push_constant) uniform shaderInformation +{ + float iTime; +} +pushc; + +void main() +{ + Vertex v0 = vertices.v[gl_GlobalInvocationID.x]; + + // Compute vertex position + const float PI = 3.14159265; + const float signY = (v0.pos.y >= 0 ? 1 : -1); + const float radius = length(v0.pos.xz); + const float argument = pushc.iTime * 4 + radius * PI; + const float s = sin(argument); + v0.pos.y = signY * abs(s) * 0.5; + + // Compute normal + if(radius == 0.0f) + { + v0.nrm = vec3(0.0f, signY, 0.0f); + } + else + { + const float c = cos(argument); + const float xzFactor = -PI * s * c; + const float yFactor = 2.0f * signY * radius * abs(s); + v0.nrm = normalize(vec3(v0.pos.x * xzFactor, yFactor, v0.pos.z * xzFactor)); + } + + vertices.v[gl_GlobalInvocationID.x] = v0; +} +~~~~ + +### Animating the Object + +First add the declaration of the animation function in `HelloVulkan`: + +~~~~ C++ +void animationObject(float time); +~~~~ + +The implementation only pushes the current time and calls the compute shader (`dispatch`). + +~~~~ C++ +void HelloVulkan::animationObject(float time) +{ + ObjModel& model = m_objModel[2]; + + updateCompDescriptors(model.vertexBuffer); + + nvvk::CommandPool genCmdBuf(m_device, m_graphicsQueueIndex); + vk::CommandBuffer cmdBuf = genCmdBuf.createCommandBuffer(); + + cmdBuf.bindPipeline(vk::PipelineBindPoint::eCompute, m_compPipeline); + cmdBuf.bindDescriptorSets(vk::PipelineBindPoint::eCompute, m_compPipelineLayout, 0, + {m_compDescSet}, {}); + cmdBuf.pushConstants(m_compPipelineLayout, vk::ShaderStageFlagBits::eCompute, 0, sizeof(float), + &time); + cmdBuf.dispatch(model.nbVertices, 1, 1); + genCmdBuf.submitAndWait(cmdBuf); +} +~~~~ + +### Invoking Animation + +In `main.cpp`, after the other resource creation functions, add the creation functions for the compute shader. + +~~~~ C++ + helloVk.createCompDescriptors(); + helloVk.createCompPipelines(); +~~~~ + +In the rendering loop, after the call to `animationInstances`, call the object animation function. + +~~~~ C++ + helloVk.animationObject(diff.count()); +~~~~ + +**Note**: At this point, the object should be animated when using the rasterizer, but should still be immobile when using the ray tracer. + + + +## Update BLAS + +In `nvvk::RaytracingBuilder` in `raytrace_vkpp.hpp`, we can add a function to update a BLAS whose vertex buffer was previously updated. This function is very similar to the one used for instances, but in this case, there is no buffer transfer to do. + +~~~~ C++ + //-------------------------------------------------------------------------------------------------- + // Refit the BLAS from updated buffers + // + void updateBlas(uint32_t blasIdx) + { + Blas& blas = m_blas[blasIdx]; + + // Compute the amount of scratch memory required by the AS builder to update the BLAS + VkAccelerationStructureMemoryRequirementsInfoKHR memoryRequirementsInfo{ + VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_INFO_KHR}; + memoryRequirementsInfo.type = VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_UPDATE_SCRATCH_KHR; + memoryRequirementsInfo.accelerationStructure = blas.as.accel; + memoryRequirementsInfo.buildType = VK_ACCELERATION_STRUCTURE_BUILD_TYPE_DEVICE_KHR; + + VkMemoryRequirements2 reqMem{VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2}; + vkGetAccelerationStructureMemoryRequirementsKHR(m_device, &memoryRequirementsInfo, &reqMem); + VkDeviceSize scratchSize = reqMem.memoryRequirements.size; + + // Allocate the scratch buffer + nvvkBuffer scratchBuffer = + m_alloc.createBuffer(scratchSize, VK_BUFFER_USAGE_RAY_TRACING_BIT_KHR | VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT); + VkBufferDeviceAddressInfo bufferInfo{VK_STRUCTURE_TYPE_BUFFER_DEVICE_ADDRESS_INFO}; + bufferInfo.buffer = scratchBuffer.buffer; + VkDeviceAddress scratchAddress = vkGetBufferDeviceAddress(m_device, &bufferInfo); + + + const VkAccelerationStructureGeometryKHR* pGeometry = blas.asGeometry.data(); + VkAccelerationStructureBuildGeometryInfoKHR asInfo{VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_BUILD_GEOMETRY_INFO_KHR}; + asInfo.type = VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_KHR; + asInfo.flags = blas.flags; + asInfo.update = VK_TRUE; + asInfo.srcAccelerationStructure = blas.as.accel; + asInfo.dstAccelerationStructure = blas.as.accel; + asInfo.geometryArrayOfPointers = VK_FALSE; + asInfo.geometryCount = (uint32_t)blas.asGeometry.size(); + asInfo.ppGeometries = &pGeometry; + asInfo.scratchData.deviceAddress = scratchAddress; + + std::vector pBuildOffset(blas.asBuildOffsetInfo.size()); + for(size_t i = 0; i < blas.asBuildOffsetInfo.size(); i++) + pBuildOffset[i] = &blas.asBuildOffsetInfo[i]; + + // Update the instance buffer on the device side and build the TLAS + nvvk::CommandPool genCmdBuf(m_device, m_queueIndex); + VkCommandBuffer cmdBuf = genCmdBuf.createCommandBuffer(); + + + // Update the acceleration structure. Note the VK_TRUE parameter to trigger the update, + // and the existing BLAS being passed and updated in place + vkCmdBuildAccelerationStructureKHR(cmdBuf, 1, &asInfo, pBuildOffset.data()); + + genCmdBuf.submitAndWait(cmdBuf); + m_alloc.destroy(scratchBuffer); + } +~~~~ + +The previous function (`updateBlas`) uses geometry information stored in `m_blas`. +To be able to re-use this information, we need to keep the structure of `nvvk::RaytracingBuilderKHR::Blas` objects +used for its creation. + +Move the `nvvk::RaytracingBuilderKHR::Blas` vector from `HelloVulkan::createBottomLevelAS()` to the `HelloVulkan` class, renaming it to `m_blas`. + +~~~~ C++ + std::vector m_blas; +~~~~ + +As with the TLAS, the BLAS needs to allow updates. We will also enable the +`VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_BUILD_BIT_KHR` flag, which indicates that the given +acceleration structure build should prioritize build time over trace performance. + +~~~~ C++ +void HelloVulkan::createBottomLevelAS() +{ + // BLAS - Storing each primitive in a geometry + m_blas.reserve(m_objModel.size()); + for(const auto & obj : m_objModel) + { + auto blas = objectToVkGeometryKHR(obj); + + // We could add more geometry in each BLAS, but we add only one for now + m_blas.push_back(blas); + } + m_rtBuilder.buildBlas(m_blas, vk::BuildAccelerationStructureFlagBitsKHR::eAllowUpdate + | vk::BuildAccelerationStructureFlagBitsKHR::ePreferFastBuild); +} +~~~~ + +Finally, we can add a line at the end of `HelloVulkan::animationObject()` to update the BLAS. + +~~~~ C++ +m_rtBuilder.updateBlas(2); +~~~~ + +![](Images/animation2.gif) diff --git a/ray_tracing_animation/images/animation1.gif b/ray_tracing_animation/images/animation1.gif new file mode 100644 index 0000000..e2fd7ee Binary files /dev/null and b/ray_tracing_animation/images/animation1.gif differ diff --git a/ray_tracing_animation/images/animation2.gif b/ray_tracing_animation/images/animation2.gif new file mode 100644 index 0000000..85dee40 Binary files /dev/null and b/ray_tracing_animation/images/animation2.gif differ diff --git a/ray_tracing_callable/README.md b/ray_tracing_callable/README.md index 8802c67..84879f3 100644 --- a/ray_tracing_callable/README.md +++ b/ray_tracing_callable/README.md @@ -1,5 +1,172 @@ -# NVIDIA Vulkan Ray Tracing Tutorial +# Callable Shaders - Tutorial -[Start the tutorial of this project](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/vkrt_tuto_callable.md.html) +![](Images/callable.png) +Author: [Martin-Karl Lefrançois](https://devblogs.nvidia.com/author/mlefrancois/) -![](../docs/Images/callable.png) +## Tutorial ([Setup](../docs/setup.md)) + +This is an extension of the Vulkan ray tracing [tutorial](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR). + + +Ray tracing allow to use [callable shaders](https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/chap8.html#shaders-callable) +in ray-generation, closest-hit, miss or another callable shader stage. +It is similar to an indirect function call, whitout having to link those shaders with the executable program. + +(insert setup.md.html here) + + +## Data Storage + +Data can only access data passed in to the callable from parent stage. There will be only one structure pass at a time and should be declared like for payload. + +In the parent stage, using the `callableDataEXT` storage qualifier, it could be declared like: + +~~~~ C++ +layout(location = 0) callableDataEXT rayLight cLight; +~~~~ + +where `rayLight` struct is defined in a shared file. + +~~~~ C++ +struct rayLight +{ + vec3 inHitPosition; + float outLightDistance; + vec3 outLightDir; + float outIntensity; +}; +~~~~ + +And in the incoming callable shader, you must use the `callableDataInEXT` storage qualifier. + +~~~~ C++ +layout(location = 0) callableDataInEXT rayLight cLight; +~~~~ + +## Execution + +To execute one of the callable shader, the parent stage need to call `executeCallableEXT`. + +The first parameter is the SBT record index, the second one correspond to the 'location' index. + +Example of how it is called. + +~~~~ C++ +executeCallableEXT(pushC.lightType, 0); +~~~~ + + +## Adding Callable Shaders to the SBT + +### Create Shader Modules + +In `HelloVulkan::createRtPipeline()`, immediately after adding the closest-hit shader, we will add +3 callable shaders, for each type of light. + +~~~~ C++ + // Callable shaders + vk::RayTracingShaderGroupCreateInfoKHR callGroup{vk::RayTracingShaderGroupTypeKHR::eGeneral, + VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR, + VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR}; + + vk::ShaderModule call0 = + nvvk::createShaderModule(m_device, + nvh::loadFile("shaders/light_point.rcall.spv", true, paths)); + vk::ShaderModule call1 = + nvvk::createShaderModule(m_device, + nvh::loadFile("shaders/light_spot.rcall.spv", true, paths)); + vk::ShaderModule call2 = + nvvk::createShaderModule(m_device, nvh::loadFile("shaders/light_inf.rcall.spv", true, paths)); + + stages.push_back({{}, vk::ShaderStageFlagBits::eCallableKHR, call0, "main"}); + callGroup.setGeneralShader(static_cast(stages.size() - 1)); + m_rtShaderGroups.push_back(callGroup); + stages.push_back({{}, vk::ShaderStageFlagBits::eCallableKHR, call1, "main"}); + callGroup.setGeneralShader(static_cast(stages.size() - 1)); + m_rtShaderGroups.push_back(callGroup); + stages.push_back({{}, vk::ShaderStageFlagBits::eCallableKHR, call2, "main"}); + callGroup.setGeneralShader(static_cast(stages.size() - 1)); + m_rtShaderGroups.push_back(callGroup); +~~~~ + +And at the end of the function, delete the shaders. + +~~~~ C++ +m_device.destroy(call0); +m_device.destroy(call1); +m_device.destroy(call2); +~~~~ + +#### Shaders + +Here are the source of all shaders + +* [light_point.rcall](https://github.com/nvpro-samples/vk_raytracing_tutorial_KHR/blob/master/ray_tracing_callable/shaders/light_point.rcall) +* [light_spot.rcall](https://github.com/nvpro-samples/vk_raytracing_tutorial_KHR/blob/master/ray_tracing_callable/shaders/light_spot.rcall) +* [light_inf.rcall](https://github.com/nvpro-samples/vk_raytracing_tutorial_KHR/blob/master/ray_tracing_callable/shaders/light_inf.rcall) + + +### Passing Callable to traceRaysKHR + +In `HelloVulkan::raytrace()`, we have to tell where the callable shader starts. Since they were added after the hit shader, we have in the SBT the following. + +![SBT](images/sbt.png) + + +Therefore, the callable starts at `4 * progSize` + +~~~~ C++ +vk::DeviceSize callableGroupOffset = 4u * progSize; // Jump over the previous shaders +vk::DeviceSize callableGroupStride = progSize; +~~~~ + +Then we can call `traceRaysKHR` + +~~~~ C++ +const vk::StridedBufferRegionKHR callableShaderBindingTable = { + m_rtSBTBuffer.buffer, callableGroupOffset, progSize, sbtSize}; + +cmdBuf.traceRaysKHR(&raygenShaderBindingTable, &missShaderBindingTable, &hitShaderBindingTable, + &callableShaderBindingTable, // + m_size.width, m_size.height, 1); // +~~~~ + +## Calling the Callable Shaders + +In the closest-hit shader, instead of having a if-else case, we can now call directly the right shader base on the type of light. + +~~~~ C++ +cLight.inHitPosition = worldPos; +//#define DONT_USE_CALLABLE +#if defined(DONT_USE_CALLABLE) + // Point light + if(pushC.lightType == 0) + { + vec3 lDir = pushC.lightPosition - cLight.inHitPosition; + float lightDistance = length(lDir); + cLight.outIntensity = pushC.lightIntensity / (lightDistance * lightDistance); + cLight.outLightDir = normalize(lDir); + cLight.outLightDistance = lightDistance; + } + else if(pushC.lightType == 1) + { + vec3 lDir = pushC.lightPosition - cLight.inHitPosition; + cLight.outLightDistance = length(lDir); + cLight.outIntensity = + pushC.lightIntensity / (cLight.outLightDistance * cLight.outLightDistance); + cLight.outLightDir = normalize(lDir); + float theta = dot(cLight.outLightDir, normalize(-pushC.lightDirection)); + float epsilon = pushC.lightSpotCutoff - pushC.lightSpotOuterCutoff; + float spotIntensity = clamp((theta - pushC.lightSpotOuterCutoff) / epsilon, 0.0, 1.0); + cLight.outIntensity *= spotIntensity; + } + else // Directional light + { + cLight.outLightDir = normalize(-pushC.lightDirection); + cLight.outIntensity = 1.0; + cLight.outLightDistance = 10000000; + } +#else + executeCallableEXT(pushC.lightType, 0); +#endif +~~~~ diff --git a/ray_tracing_callable/images/callable.png b/ray_tracing_callable/images/callable.png new file mode 100644 index 0000000..64248cf Binary files /dev/null and b/ray_tracing_callable/images/callable.png differ diff --git a/ray_tracing_callable/images/sbt.png b/ray_tracing_callable/images/sbt.png new file mode 100644 index 0000000..139f752 Binary files /dev/null and b/ray_tracing_callable/images/sbt.png differ diff --git a/ray_tracing_instances/README.md b/ray_tracing_instances/README.md index 0f29ae7..f0f1269 100644 --- a/ray_tracing_instances/README.md +++ b/ray_tracing_instances/README.md @@ -1,5 +1,253 @@ # NVIDIA Vulkan Ray Tracing Tutorial -[Start the tutorial of this project](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/vkrt_tuto_instances.md.htm) +![img](images/instances.png) + + +## Tutorial ([Setup](../docs/setup.md)) + +This is an extension of the Vulkan ray tracing [tutorial](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR). + + +Ray tracing can easily handle having many object instances at once. For instance, a top level acceleration structure can +have many different instances of a bottom level acceleration structure. However, when we have many different objects, we +can run into problems with memory allocation. Many Vulkan implementations support no more than 4096 allocations, while +our current application creates 4 allocations per object (Vertex, Index, and Material), then one for the BLAS. That +means we are hitting the limit with just above 1000 objects. + +(insert setup.md.html here) + +## Many Instances + +First, let's look how the scene would look like when we have just a few objects, with many instances. + +In `main.cpp`, add the following includes: + +~~~~ C++ +#include +~~~~ + +Then replace the calls to `helloVk.loadModel` in `main()` by + +~~~~ C++ + // Creation of the example + helloVk.loadModel(nvh::findFile("media/scenes/cube.obj", defaultSearchPaths)); + helloVk.loadModel(nvh::findFile("media/scenes/cube_multi.obj", defaultSearchPaths)); + helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths)); + + std::random_device rd; // Will be used to obtain a seed for the random number engine + std::mt19937 gen(rd()); // Standard mersenne_twister_engine seeded with rd() + std::normal_distribution dis(1.0f, 1.0f); + std::normal_distribution disn(0.05f, 0.05f); + + for(int n = 0; n < 2000; ++n) + { + HelloVulkan::ObjInstance inst; + inst.objIndex = n % 2; + inst.txtOffset = 0; + float scale = fabsf(disn(gen)); + nvmath::mat4f mat = + nvmath::translation_mat4(nvmath::vec3f{dis(gen), 2.0f + dis(gen), dis(gen)}); + mat = mat * nvmath::rotation_mat4_x(dis(gen)); + mat = mat * nvmath::scale_mat4(nvmath::vec3f(scale)); + inst.transform = mat; + inst.transformIT = nvmath::transpose(nvmath::invert((inst.transform))); + helloVk.m_objInstance.push_back(inst); + } +~~~~ + +**Note:** + This will create 3 models (OBJ) and their instances, and then add 2000 instances + distributed between green cubes and cubes with one color per face. + +## Many Objects + +Instead of creating many instances, create many objects. + +Remove the previous code and replace it with the following + +~~~~ C++ + // Creation of the example + std::random_device rd; //Will be used to obtain a seed for the random number engine + std::mt19937 gen(rd()); //Standard mersenne_twister_engine seeded with rd() + std::normal_distribution dis(1.0f, 1.0f); + std::normal_distribution disn(0.05f, 0.05f); + for(int n = 0; n < 2000; ++n) + { + helloVk.loadModel(nvh::findFile("media/scenes/cube_multi.obj", defaultSearchPaths)); + HelloVulkan::ObjInstance& inst = helloVk.m_objInstance.back(); + + float scale = fabsf(disn(gen)); + nvmath::mat4f mat = + nvmath::translation_mat4(nvmath::vec3f{dis(gen), 2.0f + dis(gen), dis(gen)}); + mat = mat * nvmath::rotation_mat4_x(dis(gen)); + mat = mat * nvmath::scale_mat4(nvmath::vec3f(scale)); + inst.transform = mat; + inst.transformIT = nvmath::transpose(nvmath::invert((inst.transform))); + } + + helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths)); +~~~~ + +The example might still work, but the console will print the following error after loading 1363 objects. All other objects allocated after the 1363rd will fail to be displayed. + +Error | Error: VUID_Undefined
Number of currently valid memory objects is not less than the maximum allowed (4096). +-|- +Note | This is the best case; the application can run out of memory and crash if substantially more objects are created (e.g. 20,000) + +## Device Memory Allocator (DMA) + +It is possible to use a memory allocator to fix this issue. + +### `hello_vulkan.h` + +In `hello_vulkan.h`, add the following defines at the top of the file to indicate which allocator to use + +~~~~ C++ +// #VKRay +//#define ALLOC_DEDICATED +#define ALLOC_DMA +~~~~ + + +Replace the definition of buffers and textures and include the right allocator. + +~~~~ C++ +#if defined(ALLOC_DEDICATED) +#include "nvvk/allocator_dedicated_vk.hpp" +using nvvkBuffer = nvvk::BufferDedicated; +using nvvkTexture = nvvk::TextureDedicated; +#elif defined(ALLOC_DMA) +#include "nvvk/allocator_dma_vk.hpp" +using nvvkBuffer = nvvk::BufferDma; +using nvvkTexture = nvvk::TextureDma; +#endif +~~~~ + +And do the same for the allocator + +~~~~ C++ +#if defined(ALLOC_DEDICATED) + nvvk::AllocatorDedicated m_alloc; // Allocator for buffer, images, acceleration structures +#elif defined(ALLOC_DMA) + nvvk::AllocatorDma m_alloc; // Allocator for buffer, images, acceleration structures + nvvk::DeviceMemoryAllocator m_memAllocator; + nvvk::StagingMemoryManagerDma m_staging; +#endif +~~~~ + +### `hello_vulkan.cpp` + +In the source file there are also a few changes to make. + +DMA needs to be initialized, which will be done in the `setup()` function: + +~~~~ C++ +#if defined(ALLOC_DEDICATED) + m_alloc.init(device, physicalDevice); +#elif defined(ALLOC_DMA) + m_memAllocator.init(device, physicalDevice); + m_memAllocator.setAllocateFlags(VK_MEMORY_ALLOCATE_DEVICE_ADDRESS_BIT_KHR, true); + m_staging.init(m_memAllocator); + m_alloc.init(device, m_memAllocator, m_staging); +#endif +~~~~ + +When using DMA, memory buffer mapping is done through the DMA interface (instead of the VKDevice). +Therefore, change the lines at the end of `updateUniformBuffer()` to use the common allocator interface. + +~~~~ C++ +void* data = m_alloc.map(m_cameraMat); +memcpy(data, &ubo, sizeof(ubo)); +m_alloc.unmap(m_cameraMat); +~~~~ + +The RaytracerBuilder was made to allow various allocators, therefore nothing to change in the call to `m_rtBuilder.setup()` + + +### Destruction + +The VMA allocator need to be released in `HelloVulkan::destroyResources()` after the last `m_alloc.destroy`. + +~~~~ C++ +#if defined(ALLOC_DMA) + m_dmaAllocator.deinit(); +#endif +~~~~ + +## Result + +Instead of thousands of allocations, our example will have only 14 allocations. Note that some of these allocations are allocated by Dear ImGui, and not by DMA. These are the 14 objects with blue borders below: + +![Memory](Images/VkInstanceNsight1.png) + +Finally, here is the Vulkan Device Memory view from Nsight Graphics: +![VkMemory](Images/VkInstanceNsight2.png) + + + +## VMA: Vulkan Memory Allocator + +We can also modify the code to use the [Vulkan Memory Allocator](https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator) from AMD. + +Download [vk_mem_alloc.h](https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator/blob/master/src/vk_mem_alloc.h) from GitHub and add this to the `shared_sources` folder. + +There is already a variation of the allocator for VMA, which is located under [nvpro-samples](https://github.com/nvpro-samples/shared_sources/tree/master/nvvk). This allocator has the same simple interface as the `AllocatorDedicated` class in `allocator_dedicated_vkpp.hpp`, but will use VMA for memory management. + +VMA might use dedicated memory, which we do, so you need to add the following extension to the +creation of the context in `main.cpp`. + +~~~~ C++ + contextInfo.addDeviceExtension(VK_KHR_BIND_MEMORY_2_EXTENSION_NAME); +~~~~ + +### hello_vulkan.h + +Follow the changes done before and add the following + +~~~~ C++ +#define ALLOC_VMA +~~~~ + +~~~~ C++ +#elif defined(ALLOC_VMA) +#include "nvvk/allocator_vma_vk.hpp" +using nvvkBuffer = nvvk::BufferVma; +using nvvkTexture = nvvk::TextureVma; +~~~~ + +~~~~ C++ +#elif defined(ALLOC_VMA) + nvvk::AllocatorVma m_alloc; // Allocator for buffer, images, acceleration structures + nvvk::StagingMemoryManagerVma m_staging; + VmaAllocator m_memAllocator; +~~~~ + + +### hello_vulkan.cpp +First, the following should only be defined once in the entire program, and it should be defined before `#include "hello_vulkan.h"`: + +~~~~ C++ +#define VMA_IMPLEMENTATION +~~~~ + +In `setup()` + +~~~~ C++ +#elif defined(ALLOC_VMA) + VmaAllocatorCreateInfo allocatorInfo = {}; + allocatorInfo.instance = instance; + allocatorInfo.physicalDevice = physicalDevice; + allocatorInfo.device = device; + allocatorInfo.flags = VMA_ALLOCATOR_CREATE_BUFFER_DEVICE_ADDRESS_BIT; + vmaCreateAllocator(&allocatorInfo, &m_memAllocator); + m_staging.init(device, physicalDevice, m_memAllocator); + m_alloc.init(device, m_memAllocator, m_staging); +~~~~ + +In `destroyResources()` + +~~~~ C++ +#elif defined(ALLOC_VMA) + vmaDestroyAllocator(m_vmaAllocator); +~~~~ -![](../docs/Images/VkInstances.png) \ No newline at end of file diff --git a/ray_tracing_instances/images/VkInstanceNsight1.png b/ray_tracing_instances/images/VkInstanceNsight1.png new file mode 100644 index 0000000..3ded966 Binary files /dev/null and b/ray_tracing_instances/images/VkInstanceNsight1.png differ diff --git a/ray_tracing_instances/images/VkInstanceNsight2.png b/ray_tracing_instances/images/VkInstanceNsight2.png new file mode 100644 index 0000000..ddf4166 Binary files /dev/null and b/ray_tracing_instances/images/VkInstanceNsight2.png differ diff --git a/ray_tracing_instances/images/instances.png b/ray_tracing_instances/images/instances.png new file mode 100644 index 0000000..6ab028c Binary files /dev/null and b/ray_tracing_instances/images/instances.png differ diff --git a/ray_tracing_intersection/README.md b/ray_tracing_intersection/README.md index 6df4c56..4a40056 100644 --- a/ray_tracing_intersection/README.md +++ b/ray_tracing_intersection/README.md @@ -1,5 +1,544 @@ -# NVIDIA Vulkan Ray Tracing Tutorial +# Intersection Shader - Tutorial -[Start the tutorial of this project](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/vkrt_tuto_intersection.md.htm) +![](images/intersection.png) +Author: [Martin-Karl Lefrançois](https://devblogs.nvidia.com/author/mlefrancois/) -![](../docs/Images/ray_tracing_intersection.png) \ No newline at end of file + +## Tutorial ([Setup](../docs/setup.md)) + +This is an extension of the Vulkan ray tracing [tutorial](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR). + +This tutorial chapter shows how to use intersection shader and render different primitives with different materials. + + +## High Level Implementation + +On a high level view, we will + +* Add 2.000.000 axis aligned bounding boxes in a BLAS +* 2 materials will be added +* Every second intersected object will be a sphere or a cube and will use one of the two material. + +To do this, we will need to: + +* Add an intersection shader (.rint) +* Add a new closest hit shader (.chit) +* Create `VkAccelerationStructureGeometryKHR` from `VkAccelerationStructureGeometryAabbsDataKHR` + +## Creating all spheres + +In the HelloVulkan class, we will add the structures we will need. First the structure that defines a sphere. + +~~~~ C++ + struct Sphere + { + nvmath::vec3f center; + float radius; + }; +~~~~ + +Then we need the Aabb structure holding all the spheres, but also used for the creation of the BLAS (`VK_GEOMETRY_TYPE_AABBS_KHR`). + +~~~~ C++ + struct Aabb + { + nvmath::vec3f minimum; + nvmath::vec3f maximum; + }; +~~~~ + +All the information will need to be hold in buffers, which will be available to the shaders. + +~~~~ C++ + std::vector m_spheres; // All spheres + nvvkBuffer m_spheresBuffer; // Buffer holding the spheres + nvvkBuffer m_spheresAabbBuffer; // Buffer of all Aabb + nvvkBuffer m_spheresMatColorBuffer; // Multiple materials + nvvkBuffer m_spheresMatIndexBuffer; // Define which sphere uses which material +~~~~ + +Finally, there are two functions, one to create the spheres, and one that will create the intermediate structure for the BLAS. + +~~~~ C++ + void createSpheres(); + nvvk::RaytracingBuilderKHR::Blas sphereToVkGeometryKHR(); +~~~~ + +The following implementation will create 2.000.000 spheres at random positions and radius. It will create the Aabb from the sphere definition, two materials which will be assigned alternatively to each object. All the created information will be moved to Vulkan buffers to be accessed by the intersection and closest shaders. + +~~~~ C++ + +//-------------------------------------------------------------------------------------------------- +// Creating all spheres +// +void HelloVulkan::createSpheres() +{ + std::random_device rd{}; + std::mt19937 gen{rd()}; + std::normal_distribution xzd{0.f, 5.f}; + std::normal_distribution yd{3.f, 1.f}; + std::uniform_real_distribution radd{.05f, .2f}; + + // All spheres + Sphere s; + for(uint32_t i = 0; i < 2000000; i++) + { + s.center = nvmath::vec3f(xzd(gen), yd(gen), xzd(gen)); + s.radius = radd(gen); + m_spheres.emplace_back(s); + } + + // Axis aligned bounding box of each sphere + std::vector aabbs; + for(const auto& s : m_spheres) + { + Aabb aabb; + aabb.minimum = s.center - nvmath::vec3f(s.radius); + aabb.maximum = s.center + nvmath::vec3f(s.radius); + aabbs.emplace_back(aabb); + } + + // Creating two materials + MatrialObj mat; + mat.diffuse = vec3f(0, 1, 1); + std::vector materials; + std::vector matIdx; + materials.emplace_back(mat); + mat.diffuse = vec3f(1, 1, 0); + materials.emplace_back(mat); + + // Assign a material to each sphere + for(size_t i = 0; i < m_spheres.size(); i++) + { + matIdx.push_back(i % 2); + } + + // Creating all buffers + using vkBU = vk::BufferUsageFlagBits; + nvvk::CommandPool genCmdBuf(m_device, m_graphicsQueueIndex); + auto cmdBuf = genCmdBuf.createCommandBuffer(); + m_spheresBuffer = m_alloc.createBuffer(cmdBuf, m_spheres, vkBU::eStorageBuffer); + m_spheresAabbBuffer = m_alloc.createBuffer(cmdBuf, aabbs, vkBU::eShaderDeviceAddress); + m_spheresMatIndexBuffer = m_alloc.createBuffer(cmdBuf, matIdx, vkBU::eStorageBuffer); + m_spheresMatColorBuffer = m_alloc.createBuffer(cmdBuf, materials, vkBU::eStorageBuffer); + genCmdBuf.submitAndWait(cmdBuf); + + // Debug information + m_debug.setObjectName(m_spheresBuffer.buffer, "spheres"); + m_debug.setObjectName(m_spheresAabbBuffer.buffer, "spheresAabb"); + m_debug.setObjectName(m_spheresMatColorBuffer.buffer, "spheresMat"); + m_debug.setObjectName(m_spheresMatIndexBuffer.buffer, "spheresMatIdx"); +} +~~~~ + +Do not forget to destroy the buffers in `destroyResources()` + +~~~~ C++ + m_alloc.destroy(m_spheresBuffer); + m_alloc.destroy(m_spheresAabbBuffer); + m_alloc.destroy(m_spheresMatColorBuffer); + m_alloc.destroy(m_spheresMatIndexBuffer); +~~~~ + +We need a new bottom level acceleration structure (BLAS) to hold the implicit primitives. For efficiency and since all those primitives are static, they will all be added in a single BLAS. + +What is changing compare to triangle primitive is the Aabb data (see Aabb structure) and the geometry type (`VK_GEOMETRY_TYPE_AABBS_KHR`). + +~~~~ C++ +//-------------------------------------------------------------------------------------------------- +// Returning the ray tracing geometry used for the BLAS, containing all spheres +// +nvvk::RaytracingBuilderKHR::Blas HelloVulkan::sphereToVkGeometryKHR() +{ + vk::AccelerationStructureCreateGeometryTypeInfoKHR asCreate; + asCreate.setGeometryType(vk::GeometryTypeKHR::eAabbs); + asCreate.setMaxPrimitiveCount((uint32_t)m_spheres.size()); // Nb triangles + asCreate.setIndexType(vk::IndexType::eNoneKHR); + asCreate.setVertexFormat(vk::Format::eUndefined); + asCreate.setMaxVertexCount(0); + asCreate.setAllowsTransforms(VK_FALSE); // No adding transformation matrices + + + vk::DeviceAddress dataAddress = m_device.getBufferAddress({m_spheresAabbBuffer.buffer}); + vk::AccelerationStructureGeometryAabbsDataKHR aabbs; + aabbs.setData(dataAddress); + aabbs.setStride(sizeof(Aabb)); + + // Setting up the build info of the acceleration + vk::AccelerationStructureGeometryKHR asGeom; + asGeom.setGeometryType(asCreate.geometryType); + asGeom.setFlags(vk::GeometryFlagBitsKHR::eOpaque); + asGeom.geometry.setAabbs(aabbs); + + vk::AccelerationStructureBuildOffsetInfoKHR offset; + offset.setFirstVertex(0); + offset.setPrimitiveCount(asCreate.maxPrimitiveCount); + offset.setPrimitiveOffset(0); + offset.setTransformOffset(0); + + nvvk::RaytracingBuilderKHR::Blas blas; + blas.asGeometry.emplace_back(asGeom); + blas.asCreateGeometryInfo.emplace_back(asCreate); + blas.asBuildOffsetInfo.emplace_back(offset); + return blas; +} +~~~~ + +## Setting Up the Scene + +In `main.cpp`, where we are loading the OBJ model, we can replace it with + +~~~~ C++ + // Creation of the example + helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths)); + helloVk.createSpheres(); +~~~~ + +**Note**: it is possible to have more OBJ models, but the spheres will need to be added after all of them. + +The scene will be large, better to move the camera out + +~~~~ C++ + CameraManip.setLookat(nvmath::vec3f(20, 20, 20), nvmath::vec3f(0, 1, 0), nvmath::vec3f(0, 1, 0)); +~~~~ + +## Acceleration Structures + +### BLAS + +The function `createBottomLevelAS()` is creating a BLAS per OBJ, the following modification will add a new BLAS containing the Aabb's of all spheres. + +~~~~ C++ +void HelloVulkan::createBottomLevelAS() +{ + // BLAS - Storing each primitive in a geometry + std::vector allBlas; + allBlas.reserve(m_objModel.size()); + for(const auto& obj : m_objModel) + { + auto blas = objectToVkGeometryKHR(obj); + + // We could add more geometry in each BLAS, but we add only one for now + allBlas.emplace_back(blas); + } + + // Spheres + { + auto blas = sphereToVkGeometryKHR(); + allBlas.emplace_back(blas); + } + + m_rtBuilder.buildBlas(allBlas, vk::BuildAccelerationStructureFlagBitsKHR::ePreferFastTrace); +} +~~~~ + +### TLAS + +Similarly in `createTopLevelAS()`, the top level acceleration structure will need to add a reference to the BLAS of the spheres. We are setting the instanceID and blasID to the last element, which is why the sphere BLAS must be added after everything else. + +The hitGroupId will be set to 1 instead of 0. We need to add a new hit group for the implicit primitives, since we will need to compute attributes like the normal, since they are not provide like with triangle primitives. + +Just before building the TLAS, we need to add the following + +~~~~ C++ + // Add the blas containing all spheres + { + nvvk::RaytracingBuilder::Instance rayInst; + rayInst.transform = m_objInstance[0].transform; // Position of the instance + rayInst.instanceId = static_cast(tlas.size()); // gl_InstanceID + rayInst.blasId = static_cast(m_objModel.size()); + rayInst.hitGroupId = 1; // We will use the same hit group for all objects + rayInst.flags = vk::GeometryInstanceFlagBitsKHR::eTriangleCullDisable; + tlas.emplace_back(rayInst); + } +~~~~ + +## Descriptors + +To access the newly created buffers holding all the spheres and materials, some changes are required to the descriptors. + +In function `createDescriptorSetLayout()`, the addition of the material and material index need to be instructed. + +~~~~ C++ + // Materials (binding = 1) + m_descSetLayoutBind.emplace_back(vkDS(1, vkDT::eStorageBuffer, nbObj + 1, + vkSS::eVertex | vkSS::eFragment | vkSS::eClosestHitKHR)); + // Materials Index (binding = 4) + m_descSetLayoutBind.emplace_back( + vkDS(4, vkDT::eStorageBuffer, nbObj + 1, vkSS::eFragment | vkSS::eClosestHitKHR)); +~~~~ + +And the new buffer holding the spheres + +~~~~ C++ + // Storing spheres (binding = 7) + m_descSetLayoutBind.emplace_back( // + vkDS(7, vkDT::eStorageBuffer, 1, vkSS::eClosestHitKHR | vkSS::eIntersectionKHR)); +~~~~ + +The function `updateDescriptorSet()` which is writing the values of the buffer need also to be modified. + +At the end of the loop on all models, lets add the new material and material index. + +~~~~ C++ + for(auto& model : m_objModel) + { + dbiMat.emplace_back(model.matColorBuffer.buffer, 0, VK_WHOLE_SIZE); + dbiMatIdx.emplace_back(model.matIndexBuffer.buffer, 0, VK_WHOLE_SIZE); + dbiVert.emplace_back(model.vertexBuffer.buffer, 0, VK_WHOLE_SIZE); + dbiIdx.emplace_back(model.indexBuffer.buffer, 0, VK_WHOLE_SIZE); + } + dbiMat.emplace_back(m_spheresMatColorBuffer.buffer, 0, VK_WHOLE_SIZE); + dbiMatIdx.emplace_back(m_spheresMatIndexBuffer.buffer, 0, VK_WHOLE_SIZE); +~~~~ + +Then write the buffer for the spheres + +~~~~ C++ + vk::DescriptorBufferInfo dbiSpheres{m_spheresBuffer.buffer, 0, VK_WHOLE_SIZE}; + writes.emplace_back(m_descSetLayoutBind.makeWrite(m_descSet, 7, dbiSpheres)); +~~~~ + +## Intersection Shader + +The intersection shader is added to the Hit Group `VK_RAY_TRACING_SHADER_GROUP_TYPE_PROCEDURAL_HIT_GROUP_KHR`. In our example, we already have a Hit Group for triangle and a closest hit associated. We will add a new one, which will become the Hit Group ID (1), see the TLAS section. + +Here is how the two hit group looks like: + +~~~~ C++ + // Hit Group0 - Closest Hit + vk::ShaderModule chitSM = + nvvk::createShaderModule(m_device, // + nvh::loadFile("shaders/raytrace.rchit.spv", true, paths)); + + { + vk::RayTracingShaderGroupCreateInfoKHR hg{vk::RayTracingShaderGroupTypeKHR::eTrianglesHitGroup, + VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR, + VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR}; + stages.push_back({{}, vk::ShaderStageFlagBits::eClosestHitKHR, chitSM, "main"}); + hg.setClosestHitShader(static_cast(stages.size() - 1)); + m_rtShaderGroups.push_back(hg); + } + + // Hit Group1 - Closest Hit + Intersection (procedural) + vk::ShaderModule chit2SM = + nvvk::createShaderModule(m_device, // + nvh::loadFile("shaders/raytrace2.rchit.spv", true, paths)); + vk::ShaderModule rintSM = + nvvk::createShaderModule(m_device, // + nvh::loadFile("shaders/raytrace.rint.spv", true, paths)); + { + vk::RayTracingShaderGroupCreateInfoKHR hg{vk::RayTracingShaderGroupTypeKHR::eProceduralHitGroup, + VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR, + VK_SHADER_UNUSED_KHR, VK_SHADER_UNUSED_KHR}; + stages.push_back({{}, vk::ShaderStageFlagBits::eClosestHitKHR, chit2SM, "main"}); + hg.setClosestHitShader(static_cast(stages.size() - 1)); + stages.push_back({{}, vk::ShaderStageFlagBits::eIntersectionKHR, rintSM, "main"}); + hg.setIntersectionShader(static_cast(stages.size() - 1)); + m_rtShaderGroups.push_back(hg); + } +~~~~ + +And destroy the two shaders at the end + +~~~~ C++ + m_device.destroy(chit2SM); + m_device.destroy(rintSM); +~~~~ + +### raycommon.glsl + +To share the structure of the data across the shaders, we can add the following to `raycommon.glsl` + +~~~~ C++ +struct Sphere +{ + vec3 center; + float radius; +}; + +struct Aabb +{ + vec3 minimum; + vec3 maximum; +}; + +#define KIND_SPHERE 0 +#define KIND_CUBE 1 +~~~~ + +### raytrace.rint + +The intersection shader `raytrace.rint` need to be added to the shader directory and CMake to be rerun such that it is added to the project. The shader will be called every time a ray will hit one of the Aabb of the scene. Note that there are no Aabb information that can be retrieved in the intersection shader. It is also not possible to have the value of the hit point that the ray tracer might have calculated on the GPU. + +The only information we have is that one of the Aabb was hit and using the `gl_PrimitiveID`, it is possible to know which one it was. Then, with the information stored in the buffer, we can retrive the geometry information of the sphere. + +We first declare the extensions and include common files. + +~~~~ C++ +#version 460 +#extension GL_EXT_ray_tracing : require +#extension GL_EXT_nonuniform_qualifier : enable +#extension GL_EXT_scalar_block_layout : enable +#extension GL_GOOGLE_include_directive : enable +#include "raycommon.glsl" +#include "wavefront.glsl" +~~~~ + +Then we **must** add the following, otherwise the intersection shader will not report any hit. + +~~~~ C++ +hitAttributeEXT vec3 HitAttribute; +~~~~ + +The following is the topology of all spheres, which we will be able to retrieve using `gl_PrimitiveID`. + +~~~~ C++ +layout(binding = 7, set = 1, scalar) buffer allSpheres_ +{ + Sphere i[]; +} +allSpheres; +~~~~ + +We will implement two intersetion method against the incoming ray. + +~~~~ C++ +struct Ray +{ + vec3 origin; + vec3 direction; +}; +~~~~ + +The sphere intersection + +~~~~ C++ +// Ray-Sphere intersection +// http://viclw17.github.io/2018/07/16/raytracing-ray-sphere-intersection/ +float hitSphere(const Sphere s, const Ray r) +{ + vec3 oc = r.origin - s.center; + float a = dot(r.direction, r.direction); + float b = 2.0 * dot(oc, r.direction); + float c = dot(oc, oc) - s.radius * s.radius; + float discriminant = b * b - 4 * a * c; + if(discriminant < 0) + { + return -1.0; + } + else + { + return (-b - sqrt(discriminant)) / (2.0 * a); + } +} +~~~~ + +And the axis aligned bounding box intersection + +~~~~ C++ +// Ray-AABB intersection +float hitAabb(const Aabb aabb, const Ray r) +{ + vec3 invDir = 1.0 / r.direction; + vec3 tbot = invDir * (aabb.minimum - r.origin); + vec3 ttop = invDir * (aabb.maximum - r.origin); + vec3 tmin = min(ttop, tbot); + vec3 tmax = max(ttop, tbot); + float t0 = max(tmin.x, max(tmin.y, tmin.z)); + float t1 = min(tmax.x, min(tmax.y, tmax.z)); + return t1 > max(t0, 0.0) ? t0 : -1.0; +} +~~~~ + +Both are returning -1 if there is no hit, otherwise, it returns the distance from to origin of the ray. + +Retrieving the ray is straight forward + +~~~~ C++ +void main() +{ + Ray ray; + ray.origin = gl_WorldRayOriginEXT; + ray.direction = gl_WorldRayDirectionEXT; +~~~~ + +And getting the information about the geometry enclosed in the Aabb can be done like this. + +~~~~ C++ + // Sphere data + Sphere sphere = allSpheres.i[gl_PrimitiveID]; +~~~~ + +Now we just need to know if we will hit a sphere or a cube. + +~~~~ C++ + float tHit = -1; + int hitKind = gl_PrimitiveID % 2 == 0 ? KIND_SPHERE : KIND_CUBE; + if(hitKind == KIND_SPHERE) + { + // Sphere intersection + tHit = hitSphere(sphere, ray); + } + else + { + // AABB intersection + Aabb aabb; + aabb.minimum = sphere.center - vec3(sphere.radius); + aabb.maximum = sphere.center + vec3(sphere.radius); + tHit = hitAabb(aabb, ray); + } + ~~~~ + +Intersection information is reported using `reportIntersectionEXT`, with a distance from the origin and a second argument (hitKind) that can be used to differentiate the primitive type. + +~~~~ C++ + + // Report hit point + if(tHit > 0) + reportIntersectionEXT(tHit, hitKind); +} +~~~~ + +The shader can be found [here](shaders/raytrace.rint) + +### raytrace2.rchit + +The new closest hit can be found [here](shaders/raytrace2.rchit) + +This shader is almost identical to original `raytrace.rchit`, but since the primitive is implicit, we will only need to compute the normal for the primitive that was hit. + +We retrieve the world position from the ray and the `gl_HitTEXT` which was set in the intersection shader. + +~~~~ C++ + vec3 worldPos = gl_WorldRayOriginEXT + gl_WorldRayDirectionEXT * gl_HitTEXT; +~~~~ + +The sphere information is retrieved the same way as in the `raytrace.rint` shader. + +~~~~ C++ + Sphere instance = allSpheres.i[gl_PrimitiveID]; +~~~~ + +Then we compute the normal, as for a sphere. + +~~~~ C++ + // Computing the normal at hit position + vec3 normal = normalize(worldPos - instance.center); +~~~~ + +To know if we have intersect a cube rather than a sphere, we are using `gl_HitKindEXT`, which was set in the second argument of `reportIntersectionEXT`. + +So when this is a cube, we set the normal to the major axis. + +~~~~ C++ + // Computing the normal for a cube if the hit intersection was reported as 1 + if(gl_HitKindEXT == KIND_CUBE) // Aabb + { + vec3 absN = abs(normal); + float maxC = max(max(absN.x, absN.y), absN.z); + normal = (maxC == absN.x) ? + vec3(sign(normal.x), 0, 0) : + (maxC == absN.y) ? vec3(0, sign(normal.y), 0) : vec3(0, 0, sign(normal.z)); + } +~~~~ diff --git a/ray_tracing_intersection/images/intersection.png b/ray_tracing_intersection/images/intersection.png new file mode 100644 index 0000000..d146e25 Binary files /dev/null and b/ray_tracing_intersection/images/intersection.png differ diff --git a/ray_tracing_intersection/shaders/raytrace.rint b/ray_tracing_intersection/shaders/raytrace.rint index 50cb7af..ee12082 100644 --- a/ray_tracing_intersection/shaders/raytrace.rint +++ b/ray_tracing_intersection/shaders/raytrace.rint @@ -6,13 +6,11 @@ #include "raycommon.glsl" #include "wavefront.glsl" -hitAttributeEXT vec3 HitAttribute; layout(binding = 7, set = 1, scalar) buffer allSpheres_ { - Sphere i[]; -} -allSpheres; + Sphere allSpheres[]; +}; struct Ray @@ -60,7 +58,7 @@ void main() ray.direction = gl_WorldRayDirectionEXT; // Sphere data - Sphere sphere = allSpheres.i[gl_PrimitiveID]; + Sphere sphere = allSpheres[gl_PrimitiveID]; float tHit = -1; int hitKind = gl_PrimitiveID % 2 == 0 ? KIND_SPHERE : KIND_CUBE; diff --git a/ray_tracing_jitter_cam/README.md b/ray_tracing_jitter_cam/README.md index f0996aa..85a374f 100644 --- a/ray_tracing_jitter_cam/README.md +++ b/ray_tracing_jitter_cam/README.md @@ -1,5 +1,288 @@ -# NVIDIA Vulkan Ray Tracing Tutorial +# Jitter Camera - Tutorial -[Start the tutorial of this project](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/vkrt_tuto_antialiasing.md.htm) +![](Images/antialiasing.png) -![](../docs/Images/antialiasing.png) \ No newline at end of file +## Tutorial ([Setup](../docs/setup.md)) + +This is an extension of the Vulkan ray tracing [tutorial](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR). + + +In this extension, we will implement antialiasing by jittering the offset of each ray for each pixel over time, instead of always shooting each ray from the middle of its pixel. + +(insert setup.md.html here) + + +## Random Functions + +We will use some simple functions for random number generation, which suffice for this example. + +Create a new shader file `random.glsl` with the following code. Add it to the `shaders` directory and rerun CMake, and include this new file in `raytrace.rgen`: + +~~~~ C++ +// Generate a random unsigned int from two unsigned int values, using 16 pairs +// of rounds of the Tiny Encryption Algorithm. See Zafar, Olano, and Curtis, +// "GPU Random Numbers via the Tiny Encryption Algorithm" +uint tea(uint val0, uint val1) +{ + uint v0 = val0; + uint v1 = val1; + uint s0 = 0; + + for(uint n = 0; n < 16; n++) + { + s0 += 0x9e3779b9; + v0 += ((v1 << 4) + 0xa341316c) ^ (v1 + s0) ^ ((v1 >> 5) + 0xc8013ea4); + v1 += ((v0 << 4) + 0xad90777d) ^ (v0 + s0) ^ ((v0 >> 5) + 0x7e95761e); + } + + return v0; +} + +// Generate a random unsigned int in [0, 2^24) given the previous RNG state +// using the Numerical Recipes linear congruential generator +uint lcg(inout uint prev) +{ + uint LCG_A = 1664525u; + uint LCG_C = 1013904223u; + prev = (LCG_A * prev + LCG_C); + return prev & 0x00FFFFFF; +} + +// Generate a random float in [0, 1) given the previous RNG state +float rnd(inout uint prev) +{ + return (float(lcg(prev)) / float(0x01000000)); +} +~~~~ + +## Frame Number + +Since our jittered samples will be accumulated across frames, we need to know which frame we are currently rendering. A frame number of 0 will indicate a new frame, and we will accumulate the data for larger frame numbers. + +Note that the uniform image is read/write, which makes it possible to accumulate previous frames. + +In `raytrace.rgen`, add the push constant block from `raytrace.rchit`, adding a new `frame` member: + +~~~~ C++ +layout(push_constant) uniform Constants +{ + vec4 clearColor; + vec3 lightPosition; + float lightIntensity; + int lightType; + int frame; +} +pushC; +~~~~ + +Also add this frame member to the `RtPushConstant` struct in `hello_vulkan.h`: + +~~~~ C++ + struct RtPushConstant + { + nvmath::vec4f clearColor; + nvmath::vec3f lightPosition; + float lightIntensity; + int lightType; + int frame{0}; + } m_rtPushConstants; +~~~~ + +## Random and Jitter + +In `raytrace.rgen`, at the beginning of `main()`, initialize the random seed: + +~~~~ C++ + // Initialize the random number + uint seed = tea(gl_LaunchIDEXT.y * gl_LaunchSizeEXT.x + gl_LaunchIDEXT.x, pushC.frame); +~~~~ + +Then we need two random numbers to vary the X and Y inside the pixel, except for frame 0, where we always shoot +in the center. + +~~~~ C++ +float r1 = rnd(seed); +float r2 = rnd(seed); +// Subpixel jitter: send the ray through a different position inside the pixel +// each time, to provide antialiasing. +vec2 subpixel_jitter = pushC.frame == 0 ? vec2(0.5f, 0.5f) : vec2(r1, r2); +~~~~ + +Now we only need to change how we compute the pixel center: + +~~~~ C++ +const vec2 pixelCenter = vec2(gl_LaunchIDEXT.xy) + subpixel_jitter; +~~~~ + +## Storing or Updating + +At the end of `main()`, if the frame number is equal to 0, we write directly to the image. +Otherwise, we combine the new image with the previous `frame` frames. + +~~~~ C++ + // Do accumulation over time + if(pushC.frame > 0) + { + float a = 1.0f / float(pushC.frame + 1); + vec3 old_color = imageLoad(image, ivec2(gl_LaunchIDEXT.xy)).xyz; + imageStore(image, ivec2(gl_LaunchIDEXT.xy), vec4(mix(old_color, prd.hitValue, a), 1.f)); + } + else + { + // First frame, replace the value in the buffer + imageStore(image, ivec2(gl_LaunchIDEXT.xy), vec4(prd.hitValue, 1.f)); + } +~~~~ + +## Application Frame Update + +We need to increment the current rendering frame, but we also need to reset it when something in the +scene is changing. + +Add two new functions to the `HelloVulkan` class: + +~~~~ C++ + void resetFrame(); + void updateFrame(); +~~~~ + +The implementation of `updateFrame` resets the frame counter if the camera has changed; otherwise, it increments the frame counter. + +~~~~ C++ +//-------------------------------------------------------------------------------------------------- +// If the camera matrix has changed, resets the frame. +// otherwise, increments frame. +// +void HelloVulkan::updateFrame() +{ + static nvmath::mat4f refCamMatrix; + + auto& m = CameraManip.getMatrix(); + if(memcmp(&refCamMatrix.a00, &m.a00, sizeof(nvmath::mat4f)) != 0) + { + resetFrame(); + refCamMatrix = m; + } + m_rtPushConstants.frame++; +} +~~~~ + +Since `resetFrame` will be called before `updateFrame` increments the frame counter, `resetFrame` will set the frame counter to -1: + +~~~~ C++ +void HelloVulkan::resetFrame() +{ + m_rtPushConstants.frame = -1; +} +~~~~ + +At the begining of `HelloVulkan::raytrace`, call + +~~~~ C++ + updateFrame(); +~~~~ + +The application will now antialias the image when ray tracing is enabled. + +Adding `resetFrame()` in `HelloVulkan::onResize()` will also take care of clearing the buffer while resizing the window. + + + +## Resetting Frame on UI Change + +The frame number should also be reset when any parts of the scene change, such as the light direction or the background color. In `renderUI()` in `main.cpp`, check for UI changes and reset the frame number when they happen: + +~~~~ C++ +void renderUI(HelloVulkan& helloVk) +{ + static int item = 1; + bool changed = false; + if(ImGui::Combo("Up Vector", &item, "X\0Y\0Z\0\0")) + { + nvmath::vec3f pos, eye, up; + CameraManip.getLookat(pos, eye, up); + up = nvmath::vec3f(item == 0, item == 1, item == 2); + CameraManip.setLookat(pos, eye, up); + changed = true; + } + changed |= + ImGui::SliderFloat3("Light Position", &helloVk.m_pushConstant.lightPosition.x, -20.f, 20.f); + changed |= + ImGui::SliderFloat("Light Intensity", &helloVk.m_pushConstant.lightIntensity, 0.f, 100.f); + changed |= ImGui::RadioButton("Point", &helloVk.m_pushConstant.lightType, 0); + ImGui::SameLine(); + changed |= ImGui::RadioButton("Infinite", &helloVk.m_pushConstant.lightType, 1); + if(changed) + helloVk.resetFrame(); +} +~~~~ + +We also need to check for UI changes inside the main loop inside `main()`: + +~~~~ C++ + bool changed = false; + // Edit 3 floats representing a color + changed |= ImGui::ColorEdit3("Clear color", reinterpret_cast(&clearColor)); + // Switch between raster and ray tracing + changed |= ImGui::Checkbox("Ray Tracer mode", &useRaytracer); + if(changed) + helloVk.resetFrame(); +~~~~ + +## Quality + +After enough samples, the quality of the rendering will be sufficiently high that it might make sense to avoid accumulating further images. + +Add a member variable to `HelloVulkan` + +~~~~ C++ +int m_maxFrames{100}; +~~~~ + +and also add a way to control it in `renderUI()`, making sure that `m_maxFrames` cannot be set below 1: + +~~~~ C++ +changed |= ImGui::InputInt("Max Frames", &helloVk.m_maxFrames); +helloVk.m_maxFrames = std::max(helloVk.m_maxFrames, 1); +~~~~ + +Then in `raytrace()`, immediately after the call to `updateFrame()`, return if the current frame has exceeded the max frame. + +~~~~ C++ + if(m_rtPushConstants.frame >= m_maxFrames) + return; +~~~~ + +Since the output image won't be modified by the ray tracer, we will simply display the last good image, reducing GPU usage when the target quality has been reached. + +## More Samples in RayGen + +To improve efficiency, we can perform multiple samples directly in the ray generation shader. This will be faster than calling `raytrace()` the equivalent number of times. + +To do this, add a constant to `raytrace.rgen` (this could alternatively be added to the push constant block and controlled by the application): + +~~~~ C++ +const int NBSAMPLES = 10; +~~~~ + +In `main()`, after initializing the random number seed, create a loop that encloses the lines from the generation of `r1` and `r2` to the `traceRayEXT` call, and accumulates the colors returned by `traceRayEXT`. At the end of the loop, divide by the number of samples that were taken. + +~~~~ C++ + vec3 hitValues = vec3(0); + + for(int smpl = 0; smpl < NBSAMPLES; smpl++) + { + float r1 = rnd(seed); + float r2 = rnd(seed); + // ... + // TraceRayEXT( ... ); + hitValues += prd.hitValue; + } + prd.hitValue = hitValues / NBSAMPLES; +~~~~ + +For a given value of `m_maxFrames` and `NBSAMPLE`, the image will have `m_maxFrames * NBSAMPLE` antialiasing samples. + +For instance, if `m_maxFrames = 10` and `NBSAMPLE = 10`, this will be equivalent in quality to an image using `m_maxFrames = 100` and `NBSAMPLE = 1`. + +However, using `NBSAMPLE=10` in the ray generation shader will be faster than calling `raytrace()` with `NBSAMPLE=1` 10 times in a row. diff --git a/ray_tracing_jitter_cam/images/antialiasing.png b/ray_tracing_jitter_cam/images/antialiasing.png new file mode 100644 index 0000000..09b599b Binary files /dev/null and b/ray_tracing_jitter_cam/images/antialiasing.png differ diff --git a/ray_tracing_manyhits/README.md b/ray_tracing_manyhits/README.md index ace2ae1..8cb0985 100644 --- a/ray_tracing_manyhits/README.md +++ b/ray_tracing_manyhits/README.md @@ -1,5 +1,341 @@ -# NVIDIA Vulkan Ray Tracing Tutorial +# Multiple Closest Hit Shaders - Tutorial -[Start the tutorial of this project](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/vkrt_tuto_manyhits.md.htm) +![](Images/manyhits.png) -![](../docs/Images/manyhits.png) \ No newline at end of file +## Tutorial ([Setup](../docs/setup.md)) + +This is an extension of the Vulkan ray tracing [tutorial](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR). + +The ray tracing tutorial only uses one closest hit shader, but it is also possible to have multiple closest hit shaders. +For example, this could be used to give different models different shaders, or to use a less complex shader when tracing +reflections. + + +## Setting up the Scene + +For this example, we will load the `wuson` model and create another translated instance of it. + +Then you can change the `helloVk.loadModel` calls to the following: + +~~~~ C++ + // Creation of the example + helloVk.loadModel(nvh::findFile("media/scenes/wuson.obj", defaultSearchPaths), + nvmath::translation_mat4(nvmath::vec3f(-1, 0, 0))); + HelloVulkan::ObjInstance inst; + inst.objIndex = 0; + inst.transform = nvmath::translation_mat4(nvmath::vec3f(1, 0, 0)); + inst.transformIT = nvmath::transpose(nvmath::invert(inst.transform)); + helloVk.m_objInstance.push_back(inst); + helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths)); +~~~~ + +## Adding a new Closest Hit Shader + +We will need to create a new closest hit shader (CHIT), to add it to the raytracing pipeline, and to indicate which instance will use this shader. + +### `raytrace2.rchit` + +We can make a very simple shader to differentiate this closest hit shader from the other one. +As an example, create a new file called `raytrace2.rchit`, and add it to Visual Studio's `shaders` filter with the other shaders. + +~~~~ C++ +#version 460 +#extension GL_EXT_ray_tracing : require +#extension GL_GOOGLE_include_directive : enable + +#include "raycommon.glsl" + +layout(location = 0) rayPayloadInEXT hitPayload prd; + +void main() +{ + prd.hitValue = vec3(1,0,0); +} +~~~~ + +### `createRtPipeline` + +This new shader needs to be added to the raytracing pipeline. So, in `createRtPipeline` in `hello_vulkan.cpp`, load the new closest hit shader immediately after loading the first one. + +~~~~ C++ + vk::ShaderModule chit2SM = + nvvk::createShaderModule(m_device, // + nvh::loadFile("shaders/raytrace2.rchit.spv", true, paths)); +~~~~ + +Then add a new hit group group immediately after adding the first hit group: + +~~~~ C++ + // Second group + stages.push_back({{}, vk::ShaderStageFlagBits::eClosestHitKHR, chit2SM, "main"}); + hg.setClosestHitShader(static_cast(stages.size() - 1)); + m_rtShaderGroups.push_back(hg); +~~~~ + +### `raytrace.rgen` + +As a test, you can try changing the `sbtRecordOffset` parameter of the `traceRayEXT` call in `raytrace.rgen`. +If you set the offset to `1`, then all ray hits will use the new CHIT, and the raytraced output should look like the image below: + +![](Images/manyhits2.png) + +!!! Warning + After testing this out, make sure to revert this change in `raytrace.rgen` before continuing. + +### `hello_vulkan.h` + +In the `ObjInstance` structure, we will add a new member variable that specifies which hit shader the instance will use: + +~~~~ C++ +uint32_t hitgroup{0}; // Hit group of the instance +~~~~ + +This change also needs to be reflected in the `sceneDesc` structure in `wavefront.glsl`: + +~~~~ C++ +struct sceneDesc +{ + int objId; + int txtOffset; + mat4 transfo; + mat4 transfoIT; + int hitGroup; +}; +~~~~ + +**Warning:** + The solution will not automatically recompile the shaders after this change to `wavefront.glsl`; instead, you will need to recompile all of the SPIR-V shaders. + +### `hello_vulkan.cpp` + +Finally, we need to tell the top-level acceleration structure which hit group to use for each instance. In `createTopLevelAS()` in `hello_vulkan.cpp`, change the line setting `rayInst.hitGroupId` to + +~~~~ C++ +rayInst.hitGroupId = m_objInstance[i].hitgroup; +~~~~ + +### Choosing the Hit shader + +Back in `main.cpp`, after loading the scene's models, we can now have both `wuson` models use the new CHIT by adding the following: + +~~~~ C++ + helloVk.m_objInstance[0].hitgroup = 1; + helloVk.m_objInstance[1].hitgroup = 1; +~~~~ + +![](Images/manyhits3.png) + +## Shader Record Data `shaderRecordKHR` + +When creating the [Shader Binding Table](https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/chap33.html#shader-binding-table), see previous, each entry in the table consists of a handle referring to the shader that it invokes. We have packed all data to the size of `shaderGroupHandleSize`, but each entry could be made larger, to store data that can later be referenced by a `shaderRecordKHR` block in the shader. + +This information can be used to pass extra information to a shader, for each entry in the SBT. + +**Note:** + Since each entry in an SBT group must have the same size, each entry of the group has to have enough space to accommodate the largest element in the entire group. + +The following diagram represents our current SBT, with the addition of some data to `HitGroup1`. As mentioned in the **note**, even if +`HitGroup0` doesn't have any shader record data, it still needs to have the same size as `HitGroup1`. + +~~~~ Bash ++-----------+----------+ +| RayGen | Handle 0 | ++-----------+----------+ +| Miss | Handle 1 | ++-----------+----------+ +| Miss | Handle 2 | ++-----------+----------+ +| HitGroup0 | Handle 3 | +| | -Empty- | ++-----------+----------+ +| HitGroup1 | Handle 4 | +| | Data 0 | ++-----------+----------+ +~~~~ + +## `hello_vulkan.h` + +In the HelloVulkan class, we will add a structure to hold the hit group data. + + + +### `raytrace2.rchit` + +In the closest hit shader, we can retrieve the shader record using the `layout(shaderRecordEXT)` descriptor + +~~~~ C++ +layout(shaderRecordEXT) buffer sr_ { vec4 c; } shaderRec; +~~~~ + +and use this information to return the color: + +~~~~ C++ +void main() +{ + prd.hitValue = shaderRec.c.rgb; +} +~~~~ + +**Note** + Adding a new shader requires to rerun CMake to added to the project compilation system. + + +### `main.cpp` + +In `main`, after we set which hit group an instance will use, we can add the data we want to set through the shader record. + +~~~~ C++ + helloVk.m_hitShaderRecord.resize(1); + helloVk.m_hitShaderRecord[0].color = nvmath::vec4f(1, 1, 0, 0); // Yellow +~~~~ + +### `HelloVulkan::createRtShaderBindingTable` + +Since we are no longer compacting all handles in a continuous buffer, we need to fill the SBT as described above. + +After retrieving the handles of all 5 groups (raygen, miss, miss shadow, hit0, and hit1) +using `getRayTracingShaderGroupHandlesKHR`, store the pointers to easily retrieve them. + +~~~~ C++ + // Retrieve the handle pointers + std::vector handles(groupCount); + for(uint32_t i = 0; i < groupCount; i++) + { + handles[i] = &shaderHandleStorage[i * groupHandleSize]; + } +~~~~ + +The size of each group can be described as follows: + +~~~~ C++ + // Sizes + uint32_t rayGenSize = baseAlignment; + uint32_t missSize = baseAlignment; + uint32_t hitSize = + ROUND_UP(groupHandleSize + static_cast(sizeof(HitRecordBuffer)), baseAlignment); + uint32_t newSbtSize = rayGenSize + 2 * missSize + 2 * hitSize; +~~~~ + +Then write the new SBT like this, where only Hit 1 has extra data. + +~~~~ C++ + std::vector sbtBuffer(newSbtSize); + { + uint8_t* pBuffer = sbtBuffer.data(); + + memcpy(pBuffer, handles[0], groupHandleSize); // Raygen + pBuffer += rayGenSize; + memcpy(pBuffer, handles[1], groupHandleSize); // Miss 0 + pBuffer += missSize; + memcpy(pBuffer, handles[2], groupHandleSize); // Miss 1 + pBuffer += missSize; + + uint8_t* pHitBuffer = pBuffer; + memcpy(pHitBuffer, handles[3], groupHandleSize); // Hit 0 + // No data + pBuffer += hitSize; + + pHitBuffer = pBuffer; + memcpy(pHitBuffer, handles[4], groupHandleSize); // Hit 1 + pHitBuffer += groupHandleSize; + memcpy(pHitBuffer, &m_hitShaderRecord[0], sizeof(HitRecordBuffer)); // Hit 1 data + pBuffer += hitSize; + } +~~~~ + +Then change the call to `m_alloc.createBuffer` to create the SBT buffer from `sbtBuffer`: + +~~~~ C++ + m_rtSBTBuffer = m_alloc.createBuffer(cmdBuf, sbtBuffer, vk::BufferUsageFlagBits::eRayTracingKHR); +~~~~ + +Note: we are using this `define` for rounding up to the correct alignment +~~~~ C++ +#ifndef ROUND_UP +#define ROUND_UP(v, powerOf2Alignment) (((v) + (powerOf2Alignment)-1) & ~((powerOf2Alignment)-1)) +#endif +~~~~ + + +### `raytrace` + +Finally, since the size of the hit group is now larger than just the handle, we need to set the new value of the hit group stride in `HelloVulkan::raytrace`. + +~~~~ C++ +vk::DeviceSize hitGroupStride = +ROUND_UP(m_rtProperties.shaderGroupHandleSize + sizeof(HitRecordBuffer), progOffset); +~~~~ + +!!! Note: + The result should now show both `wuson` models with a yellow color. + +![](Images/manyhits4.png) + +## Extending Hit + +The SBT can be larger than the number of shading models, which could then be used to have one shader per instance with its own data. For some applications, instead of retrieving the material information as in the main tutorial using a storage buffer and indexing into it using the `gl_InstanceID`, it is possible to set all of the material information in the SBT. + +The following modification will add another entry to the SBT with a different color per instance. The new SBT hit group (2) will use the same CHIT handle (4) as hit group 1. + +~~~~ Bash ++-----------+----------+ +| RayGen | Handle 0 | ++-----------+----------+ +| Miss | Handle 1 | ++-----------+----------+ +| Miss | Handle 2 | ++-----------+----------+ +| HitGroup0 | Handle 3 | +| | -Empty- | ++-----------+----------+ +| HitGroup1 | Handle 4 | +| | Data 0 | ++-----------+----------+ +| HitGroup2 | Handle 4 | +| | Data 1 | ++-----------+----------+ +~~~~ + + +### `main.cpp` + +In the description of the scene in `main`, we will tell the `wuson` models to use hit groups 1 and 2 respectively, and to have different colors. + +~~~~ C++ + helloVk.m_objInstance[0].hitgroup = 1; + helloVk.m_objInstance[1].hitgroup = 2; + helloVk.m_hitShaderRecord.resize(2); + helloVk.m_hitShaderRecord[0].color = nvmath::vec4f(0, 1, 0, 0); // Green + helloVk.m_hitShaderRecord[1].color = nvmath::vec4f(0, 1, 1, 0); // Cyan +~~~~ + +### `createRtShaderBindingTable` + +The size of the SBT will now account for its 3 hit groups: + +~~~~ C++ + uint32_t newSbtSize = rayGenSize + 2 * missSize + 3 * hitSize; +~~~~ + +Finally, we need to add the new entry as well at the end of the buffer, reusing the handle of the second Hit Group and setting a different color. + +~~~~ C++ + pHitBuffer = pBuffer; + memcpy(pHitBuffer, handles[4], groupHandleSize); // Hit 2 + pHitBuffer += groupHandleSize; + memcpy(pHitBuffer, &m_hitShaderRecord[1], sizeof(HitRecordBuffer)); // Hit 2 data + pBuffer += hitSize; +~~~~ + +** Warning**: + Adding entries like this can be error-prone and inconvenient for decent + scene sizes. Instead, it is recommended to wrap the storage of handles, data, + and size per group in a SBT utility to handle this automatically. diff --git a/ray_tracing_manyhits/images/manyhits.png b/ray_tracing_manyhits/images/manyhits.png new file mode 100644 index 0000000..eaa90ee Binary files /dev/null and b/ray_tracing_manyhits/images/manyhits.png differ diff --git a/ray_tracing_manyhits/images/manyhits2.png b/ray_tracing_manyhits/images/manyhits2.png new file mode 100644 index 0000000..aeb5624 Binary files /dev/null and b/ray_tracing_manyhits/images/manyhits2.png differ diff --git a/ray_tracing_manyhits/images/manyhits3.png b/ray_tracing_manyhits/images/manyhits3.png new file mode 100644 index 0000000..7faa476 Binary files /dev/null and b/ray_tracing_manyhits/images/manyhits3.png differ diff --git a/ray_tracing_manyhits/images/manyhits4.png b/ray_tracing_manyhits/images/manyhits4.png new file mode 100644 index 0000000..8465702 Binary files /dev/null and b/ray_tracing_manyhits/images/manyhits4.png differ diff --git a/ray_tracing_rayquery/README.md b/ray_tracing_rayquery/README.md index 4ebc46d..952ba2f 100644 --- a/ray_tracing_rayquery/README.md +++ b/ray_tracing_rayquery/README.md @@ -1,8 +1,120 @@ -# NVIDIA Vulkan Ray Tracing Tutorial +# Ray Query - Tutorial -This example is part of the the ray tracing tutorial. -If you haven't done it, [**Start Ray Tracing Tutorial**](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/). -[Start the tutorial of this project](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/vkrt_tuto_rayquery.md.htm) +![](images/rayquery.png) + +## Tutorial ([Setup](../docs/setup.md)) + +This is an extension of the Vulkan ray tracing [tutorial](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR). + + +This extension is allowing to execute ray intersection queries in any shader stages. In this example, we will add +ray queries [(GLSL_EXT_ray_query)](https://github.com/KhronosGroup/GLSL/blob/master/extensions/ext/GLSL_EXT_ray_query.txt) to the fragment shader to cast shadow rays. + +In the contrary to all other examples, with this one, we are removing code. There are no need to have a SBT and a raytracing pipeline, the only thing that +will matter, is the creation of the acceleration structure. + +Starting from the end of the tutorial, [ray_tracing__simple](https://github.com/nvpro-samples/vk_raytracing_tutorial_KHR/tree/master/ray_tracing__simple) we will remove +all functions that were dedicated to ray tracing and keep only the construction of the BLAS and TLAS. + +## Cleanup + +First, let's remove all extra code + +### hello_vulkan (header) + +Remove most functions and members to keep only what is need to create the acceleration structure: + +~~~~ C++ +// #VKRay +void initRayTracing(); +nvvk::RaytracingBuilderKHR::Blas objectToVkGeometryKHR(const ObjModel& model); +void createBottomLevelAS(); +void createTopLevelAS(); + +vk::PhysicalDeviceRayTracingPropertiesKHR m_rtProperties; +nvvk::RaytracingBuilderKHR m_rtBuilder; +~~~~ + +### hello_vulkan (source) + +From the source code, remove the code for all functions that was previously removed. + +### Shaders + +You can safely remove all raytrace.* shaders + + +## Support for Fragment shader + +In `HelloVulkan::createDescriptorSetLayout`, add the acceleration structure to the description layout. + +~~~~ C++ +// The top level acceleration structure +m_descSetLayoutBind.emplace_back( // + vkDS(7, vkDT::eAccelerationStructureKHR, 1, vkSS::eFragment)); +~~~~ + +In `HelloVulkan::updateDescriptorSet`, write the value to the descriptor set. + +~~~~ C++ + vk::AccelerationStructureKHR tlas = m_rtBuilder.getAccelerationStructure(); + vk::WriteDescriptorSetAccelerationStructureKHR descASInfo; + descASInfo.setAccelerationStructureCount(1); + descASInfo.setPAccelerationStructures(&tlas); + writes.emplace_back(m_descSetLayoutBind.makeWrite(m_descSet, 7, descASInfo)); +~~~~ + + +### Shader + +The last modification is in the fragment shader, where we will add the ray intersection query to trace shadow rays. + +First, the version has bumpped to 460 + +~~~~ C++ +#version 460 +~~~~ + +Then we need to add new extensions + +~~~~ C++ +#extension GL_EXT_ray_tracing : enable +#extension GL_EXT_ray_query : enable +~~~~ + +We have to add the layout to access the top level acceleration structure. + +~~~~ C++ +layout(binding = 7, set = 0) uniform accelerationStructureEXT topLevelAS; +~~~~ + + +Ad the end of the shader, add the following code to initiate the ray query. As we are only interested to know if the ray +has hit something, we can keep the minimal. + +~~~~ C++ +// Ray Query for shadow +vec3 origin = worldPos; +vec3 direction = L; // vector to light +float tMin = 0.01f; +float tMax = lightDistance; + +// Initializes a ray query object but does not start traversal +rayQueryEXT rayQuery; +rayQueryInitializeEXT(rayQuery, topLevelAS, gl_RayFlagsTerminateOnFirstHitEXT, 0xFF, origin, tMin, + direction, tMax); + +// Start traversal: return false if traversal is complete +while(rayQueryProceedEXT(rayQuery)) +{ +} + +// Returns type of committed (true) intersection +if(rayQueryGetIntersectionTypeEXT(rayQuery, true) != gl_RayQueryCommittedIntersectionNoneEXT) +{ + // Got an intersection == Shadow + outColor *= 0.1; +} +~~~~ -![rayquery](../docs/Images/rayquery.png) diff --git a/ray_tracing_rayquery/images/rayquery.png b/ray_tracing_rayquery/images/rayquery.png new file mode 100644 index 0000000..c9f160a Binary files /dev/null and b/ray_tracing_rayquery/images/rayquery.png differ diff --git a/ray_tracing_reflections/README.md b/ray_tracing_reflections/README.md index 0d25407..f279fc7 100644 --- a/ray_tracing_reflections/README.md +++ b/ray_tracing_reflections/README.md @@ -1,5 +1,254 @@ -# NVIDIA Vulkan Ray Tracing Tutorial +# Reflections - Tutorial -[Start the tutorial of this project](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/vkrt_tuto_reflection.md.htm) +![](Images/reflections.png) + +## Tutorial ([Setup](../docs/setup.md)) + +This is an extension of the Vulkan ray tracing [tutorial](https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR). + +## Setting Up the scene + +First, we will create a scene with two reflective planes and a multicolored cube in the center. Change the `helloVk.loadModel` calls in `main()` to + +~~~~ C++ + // Creation of the example + helloVk.loadModel(nvh::findFile("media/scenes/cube.obj", defaultSearchPaths), + nvmath::translation_mat4(nvmath::vec3f(-2, 0, 0)) + * nvmath::scale_mat4(nvmath::vec3f(.1f, 5.f, 5.f))); + helloVk.loadModel(nvh::findFile("media/scenes/cube.obj", defaultSearchPaths), + nvmath::translation_mat4(nvmath::vec3f(2, 0, 0)) + * nvmath::scale_mat4(nvmath::vec3f(.1f, 5.f, 5.f))); + helloVk.loadModel(nvh::findFile("media/scenes/cube_multi.obj", defaultSearchPaths)); + helloVk.loadModel(nvh::findFile("media/scenes/plane.obj", defaultSearchPaths), + nvmath::translation_mat4(nvmath::vec3f(0, -1, 0))); +~~~~ + +Then find `cube.mtl` in `media/scenes` and modify the material to be 95% reflective, without any diffuse +contribution: + +~~~~ C++ +newmtl cube_instance_material +illum 3 +d 1 +Ns 32 +Ni 0 +Ka 0 0 0 +Kd 0 0 0 +Ks 0.95 0.95 0.95 +~~~~ + +## Recursive Reflections + +Vulkan ray tracing allows recursive calls to traceRayEXT, up to a limit defined by `VkPhysicalDeviceRayTracingPropertiesKHR`. + +In `createRtPipeline()` in `hello_vulkan.cpp`, bring the maximum recursion depth up to 10, making sure not to exceed the physical device's maximum recursion limit: + +~~~~ C++ + rayPipelineInfo.setMaxRecursionDepth( + std::max(10u, m_rtProperties.maxRecursionDepth)); // Ray depth +~~~~ + +### `raycommon.glsl` + +We will need to track the depth and the attenuation of the ray. +In the `hitPayload` struct in `raycommon.glsl`, add the following: + +~~~~ C++ + int depth; + vec3 attenuation; +~~~~ + +### `raytrace.rgen` + +In the ray generation shader, we will initialize all payload values before calling `traceRayEXT`. + +~~~~ C++ + prd.depth = 0; + prd.hitValue = vec3(0); + prd.attenuation = vec3(1.f, 1.f, 1.f); +~~~~ + +### `raytrace.rchit` + +At the end of the closest hit shader, before setting `prd.hitValue`, we need to shoot a ray if the material is reflective. + +~~~~ C++ + // Reflection + if(mat.illum == 3 && prd.depth < 10) + { + vec3 origin = worldPos; + vec3 rayDir = reflect(gl_WorldRayDirectionEXT, normal); + prd.attenuation *= mat.specular; + + prd.depth++; + traceRayEXT(topLevelAS, // acceleration structure + gl_RayFlagsNoneEXT, // rayFlags + 0xFF, // cullMask + 0, // sbtRecordOffset + 0, // sbtRecordStride + 0, // missIndex + origin, // ray origin + 0.1, // ray min range + rayDir, // ray direction + 100000.0, // ray max range + 0 // payload (location = 0) + ); + prd.depth--; + } +~~~~ + +The calculated `hitValue` needs to be accumulated, since the payload is global for the +entire execution from raygen, so change the last line of `main()` to + +~~~~ C++ +prd.hitValue += vec3(attenuation * lightIntensity * (diffuse + specular)) * prd.attenuation; +~~~~ + +### `raytrace.rmiss` + +Finally, the miss shader also needs to attenuate its contribution: + +~~~~ C++ + prd.hitValue = clearColor.xyz * 0.8 * prd.attenuation; +~~~~ + +### Working, but limited + +This is working, but it is limited to the number of recursions the GPU can do, and could also impact performance. Trying to go over the limit of recursions would eventually generate a device lost error. + +## Iterative Reflections + +Instead of dispatching new rays from the closest hit shader, we will return the information in the payload to shoot new rays if needed. + +### 'raycommon.glsl' + +Enhance the structure to add information to start new rays if wanted. + +~~~~ C++ + int done; + vec3 rayOrigin; + vec3 rayDir; +~~~~ + +### `raytrace.rgen` + +Initialize the new members of the payload: + +~~~~ C++ + prd.done = 1; + prd.rayOrigin = origin.xyz; + prd.rayDir = direction.xyz; +~~~~ + +Instead of calling traceRayEXT only once, we will call it in a loop until we are done. + +Wrap the trace call in `raytrace.rgen` like this: + +~~~~ C++ + vec3 hitValue = vec3(0); + for(;;) + { + traceRayEXT( /*.. */); + + hitValue += prd.hitValue * prd.attenuation; + + prd.depth++; + if(prd.done == 1 || prd.depth >= 10) + break; + + origin.xyz = prd.rayOrigin; + direction.xyz = prd.rayDir; + prd.done = 1; // Will stop if a reflective material isn't hit + } +~~~~ + +And make sure to write the correct value + +~~~~ C++ +imageStore(image, ivec2(gl_LaunchIDEXT.xy), vec4(hitValue, 1.0)); +~~~~ + +### `raytrace.rchit` + +We no longer need to shoot rays from the closest hit shader, so we can replace the block at the end with + +~~~~ C++ + if(mat.illum == 3) + { + vec3 origin = worldPos; + vec3 rayDir = reflect(gl_WorldRayDirectionEXT, normal); + prd.attenuation *= mat.specular; + prd.done = 0; + prd.rayOrigin = origin; + prd.rayDir = rayDir; + } +~~~~ + +The calculation of the hitValue also no longer needs to be additive, or take attenuation into account: + +~~~~ C++ + prd.hitValue = vec3(attenuation * lightIntensity * (diffuse + specular)); +~~~~ + +### `raytrace.rmiss` + +Since the ray generation shader now handles attenuation, we no longer need to attenuate the value returned in the miss shader: + +~~~~ C++ + prd.hitValue = clearColor.xyz * 0.8; +~~~~ + +### Max Recursion + +Finally, we no longer need to have a deep recursion setting in `createRtPipeline` -- just a depth of 2, one for the initial ray generation segment and another for shadow rays. + +~~~~ C++ + rayPipelineInfo.setMaxRecursionDepth(2); // Ray depth +~~~~ + +In `raytrace.rgen`, we can now make the maximum ray depth significantly larger -- such as 100, for instance -- without causing a device lost error. + +## Controlling Depth + +As an extra, we can also add UI to control the maximum depth. + +In the `RtPushConstant` structure, we can add a new `maxDepth` member to pass to the shader. + +~~~~ C++ + struct RtPushConstant + { + nvmath::vec4f clearColor; + nvmath::vec3f lightPosition; + float lightIntensity; + int lightType; + int maxDepth{10}; + } m_rtPushConstants; +~~~~ + +In the `raytrace.rgen` shader, we will collect the push constant data + +~~~~ C++ +layout(push_constant) uniform Constants +{ + vec4 clearColor; + vec3 lightPosition; + float lightIntensity; + int lightType; + int maxDepth; +} +pushC; +~~~~ + +Then test for the value for when to stop + +~~~~ C++ + if(prd.done == 1 || prd.depth >= pushC.maxDepth) + break; +~~~~ + +Finally, in `main.cpp` in the `renderUI` function, we will add a slider to control the value. + +~~~~ C++ + ImGui::SliderInt("Max Depth", &helloVk.m_rtPushConstants.maxDepth, 1, 100); +~~~~ -![](../docs/Images/reflections.png) \ No newline at end of file diff --git a/ray_tracing_reflections/images/reflections.png b/ray_tracing_reflections/images/reflections.png new file mode 100644 index 0000000..5453183 Binary files /dev/null and b/ray_tracing_reflections/images/reflections.png differ