Adapting to buildTlas and buildBlas changes
This commit is contained in:
parent
1c9be00cec
commit
3e399adf0a
3 changed files with 338 additions and 280 deletions
|
|
@ -271,13 +271,12 @@ Its implementation will fill three structures that will eventually be passed to
|
|||
are ultimately passed as separate arguments to the AS builder but work in concert to determine the actual memory to source
|
||||
vertices from. As a crude analogy, this is similar to how `glVertexAttribPointer` defines how to interpret a buffer as a vertex
|
||||
array while the actual numeric arguments to `glDrawArrays` determine what section of that array is actually drawn.
|
||||
<!-- I would have preferred a Vulkan analogy but vulkan vertex bindings have too many moving parts for a clean analogy. -->
|
||||
<!-- Even though this analogy is kinda goofy, I found the above structures horribly confusing when I first read this -->
|
||||
<!-- and I would have appreciated a crude analogy. -->
|
||||
|
||||
|
||||
Multiple of the above structure can be combined in arrays and built into a single BLAS. In this example,
|
||||
this array will always be a length of one.
|
||||
this array will always be a length of one. There would be reason for having multiple geometry per BLAS. The
|
||||
main reason is the acceleration structure will be more efficient, as it will properly divide the volume with intersecting
|
||||
objects. This should be concider only for large or complex static group of objects.
|
||||
|
||||
Note that we consider all objects opaque for now, and indicate this to the builder for
|
||||
potential optimization. (More specifically, this disables calls to the anyhit shader, described later).
|
||||
|
|
@ -337,7 +336,7 @@ auto HelloVulkan::objectToVkGeometryKHR(const ObjModel& model)
|
|||
!!! Warning Memory Safety
|
||||
`BlasInput` acts essentially as a fancy device pointer to vertex buffer data; no actual vertex data is copied or managed
|
||||
by the helper. For this simple example, we are relying on the fact that all models are loaded at
|
||||
startup and remain in memory unchanged until shutdown. If you are dynamically loading and unloading parts of a larger
|
||||
startup and remain in memory unchanged until the BLAS is created. If you are dynamically loading and unloading parts of a larger
|
||||
scene, or dynamically generating vertex data, it is your responsibility to avoid race conditions with the AS builder.
|
||||
|
||||
In the `HelloVulkan` class declaration, we can now add the `createBottomLevelAS()` method that will generate a
|
||||
|
|
@ -375,85 +374,207 @@ This helper function is already present in `raytraceKHR_vkpp.hpp`: it can be reu
|
|||
part of the set of helpers provided by the [nvpro-samples](https://github.com/nvpro-samples). The function
|
||||
will generate one BLAS for each `RaytracingBuilderKHR::BlasInput`:
|
||||
|
||||
```` C
|
||||
// Create all the BLAS from the vector of BlasInput
|
||||
// - There will be one BLAS per input-vector entry
|
||||
// - There will be as many BLAS as input.size()
|
||||
// - The resulting BLAS (along with the inputs used to build) are stored in m_blas,
|
||||
// and can be referenced by index.
|
||||
Creating a Bottom-Level-Accelerated-Structure, requires the following elements:
|
||||
|
||||
void buildBlas(const std::vector<RaytracingBuilderKHR::BlasInput>& input,
|
||||
VkBuildAccelerationStructureFlagsKHR flags = VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_TRACE_BIT_KHR)
|
||||
* `VkAccelerationStructureBuildGeometryInfoKHR` : to create and build the acceleration structure.
|
||||
It is referencing the array of `VkAccelerationStructureGeometryKHR` created in `objectToVkGeometryKHR()`
|
||||
* `VkAccelerationStructureBuildRangeInfoKHR`: a reference to the range, also created in `objectToVkGeometryKHR()`
|
||||
* `VkAccelerationStructureBuildSizesInfoKHR`: the size require for the creation of the AS and the scratch buffer
|
||||
* `nvvk::AccelKHR`: the result
|
||||
|
||||
The above data will be stored in a structure `BuildAccelerationStructure` to ease the creation.
|
||||
|
||||
At the begining of the function, we are only initializing data that we will need later.
|
||||
|
||||
````C
|
||||
//--------------------------------------------------------------------------------------------------
|
||||
// Create all the BLAS from the vector of BlasInput
|
||||
// - There will be one BLAS per input-vector entry
|
||||
// - There will be as many BLAS as input.size()
|
||||
// - The resulting BLAS (along with the inputs used to build) are stored in m_blas,
|
||||
// and can be referenced by index.
|
||||
// - if flag has the 'Compact' flag, the BLAS will be compacted
|
||||
//
|
||||
void nvvk::RaytracingBuilderKHR::buildBlas(const std::vector<BlasInput>& input, VkBuildAccelerationStructureFlagsKHR flags)
|
||||
{
|
||||
m_cmdPool.init(m_device, m_queueIndex);
|
||||
uint32_t nbBlas = static_cast<uint32_t>(input.size());
|
||||
VkDeviceSize asTotalSize{0}; // Memory size of all allocated BLAS
|
||||
uint32_t nbCompactions{0}; // Nb of BLAS requesting compaction
|
||||
VkDeviceSize maxScratchSize{0}; // Largest scratch size
|
||||
````
|
||||
|
||||
The next part is to populate the `BuildAccelerationStructure` for each BLAS, setting the reference to the
|
||||
geometry, the build range, the size of the memory needed for the build, and the size of the scratch buffer.
|
||||
We will reuse the same scratch memory for each build, so we keep track of the maximum scratch memory ever needed.
|
||||
Later, we will allocate a scratch buffer of this size.
|
||||
|
||||
|
||||
|
||||
````C
|
||||
// Preparing the information for the acceleration build commands.
|
||||
std::vector<BuildAccelerationStructure> buildAs(nbBlas);
|
||||
for(uint32_t idx = 0; idx < nbBlas; idx++)
|
||||
{
|
||||
// Filling partially the VkAccelerationStructureBuildGeometryInfoKHR for querying the build sizes.
|
||||
// Other information will be filled in the createBlas (see #2)
|
||||
buildAs[idx].buildInfo.type = VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_KHR;
|
||||
buildAs[idx].buildInfo.mode = VK_BUILD_ACCELERATION_STRUCTURE_MODE_BUILD_KHR;
|
||||
buildAs[idx].buildInfo.flags = input[idx].flags | flags;
|
||||
buildAs[idx].buildInfo.geometryCount = static_cast<uint32_t>(input[idx].asGeometry.size());
|
||||
buildAs[idx].buildInfo.pGeometries = input[idx].asGeometry.data();
|
||||
|
||||
// Build range information
|
||||
buildAs[idx].rangeInfo = input[idx].asBuildOffsetInfo.data();
|
||||
|
||||
// Finding sizes to create acceleration structures and scratch
|
||||
std::vector<uint32_t> maxPrimCount(input[idx].asBuildOffsetInfo.size());
|
||||
for(auto tt = 0; tt < input[idx].asBuildOffsetInfo.size(); tt++)
|
||||
maxPrimCount[tt] = input[idx].asBuildOffsetInfo[tt].primitiveCount; // Number of primitives/triangles
|
||||
vkGetAccelerationStructureBuildSizesKHR(m_device, VK_ACCELERATION_STRUCTURE_BUILD_TYPE_DEVICE_KHR,
|
||||
&buildAs[idx].buildInfo, maxPrimCount.data(), &buildAs[idx].sizeInfo);
|
||||
|
||||
// Extra info
|
||||
asTotalSize += buildAs[idx].sizeInfo.accelerationStructureSize;
|
||||
maxScratchSize = std::max(maxScratchSize, buildAs[idx].sizeInfo.buildScratchSize);
|
||||
nbCompactions += hasFlag(buildAs[idx].buildInfo.flags, VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_COMPACTION_BIT_KHR);
|
||||
}
|
||||
````
|
||||
|
||||
After looping over all BLAS, we have the largest scratch buffer size and we will create it.
|
||||
|
||||
```` C
|
||||
// Allocate the scratch buffers holding the temporary data of the acceleration structure builder
|
||||
nvvk::Buffer scratchBuffer =
|
||||
m_alloc->createBuffer(maxScratchSize, VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT);
|
||||
VkBufferDeviceAddressInfo bufferInfo{VK_STRUCTURE_TYPE_BUFFER_DEVICE_ADDRESS_INFO, nullptr, scratchBuffer.buffer};
|
||||
VkDeviceAddress scratchAddress = vkGetBufferDeviceAddress(m_device, &bufferInfo);
|
||||
````
|
||||
|
||||
The following section is for querying the real size of each BLAS.
|
||||
To know the size that the BLAS is really taking, we use queries of the type `VK_QUERY_TYPE_ACCELERATION_STRUCTURE_COMPACTED_SIZE_KHR`.
|
||||
This is needed if we want to compact the acceleration structure in a second step. By default, the
|
||||
size returned by `vkGetAccelerationStructureBuildSizesKHR` has the size of the worst case. After creation,
|
||||
the real space can be smaller, and it is possible to copy the acceleration structure to one that is
|
||||
using exactly what is needed. This could save over 50% of the device memory usage.
|
||||
|
||||
```` C
|
||||
// Allocate a query pool for storing the needed size for every BLAS compaction.
|
||||
VkQueryPool queryPool{VK_NULL_HANDLE};
|
||||
if(nbCompactions > 0) // Is compaction requested?
|
||||
{
|
||||
assert(nbCompactions == nbBlas); // Don't allow mix of on/off compaction
|
||||
VkQueryPoolCreateInfo qpci{VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO};
|
||||
qpci.queryCount = nbBlas;
|
||||
qpci.queryType = VK_QUERY_TYPE_ACCELERATION_STRUCTURE_COMPACTED_SIZE_KHR;
|
||||
vkCreateQueryPool(m_device, &qpci, nullptr, &queryPool);
|
||||
}
|
||||
````
|
||||
|
||||
!!! Note Compaction
|
||||
To use compaction the BLAS flag must have VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_COMPACTION_BIT_KHR
|
||||
|
||||
|
||||
Creating all BLAS in a single command buffer might work, but it could stall the pipeline and potentially create problems.
|
||||
To avoid this potential problem, we split the BLAS creation into chunks of ~256MB of required memory.
|
||||
And if we request compaction, we will do it immediately, thus limiting the memory allocation required.
|
||||
|
||||
See below for the split of BLAS creation. The function `cmdCreateBlas` and `cmdCompactBlas` will be detailed later.
|
||||
|
||||
|
||||
```` C
|
||||
// Batching creation/compaction of BLAS to allow staying in restricted amount of memory
|
||||
std::vector<uint32_t> indices; // Indices of the BLAS to create
|
||||
VkDeviceSize batchSize{0};
|
||||
VkDeviceSize batchLimit{256'000'000}; // 256 MB
|
||||
for(uint32_t idx = 0; idx < nbBlas; idx++)
|
||||
{
|
||||
indices.push_back(idx);
|
||||
batchSize += buildAs[idx].sizeInfo.accelerationStructureSize;
|
||||
// Over the limit or last BLAS element
|
||||
if(batchSize >= batchLimit || idx == nbBlas - 1)
|
||||
{
|
||||
// Cannot call buildBlas twice.
|
||||
assert(m_blas.empty());
|
||||
VkCommandBuffer cmdBuf = m_cmdPool.createCommandBuffer();
|
||||
cmdCreateBlas(cmdBuf, indices, buildAs, scratchAddress, queryPool);
|
||||
m_cmdPool.submitAndWait(cmdBuf);
|
||||
|
||||
// Make our own copy of the user-provided inputs.
|
||||
m_blas = std::vector<BlasEntry>(input.begin(), input.end());
|
||||
uint32_t nbBlas = static_cast<uint32_t>(m_blas.size());
|
||||
````
|
||||
if(queryPool)
|
||||
{
|
||||
VkCommandBuffer cmdBuf = m_cmdPool.createCommandBuffer();
|
||||
cmdCompactBlas(cmdBuf, indices, buildAs, queryPool);
|
||||
m_cmdPool.submitAndWait(cmdBuf); // Submit command buffer and call vkQueueWaitIdle
|
||||
|
||||
We then need to package the user-provided geometry into `VkAccelerationStructureBuildGeometryInfoKHR`,
|
||||
with one build info per BLAS to build.
|
||||
// Destroy the non-compacted version
|
||||
destroyNonCompacted(indices, buildAs);
|
||||
}
|
||||
// Reset
|
||||
|
||||
batchSize = 0;
|
||||
indices.clear();
|
||||
}
|
||||
}
|
||||
````
|
||||
|
||||
The created acceleration structure is kept in this class, such that it can be retrieved with the index of creation.
|
||||
|
||||
```` C
|
||||
// Preparing the build information array for the acceleration build command.
|
||||
// This is mostly just a fancy pointer to the user-passed arrays of VkAccelerationStructureGeometryKHR.
|
||||
// dstAccelerationStructure will be filled later once we allocated the acceleration structures.
|
||||
std::vector<VkAccelerationStructureBuildGeometryInfoKHR> buildInfos(nbBlas);
|
||||
for(uint32_t idx = 0; idx < nbBlas; idx++)
|
||||
{
|
||||
buildInfos[idx].sType = VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_BUILD_GEOMETRY_INFO_KHR;
|
||||
buildInfos[idx].flags = flags;
|
||||
buildInfos[idx].geometryCount = (uint32_t)m_blas[idx].input.asGeometry.size();
|
||||
buildInfos[idx].pGeometries = m_blas[idx].input.asGeometry.data();
|
||||
buildInfos[idx].mode = VK_BUILD_ACCELERATION_STRUCTURE_MODE_BUILD_KHR;
|
||||
buildInfos[idx].type = VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_KHR;
|
||||
buildInfos[idx].srcAccelerationStructure = VK_NULL_HANDLE;
|
||||
}
|
||||
````
|
||||
// Keeping all the created acceleration structures
|
||||
for(auto& b : buildAs)
|
||||
{
|
||||
m_blas.emplace_back(b.as);
|
||||
}
|
||||
````
|
||||
|
||||
Next, we need to create the acceleration structure handles, query the memory requirements for each,
|
||||
and allocate a big enough buffer to bind each acceleration structure to. Along the way, we also
|
||||
query the amount of scratch memory needed. We will re-use the same scratch memory for each build,
|
||||
so we keep track of the maximum scratch memory ever needed. Later, we'll allocate a scratch buffer of this size.
|
||||
Finally we are cleaning up what we use.
|
||||
|
||||
```` C
|
||||
for(size_t idx = 0; idx < nbBlas; idx++)
|
||||
{
|
||||
// Query both the size of the finished acceleration structure and the amount of scratch memory
|
||||
// needed (both written to sizeInfo). The `vkGetAccelerationStructureBuildSizesKHR` function
|
||||
// computes the worst case memory requirements based on the user-reported max number of
|
||||
// primitives. Later, compaction can fix this potential inefficiency.
|
||||
std::vector<uint32_t> maxPrimCount(m_blas[idx].input.asBuildOffsetInfo.size());
|
||||
for(auto tt = 0; tt < m_blas[idx].input.asBuildOffsetInfo.size(); tt++)
|
||||
maxPrimCount[tt] = m_blas[idx].input.asBuildOffsetInfo[tt].primitiveCount; // Number of primitives/triangles
|
||||
VkAccelerationStructureBuildSizesInfoKHR sizeInfo{
|
||||
VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_BUILD_SIZES_INFO_KHR};
|
||||
vkGetAccelerationStructureBuildSizesKHR(m_device, VK_ACCELERATION_STRUCTURE_BUILD_TYPE_DEVICE_KHR,
|
||||
&buildInfos[idx], maxPrimCount.data(), &sizeInfo);
|
||||
// Clean up
|
||||
vkDestroyQueryPool(m_device, queryPool, nullptr);
|
||||
m_alloc->finalizeAndReleaseStaging();
|
||||
m_alloc->destroy(scratchBuffer);
|
||||
m_cmdPool.deinit();
|
||||
````
|
||||
|
||||
// Create acceleration structure object. Not yet bound to memory.
|
||||
VkAccelerationStructureCreateInfoKHR createInfo{VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_CREATE_INFO_KHR};
|
||||
createInfo.type = VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_KHR;
|
||||
createInfo.size = sizeInfo.accelerationStructureSize; // Will be used to allocate memory.
|
||||
#### cmdCreateBlas
|
||||
|
||||
// Actual allocation of buffer and acceleration structure. Note: This relies on createInfo.offset == 0
|
||||
// and fills in createInfo.buffer with the buffer allocated to store the BLAS. The underlying
|
||||
// vkCreateAccelerationStructureKHR call then consumes the buffer value.
|
||||
m_blas[idx].as = m_alloc->createAcceleration(createInfo);
|
||||
m_debug.setObjectName(m_blas[idx].as.accel, (std::string("Blas" + std::to_string(idx)).c_str()));
|
||||
buildInfos[idx].dstAccelerationStructure = m_blas[idx].as.accel; // Setting the where the build lands
|
||||
```` C
|
||||
//--------------------------------------------------------------------------------------------------
|
||||
// Creating the bottom level acceleration structure for all indices of `buildAs` vector.
|
||||
// The array of BuildAccelerationStructure was created in buildBlas and the vector of
|
||||
// indices limits the number of BLAS to create at once. This limits the amount of
|
||||
// memory needed when compacting the BLAS.
|
||||
void nvvk::RaytracingBuilderKHR::cmdCreateBlas(VkCommandBuffer cmdBuf,
|
||||
std::vector<uint32_t> indices,
|
||||
std::vector<BuildAccelerationStructure>& buildAs,
|
||||
VkDeviceAddress scratchAddress,
|
||||
VkQueryPool queryPool)
|
||||
{
|
||||
````
|
||||
|
||||
// Keeping info
|
||||
m_blas[idx].flags = flags;
|
||||
maxScratch = std::max(maxScratch, sizeInfo.buildScratchSize);
|
||||
First we reset the query to know the real size of the BLAS
|
||||
|
||||
// Stats - Original size
|
||||
originalSizes[idx] = sizeInfo.accelerationStructureSize;
|
||||
}
|
||||
````C
|
||||
if(queryPool) // For querying the compaction size
|
||||
vkResetQueryPool(m_device, queryPool, 0, static_cast<uint32_t>(indices.size()));
|
||||
uint32_t queryCnt{0};
|
||||
````
|
||||
|
||||
This function is creating all the BLAS defined by the index chunk.
|
||||
|
||||
```` C
|
||||
for(const auto& idx : indices)
|
||||
{
|
||||
````
|
||||
|
||||
|
||||
The creation of the BLAS consist in two steps:
|
||||
|
||||
* Creating the acceleration structure: we use `createAcceleration()` from our memory allocator abstraction and
|
||||
the information about the size we get earlier. This will create the buffer and acceleration structure.
|
||||
* Building the acceleration structure: with the acceleration structure, the scratch buffer and information on the geometry,
|
||||
this makes the actual build of the BLAS.
|
||||
|
||||
|
||||
Behind the scenes, `m_alloc->createAcceleration` is creating a buffer of the size indicated by the acceleration structure
|
||||
size query, giving it the `VK_BUFFER_USAGE_ACCELERATION_STRUCTURE_STORAGE_BIT_KHR` and `VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT`
|
||||
usage bits (the latter is needed as the TLAS builder will need the raw address of the BLASes), and binding the acceleration structure
|
||||
|
|
@ -461,178 +582,120 @@ to its allocated memory by filling in the `buffer` field of `VkAccelerationStruc
|
|||
where `Vk*` handle allocation and memory binding is done in separate steps, an acceleration structure is both created and bound
|
||||
to memory with one `vkCreateAccelerationStructureKHR` call.
|
||||
|
||||
```` C
|
||||
AccelerationDedicatedKHR createAcceleration(VkAccelerationStructureCreateInfoKHR& accel_)
|
||||
{
|
||||
AccelerationDedicatedKHR resultAccel;
|
||||
// Allocating the buffer to hold the acceleration structure
|
||||
resultAccel.buffer = createBuffer(accel_.size, VK_BUFFER_USAGE_ACCELERATION_STRUCTURE_STORAGE_BIT_KHR
|
||||
| VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT);
|
||||
// Setting the buffer
|
||||
accel_.buffer = resultAccel.buffer.buffer;
|
||||
// Create the acceleration structure
|
||||
vkCreateAccelerationStructureKHR(m_device, &accel_, nullptr, &resultAccel.accel);
|
||||
|
||||
return resultAccel;
|
||||
}
|
||||
```` C
|
||||
// Actual allocation of buffer and acceleration structure.
|
||||
VkAccelerationStructureCreateInfoKHR createInfo{VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_CREATE_INFO_KHR};
|
||||
createInfo.type = VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_KHR;
|
||||
createInfo.size = buildAs[idx].sizeInfo.accelerationStructureSize; // Will be used to allocate memory.
|
||||
buildAs[idx].as = m_alloc->createAcceleration(createInfo);
|
||||
NAME_IDX_VK(buildAs[idx].as.accel, idx);
|
||||
NAME_IDX_VK(buildAs[idx].as.buffer.buffer, idx);
|
||||
|
||||
// BuildInfo #2 part
|
||||
buildAs[idx].buildInfo.dstAccelerationStructure = buildAs[idx].as.accel; // Setting where the build lands
|
||||
buildAs[idx].buildInfo.scratchData.deviceAddress = scratchAddress; // All build are using the same scratch buffer
|
||||
|
||||
// Building the bottom-level-acceleration-structure
|
||||
vkCmdBuildAccelerationStructuresKHR(cmdBuf, 1, &buildAs[idx].buildInfo, &buildAs[idx].rangeInfo);
|
||||
````
|
||||
|
||||
Now that we know the maximum scratch memory needed, we allocate a scratch buffer.
|
||||
|
||||
Note the barrier after each call to the build: this is necessary because we are reusing scratch space across builds,
|
||||
so we need to make sure the previous build is finished before starting the next one. We could have used multiple
|
||||
scratch buffers, but that would have been memory intensive, and the device can only build one BLAS at a time,
|
||||
so it wouldn't be any faster.
|
||||
|
||||
```` C
|
||||
// Allocate the scratch buffers holding the temporary data of the
|
||||
// acceleration structure builder
|
||||
nvvk::Buffer scratchBuffer =
|
||||
m_alloc->createBuffer(maxScratch,
|
||||
VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT);
|
||||
VkBufferDeviceAddressInfo bufferInfo{VK_STRUCTURE_TYPE_BUFFER_DEVICE_ADDRESS_INFO};
|
||||
bufferInfo.buffer = scratchBuffer.buffer;
|
||||
VkDeviceAddress scratchAddress = vkGetBufferDeviceAddress(m_device, &bufferInfo);
|
||||
// Since the scratch buffer is reused across builds, we need a barrier to ensure one build
|
||||
// is finished before starting the next one.
|
||||
VkMemoryBarrier barrier{VK_STRUCTURE_TYPE_MEMORY_BARRIER};
|
||||
barrier.srcAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_KHR;
|
||||
barrier.dstAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_KHR;
|
||||
vkCmdPipelineBarrier(cmdBuf, VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_KHR,
|
||||
VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_KHR, 0, 1, &barrier, 0, nullptr, 0, nullptr);
|
||||
````
|
||||
|
||||
To know the size that the BLAS is really taking, we use queries of the type `VK_QUERY_TYPE_ACCELERATION_STRUCTURE_COMPACTED_SIZE_KHR`.
|
||||
This is needed if we want to compact the acceleration structure in a second step. By default, the
|
||||
memory allocated by the creation of the acceleration structure has the size of the worst case. After creation,
|
||||
the real space can be smaller, and it is possible to copy the acceleration structure to one that is
|
||||
using exactly what is needed. This could save over 50% of the device memory usage.
|
||||
Then we add the size query only if needed
|
||||
|
||||
```` C
|
||||
// Is compaction requested?
|
||||
bool doCompaction = (flags & VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_COMPACTION_BIT_KHR)
|
||||
== VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_COMPACTION_BIT_KHR;
|
||||
|
||||
// Allocate a query pool for storing the needed size for every BLAS compaction.
|
||||
VkQueryPoolCreateInfo qpci{VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO};
|
||||
qpci.queryCount = nbBlas;
|
||||
qpci.queryType = VK_QUERY_TYPE_ACCELERATION_STRUCTURE_COMPACTED_SIZE_KHR;
|
||||
VkQueryPool queryPool;
|
||||
vkCreateQueryPool(m_device, &qpci, nullptr, &queryPool);
|
||||
if(queryPool)
|
||||
{
|
||||
// Add a query to find the 'real' amount of memory needed, use for compaction
|
||||
vkCmdWriteAccelerationStructuresPropertiesKHR(cmdBuf, 1, &buildAs[idx].buildInfo.dstAccelerationStructure,
|
||||
VK_QUERY_TYPE_ACCELERATION_STRUCTURE_COMPACTED_SIZE_KHR, queryPool, queryCnt++);
|
||||
}
|
||||
}
|
||||
}
|
||||
````
|
||||
|
||||
We then use multiple command buffers to launch all the BLAS builds. We are using multiple
|
||||
command buffers instead of one, to allow the driver to allow system interuption and avoid a
|
||||
TDR if the job was too heavy.
|
||||
|
||||
Note the barrier after each
|
||||
build call: this is required as we reuse the scratch space across builds, and hence need to ensure
|
||||
the previous build has completed before starting the next. We could have used multiple scratch buffers,
|
||||
but it would have been expensive memory wise, and the device can only build one BLAS at a time, so it
|
||||
wouldn't be faster.
|
||||
Although this approach has the advantage of keeping all BLAS independent, building many BLAS efficiently would require allocating a larger scratch buffer and launching multiple builds simultaneously.
|
||||
This current tutorial does not use compaction, which could significantly reduce the memory footprint of the acceleration structures. These two aspects will be part of a future advanced tutorial.
|
||||
|
||||
```` C
|
||||
// Allocate a command pool for queue of given queue index.
|
||||
// To avoid timeout, record and submit one command buffer per AS build.
|
||||
nvvk::CommandPool genCmdBuf(m_device, m_queueIndex);
|
||||
std::vector<VkCommandBuffer> allCmdBufs(nbBlas);
|
||||
|
||||
// Building the acceleration structures
|
||||
for(uint32_t idx = 0; idx < nbBlas; idx++)
|
||||
{
|
||||
auto& blas = m_blas[idx];
|
||||
VkCommandBuffer cmdBuf = genCmdBuf.createCommandBuffer();
|
||||
allCmdBufs[idx] = cmdBuf;
|
||||
#### cmdCompactBlas
|
||||
|
||||
// All build are using the same scratch buffer
|
||||
buildInfos[idx].scratchData.deviceAddress = scratchAddress;
|
||||
What follows is when the compact flag is set. This part, which is optional, will compact the BLAS into the memory
|
||||
it actually uses. We have to wait until all BLAS are built, to make a copy in the more suitable memory space.
|
||||
This is the reason why we used `m_cmdPool.submitAndWait(cmdBuf)` before calling this function.
|
||||
|
||||
// Convert user vector of offsets to vector of pointer-to-offset (required by vk).
|
||||
// Recall that this defines which (sub)section of the vertex/index arrays
|
||||
// will be built into the BLAS.
|
||||
std::vector<const VkAccelerationStructureBuildRangeInfoKHR*> pBuildOffset(
|
||||
blas.input.asBuildOffsetInfo.size());
|
||||
for(size_t infoIdx = 0; infoIdx < blas.input.asBuildOffsetInfo.size(); infoIdx++)
|
||||
pBuildOffset[infoIdx] = &blas.input.asBuildOffsetInfo[infoIdx];
|
||||
|
||||
// Building the AS
|
||||
vkCmdBuildAccelerationStructuresKHR(cmdBuf, 1, &buildInfos[idx], pBuildOffset.data());
|
||||
|
||||
// Since the scratch buffer is reused across builds, we need a barrier to ensure one build
|
||||
// is finished before starting the next one
|
||||
VkMemoryBarrier barrier{VK_STRUCTURE_TYPE_MEMORY_BARRIER};
|
||||
barrier.srcAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_KHR;
|
||||
barrier.dstAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_KHR;
|
||||
vkCmdPipelineBarrier(cmdBuf,
|
||||
VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_KHR,
|
||||
VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_KHR,
|
||||
0, 1, &barrier, 0, nullptr, 0, nullptr);
|
||||
|
||||
// Write compacted size to query number idx.
|
||||
if(doCompaction)
|
||||
{
|
||||
vkCmdWriteAccelerationStructuresPropertiesKHR(
|
||||
cmdBuf, 1, &blas.as.accel,
|
||||
VK_QUERY_TYPE_ACCELERATION_STRUCTURE_COMPACTED_SIZE_KHR, queryPool, idx);
|
||||
}
|
||||
}
|
||||
genCmdBuf.submitAndWait(allCmdBufs); // vkQueueWaitIdle behind this call.
|
||||
allCmdBufs.clear();
|
||||
```` C
|
||||
//--------------------------------------------------------------------------------------------------
|
||||
// Create and replace a new acceleration structure and buffer based on the size retrieved by the
|
||||
// Query.
|
||||
void nvvk::RaytracingBuilderKHR::cmdCompactBlas(VkCommandBuffer cmdBuf,
|
||||
std::vector<uint32_t> indices,
|
||||
std::vector<BuildAccelerationStructure>& buildAs,
|
||||
VkQueryPool queryPool)
|
||||
{
|
||||
````
|
||||
|
||||
While this approach has the advantage of keeping all BLASes independent, building many BLASes efficiently would
|
||||
require allocating a larger scratch buffer, and launch several builds simultaneously. This current tutorial
|
||||
does not make use of compaction, which could reduce significantly the memory footprint of the acceleration structures. Both
|
||||
of those aspects will be part of a future advanced tutorial.
|
||||
In broad terms, compaction works as follows:
|
||||
|
||||
The following is when compation flag is enabled. This part, which is optional, will compact the BLAS in the memory that it is really using.
|
||||
It needs to wait that all BLASes are constructred, to make a copy in the more fitted memory space.
|
||||
* Get the values from the query
|
||||
* Create a new acceleration structure with the smaller size
|
||||
* Copy the previous acceleration structure to the new allocated one
|
||||
* Destroy previous acceleration structure.
|
||||
|
||||
```` C
|
||||
// Compacting all BLAS
|
||||
if(doCompaction)
|
||||
{
|
||||
VkCommandBuffer cmdBuf = genCmdBuf.createCommandBuffer();
|
||||
```` C
|
||||
uint32_t queryCtn{0};
|
||||
std::vector<nvvk::AccelKHR> cleanupAS; // previous AS to destroy
|
||||
|
||||
// Get the size result back
|
||||
std::vector<VkDeviceSize> compactSizes(nbBlas);
|
||||
vkGetQueryPoolResults(m_device, queryPool, 0,
|
||||
(uint32_t)compactSizes.size(), compactSizes.size() * sizeof(VkDeviceSize),
|
||||
compactSizes.data(), sizeof(VkDeviceSize), VK_QUERY_RESULT_WAIT_BIT);
|
||||
// Get the compacted size result back
|
||||
std::vector<VkDeviceSize> compactSizes(static_cast<uint32_t>(indices.size()));
|
||||
vkGetQueryPoolResults(m_device, queryPool, 0, (uint32_t)compactSizes.size(), compactSizes.size() * sizeof(VkDeviceSize),
|
||||
compactSizes.data(), sizeof(VkDeviceSize), VK_QUERY_RESULT_WAIT_BIT);
|
||||
|
||||
for(auto idx : indices)
|
||||
{
|
||||
buildAs[idx].cleanupAS = buildAs[idx].as; // previous AS to destroy
|
||||
buildAs[idx].sizeInfo.accelerationStructureSize = compactSizes[queryCtn++]; // new reduced size
|
||||
|
||||
// Compacting
|
||||
std::vector<nvvk::AccelKHR> cleanupAS(nbBlas); // previous AS to destroy
|
||||
uint32_t statTotalOriSize{0}, statTotalCompactSize{0};
|
||||
for(uint32_t idx = 0; idx < nbBlas; idx++)
|
||||
{
|
||||
// LOGI("Reducing %i, from %d to %d \n", i, originalSizes[i], compactSizes[i]);
|
||||
statTotalOriSize += (uint32_t)originalSizes[idx];
|
||||
statTotalCompactSize += (uint32_t)compactSizes[idx];
|
||||
// Creating a compact version of the AS
|
||||
VkAccelerationStructureCreateInfoKHR asCreateInfo{VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_CREATE_INFO_KHR};
|
||||
asCreateInfo.size = buildAs[idx].sizeInfo.accelerationStructureSize;
|
||||
asCreateInfo.type = VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_KHR;
|
||||
buildAs[idx].as = m_alloc->createAcceleration(asCreateInfo);
|
||||
NAME_IDX_VK(buildAs[idx].as.accel, idx);
|
||||
NAME_IDX_VK(buildAs[idx].as.buffer.buffer, idx);
|
||||
|
||||
// Creating a compact version of the AS
|
||||
VkAccelerationStructureCreateInfoKHR asCreateInfo{VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_CREATE_INFO_KHR};
|
||||
asCreateInfo.size = compactSizes[idx];
|
||||
asCreateInfo.type = VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_KHR;
|
||||
auto as = m_alloc->createAcceleration(asCreateInfo);
|
||||
|
||||
// Copy the original BLAS to a compact version
|
||||
VkCopyAccelerationStructureInfoKHR copyInfo{VK_STRUCTURE_TYPE_COPY_ACCELERATION_STRUCTURE_INFO_KHR};
|
||||
copyInfo.src = m_blas[idx].as.accel;
|
||||
copyInfo.dst = as.accel;
|
||||
copyInfo.mode = VK_COPY_ACCELERATION_STRUCTURE_MODE_COMPACT_KHR;
|
||||
vkCmdCopyAccelerationStructureKHR(cmdBuf, ©Info);
|
||||
cleanupAS[idx] = m_blas[idx].as;
|
||||
m_blas[idx].as = as;
|
||||
}
|
||||
genCmdBuf.submitAndWait(cmdBuf); // vkQueueWaitIdle within.
|
||||
|
||||
// Destroying the previous version
|
||||
for(auto as : cleanupAS)
|
||||
m_alloc->destroy(as);
|
||||
|
||||
LOGI(" RT BLAS: reducing from: %u to: %u = %u (%2.2f%s smaller) \n", statTotalOriSize, statTotalCompactSize,
|
||||
statTotalOriSize - statTotalCompactSize,
|
||||
(statTotalOriSize - statTotalCompactSize) / float(statTotalOriSize) * 100.f, "%%");
|
||||
}
|
||||
````
|
||||
|
||||
Finally, destroy what was allocated.
|
||||
|
||||
```` C
|
||||
vkDestroyQueryPool(m_device, queryPool, nullptr);
|
||||
m_alloc.destroy(scratchBuffer);
|
||||
m_alloc.finalizeAndReleaseStaging();
|
||||
// Copy the original BLAS to a compact version
|
||||
VkCopyAccelerationStructureInfoKHR copyInfo{VK_STRUCTURE_TYPE_COPY_ACCELERATION_STRUCTURE_INFO_KHR};
|
||||
copyInfo.src = buildAs[idx].buildInfo.dstAccelerationStructure;
|
||||
copyInfo.dst = buildAs[idx].as.accel;
|
||||
copyInfo.mode = VK_COPY_ACCELERATION_STRUCTURE_MODE_COMPACT_KHR;
|
||||
vkCmdCopyAccelerationStructureKHR(cmdBuf, ©Info);
|
||||
}
|
||||
````
|
||||
}
|
||||
````
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## Top-Level Acceleration Structure
|
||||
|
||||
|
|
@ -643,8 +706,8 @@ to the `HelloVulkan` class:
|
|||
void createTopLevelAS();
|
||||
````
|
||||
|
||||
We represent an instance with `nvvk::RaytracingBuilder::Instance`, which stores its transform matrix (`transform`)
|
||||
and the index of its corresponding BLAS (`blasId`) in the vector passed to `buildBlas`. It also contains an instance identifier that will
|
||||
We represent an instance with `VkAccelerationStructureInstanceKHR`, which stores its transform matrix (`transform`)
|
||||
a reference of its corresponding BLAS (`blasId`) in the vector passed to `buildBlas`. It also contains an instance identifier that will
|
||||
be available during shading as `gl_InstanceCustomIndex`, as well as the index of the hit group that represents the shaders that will be
|
||||
invoked upon hitting the object (`VkAccelerationStructureInstanceKHR::instanceShaderBindingTableRecordOffset`, a.k.a. `hitGroupId` in the helper).
|
||||
|
||||
|
|
@ -671,23 +734,23 @@ optimized for tracing performance (rather than AS size, for example).
|
|||
```` C
|
||||
void HelloVulkan::createTopLevelAS()
|
||||
{
|
||||
std::vector<nvvk::RaytracingBuilderKHR::Instance> tlas;
|
||||
std::vector<VkAccelerationStructureInstanceKHR> tlas;
|
||||
tlas.reserve(m_objInstance.size());
|
||||
for(uint32_t i = 0; i < static_cast<uint32_t>(m_objInstance.size()); i++)
|
||||
{
|
||||
nvvk::RaytracingBuilderKHR::Instance rayInst;
|
||||
rayInst.transform = m_objInstance[i].transform; // Position of the instance
|
||||
rayInst.instanceCustomId = i; // gl_InstanceCustomIndexEXT
|
||||
rayInst.blasId = m_objInstance[i].objIndex;
|
||||
rayInst.hitGroupId = 0; // We will use the same hit group for all objects
|
||||
rayInst.flags = VK_GEOMETRY_INSTANCE_TRIANGLE_FACING_CULL_DISABLE_BIT_KHR;
|
||||
VkAccelerationStructureInstanceKHR rayInst;
|
||||
rayInst.transform = nvvk::toTransformMatrixKHR(m_objInstance[i].transform); // Position of the instance
|
||||
rayInst.instanceCustomIndex = i; // gl_InstanceCustomIndexEXT
|
||||
rayInst.accelerationStructureReference = m_rtBuilder.getBlasDeviceAddress(m_objInstance[i].objIndex);
|
||||
rayInst.instanceShaderBindingTableRecordOffset = 0; // We will use the same hit group for all objects
|
||||
rayInst.flags = VK_GEOMETRY_INSTANCE_TRIANGLE_FACING_CULL_DISABLE_BIT_KHR;
|
||||
rayInst.mask = 0xFF;
|
||||
tlas.emplace_back(rayInst);
|
||||
}
|
||||
m_rtBuilder.buildTlas(tlas, VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_TRACE_BIT_KHR);
|
||||
}
|
||||
````
|
||||
|
||||
|
||||
As usual in Vulkan, we need to explicitly destroy the objects we created by adding a call at the end of
|
||||
`HelloVulkan::destroyResources`:
|
||||
|
||||
|
|
@ -696,9 +759,9 @@ As usual in Vulkan, we need to explicitly destroy the objects we created by addi
|
|||
m_rtBuilder.destroy();
|
||||
````
|
||||
|
||||
!!! Note blasId
|
||||
`blasId` is a concept introduced for convenience by the acceleration structure build helper. The `buildTlas` function,
|
||||
described next, converts these indices into the raw device address of BLASes, which are fed to the actual TLAS builder.
|
||||
!!! Note getBlasDeviceAddress()
|
||||
`getBlasDeviceAddress()` returns the acceleration structure device address of the `blasId`. The id correspond to
|
||||
the created BLAS in `buildBlas`.
|
||||
|
||||
### Helper Details: RaytracingBuilder::buildTlas()
|
||||
|
||||
|
|
@ -1064,10 +1127,10 @@ information in the SBT using `shaderRecordEXT`, not covered here). The steps to
|
|||
|
||||
* Load and compile shaders into `VkShaderModule`s in the usual way.
|
||||
|
||||
* Package those `VkShaderModule`s into an array of `VkPipelineStageCreateInfo`.
|
||||
* Package those `VkShaderModule`s into an array of `VkPipelineShaderStageCreateInfo`.
|
||||
|
||||
* Create an array of `VkRayTracingShaderGroupCreateInfoKHR`; each will eventually become an SBT entry.
|
||||
At this point, the shader groups reference individual shaders by their index in the above `VkPipelineStageCreateInfo`
|
||||
At this point, the shader groups reference individual shaders by their index in the above `VkPipelineShaderStageCreateInfo`
|
||||
array as no device addresses have yet been allocated.
|
||||
|
||||
* Compile the above two arrays (plus a pipeline layout, as usual) into a raytracing pipeline using `vkCreateRayTracingPipelineKHR`.
|
||||
|
|
@ -1502,7 +1565,7 @@ As with other resources, we destroy the SBT in `destroyResources`:
|
|||
!!! Tip Shader order
|
||||
As with the pipeline, there is no requirement that raygen, miss, and hit groups come
|
||||
in this order. Since there's no reason to change the order, we constructed SBT entries
|
||||
0, 1, and 2 to correspond to entries 0, 1, and 2 of the `VkPipelineStageCreateInfo`
|
||||
0, 1, and 2 to correspond to entries 0, 1, and 2 of the `VkPipelineShaderStageCreateInfo`
|
||||
array used to build the pipeline. In general though, the order of the SBT need not match
|
||||
the pipeline shader stage order.
|
||||
|
||||
|
|
|
|||
|
|
@ -389,59 +389,55 @@ In `nvvk::RaytracingBuilder` in `raytrace_vkpp.hpp`, we can add a function to up
|
|||
|
||||
~~~~ C++
|
||||
//--------------------------------------------------------------------------------------------------
|
||||
// Refit the BLAS from updated buffers
|
||||
//
|
||||
void updateBlas(uint32_t blasIdx)
|
||||
{
|
||||
Blas& blas = m_blas[blasIdx];
|
||||
// Refit BLAS number blasIdx from updated buffer contents.
|
||||
//
|
||||
void nvvk::RaytracingBuilderKHR::updateBlas(uint32_t blasIdx, BlasInput& blas, VkBuildAccelerationStructureFlagsKHR flags)
|
||||
{
|
||||
assert(size_t(blasIdx) < m_blas.size());
|
||||
|
||||
// Compute the amount of scratch memory required by the AS builder to update the BLAS
|
||||
VkAccelerationStructureMemoryRequirementsInfoKHR memoryRequirementsInfo{
|
||||
VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_INFO_KHR};
|
||||
memoryRequirementsInfo.type = VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_UPDATE_SCRATCH_KHR;
|
||||
memoryRequirementsInfo.accelerationStructure = blas.as.accel;
|
||||
memoryRequirementsInfo.buildType = VK_ACCELERATION_STRUCTURE_BUILD_TYPE_DEVICE_KHR;
|
||||
// Preparing all build information, acceleration is filled later
|
||||
VkAccelerationStructureBuildGeometryInfoKHR buildInfos{VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_BUILD_GEOMETRY_INFO_KHR};
|
||||
buildInfos.flags = flags;
|
||||
buildInfos.geometryCount = (uint32_t)blas.asGeometry.size();
|
||||
buildInfos.pGeometries = blas.asGeometry.data();
|
||||
buildInfos.mode = VK_BUILD_ACCELERATION_STRUCTURE_MODE_UPDATE_KHR; // UPDATE
|
||||
buildInfos.type = VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_KHR;
|
||||
buildInfos.srcAccelerationStructure = m_blas[blasIdx].accel; // UPDATE
|
||||
buildInfos.dstAccelerationStructure = m_blas[blasIdx].accel;
|
||||
|
||||
VkMemoryRequirements2 reqMem{VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2};
|
||||
vkGetAccelerationStructureMemoryRequirementsKHR(m_device, &memoryRequirementsInfo, &reqMem);
|
||||
VkDeviceSize scratchSize = reqMem.memoryRequirements.size;
|
||||
// Find size to build on the device
|
||||
std::vector<uint32_t> maxPrimCount(blas.asBuildOffsetInfo.size());
|
||||
for(auto tt = 0; tt < blas.asBuildOffsetInfo.size(); tt++)
|
||||
maxPrimCount[tt] = blas.asBuildOffsetInfo[tt].primitiveCount; // Number of primitives/triangles
|
||||
VkAccelerationStructureBuildSizesInfoKHR sizeInfo{VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_BUILD_SIZES_INFO_KHR};
|
||||
vkGetAccelerationStructureBuildSizesKHR(m_device, VK_ACCELERATION_STRUCTURE_BUILD_TYPE_DEVICE_KHR, &buildInfos,
|
||||
maxPrimCount.data(), &sizeInfo);
|
||||
|
||||
// Allocate the scratch buffer
|
||||
nvvkBuffer scratchBuffer =
|
||||
m_alloc.createBuffer(scratchSize, VK_BUFFER_USAGE_RAY_TRACING_BIT_KHR | VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT);
|
||||
VkBufferDeviceAddressInfo bufferInfo{VK_STRUCTURE_TYPE_BUFFER_DEVICE_ADDRESS_INFO};
|
||||
bufferInfo.buffer = scratchBuffer.buffer;
|
||||
VkDeviceAddress scratchAddress = vkGetBufferDeviceAddress(m_device, &bufferInfo);
|
||||
// Allocate the scratch buffer and setting the scratch info
|
||||
nvvk::Buffer scratchBuffer =
|
||||
m_alloc->createBuffer(sizeInfo.buildScratchSize, VK_BUFFER_USAGE_ACCELERATION_STRUCTURE_STORAGE_BIT_KHR
|
||||
| VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT);
|
||||
VkBufferDeviceAddressInfo bufferInfo{VK_STRUCTURE_TYPE_BUFFER_DEVICE_ADDRESS_INFO};
|
||||
bufferInfo.buffer = scratchBuffer.buffer;
|
||||
buildInfos.scratchData.deviceAddress = vkGetBufferDeviceAddress(m_device, &bufferInfo);
|
||||
|
||||
|
||||
const VkAccelerationStructureGeometryKHR* pGeometry = blas.asGeometry.data();
|
||||
VkAccelerationStructureBuildGeometryInfoKHR asInfo{VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_BUILD_GEOMETRY_INFO_KHR};
|
||||
asInfo.type = VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_KHR;
|
||||
asInfo.flags = blas.flags;
|
||||
asInfo.update = VK_TRUE;
|
||||
asInfo.srcAccelerationStructure = blas.as.accel;
|
||||
asInfo.dstAccelerationStructure = blas.as.accel;
|
||||
asInfo.geometryArrayOfPointers = VK_FALSE;
|
||||
asInfo.geometryCount = (uint32_t)blas.asGeometry.size();
|
||||
asInfo.ppGeometries = &pGeometry;
|
||||
asInfo.scratchData.deviceAddress = scratchAddress;
|
||||
std::vector<const VkAccelerationStructureBuildRangeInfoKHR*> pBuildOffset(blas.asBuildOffsetInfo.size());
|
||||
for(size_t i = 0; i < blas.asBuildOffsetInfo.size(); i++)
|
||||
pBuildOffset[i] = &blas.asBuildOffsetInfo[i];
|
||||
|
||||
std::vector<const VkAccelerationStructureBuildOffsetInfoKHR*> pBuildOffset(blas.asBuildOffsetInfo.size());
|
||||
for(size_t i = 0; i < blas.asBuildOffsetInfo.size(); i++)
|
||||
pBuildOffset[i] = &blas.asBuildOffsetInfo[i];
|
||||
|
||||
// Update the instance buffer on the device side and build the TLAS
|
||||
nvvk::CommandPool genCmdBuf(m_device, m_queueIndex);
|
||||
VkCommandBuffer cmdBuf = genCmdBuf.createCommandBuffer();
|
||||
// Update the instance buffer on the device side and build the TLAS
|
||||
nvvk::CommandPool genCmdBuf(m_device, m_queueIndex);
|
||||
VkCommandBuffer cmdBuf = genCmdBuf.createCommandBuffer();
|
||||
|
||||
|
||||
// Update the acceleration structure. Note the VK_TRUE parameter to trigger the update,
|
||||
// and the existing BLAS being passed and updated in place
|
||||
vkCmdBuildAccelerationStructureKHR(cmdBuf, 1, &asInfo, pBuildOffset.data());
|
||||
// Update the acceleration structure. Note the VK_TRUE parameter to trigger the update,
|
||||
// and the existing BLAS being passed and updated in place
|
||||
vkCmdBuildAccelerationStructuresKHR(cmdBuf, 1, &buildInfos, pBuildOffset.data());
|
||||
|
||||
genCmdBuf.submitAndWait(cmdBuf);
|
||||
m_alloc.destroy(scratchBuffer);
|
||||
}
|
||||
genCmdBuf.submitAndWait(cmdBuf);
|
||||
m_alloc->destroy(scratchBuffer);
|
||||
}
|
||||
~~~~
|
||||
|
||||
The previous function (`updateBlas`) uses geometry information stored in `m_blas`.
|
||||
|
|
@ -478,7 +474,7 @@ void HelloVulkan::createBottomLevelAS()
|
|||
Finally, we can add a line at the end of `HelloVulkan::animationObject()` to update the BLAS.
|
||||
|
||||
~~~~ C++
|
||||
m_rtBuilder.updateBlas(2);
|
||||
m_rtBuilder.updateBlas(2, m_blas[2], VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_UPDATE_BIT_KHR | VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_BUILD_BIT_KHR);
|
||||
~~~~
|
||||
|
||||

|
||||
|
|
|
|||
|
|
@ -513,6 +513,7 @@ all code from `// Vector toward the light` to the end can be remove and be repla
|
|||
prd.hitValue = emittance + (BRDF * incoming * cos_theta / p);
|
||||
~~~~
|
||||
|
||||
:warning: **Note:** We do not implement the point light as in the Rasterizer. Therefore, only the emitting geometry will emit the energy to illuminate the scene.
|
||||
|
||||
## Miss Shader
|
||||
|
||||
|
|
@ -581,7 +582,7 @@ First initialize the `payload` and variable to compute the accumulation.
|
|||
|
||||
Now the loop over the trace function, will be like the following.
|
||||
|
||||
**Note:** the depth is hardcode, but could be a parameter to the `push constant`.
|
||||
:warning: **Note:** the depth is hardcode, but could be a parameter to the `push constant`.
|
||||
|
||||
~~~~C
|
||||
for(; prd.depth < 10; prd.depth++)
|
||||
|
|
@ -604,6 +605,4 @@ Now the loop over the trace function, will be like the following.
|
|||
}
|
||||
~~~~
|
||||
|
||||
**Note:** do not forget to use `hitValue` in the `imageStore`.
|
||||
|
||||
|
||||
:warning: **Note:** do not forget to use `hitValue` in the `imageStore`.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue