cleanup and refactoring

This commit is contained in:
CDaut 2024-05-25 11:53:25 +02:00
parent 2302158928
commit 76f6bf62a4
Signed by: clara
GPG key ID: 223391B52FAD4463
1285 changed files with 757994 additions and 8 deletions

View file

@ -0,0 +1,603 @@
## Table of Contents
- [appwindowcamerainertia.hpp](#appwindowcamerainertiahpp)
- [appwindowprofiler.hpp](#appwindowprofilerhpp)
- [bitarray.hpp](#bitarrayhpp)
- [boundingbox.hpp](#boundingboxhpp)
- [cameracontrol.hpp](#cameracontrolhpp)
- [camerainertia.hpp](#camerainertiahpp)
- [cameramanipulator.hpp](#cameramanipulatorhpp)
- [commandlineparser.hpp](#commandlineparserhpp)
- [fileoperations.hpp](#fileoperationshpp)
- [geometry.hpp](#geometryhpp)
- [gltfscene.hpp](#gltfscenehpp)
- [inputparser.h](#inputparserh)
- [misc.hpp](#mischpp)
- [nvml_monitor.hpp](#nvml_monitorhpp)
- [nvprint.hpp](#nvprinthpp)
- [parallel_work.hpp](#parallel_workhpp)
- [parametertools.hpp](#parametertoolshpp)
- [primitives.hpp](#primitiveshpp)
- [profiler.hpp](#profilerhpp)
- [radixsort.hpp](#radixsorthpp)
- [shaderfilemanager.hpp](#shaderfilemanagerhpp)
- [threading.hpp](#threadinghpp)
- [timesampler.hpp](#timesamplerhpp)
- [trangeallocator.hpp](#trangeallocatorhpp)
## appwindowcamerainertia.hpp
### class AppWindowCameraInertia
> AppWindowCameraInertia is a Window base for samples, adding a camera with inertia
It derives the Window for this sample
## appwindowprofiler.hpp
### class nvh::AppWindowProfiler
nvh::AppWindowProfiler provides an alternative utility wrapper class around NVPWindow.
It is useful to derive single-window applications from and is used by some
but not all nvpro-samples.
Further functionality is provided :
- built-in profiler/timer reporting to console
- command-line argument parsing as well as config file parsing using the ParameterTools
see AppWindowProfiler::setupParameters() for built-in commands
- benchmark/automation mode using ParameterTools
- screenshot creation
- logfile based on devicename (depends on context)
- optional context/swapchain interface
the derived classes nvvk/appwindowprofiler_vk and nvgl/appwindowprofiler_gl make use of this
## bitarray.hpp
### class nvh::BitArray
> The nvh::BitArray class implements a tightly packed boolean array using single bits stored in uint64_t values.
Whenever you want large boolean arrays this representation is preferred for cache-efficiency.
The Visitor and OffsetVisitor traversal mechanisms make use of cpu intrinsics to speed up iteration over bits.
Example:
```cpp
BitArray modifiedObjects(1024);
// set some bits
modifiedObjects.setBit(24,true);
modifiedObjects.setBit(37,true);
// iterate over all set bits using the built-in traversal mechanism
struct MyVisitor {
void operator()( size_t index ){
// called with the index of a set bit
myObjects[index].update();
}
};
MyVisitor visitor;
modifiedObjects.traverseBits(visitor);
```
## boundingbox.hpp
```nvh::Bbox``` is a class to create bounding boxes.
It grows by adding 3d vector, can combine other bound boxes.
And it returns information, like its volume, its center, the min, max, etc..
## cameracontrol.hpp
### class nvh::CameraControl
> nvh::CameraControl is a utility class to create a viewmatrix based on mouse inputs.
It can operate in perspective or orthographic mode (`m_sceneOrtho==true`).
perspective:
- LMB: rotate
- RMB or WHEEL: zoom via dolly movement
- MMB: pan/move within camera plane
ortho:
- LMB: pan/move within camera plane
- RMB or WHEEL: zoom via dolly movement, application needs to use `m_sceneOrthoZoom` for projection matrix adjustment
- MMB: rotate
The camera can be orbiting (`m_useOrbit==true`) around `m_sceneOrbit` or
otherwise provide "first person/fly through"-like controls.
Speed of movement/rotation etc. is influenced by `m_sceneDimension` as well as the
sensitivity values.
## camerainertia.hpp
### struct InertiaCamera
> Struct that offers a camera moving with some inertia effect around a target point
InertiaCamera exposes a mix of pseudo polar rotation around a target point and
some other movements to translate the target point, zoom in and out.
Either the keyboard or mouse can be used for all of the moves.
## cameramanipulator.hpp
### class nvh::CameraManipulator
nvh::CameraManipulator is a camera manipulator help class
It allow to simply do
- Orbit (LMB)
- Pan (LMB + CTRL | MMB)
- Dolly (LMB + SHIFT | RMB)
- Look Around (LMB + ALT | LMB + CTRL + SHIFT)
In a various ways:
- examiner(orbit around object)
- walk (look up or down but stays on a plane)
- fly ( go toward the interest point)
Do use the camera manipulator, you need to do the following
- Call setWindowSize() at creation of the application and when the window size change
- Call setLookat() at creation to initialize the camera look position
- Call setMousePosition() on application mouse down
- Call mouseMove() on application mouse move
Retrieve the camera matrix by calling getMatrix()
See: appbase_vkpp.hpp
Note: There is a singleton `CameraManip` which can be use across the entire application
```cpp
// Retrieve/set camera information
CameraManip.getLookat(eye, center, up);
CameraManip.setLookat(eye, center, glm::vec3(m_upVector == 0, m_upVector == 1, m_upVector == 2));
CameraManip.getFov();
CameraManip.setSpeed(navSpeed);
CameraManip.setMode(navMode == 0 ? nvh::CameraManipulator::Examine : nvh::CameraManipulator::Fly);
// On mouse down, keep mouse coordinates
CameraManip.setMousePosition(x, y);
// On mouse move and mouse button down
if(m_inputs.lmb || m_inputs.rmb || m_inputs.mmb)
{
CameraManip.mouseMove(x, y, m_inputs);
}
// Wheel changes the FOV
CameraManip.wheel(delta > 0 ? 1 : -1, m_inputs);
// Retrieve the matrix to push to the shader
m_ubo.view = CameraManip.getMatrix();
````
## commandlineparser.hpp
Command line parser.
```cpp
std::string inFilename = "";
bool printHelp = false;
CommandLineParser args("Test Parser");
args.addArgument({"-f", "--filename"}, &inFilename, "Input filename");
args.addArgument({"-h", "--help"}, &printHelp, "Print Help");
bool result = args.parse(argc, argv);
```
## fileoperations.hpp
### functions in nvh
- nvh::fileExists : check if file exists
- nvh::findFile : finds filename in provided search directories
- nvh::loadFile : (multiple overloads) loads file as std::string, binary or text, can also search in provided directories
- nvh::getFileName : splits filename from filename with path
- nvh::getFilePath : splits filepath from filename with path
## geometry.hpp
### namespace nvh::geometry
The geometry namespace provides a few procedural mesh primitives
that are subdivided.
nvh::geometry::Mesh template uses the provided TVertex which must have a
constructor from nvh::geometry::Vertex. You can also use nvh::geometry::Vertex
directly.
It provides triangle indices, as well as outline line indices. The outline indices
are typical feature lines (rectangle for plane, some circles for sphere/torus).
All basic primitives are within -1,1 ranges along the axis they use
- nvh::geometry::Plane (x,y subdivision)
- nvh::geometry::Box (x,y,z subdivision, made of 6 planes)
- nvh::geometry::Sphere (lat,long subdivision)
- nvh::geometry::Torus (inner, outer circle subdivision)
- nvh::geometry::RandomMengerSponge (subdivision, tree depth, probability)
Example:
```cpp
// single primitive
nvh::geometry::Box<nvh::geometry::Vertex> box(4,4,4);
// construct from primitives
```
## gltfscene.hpp
### `nvh::GltfScene`
These utilities are for loading glTF models in a
canonical scene representation. From this representation
you would create the appropriate 3D API resources (buffers
and textures).
```cpp
// Typical Usage
// Load the GLTF Scene using TinyGLTF
tinygltf::Model gltfModel;
tinygltf::TinyGLTF gltfContext;
fileLoaded = gltfContext.LoadASCIIFromFile(&gltfModel, &error, &warn, m_filename);
// Fill the data in the gltfScene
gltfScene.getMaterials(tmodel);
gltfScene.getDrawableNodes(tmodel, GltfAttributes::Normal | GltfAttributes::Texcoord_0);
// Todo in App:
// create buffers for vertices and indices, from gltfScene.m_position, gltfScene.m_index
// create textures from images: using tinygltf directly
// create descriptorSet for material using directly gltfScene.m_materials
```
## inputparser.h
### class InputParser
> InputParser is a Simple command line parser
Example of usage for: test.exe -f name.txt -size 200 100
Parsing the command line: mandatory '-f' for the filename of the scene
```cpp
nvh::InputParser parser(argc, argv);
std::string filename = parser.getString("-f");
if(filename.empty()) filename = "default.txt";
if(parser.exist("-size") {
auto values = parser.getInt2("-size");
```
## misc.hpp
### functions in nvh
- mipMapLevels : compute number of mip maps
- stringFormat : sprintf for std::string
- frand : random float using rand()
- permutation : fills uint vector with random permutation of values [0... vec.size-1]
## nvml_monitor.hpp
Capture the GPU load and memory for all GPUs on the system.
Usage:
- There should be only one instance of NvmlMonitor
- call refresh() in each frame. It will not pull more measurement that the interval(ms)
- isValid() : return if it can be used
- nbGpu() : return the number of GPU in the computer
- getGpuInfo() : static info about the GPU
- getDeviceMemory() : memory consumption info
- getDeviceUtilization() : GPU and memory utilization
- getDevicePerformanceState() : clock speeds and throttle reasons
- getDevicePowerState() : power, temperature and fan speed
Measurements:
- Uses a cycle buffer.
- Offset is the last measurement
## nvprint.hpp
Multiple functions and macros that should be used for logging purposes,
rather than printf. These can print to multiple places at once
### Function nvprintf etc
Configuration:
- nvprintSetLevel : sets default loglevel
- nvprintGetLevel : gets default loglevel
- nvprintSetLogFileName : sets log filename
- nvprintSetLogging : sets file logging state
- nvprintSetCallback : sets custom callback
Printf-style functions and macros.
These take printf-style specifiers.
- nvprintf : prints at default loglevel
- nvprintfLevel : nvprintfLevel print at a certain loglevel
- LOGI : macro that does nvprintfLevel(LOGLEVEL_INFO)
- LOGW : macro that does nvprintfLevel(LOGLEVEL_WARNING)
- LOGE : macro that does nvprintfLevel(LOGLEVEL_ERROR)
- LOGE_FILELINE : macro that does nvprintfLevel(LOGLEVEL_ERROR) combined with filename/line
- LOGD : macro that does nvprintfLevel(LOGLEVEL_DEBUG) (only in debug builds)
- LOGOK : macro that does nvprintfLevel(LOGLEVEL_OK)
- LOGSTATS : macro that does nvprintfLevel(LOGLEVEL_STATS)
std::print-style functions and macros.
These take std::format-style specifiers
(https://en.cppreference.com/w/cpp/utility/format/formatter#Standard_format_specification).
- nvprintLevel : print at a certain loglevel
- PRINTI : macro that does nvprintLevel(LOGLEVEL_INFO)
- PRINTW : macro that does nvprintLevel(LOGLEVEL_WARNING)
- PRINTE : macro that does nvprintLevel(LOGLEVEL_ERROR)
- PRINTE_FILELINE : macro that does nvprintLevel(LOGLEVEL_ERROR) combined with filename/line
- PRINTD : macro that does nvprintLevel(LOGLEVEL_DEBUG) (only in debug builds)
- PRINTOK : macro that does nvprintLevel(LOGLEVEL_OK)
- PRINTSTATS : macro that does nvprintLevel(LOGLEVEL_STATS)
Safety:
On error, all functions print an error message.
All functions are thread-safe.
Printf-style functions have annotations that should produce warnings at
compile-time or when performing static analysis. Their format strings may be
dynamic - but this can be bad if an adversary can choose the content of the
format string.
std::print-style functions are safer: they produce compile-time errors, and
their format strings must be compile-time constants. Dynamic formatting
should be performed outside of printing, like this:
```cpp
ImGui::InputText("Enter a format string: ", userFormat, sizeof(userFormat));
try
{
std::string formatted = fmt::vformat(userFormat, ...);
}
catch (const std::exception& e)
{
(error handling...)
}
PRINTI("{}", formatted);
```
Text encoding:
Printing to the Windows debug console is the only operation that assumes a
text encoding, which is ANSI. In all other cases, strings are copied into
the output.
## parallel_work.hpp
Distributes batches of loops over BATCHSIZE items across multiple threads. numItems reflects the total number
of items to process.
batches: fn (uint64_t itemIndex, uint32_t threadIndex)
callback does single item
ranges: fn (uint64_t itemBegin, uint64_t itemEnd, uint32_t threadIndex)
callback does loop `for (uint64_t itemIndex = itemBegin; itemIndex < itemEnd; itemIndex++)`
## parametertools.hpp
### class nvh::ParameterList
The nvh::ParameterList helps parsing commandline arguments
or commandline arguments stored within ascii config files.
Parameters always update the values they point to, and optionally
can trigger a callback that can be provided per-parameter.
```cpp
ParameterList list;
std::string modelFilename;
float modelScale;
list.addFilename(".gltf|model filename", &modelFilename);
list.add("scale|model scale", &modelScale);
list.applyTokens(3, {"blah.gltf","-scale","4"}, "-", "/assets/");
```
Use in combination with the ParameterSequence class to iterate
sequences of parameter changes for benchmarking/automation.
### class nvh::ParameterSequence
The nvh::ParameterSequence processes provided tokens in sequences.
The sequences are terminated by a special "separator" token.
All tokens between the last iteration and the separator are applied
to the provided ParameterList.
Useful to process commands in sequences (automation, benchmarking etc.).
Example:
```cpp
ParameterSequence sequence;
ParameterList list;
int mode;
list.add("mode", &mode);
std::vector<const char*> tokens;
ParameterList::tokenizeString("benchmark simple -mode 10 benchmark complex -mode 20", tokens);
sequence.init(&list, tokens);
// 1 means our separator is followed by one argument (simple/complex)
// "-" as parameters in the string are prefixed with -
while(!sequence.advanceIteration("benchmark", 1, "-")) {
printf("%d %s mode %d\n", sequence.getIteration(), sequence.getSeparatorArg(0), mode);
}
// would print:
// 0 simple mode 10
// 1 complex mode 20
```
## primitives.hpp
### struct `nvh::PrimitiveMesh`
- Common primitive type, made of vertices: position, normal and texture coordinates.
- All primitives are triangles, and each 3 indices is forming a triangle.
### struct `nvh::Node`
- Structure to hold a reference to a mesh, with a material and transformation.
Primitives that can be created:
* Tetrahedron
* Icosahedron
* Octahedron
* Plane
* Cube
* SphereUv
* Cone
* SphereMesh
* Torus
Node creator: returns the instance and the position
* MengerSponge
* SunFlower
Other utilities
* mergeNodes
* removeDuplicateVertices
* wobblePrimitive
## profiler.hpp
### class nvh::Profiler
> The nvh::Profiler class is designed to measure timed sections.
Each section has a cpu and gpu time. Gpu times are typically provided
by derived classes for each individual api (e.g. OpenGL, Vulkan etc.).
There is functionality to pretty print the sections with their nesting level.
Multiple profilers can reference the same database, so one profiler
can serve as master that they others contribute to. Typically the
base class measuring only CPU time could be the master, and the api
derived classes reference it to share the same database.
Profiler::Clock can be used standalone for time measuring.
## radixsort.hpp
### function nvh::radixsort
The radixsort function sorts the provided keys based on
BYTES many bytes stored inside TKey starting at BYTEOFFSET.
The sorting result is returned as indices into the keys array.
For example:
```cpp
struct MyData {
uint32_t objectIdentifier;
uint16_t objectSortKey;
};
// 4-byte offset of objectSortKey within MyData
// 2-byte size of sorting key
result = radixsort<4,2>(keys, indicesIn, indicesTemp);
// after sorting the following is true
keys[result[i]].objectSortKey < keys[result[i + 1]].objectSortKey
// result can point either to indicesIn or indicesTemp (we swap the arrays
// after each byte iteration)
```
## shaderfilemanager.hpp
### class nvh::ShaderFileManager
The nvh::ShaderFileManager class is meant to be derived from to create the actual api-specific
shader/program managers.
The ShaderFileManager provides a system to find/load shader files.
It also allows resolving #include instructions in HLSL/GLSL source files.
Such includes can be registered before pointing to strings in memory.
If m_handleIncludePasting is true, then `#include`s are replaced by
the include file contents (recursively) before presenting the
loaded shader source code to the caller. Otherwise, the include file
loader is still available but `#include`s are left unchanged.
Furthermore it handles injecting prepended strings (typically used
for #defines) after the #version statement of GLSL files,
regardless of m_handleIncludePasting's value.
## threading.hpp
### class nvh::delayed_call
Class returned by delay_noreturn_for to track the thread created and possibly reset the
delay timer.
Delay a call to a void function for sleep_duration.
`return`: A delayed_call object that holds the running thread.
Example:
```cpp
// Create or update a delayed call to callback. Useful to consolidate multiple events into one call.
if(!m_delayedCall.delay_for(delay))
m_delayedCall = nvh::delay_noreturn_for(delay, callback);
```
## timesampler.hpp
### struct TimeSampler
TimeSampler does time sampling work
### struct nvh::Stopwatch
> Timer in milliseconds.
Starts the timer at creation and the elapsed time is retrieved by calling `elapsed()`.
The timer can be reset if it needs to start timing later in the code execution.
Usage:
````cpp
{
nvh::Stopwatch sw;
... work ...
LOGI("Elapsed: %f ms\n", sw.elapsed()); // --> Elapsed: 128.157 ms
}
````
## trangeallocator.hpp
### class nvh::TRangeAllocator
The nvh::TRangeAllocator<GRANULARITY> template allows to sub-allocate ranges from a fixed
maximum size. Ranges are allocated at GRANULARITY and are merged back on freeing.
Its primary use is within allocators that sub-allocate from fixed-size blocks.
The implementation is based on [MakeID by Emil Persson](http://www.humus.name/3D/MakeID.h).
Example :
```cpp
TRangeAllocator<256> range;
// initialize to a certain range
range.init(range.alignedSize(128 * 1024 * 1024));
...
// allocate a sub range
// example
uint32_t size = vertexBufferSize;
uint32_t alignment = vertexAlignment;
uint32_t allocOffset;
uint32_t allocSize;
uint32_t alignedOffset;
if (range.subAllocate(size, alignment, allocOffset, alignedOffset, allocSize)) {
... use the allocation space
// [alignedOffset + size] is guaranteed to be within [allocOffset + allocSize]
}
// give back the memory range for re-use
range.subFree(allocOffset, allocSize);
...
// at the end cleanup
range.deinit();
```

View file

@ -0,0 +1,49 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
/// @DOC_SKIP (keyword to exclude this file from automatic README.md generation)
#pragma once
#ifndef NVH_ALIGNEMENT_HPP
#define NVH_ALIGNEMENT_HPP 1
#include <stddef.h> // for size_t
namespace nvh {
template <class integral>
constexpr bool is_aligned(integral x, size_t a) noexcept
{
return (x & (integral(a) - 1)) == 0;
}
template <class integral>
constexpr integral align_up(integral x, size_t a) noexcept
{
return integral((x + (integral(a) - 1)) & ~integral(a - 1));
}
template <class integral>
constexpr integral align_down(integral x, size_t a) noexcept
{
return integral(x & ~integral(a - 1));
}
} // namespace nvh
#endif // !NVH_ALIGNEMENT_HPP

View file

@ -0,0 +1,429 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
//--------------------------------------------------------------------
#include <nvpwindow.hpp>
#ifdef WIN32
#include <windows.h>
#endif
#include "nvh/camerainertia.hpp"
#include "nvh/timesampler.hpp"
#include <imgui/imgui_helper.h>
#ifdef NVP_SUPPORTS_NVTOOLSEXT
#include "nvh/nsightevents.h"
#else
// Note: they are defined inside "nsightevents.h"
// but let's define them again here as empty defines for the case when NSIGHT is not needed at all
#define NX_RANGE int
#define NX_MARK(name)
#define NX_RANGESTART(name) 0
#define NX_RANGEEND(id)
#define NX_RANGEPUSH(name)
#define NX_RANGEPUSHCOL(name, c)
#define NX_RANGEPOP()
#define NXPROFILEFUNC(name)
#define NXPROFILEFUNCCOL(name, c)
#define NXPROFILEFUNCCOL2(name, c, a)
#endif
#include <map>
using std::map;
#define KEYTAU 0.10f
//-----------------------------------------------------------------------------
// GLOBALS
//-----------------------------------------------------------------------------
#ifndef WIN32
struct POINT
{
int x;
int y;
};
#endif
struct ToggleInfo
{
bool* p;
bool addToUI;
std::string desc;
};
#ifdef WINDOWINERTIACAMERA_EXTERN
extern std::map<char, ToggleInfo> g_toggleMap;
#else
std::map<char, ToggleInfo> g_toggleMap;
#endif
inline void addToggleKey(char c, bool* target, const char* desc, bool addToUI = true)
{
LOGI("%s", desc);
g_toggleMap[c].desc = desc;
g_toggleMap[c].p = target;
g_toggleMap[c].addToUI = addToUI;
}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
inline void DrawToggles()
{
for(auto& it : g_toggleMap)
{
if(!it.second.addToUI)
continue;
bool* pB = it.second.p;
bool prevValue = *pB;
ImGui::Checkbox(it.second.desc.c_str(), pB);
}
}
/* @DOC_START
# class AppWindowCameraInertia
> AppWindowCameraInertia is a Window base for samples, adding a camera with inertia
It derives the Window for this sample
@DOC_END */
class AppWindowCameraInertia : public NVPWindow
{
public:
AppWindowCameraInertia(const glm::vec3 eye = glm::vec3(0.0f, 1.0f, -3.0f),
const glm::vec3 focus = glm::vec3(0, 0, 0),
const glm::vec3 object = glm::vec3(0, 0, 0),
float fov_ = 50.0,
float near_ = 0.01f,
float far_ = 10.0)
: m_camera(eye, focus, object)
{
m_renderCnt = 1;
m_bCameraMode = true;
m_bContinue = true;
m_moveStep = 0.2f;
m_ptLastMousePosit.x = m_ptLastMousePosit.y = 0;
m_ptCurrentMousePosit.x = m_ptCurrentMousePosit.y = 0;
m_ptOriginalMousePosit.x = m_ptOriginalMousePosit.y = 0;
m_bMousing = false;
m_bRMousing = false;
m_bMMousing = false;
m_bNewTiming = false;
m_bAdjustTimeScale = true;
m_fov = fov_;
m_near = near_;
m_far = far_;
}
bool m_bCameraMode;
bool m_bContinue;
float m_moveStep;
POINT m_ptLastMousePosit;
POINT m_ptCurrentMousePosit;
POINT m_ptOriginalMousePosit;
bool m_bMousing;
bool m_bRMousing;
bool m_bMMousing;
bool m_bNewTiming;
bool m_bAdjustTimeScale;
int m_renderCnt;
TimeSampler m_realtime;
bool m_timingGlitch;
InertiaCamera m_camera;
glm::mat4 m_projection;
float m_fov, m_near, m_far;
public:
inline glm::mat4& projMat() { return m_projection; }
inline glm::mat4& viewMat() { return m_camera.m4_view; }
inline bool& nonStopRendering() { return m_realtime.bNonStopRendering; }
bool open(int posX, int posY, int width, int height, const char* title, bool requireGLContext) override;
virtual void onWindowClose() override;
virtual void onWindowResize(int w, int h) override;
virtual void onWindowRefresh() override;
virtual void onMouseMotion(int x, int y) override;
virtual void onMouseWheel(int delta) override;
virtual void onMouseButton(NVPWindow::MouseButton button, ButtonAction action, int mods, int x, int y) override;
virtual void onKeyboard(AppWindowCameraInertia::KeyCode key, ButtonAction action, int mods, int x, int y) override;
virtual void onKeyboardChar(unsigned char key, int mods, int x, int y) override;
virtual int idle();
const char* getHelpText(int* lines = NULL)
{
if(lines)
*lines = 7;
return "Left mouse button: rotate around the target\n"
"Right mouse button: translate target forward backward (+ Y axis rotate)\n"
"Middle mouse button: Pan target along view plane\n"
"Mouse wheel or PgUp/PgDn: zoom in/out\n"
"Arrow keys: rotate around the target\n"
"Ctrl+Arrow keys: Pan target\n"
"Ctrl+PgUp/PgDn: translate target forward/backward\n";
}
};
#ifndef WINDOWINERTIACAMERA_EXTERN
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
bool AppWindowCameraInertia::open(int posX, int posY, int width, int height, const char* title, bool requireGLContext)
{
m_realtime.bNonStopRendering = true;
float r = (float)width / (float)height;
m_projection = glm::perspective(glm::radians(m_fov), r, m_near, m_far);
ImGuiH::Init(width, height, this);
return NVPWindow::open(posX, posY, width, height, title, requireGLContext);
}
void AppWindowCameraInertia::onWindowClose()
{
ImGuiH::Deinit();
}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
#define CAMERATAU 0.03f
void AppWindowCameraInertia::onMouseMotion(int x, int y)
{
m_ptCurrentMousePosit.x = x;
m_ptCurrentMousePosit.y = y;
if(ImGuiH::mouse_pos(x, y))
return;
//---------------------------- LEFT
if(m_bMousing)
{
float hval = 2.0f * (float)(m_ptCurrentMousePosit.x - m_ptLastMousePosit.x) / (float)getWidth();
float vval = 2.0f * (float)(m_ptCurrentMousePosit.y - m_ptLastMousePosit.y) / (float)getHeight();
m_camera.tau = CAMERATAU;
m_camera.rotateH(hval);
m_camera.rotateV(vval);
m_renderCnt++;
}
//---------------------------- MIDDLE
if(m_bMMousing)
{
float hval = 2.0f * (float)(m_ptCurrentMousePosit.x - m_ptLastMousePosit.x) / (float)getWidth();
float vval = 2.0f * (float)(m_ptCurrentMousePosit.y - m_ptLastMousePosit.y) / (float)getHeight();
m_camera.tau = CAMERATAU;
m_camera.rotateH(hval, true);
m_camera.rotateV(vval, true);
m_renderCnt++;
}
//---------------------------- RIGHT
if(m_bRMousing)
{
float hval = 2.0f * (float)(m_ptCurrentMousePosit.x - m_ptLastMousePosit.x) / (float)getWidth();
float vval = -2.0f * (float)(m_ptCurrentMousePosit.y - m_ptLastMousePosit.y) / (float)getHeight();
m_camera.tau = CAMERATAU;
m_camera.rotateH(hval, !!(getKeyModifiers() & KMOD_CONTROL));
m_camera.move(vval, !!(getKeyModifiers() & KMOD_CONTROL));
m_renderCnt++;
}
m_ptLastMousePosit.x = m_ptCurrentMousePosit.x;
m_ptLastMousePosit.y = m_ptCurrentMousePosit.y;
}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
void AppWindowCameraInertia::onMouseWheel(int delta)
{
if(ImGuiH::mouse_wheel(delta))
return;
m_camera.tau = KEYTAU;
m_camera.move(delta > 0 ? m_moveStep : -m_moveStep, !!(getKeyModifiers() & KMOD_CONTROL));
m_renderCnt++;
}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
void AppWindowCameraInertia::onMouseButton(NVPWindow::MouseButton button, NVPWindow::ButtonAction state, int mods, int x, int y)
{
if(ImGuiH::mouse_button(button, state))
return;
switch(button)
{
case NVPWindow::MOUSE_BUTTON_LEFT:
if(state == NVPWindow::BUTTON_PRESS)
{
m_renderCnt++;
// TODO: equivalent of glfwSetInputMode(window, GLFW_CURSOR, GLFW_CURSOR_DISABLED/NORMAL);
m_bMousing = true;
m_renderCnt++;
if(getKeyModifiers() & KMOD_CONTROL)
{
}
else if(getKeyModifiers() & KMOD_SHIFT)
{
}
}
else
{
m_bMousing = false;
m_renderCnt++;
}
break;
case NVPWindow::MOUSE_BUTTON_RIGHT:
if(state == NVPWindow::BUTTON_PRESS)
{
m_ptLastMousePosit.x = m_ptCurrentMousePosit.x = x;
m_ptLastMousePosit.y = m_ptCurrentMousePosit.y = y;
m_bRMousing = true;
m_renderCnt++;
if(getKeyModifiers() & KMOD_CONTROL)
{
}
}
else
{
m_bRMousing = false;
m_renderCnt++;
}
break;
case NVPWindow::MOUSE_BUTTON_MIDDLE:
if(state == NVPWindow::BUTTON_PRESS)
{
m_ptLastMousePosit.x = m_ptCurrentMousePosit.x = x;
m_ptLastMousePosit.y = m_ptCurrentMousePosit.y = y;
m_bMMousing = true;
m_renderCnt++;
}
else
{
m_bMMousing = false;
m_renderCnt++;
}
break;
}
}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
void AppWindowCameraInertia::onKeyboard(NVPWindow::KeyCode key, NVPWindow::ButtonAction action, int mods, int x, int y)
{
m_renderCnt++;
if(ImGuiH::key_button(key, action, mods))
return;
if(action == NVPWindow::BUTTON_RELEASE)
return;
switch(key)
{
case NVPWindow::KEY_F1:
break;
case NVPWindow::KEY_F2:
break;
case NVPWindow::KEY_F3:
case NVPWindow::KEY_F4:
case NVPWindow::KEY_F5:
case NVPWindow::KEY_F6:
case NVPWindow::KEY_F7:
case NVPWindow::KEY_F8:
case NVPWindow::KEY_F9:
case NVPWindow::KEY_F10:
case NVPWindow::KEY_F11:
break;
case NVPWindow::KEY_F12:
break;
case NVPWindow::KEY_LEFT:
m_camera.tau = KEYTAU;
m_camera.rotateH(m_moveStep, !!(getKeyModifiers() & KMOD_CONTROL));
break;
case NVPWindow::KEY_UP:
m_camera.tau = KEYTAU;
m_camera.rotateV(m_moveStep, !!(getKeyModifiers() & KMOD_CONTROL));
break;
case NVPWindow::KEY_RIGHT:
m_camera.tau = KEYTAU;
m_camera.rotateH(-m_moveStep, !!(getKeyModifiers() & KMOD_CONTROL));
break;
case NVPWindow::KEY_DOWN:
m_camera.tau = KEYTAU;
m_camera.rotateV(-m_moveStep, !!(getKeyModifiers() & KMOD_CONTROL));
break;
case NVPWindow::KEY_PAGE_UP:
m_camera.tau = KEYTAU;
m_camera.move(m_moveStep, !!(getKeyModifiers() & KMOD_CONTROL));
break;
case NVPWindow::KEY_PAGE_DOWN:
m_camera.tau = KEYTAU;
m_camera.move(-m_moveStep, !!(getKeyModifiers() & KMOD_CONTROL));
break;
case NVPWindow::KEY_ESCAPE:
close();
break;
}
}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
void AppWindowCameraInertia::onKeyboardChar(unsigned char key, int mods, int x, int y)
{
m_renderCnt++;
if(ImGuiH::key_char(key))
return;
// check registered toggles
auto it = g_toggleMap.find(key);
if(it != g_toggleMap.end())
{
it->second.p[0] = it->second.p[0] ? false : true;
}
}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
int AppWindowCameraInertia::idle()
{
//
// Camera motion
//
m_bContinue = m_camera.update((float)m_realtime.getFrameDT());
//
// time sampling
//
m_realtime.update(m_bContinue, &m_timingGlitch);
//
// if requested: trigger again the next frame for rendering
//
if(m_bContinue || m_realtime.bNonStopRendering)
m_renderCnt++;
return m_renderCnt;
}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
void AppWindowCameraInertia::onWindowRefresh() {}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
void AppWindowCameraInertia::onWindowResize(int w, int h)
{
NVPWindow::onWindowResize(w, h);
auto& imgui_io = ImGui::GetIO();
imgui_io.DisplaySize = ImVec2(float(w), float(h));
float r = (float)w / (float)h;
m_projection = glm::perspective(glm::radians(m_fov), r, m_near, m_far);
m_renderCnt++;
}
#endif //WINDOWINERTIACAMERA_EXTERN

View file

@ -0,0 +1,593 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#ifdef _WIN32
#ifndef NOMINMAX
#define NOMINMAX
#endif
#ifndef WIN32_LEAN_AND_MEAN
#define WIN32_LEAN_AND_MEAN
#endif
#include <windows.h>
#endif
#include "appwindowprofiler.hpp"
#include <algorithm>
#include <assert.h>
#include <fstream>
#include <iostream>
#include <sstream>
#include <stdarg.h>
#include <stdio.h>
#include "fileoperations.hpp"
#include "misc.hpp"
#include <fileformats/bmp.hpp>
namespace nvh {
static void replace(std::string& str, const std::string& from, const std::string& to)
{
size_t start_pos = 0;
while((start_pos = str.find(from, start_pos)) != std::string::npos)
{
str.replace(start_pos, from.length(), to);
start_pos += to.length();
}
}
static void fixDeviceName(std::string& deviceName)
{
replace(deviceName, "INTEL(R) ", "");
replace(deviceName, "AMD ", "");
replace(deviceName, "DRI ", "");
replace(deviceName, "(TM) ", "");
replace(deviceName, " Series", "");
replace(deviceName, " Graphics", "");
replace(deviceName, "/PCIe/SSE2", "");
std::replace(deviceName.begin(), deviceName.end(), ' ', '_');
deviceName.erase(std::remove(deviceName.begin(), deviceName.end(), '/'), deviceName.end());
deviceName.erase(std::remove(deviceName.begin(), deviceName.end(), '\\'), deviceName.end());
deviceName.erase(std::remove(deviceName.begin(), deviceName.end(), ':'), deviceName.end());
deviceName.erase(std::remove(deviceName.begin(), deviceName.end(), '?'), deviceName.end());
deviceName.erase(std::remove(deviceName.begin(), deviceName.end(), '*'), deviceName.end());
deviceName.erase(std::remove(deviceName.begin(), deviceName.end(), '<'), deviceName.end());
deviceName.erase(std::remove(deviceName.begin(), deviceName.end(), '>'), deviceName.end());
deviceName.erase(std::remove(deviceName.begin(), deviceName.end(), '|'), deviceName.end());
deviceName.erase(std::remove(deviceName.begin(), deviceName.end(), '"'), deviceName.end());
deviceName.erase(std::remove(deviceName.begin(), deviceName.end(), ','), deviceName.end());
}
void AppWindowProfiler::onMouseMotion(int x, int y)
{
AppWindowProfiler::WindowState& window = m_windowState;
if(!window.m_mouseButtonFlags && mouse_pos(x, y))
return;
window.m_mouseCurrent[0] = x;
window.m_mouseCurrent[1] = y;
}
void AppWindowProfiler::onMouseButton(MouseButton Button, ButtonAction Action, int mods, int x, int y)
{
AppWindowProfiler::WindowState& window = m_windowState;
m_profiler.reset();
if(mouse_button(Button, Action))
return;
switch(Action)
{
case BUTTON_PRESS: {
switch(Button)
{
case MOUSE_BUTTON_LEFT: {
window.m_mouseButtonFlags |= MOUSE_BUTTONFLAG_LEFT;
}
break;
case MOUSE_BUTTON_MIDDLE: {
window.m_mouseButtonFlags |= MOUSE_BUTTONFLAG_MIDDLE;
}
break;
case MOUSE_BUTTON_RIGHT: {
window.m_mouseButtonFlags |= MOUSE_BUTTONFLAG_RIGHT;
}
break;
}
}
break;
case BUTTON_RELEASE: {
if(!window.m_mouseButtonFlags)
break;
switch(Button)
{
case MOUSE_BUTTON_LEFT: {
window.m_mouseButtonFlags &= ~MOUSE_BUTTONFLAG_LEFT;
}
break;
case MOUSE_BUTTON_MIDDLE: {
window.m_mouseButtonFlags &= ~MOUSE_BUTTONFLAG_MIDDLE;
}
break;
case MOUSE_BUTTON_RIGHT: {
window.m_mouseButtonFlags &= ~MOUSE_BUTTONFLAG_RIGHT;
}
break;
}
}
break;
}
}
void AppWindowProfiler::onMouseWheel(int y)
{
AppWindowProfiler::WindowState& window = m_windowState;
m_profiler.reset();
if(mouse_wheel(y))
return;
window.m_mouseWheel += y;
}
void AppWindowProfiler::onKeyboard(KeyCode key, ButtonAction action, int mods, int x, int y)
{
AppWindowProfiler::WindowState& window = m_windowState;
m_profiler.reset();
if(key_button(key, action, mods))
return;
bool newState = false;
switch(action)
{
case BUTTON_PRESS:
case BUTTON_REPEAT: {
newState = true;
break;
}
case BUTTON_RELEASE: {
newState = false;
break;
}
}
window.m_keyToggled[key] = window.m_keyPressed[key] != newState;
window.m_keyPressed[key] = newState;
}
void AppWindowProfiler::onKeyboardChar(unsigned char key, int mods, int x, int y)
{
m_profiler.reset();
if(key_char(key))
return;
}
void AppWindowProfiler::parseConfigFile(const char* filename)
{
std::string result = loadFile(filename, false);
if(result.empty())
{
LOGW("file not found: %s\n", filename);
return;
}
std::vector<const char*> args;
ParameterList::tokenizeString(result, args);
std::string path = getFilePath(filename);
parseConfig(uint32_t(args.size()), args.data(), path);
}
void AppWindowProfiler::onWindowClose()
{
exitScreenshot();
}
void AppWindowProfiler::onWindowResize(int width, int height)
{
m_profiler.reset();
if(width == 0 || height == 0)
{
return;
}
m_windowState.m_winSize[0] = width;
m_windowState.m_winSize[1] = height;
if(m_activeContext)
{
swapResize(width, height);
}
if(m_active)
{
resize(m_windowState.m_swapSize[0], m_windowState.m_swapSize[1]);
}
}
void AppWindowProfiler::setVsync(bool state)
{
if(m_internal)
{
swapVsync(state);
LOGI("vsync: %s\n", state ? "on" : "off");
}
m_config.vsyncstate = state;
m_vsync = state;
}
int AppWindowProfiler::run(const std::string& title, int argc, const char** argv, int width, int height, bool requireGLContext)
{
m_config.winsize[0] = m_config.winsize[0] ? m_config.winsize[0] : width;
m_config.winsize[1] = m_config.winsize[1] ? m_config.winsize[1] : height;
// skip first argument here (exe file)
parseConfig(argc - 1, argv + 1, ".");
if(!validateConfig())
{
return EXIT_FAILURE;
}
if(!NVPWindow::open(m_config.winpos[0], m_config.winpos[1], m_config.winsize[0], m_config.winsize[1], title.c_str(), requireGLContext))
{
LOGE("Could not create window\n");
return EXIT_FAILURE;
}
m_windowState.m_winSize[0] = m_config.winsize[0];
m_windowState.m_winSize[1] = m_config.winsize[1];
postConfigPreContext();
contextInit();
m_activeContext = true;
// hack to react on $DEVICE$ filename
if(!m_config.logFilename.empty())
{
parameterCallback(m_paramLog);
}
if(contextGetDeviceName())
{
std::string deviceName = contextGetDeviceName();
fixDeviceName(deviceName);
LOGOK("DEVICE: %s\n", deviceName.c_str());
}
initBenchmark();
setVsync(m_config.vsyncstate);
bool Run = begin();
m_active = true;
bool quickExit = m_config.quickexit;
if(m_config.frameLimit)
{
m_profilerPrint = false;
quickExit = true;
}
double timeStart = getTime();
double timeBegin = getTime();
double frames = 0;
bool lastVsync = m_vsync;
m_hadProfilerPrint = false;
double lastProfilerPrintTime = 0;
if(Run)
{
while(pollEvents())
{
bool wasClosed = false;
while(!isOpen())
{
NVPSystem::waitEvents();
wasClosed = true;
}
if(wasClosed)
{
continue;
}
if(m_windowState.onPress(KEY_V))
{
setVsync(!m_vsync);
}
std::string stats;
{
bool benchmarkActive = m_benchmark.sequence.isActive();
double curTime = getTime();
double printInterval = m_profilerPrint && !benchmarkActive ? float(m_config.intervalSeconds) : float(FLT_MAX);
bool printStats = ((curTime - lastProfilerPrintTime) > printInterval);
if(printStats)
{
lastProfilerPrintTime = curTime;
}
m_profiler.beginFrame();
swapPrepare();
{
//const nvh::Profiler::Section profile(m_profiler, "App");
think(getTime() - timeStart);
}
memset(m_windowState.m_keyToggled, 0, sizeof(m_windowState.m_keyToggled));
swapBuffers();
m_profiler.endFrame();
if(printStats)
{
m_profiler.print(stats);
}
}
m_hadProfilerPrint = false;
if(m_profilerPrint && !stats.empty())
{
if(!m_config.timerLimit || m_config.timerLimit == 1)
{
LOGI("%s\n", stats.c_str());
m_hadProfilerPrint = true;
}
if(m_config.timerLimit == 1)
{
m_config.frameLimit = 1;
}
if(m_config.timerLimit)
{
m_config.timerLimit--;
}
}
advanceBenchmark();
postProfiling();
frames++;
double timeCurrent = getTime();
double timeDelta = timeCurrent - timeBegin;
if(timeDelta > double(m_config.intervalSeconds) || lastVsync != m_vsync || m_config.frameLimit == 1)
{
std::ostringstream combined;
if(lastVsync != m_vsync)
{
timeDelta = 0;
}
if(m_timeInTitle)
{
combined << title << ": " << (timeDelta * 1000.0 / (frames)) << " [ms]"
<< (m_vsync ? " (vsync on - V for toggle)" : "");
setTitle(combined.str().c_str());
}
if(m_config.frameLimit == 1)
{
LOGI("frametime: %f ms\n", (timeDelta * 1000.0 / (frames)));
}
frames = 0;
timeBegin = timeCurrent;
lastVsync = m_vsync;
}
if(m_windowState.m_keyPressed[KEY_ESCAPE] || m_config.frameLimit == 1)
break;
if(m_config.frameLimit)
m_config.frameLimit--;
}
}
contextSync();
exitScreenshot();
if(quickExit)
{
exit(EXIT_SUCCESS);
return EXIT_SUCCESS;
}
end();
m_active = false;
contextDeinit();
postEnd();
return Run ? EXIT_SUCCESS : EXIT_FAILURE;
}
void AppWindowProfiler::leave()
{
m_config.frameLimit = 1;
}
std::string AppWindowProfiler::specialStrings(const char* original)
{
std::string str(original);
if(strstr(original, "$DEVICE$"))
{
if(contextGetDeviceName())
{
std::string deviceName = contextGetDeviceName();
fixDeviceName(deviceName);
if(deviceName.empty())
{
// no proper device name available
return std::string();
}
// replace $DEVICE$
replace(str, "$DEVICE$", deviceName);
}
else
{
// no proper device name available
return std::string();
}
}
return str;
}
void AppWindowProfiler::parameterCallback(uint32_t param)
{
if(param == m_paramLog)
{
std::string logfileName = specialStrings(m_config.logFilename.c_str());
if(!logfileName.empty())
{
nvprintSetLogFileName(logfileName.c_str());
}
}
else if(param == m_paramCfg || param == m_paramBat)
{
parseConfigFile(m_config.configFilename.c_str());
}
else if(param == m_paramWinsize)
{
if(m_internal)
{
setWindowSize(m_config.winsize[0], m_config.winsize[1]);
}
}
if(!m_active)
return;
if(param == m_paramVsync)
{
setVsync(m_config.vsyncstate);
}
else if(param == m_paramScreenshot)
{
std::string filename = specialStrings(m_config.screenshotFilename.c_str());
if(!filename.empty())
{
screenshot(filename.c_str());
}
}
else if(param == m_paramClear)
{
clear(m_config.clearColor[0], m_config.clearColor[1], m_config.clearColor[2]);
}
}
void AppWindowProfiler::setupParameters()
{
nvh::ParameterList::Callback callback = [&](uint32_t param) { parameterCallback(param); };
m_paramWinsize = m_parameterList.add("winsize|Set window size (width and height)", m_config.winsize, callback, 2);
m_paramVsync = m_parameterList.add("vsync|Enable or disable vsync", &m_config.vsyncstate, callback);
m_paramLog = m_parameterList.addFilename("logfile|Set logfile", &m_config.logFilename, callback);
m_paramCfg = m_parameterList.addFilename(".cfg|load parameters from this config file", &m_config.configFilename, callback);
m_paramBat = m_parameterList.addFilename(".bat|load parameters from this batch file", &m_config.configFilename, callback);
m_parameterList.add("winpos|Set window position (x and y)", m_config.winpos, nullptr, 2);
m_parameterList.add("frames|Set number of frames to render before exit", &m_config.frameLimit);
m_parameterList.add("timerprints|Set number of timerprints to do, before exit", &m_config.timerLimit);
m_parameterList.add("timerinterval|Set interval of timer prints in seconds", &m_config.intervalSeconds);
m_parameterList.add("bmpatexit|Set file to store a bitmap image of the last frame at exit", &m_config.dumpatexitFilename);
m_parameterList.addFilename("benchmark|Set benchmark filename", &m_benchmark.filename);
m_parameterList.add("benchmarkframes|Set number of benchmarkframes", &m_benchmark.frameLength);
m_parameterList.add("quickexit|skips tear down", &m_config.quickexit);
m_paramScreenshot = m_parameterList.add("screenshot|makes a screenshot into this file", &m_config.screenshotFilename, callback);
m_paramClear = m_parameterList.add("clear|clears window color (r,b,g in 0-255) using OS", m_config.clearColor, callback, 3);
}
void AppWindowProfiler::exitScreenshot()
{
if(!m_config.dumpatexitFilename.empty() && !m_hadScreenshot)
{
screenshot(m_config.dumpatexitFilename.c_str());
m_hadScreenshot = true;
}
}
void AppWindowProfiler::initBenchmark()
{
if(m_benchmark.filename.empty())
return;
m_benchmark.content = loadFile(m_benchmark.filename.c_str(), false);
if(!m_benchmark.content.empty())
{
std::vector<const char*> tokens;
ParameterList::tokenizeString(m_benchmark.content, tokens);
std::string path = getFilePath(m_benchmark.filename.c_str());
m_benchmark.sequence.init(&m_parameterList, tokens);
// do first iteration manually, due to custom arg parsing
uint32_t argBegin;
uint32_t argCount;
if(!m_benchmark.sequence.advanceIteration("benchmark", 1, argBegin, argCount))
{
parseConfig(argCount, &tokens[argBegin], path);
}
m_profiler.reset(nvh::Profiler::CONFIG_DELAY);
m_benchmark.frame = 0;
m_profilerPrint = false;
}
}
void AppWindowProfiler::advanceBenchmark()
{
if(!m_benchmark.sequence.isActive())
return;
m_benchmark.frame++;
if(m_benchmark.frame > m_benchmark.frameLength + nvh::Profiler::CONFIG_DELAY + nvh::Profiler::FRAME_DELAY)
{
m_benchmark.frame = 0;
std::string stats;
m_profiler.print(stats);
LOGI("BENCHMARK %d \"%s\" {\n", m_benchmark.sequence.getIteration(), m_benchmark.sequence.getSeparatorArg(0));
LOGI("%s}\n\n", stats.c_str());
bool done = m_benchmark.sequence.applyIteration("benchmark", 1, "-");
m_profiler.reset(nvh::Profiler::CONFIG_DELAY);
postBenchmarkAdvance();
if(done)
{
leave();
}
}
}
} // namespace nvh

View file

@ -0,0 +1,252 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#ifndef NV_PROJECTBASE_INCLUDED
#define NV_PROJECTBASE_INCLUDED
#include <nvpwindow.hpp>
#include <string.h> // for memset
#include "parametertools.hpp"
#include "profiler.hpp"
namespace nvh {
/** @DOC_START
# class nvh::AppWindowProfiler
nvh::AppWindowProfiler provides an alternative utility wrapper class around NVPWindow.
It is useful to derive single-window applications from and is used by some
but not all nvpro-samples.
Further functionality is provided :
- built-in profiler/timer reporting to console
- command-line argument parsing as well as config file parsing using the ParameterTools
see AppWindowProfiler::setupParameters() for built-in commands
- benchmark/automation mode using ParameterTools
- screenshot creation
- logfile based on devicename (depends on context)
- optional context/swapchain interface
the derived classes nvvk/appwindowprofiler_vk and nvgl/appwindowprofiler_gl make use of this
@DOC_END */
#define NV_PROFILE_BASE_SECTION(name) nvh::Profiler::Section _tempTimer(m_profiler, name)
#define NV_PROFILE_BASE_SPLIT() m_profiler.accumulationSplit()
class AppWindowProfiler : public NVPWindow
{
public:
class WindowState
{
public:
WindowState()
: m_mouseButtonFlags(0)
, m_mouseWheel(0)
{
memset(m_keyPressed, 0, sizeof(m_keyPressed));
memset(m_keyToggled, 0, sizeof(m_keyToggled));
}
int m_winSize[2];
int m_swapSize[2];
int m_mouseCurrent[2];
int m_mouseButtonFlags;
int m_mouseWheel;
bool m_keyPressed[KEY_LAST + 1];
bool m_keyToggled[KEY_LAST + 1];
bool onPress(int key) { return m_keyPressed[key] && m_keyToggled[key]; }
};
//////////////////////////////////////////////////////////////////////////
WindowState m_windowState;
nvh::Profiler m_profiler;
bool m_profilerPrint;
bool m_hadProfilerPrint;
bool m_timeInTitle;
ParameterList m_parameterList;
AppWindowProfiler(bool deprecated = true)
: m_profilerPrint(true)
, m_vsync(false)
, m_active(false)
, m_timeInTitle(true)
, m_hadScreenshot(false)
{
setupParameters();
}
// Sample Related
//////////////////////////////////////////////////////////////////////////
// setup sample (this is executed after window/context creation)
virtual bool begin() { return false; }
// tear down sample (triggered by ESC/window close)
virtual void end() {}
// do primary logic/drawing etc. here
virtual void think(double time) {}
// react on swapchain resizes here
// may be different to winWidth/winHeight!
virtual void resize(int swapWidth, int swapHeight) {}
// return true to prevent m_window state updates
virtual bool mouse_pos(int x, int y) { return false; }
virtual bool mouse_button(int button, int action) { return false; }
virtual bool mouse_wheel(int wheel) { return false; }
virtual bool key_button(int button, int action, int modifier) { return false; }
virtual bool key_char(int button) { return false; }
virtual void parseConfig(int argc, const char** argv, const std::string& path)
{
// if you want to handle parameters not represented in
// m_parameterList then override this function accordingly.
m_parameterList.applyTokens(argc, argv, "-", path.c_str());
// This function is called before "begin" and provided with the commandline used in "run".
// It can also be called by the benchmarking system, and parseConfigFile.
}
virtual bool validateConfig()
{
// override if you want to test the state of app after parsing configs
// returning false terminates app
return true;
}
// additional special-purpose callbacks
virtual void postProfiling() {}
virtual void postEnd() {}
virtual void postBenchmarkAdvance() {}
virtual void postConfigPreContext(){};
//////////////////////////////////////////////////////////////////////////
// initial kickoff (typically called from main)
int run(const std::string& name, int argc, const char** argv, int width, int height, bool requireGLContext);
void leave();
void parseConfigFile(const char* filename);
// handles special strings (returns empty string if
// could not do the replacement properly)
// known specials:
// $DEVICE$
std::string specialStrings(const char* original);
void setVsync(bool state);
bool getVsync() const { return m_vsync; }
//////////////////////////////////////////////////////////////////////////
// Context Window (if desired, not mandatory )
//
// Used when deriving from this class for the purpose of providing 3D Api contexts
// nvvk/appwindowprofiler_vk or nvgl/appwindowprofiler_gl make use of this.
virtual void contextInit() {}
virtual void contextDeinit() {}
virtual void contextSync() {}
virtual const char* contextGetDeviceName() { return NULL; }
virtual void swapResize(int winWidth, int winHeight)
{
m_windowState.m_swapSize[0] = winWidth;
m_windowState.m_swapSize[1] = winHeight;
}
virtual void swapPrepare() {}
virtual void swapBuffers() {}
virtual void swapVsync(bool state) {}
//////////////////////////////////////////////////////////////////////////
// inherited from NVPWindow, don't use them directly, use the "Sample-related" ones
void onWindowClose() override;
void onWindowResize(int w, int h) override;
void onWindowRefresh() override {} // leave empty, we call redraw ourselves in think
void onMouseMotion(int x, int y) override;
void onMouseWheel(int delta) override;
void onMouseButton(MouseButton button, ButtonAction action, int mods, int x, int y) override;
void onKeyboard(KeyCode key, ButtonAction action, int mods, int x, int y) override;
void onKeyboardChar(unsigned char key, int mods, int x, int y) override;
private:
struct Benchmark
{
std::string filename;
std::string content;
nvh::ParameterSequence sequence;
uint32_t frameLength = 256;
uint32_t frame = 0;
};
struct Config
{
int32_t winpos[2];
int32_t winsize[2];
bool vsyncstate = true;
bool quickexit = false;
uint32_t intervalSeconds = 2;
uint32_t frameLimit = 0;
uint32_t timerLimit = 0;
std::string dumpatexitFilename;
std::string screenshotFilename;
std::string logFilename;
std::string configFilename;
uint32_t clearColor[3] = {127, 0, 0};
Config()
{
winpos[0] = 50;
winpos[1] = 50;
winsize[0] = 0;
winsize[1] = 0;
}
};
void parameterCallback(uint32_t param);
void setupParameters();
void exitScreenshot();
void initBenchmark();
void advanceBenchmark();
bool m_activeContext = false;
bool m_active = false;
bool m_vsync;
bool m_hadScreenshot;
Config m_config;
Benchmark m_benchmark;
uint32_t m_paramWinsize;
uint32_t m_paramVsync;
uint32_t m_paramScreenshot;
uint32_t m_paramLog;
uint32_t m_paramCfg;
uint32_t m_paramBat;
uint32_t m_paramClear;
};
} // namespace nvh
#endif

View file

@ -0,0 +1,218 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#include "bitarray.hpp"
namespace nvh {
/** \brief Create a new BitVector.
**/
BitArray::BitArray()
: m_size(0)
, m_bits(NULL)
{
}
/** \brief Create a new BitVector with all bits set to false
\param size Number of Bits in the Array
**/
BitArray::BitArray(size_t size)
: m_size(size)
, m_bits(new BitStorageType[determineNumberOfElements()])
{
clear();
}
BitArray::BitArray(const BitArray& rhs)
: m_size(rhs.m_size)
, m_bits(new BitStorageType[determineNumberOfElements()])
{
std::copy(rhs.m_bits, rhs.m_bits + determineNumberOfElements(), m_bits);
}
BitArray::~BitArray()
{
delete[] m_bits;
}
void BitArray::resize(size_t newSize, bool defaultValue)
{
// if the default value for the new bits is true enabled the unused bits in the last element.
if(defaultValue)
{
setUnusedBits();
}
size_t oldNumberOfElements = determineNumberOfElements();
m_size = newSize;
size_t newNumberOfElements = determineNumberOfElements();
// the number of elements has changed, reallocate array
if(oldNumberOfElements != newNumberOfElements)
{
BitStorageType* NV_RESTRICT newBits = new BitStorageType[newNumberOfElements];
if(newNumberOfElements < oldNumberOfElements)
{
std::copy(m_bits, m_bits + newNumberOfElements, newBits);
}
else
{
std::copy(m_bits, m_bits + oldNumberOfElements, newBits);
std::fill(newBits + oldNumberOfElements, newBits + newNumberOfElements,
defaultValue ? ~BitStorageType(0) : BitStorageType(0));
}
delete[] m_bits;
m_bits = newBits;
}
clearUnusedBits();
}
BitArray& BitArray::operator=(const BitArray& rhs)
{
if(m_size != rhs.m_size)
{
m_size = rhs.m_size;
delete[] m_bits;
m_bits = new BitStorageType[determineNumberOfElements()];
}
std::copy(rhs.m_bits, rhs.m_bits + determineNumberOfElements(), m_bits);
return *this;
}
bool BitArray::operator==(const BitArray& rhs)
{
return (m_size == rhs.m_size) ? std::equal(m_bits, m_bits + determineNumberOfElements(), rhs.m_bits) : false;
}
BitArray BitArray::operator^(BitArray const& rhs)
{
NV_ASSERT(getSize() == rhs.getSize());
BitArray result(getSize());
for(size_t index = 0; index < determineNumberOfElements(); ++index)
{
result.m_bits[index] = m_bits[index] ^ rhs.m_bits[index];
}
clearUnusedBits();
return result;
}
BitArray BitArray::operator|(BitArray const& rhs)
{
NV_ASSERT(getSize() == rhs.getSize());
BitArray result(getSize());
for(size_t index = 0; index < determineNumberOfElements(); ++index)
{
result.m_bits[index] = m_bits[index] | rhs.m_bits[index];
}
clearUnusedBits();
return result;
}
BitArray BitArray::operator&(BitArray const& rhs)
{
NV_ASSERT(getSize() == rhs.getSize());
BitArray result(getSize());
for(size_t index = 0; index < determineNumberOfElements(); ++index)
{
result.m_bits[index] = m_bits[index] & rhs.m_bits[index];
}
clearUnusedBits();
return result;
}
BitArray& BitArray::operator^=(BitArray const& rhs)
{
NV_ASSERT(getSize() == rhs.getSize());
for(size_t index = 0; index < determineNumberOfElements(); ++index)
{
m_bits[index] ^= rhs.m_bits[index];
}
clearUnusedBits();
return *this;
}
BitArray& BitArray::operator|=(BitArray const& rhs)
{
NV_ASSERT(getSize() == rhs.getSize());
for(size_t index = 0; index < determineNumberOfElements(); ++index)
{
m_bits[index] |= rhs.m_bits[index];
}
return *this;
}
BitArray& BitArray::operator&=(BitArray const& rhs)
{
NV_ASSERT(getSize() == rhs.getSize());
for(size_t index = 0; index < determineNumberOfElements(); ++index)
{
m_bits[index] &= rhs.m_bits[index];
}
return *this;
}
void BitArray::clear()
{
std::fill(m_bits, m_bits + determineNumberOfElements(), 0);
}
void BitArray::fill()
{
if(determineNumberOfElements())
{
std::fill(m_bits, m_bits + determineNumberOfElements(), ~0);
clearUnusedBits();
}
}
size_t BitArray::countLeadingZeroes() const
{
size_t index = 0;
// first count
while(index < determineNumberOfElements() && !m_bits[index])
{
++index;
}
size_t leadingZeroes = index * StorageBitsPerElement;
if(index < determineNumberOfElements())
{
leadingZeroes += ctz(m_bits[index]);
}
return std::min(leadingZeroes, getSize());
}
} // namespace nvh

View file

@ -0,0 +1,324 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#ifndef NV_BITARRAY_H__
#define NV_BITARRAY_H__
#include <algorithm>
#include <platform.h>
#if(defined(NV_X86) || defined(NV_X64)) && defined(_MSC_VER)
#include <intrin.h>
#endif
namespace nvh {
//////////////////////////////////////////////////////////////////////////
/** @DOC_START
# class nvh::BitArray
> The nvh::BitArray class implements a tightly packed boolean array using single bits stored in uint64_t values.
Whenever you want large boolean arrays this representation is preferred for cache-efficiency.
The Visitor and OffsetVisitor traversal mechanisms make use of cpu intrinsics to speed up iteration over bits.
Example:
```cpp
BitArray modifiedObjects(1024);
// set some bits
modifiedObjects.setBit(24,true);
modifiedObjects.setBit(37,true);
// iterate over all set bits using the built-in traversal mechanism
struct MyVisitor {
void operator()( size_t index ){
// called with the index of a set bit
myObjects[index].update();
}
};
MyVisitor visitor;
modifiedObjects.traverseBits(visitor);
```
@DOC_END */
/** > Visitor which forwards the visitor operator with a fixed offset **/
template <typename Visitor>
struct OffsetVisitor
{
inline OffsetVisitor(Visitor& visitor, size_t offset)
: m_visitor(visitor)
, m_offset(offset)
{
}
inline void operator()(size_t index) { m_visitor(index + m_offset); }
private:
Visitor& m_visitor;
size_t m_offset;
};
#if(defined(NV_X86) || defined(NV_X64)) && defined(_MSC_VER)
template <typename Visitor>
inline void bitTraverse(uint32_t bits, Visitor& visitor)
{
unsigned long localIndex;
while(_BitScanForward(&localIndex, bits))
{
visitor(localIndex);
bits ^= 1 << localIndex; // clear the current bit so that the next one is being found by the bitscan
}
}
template <typename Visitor>
inline void bitTraverse(uint64_t bits, Visitor& visitor)
{
unsigned long localIndex;
while(_BitScanForward64(&localIndex, bits))
{
visitor(localIndex);
bits ^= uint64_t(1) << localIndex; // clear the current bit so that the next one is being found by the bitscan
}
}
inline size_t ctz(uint64_t bits)
{
unsigned long localIndex;
return _BitScanForward64(&localIndex, bits) ? localIndex : 64;
}
inline size_t ctz(uint32_t bits)
{
unsigned long localIndex;
return _BitScanForward(&localIndex, bits) ? localIndex : 32;
}
#else
inline size_t ctz(uint64_t bits)
{
return (bits != 0) ? __builtin_ctzl(bits) : 64;
}
inline size_t ctz(uint32_t bits)
{
return (bits != 0) ? __builtin_ctz(bits) : 32;
}
// TODO implement GCC version!
template <typename BitType, typename Visitor>
inline void bitTraverse(BitType bits, Visitor visitor)
{
size_t index = 0;
while(bits)
{
if(bits & 0xff) // skip ifs if the byte is 0
{
if(bits & 0x01)
visitor(index + 0);
if(bits & 0x02)
visitor(index + 1);
if(bits & 0x04)
visitor(index + 2);
if(bits & 0x08)
visitor(index + 3);
if(bits & 0x10)
visitor(index + 4);
if(bits & 0x20)
visitor(index + 5);
if(bits & 0x40)
visitor(index + 6);
if(bits & 0x80)
visitor(index + 7);
}
bits >>= 8;
index += 8;
}
}
#endif
/** > Call visitor(index) for each bit set **/
template <typename BitType, typename Visitor>
inline void bitTraverse(BitType* elements, size_t numberOfElements, Visitor& visitor)
{
size_t baseIndex = 0;
for(size_t elementIndex = 0; elementIndex < numberOfElements; ++elementIndex)
{
OffsetVisitor<Visitor> offsetVisitor(visitor, baseIndex);
bitTraverse(elements[elementIndex], offsetVisitor);
baseIndex += sizeof(*elements) * 8;
}
}
class BitArray
{
public:
typedef uint64_t BitStorageType;
enum
{
StorageBitsPerElement = sizeof(BitStorageType) * 8
};
BitArray();
BitArray(size_t size);
BitArray(const BitArray& rhs);
~BitArray();
BitArray& operator=(const BitArray& rhs);
bool operator==(const BitArray& rhs);
BitArray operator^(BitArray const& rhs);
BitArray operator&(BitArray const& rhs);
BitArray operator|(BitArray const& rhs);
BitArray& operator^=(BitArray const& rhs);
BitArray& operator&=(BitArray const& rhs);
BitArray& operator|=(BitArray const& rhs);
void clear();
void fill();
/** > Change the number of bits in this array. The state of remaining bits is being kept.
New bits will be initialized to false.
\param size New number of bits in this array
\param defaultValue The new default value for the new bits
**/
void resize(size_t size, bool defaultValue = false);
size_t getSize() const { return m_size; }
// inline functions
void enableBit(size_t index);
void disableBit(size_t index);
void setBit(size_t index, bool value);
bool getBit(size_t index) const;
BitStorageType const* getBits() const;
template <typename Visitor>
void traverseBits(Visitor visitor);
size_t countLeadingZeroes() const;
private:
size_t m_size;
BitStorageType* NV_RESTRICT m_bits;
void determineBitPosition(size_t index, size_t& element, size_t& bit) const;
size_t determineNumberOfElements() const;
/** > Clear the last unused bits in the last element.
\remarks Clear bits whose number is >= m_size. those are traversed unconditional and would produce invalid results.
restrict shifting range to 0 to StorageBitsPerElement - 1 to handle the case usedBitsInLastElement==0
which would result in shifting StorageBitsPerElement which is undefined by the standard and not the desired operation.
**/
void clearUnusedBits();
/** > Set the last unused bits in the last element.
\remarks Set bits whose number is >= m_size. This is required when expanding the vector with the bits set to true.
**/
void setUnusedBits();
};
/** > Determine the element / bit for the given index **/
inline void BitArray::determineBitPosition(size_t index, size_t& element, size_t& bit) const
{
element = index / StorageBitsPerElement;
bit = index % StorageBitsPerElement;
}
inline size_t BitArray::determineNumberOfElements() const
{
return (m_size + StorageBitsPerElement - 1) / StorageBitsPerElement;
}
inline void BitArray::enableBit(size_t index)
{
NV_ASSERT(index < m_size);
size_t element;
size_t bit;
determineBitPosition(index, element, bit);
m_bits[element] |= BitStorageType(1) << bit;
}
inline void BitArray::disableBit(size_t index)
{
NV_ASSERT(index < m_size);
size_t element;
size_t bit;
determineBitPosition(index, element, bit);
m_bits[element] &= ~(BitStorageType(1) << bit);
}
inline void BitArray::setBit(size_t index, bool value)
{
NV_ASSERT(index < m_size);
if(value)
{
enableBit(index);
}
else
{
disableBit(index);
}
}
inline BitArray::BitStorageType const* BitArray::getBits() const
{
return m_bits;
}
inline bool BitArray::getBit(size_t index) const
{
NV_ASSERT(index < m_size);
size_t element;
size_t bit;
determineBitPosition(index, element, bit);
return !!(m_bits[element] & (BitStorageType(1) << bit));
}
/** > call Visitor( size_t index ) on all bits which are set. **/
template <typename Visitor>
inline void BitArray::traverseBits(Visitor visitor)
{
bitTraverse(m_bits, determineNumberOfElements(), visitor);
}
inline void BitArray::clearUnusedBits()
{
if(m_size)
{
size_t usedBitsInLastElement = m_size % StorageBitsPerElement;
m_bits[determineNumberOfElements() - 1] &=
~BitStorageType(0) >> ((StorageBitsPerElement - usedBitsInLastElement) & (StorageBitsPerElement - 1));
}
}
inline void BitArray::setUnusedBits()
{
if(m_size)
{
size_t usedBitsInLastElement = m_size % StorageBitsPerElement;
m_bits[determineNumberOfElements() - 1] |= ~BitStorageType(0) << usedBitsInLastElement;
}
}
} // namespace nvh
#endif

View file

@ -0,0 +1,121 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#include <glm/gtc/matrix_access.hpp>
namespace nvh {
/* @DOC_START
```nvh::Bbox``` is a class to create bounding boxes.
It grows by adding 3d vector, can combine other bound boxes.
And it returns information, like its volume, its center, the min, max, etc..
@DOC_END */
struct Bbox
{
Bbox() = default;
Bbox(glm::vec3 _min, glm::vec3 _max)
: m_min(_min)
, m_max(_max)
{
}
Bbox(const std::vector<glm::vec3>& corners)
{
for(auto& c : corners)
{
insert(c);
}
}
void insert(const glm::vec3& v)
{
m_min = {std::min(m_min.x, v.x), std::min(m_min.y, v.y), std::min(m_min.z, v.z)};
m_max = {std::max(m_max.x, v.x), std::max(m_max.y, v.y), std::max(m_max.z, v.z)};
}
void insert(const Bbox& b)
{
insert(b.m_min);
insert(b.m_max);
}
inline Bbox& operator+=(float v)
{
m_min -= v;
m_max += v;
return *this;
}
inline bool isEmpty() const
{
return m_min == glm::vec3{std::numeric_limits<float>::max()} || m_max == glm::vec3{std::numeric_limits<float>::lowest()};
}
inline uint32_t rank() const
{
uint32_t result{0};
result += m_min.x < m_max.x;
result += m_min.y < m_max.y;
result += m_min.z < m_max.z;
return result;
}
inline bool isPoint() const { return m_min == m_max; }
inline bool isLine() const { return rank() == 1u; }
inline bool isPlane() const { return rank() == 2u; }
inline bool isVolume() const { return rank() == 3u; }
inline glm::vec3 min() { return m_min; }
inline glm::vec3 max() { return m_max; }
inline glm::vec3 extents() { return m_max - m_min; }
inline glm::vec3 center() { return (m_min + m_max) * 0.5f; }
inline float radius() { return glm::length(m_max - m_min) * 0.5f; }
Bbox transform(glm::mat4 mat)
{
// Make sure this is a 3D transformation + translation:
auto r = glm::row(mat, 3);
const float epsilon = 1e-6f;
assert(fabs(r.x) < epsilon && fabs(r.y) < epsilon && fabs(r.z) < epsilon && fabs(r.w - 1.0f) < epsilon);
std::vector<glm::vec3> corners(8);
corners[0] = glm::vec3(mat * glm::vec4(m_min.x, m_min.y, m_min.z, 1.f));
corners[1] = glm::vec3(mat * glm::vec4(m_min.x, m_min.y, m_max.z, 1.f));
corners[2] = glm::vec3(mat * glm::vec4(m_min.x, m_max.y, m_min.z, 1.f));
corners[3] = glm::vec3(mat * glm::vec4(m_min.x, m_max.y, m_max.z, 1.f));
corners[4] = glm::vec3(mat * glm::vec4(m_max.x, m_min.y, m_min.z, 1.f));
corners[5] = glm::vec3(mat * glm::vec4(m_max.x, m_min.y, m_max.z, 1.f));
corners[6] = glm::vec3(mat * glm::vec4(m_max.x, m_max.y, m_min.z, 1.f));
corners[7] = glm::vec3(mat * glm::vec4(m_max.x, m_max.y, m_max.z, 1.f));
Bbox result(corners);
return result;
}
private:
glm::vec3 m_min{std::numeric_limits<float>::max()};
glm::vec3 m_max{-std::numeric_limits<float>::max()};
};
template <typename T, typename TFlag>
inline bool hasFlag(T a, TFlag flag)
{
return (a & flag) == flag;
}
} // namespace nvh

View file

@ -0,0 +1,236 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#ifndef NV_CAMCONTROL_INCLUDED
#define NV_CAMCONTROL_INCLUDED
#include <algorithm>
#include <glm/ext/matrix_transform.hpp>
#include <glm/gtx/euler_angles.hpp>
namespace nvh {
//////////////////////////////////////////////////////////////////////////
/** @DOC_START
# class nvh::CameraControl
> nvh::CameraControl is a utility class to create a viewmatrix based on mouse inputs.
It can operate in perspective or orthographic mode (`m_sceneOrtho==true`).
perspective:
- LMB: rotate
- RMB or WHEEL: zoom via dolly movement
- MMB: pan/move within camera plane
ortho:
- LMB: pan/move within camera plane
- RMB or WHEEL: zoom via dolly movement, application needs to use `m_sceneOrthoZoom` for projection matrix adjustment
- MMB: rotate
The camera can be orbiting (`m_useOrbit==true`) around `m_sceneOrbit` or
otherwise provide "first person/fly through"-like controls.
Speed of movement/rotation etc. is influenced by `m_sceneDimension` as well as the
sensitivity values.
@DOC_END */
class CameraControl
{
public:
CameraControl()
: m_lastButtonFlags(0)
, m_lastWheel(0)
, m_senseWheelZoom(0.05f / 120.0f)
, m_senseZoom(0.001f)
, m_senseRotate((glm::pi<float>() * 0.5f) / 256.0f)
, m_sensePan(1.0f)
, m_sceneOrbit(0.0f)
, m_sceneDimension(1.0f)
, m_sceneOrtho(false)
, m_sceneOrthoZoom(1.0f)
, m_useOrbit(true)
, m_sceneUp(0, 1, 0)
{
}
inline void processActions(const glm::ivec2& window, const glm::vec2& mouse, int mouseButtonFlags, int wheel)
{
int changed = m_lastButtonFlags ^ mouseButtonFlags;
m_lastButtonFlags = mouseButtonFlags;
int panFlag = m_sceneOrtho ? 1 << 0 : 1 << 2;
int zoomFlag = 1 << 1;
int rotFlag = m_sceneOrtho ? 1 << 2 : 1 << 0;
m_panning = !!(mouseButtonFlags & panFlag);
m_zooming = !!(mouseButtonFlags & zoomFlag);
m_rotating = !!(mouseButtonFlags & rotFlag);
m_zoomingWheel = wheel != m_lastWheel;
m_startZoomWheel = m_lastWheel;
m_lastWheel = wheel;
if(m_rotating)
{
m_panning = false;
m_zooming = false;
}
if(m_panning && (changed & panFlag))
{
// pan
m_startPan = mouse;
m_startMatrix = m_viewMatrix;
}
if(m_zooming && (changed & zoomFlag))
{
// zoom
m_startMatrix = m_viewMatrix;
m_startZoom = mouse;
m_startZoomOrtho = m_sceneOrthoZoom;
}
if(m_rotating && (changed & rotFlag))
{
// rotate
m_startRotate = mouse;
m_startMatrix = m_viewMatrix;
}
if(m_zooming || m_zoomingWheel)
{
float dist = m_zooming ? -(glm::dot(mouse - m_startZoom, glm::vec2(-1, 1)) * m_sceneDimension * m_senseZoom) :
(float(wheel - m_startZoomWheel) * m_sceneDimension * m_senseWheelZoom);
if(m_zoomingWheel)
{
m_startZoomOrtho = m_sceneOrthoZoom;
m_startMatrix = m_viewMatrix;
}
if(m_sceneOrtho)
{
float newzoom = m_startZoomOrtho - (dist);
if(m_zoomingWheel)
{
if(newzoom < 0)
{
m_sceneOrthoZoom *= 0.5;
}
else if(m_sceneOrthoZoom < abs(dist))
{
m_sceneOrthoZoom *= 2.0;
}
else
{
m_sceneOrthoZoom = newzoom;
}
}
else
{
m_sceneOrthoZoom = newzoom;
}
m_sceneOrthoZoom = std::max(0.0001f, m_sceneOrthoZoom);
}
else
{
glm::mat4 delta = glm::translate(glm::mat4(1), glm::vec3(0, 0, dist * 2.0f));
m_viewMatrix = delta * m_startMatrix;
}
}
if(m_panning)
{
float aspect = float(window.x) / float(window.y);
glm::vec3 winsize(window.x, window.y, 1.0f);
glm::vec3 ortho(m_sceneOrthoZoom * aspect, m_sceneOrthoZoom, 1.0f);
glm::vec3 sub(mouse - m_startPan, 0.0f);
sub /= winsize;
sub *= ortho;
sub.y *= -1.0;
if(!m_sceneOrtho)
{
sub *= m_sensePan * m_sceneDimension;
}
glm::mat4 delta = glm::translate(glm::mat4(1), sub);
m_viewMatrix = delta * m_startMatrix;
}
if(m_rotating)
{
float aspect = float(window.x) / float(window.y);
glm::vec2 angles = (mouse - m_startRotate) * m_senseRotate;
if(m_useOrbit)
{
glm::mat4 rot = glm::yawPitchRoll(angles.x, angles.y, 0.0f);
glm::vec3 center = glm::vec3(m_startMatrix * glm::vec4(m_sceneOrbit, 1.0f));
glm::mat4 delta = glm::translate(glm::mat4(1), center) * rot * glm::translate(glm::mat4(1), -center);
m_viewMatrix = delta * m_startMatrix;
}
else
{
// FIXME use sceneUP
glm::mat4 rot = glm::yawPitchRoll(angles.x, angles.y, 0.0f);
m_viewMatrix = rot * m_startMatrix;
}
}
}
bool m_useOrbit;
bool m_sceneOrtho;
float m_sceneOrthoZoom;
float m_sceneDimension;
glm::vec3 m_sceneUp;
glm::vec3 m_sceneOrbit;
glm::mat4 m_viewMatrix;
float m_senseWheelZoom;
float m_senseZoom;
float m_senseRotate;
float m_sensePan;
private:
bool m_zooming;
bool m_zoomingWheel;
bool m_panning;
bool m_rotating;
glm::vec2 m_startPan;
glm::vec2 m_startZoom;
glm::vec2 m_startRotate;
glm::mat4 m_startMatrix;
int m_startZoomWheel;
float m_startZoomOrtho;
int m_lastButtonFlags;
int m_lastWheel;
};
} // namespace nvh
#endif

View file

@ -0,0 +1,237 @@
/*
* Copyright (c) 2013-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2013-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
//--------------------------------------------------------------------
#pragma once
#include <nvh/nvprint.hpp>
#include <glm/glm.hpp>
#include <cmath>
#include "glm/gtc/matrix_transform.hpp"
/* @DOC_START
# struct InertiaCamera
> Struct that offers a camera moving with some inertia effect around a target point
InertiaCamera exposes a mix of pseudo polar rotation around a target point and
some other movements to translate the target point, zoom in and out.
Either the keyboard or mouse can be used for all of the moves.
@DOC_END */
struct InertiaCamera
{
glm::vec3 curEyePos, curFocusPos, curObjectPos; ///< Current position of the motion
glm::vec3 eyePos, focusPos, objectPos; ///< expected posiions to reach
float tau; ///< acceleration factor in the motion function
float epsilon;
float eyeD;
float focusD;
float objectD;
glm::mat4 m4_view; ///< transformation matrix resulting from the computation
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
InertiaCamera(const glm::vec3 eye = glm::vec3(0.0f, 1.0f, -3.0f),
const glm::vec3 focus = glm::vec3(0, 0, 0),
const glm::vec3 object = glm::vec3(0, 0, 0))
{
epsilon = 0.001f;
tau = 0.2f;
curEyePos = eye;
eyePos = eye;
curFocusPos = focus;
focusPos = focus;
curObjectPos = object;
objectPos = object;
eyeD = 0.0f;
focusD = 0.0f;
objectD = 0.0f;
m4_view = glm::mat4(1);
glm::mat4 Lookat = glm::lookAt(curEyePos, curFocusPos, glm::vec3(0, 1, 0));
m4_view *= Lookat;
}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
void rotateH(float s, bool bPan = false)
{
glm::vec3 p = eyePos;
glm::vec3 o = focusPos;
glm::vec3 po = p - o;
float l = glm::length(po);
glm::vec3 dv = glm::cross(po, glm::vec3(0, 1, 0));
dv *= s;
p += dv;
po = p - o;
float l2 = glm::length(po);
l = l2 - l;
p -= (l / l2) * (po);
eyePos = p;
if(bPan)
focusPos += dv;
}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
void rotateV(float s, bool bPan = false)
{
glm::vec3 p = eyePos;
glm::vec3 o = focusPos;
glm::vec3 po = p - o;
float l = glm::length(po);
glm::vec3 dv = glm::cross(po, glm::vec3(0, -1, 0));
dv = glm::normalize(dv);
glm::vec3 dv2 = glm::cross(po, dv);
dv2 *= s;
p += dv2;
po = p - o;
float l2 = glm::length(po);
if(bPan)
focusPos += dv2;
// protect against gimbal lock
if(std::fabs(dot(po / l2, glm::vec3(0, 1, 0))) > 0.99)
return;
l = l2 - l;
p -= (l / l2) * (po);
eyePos = p;
}
//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
void move(float s, bool bPan)
{
glm::vec3 p = eyePos;
glm::vec3 o = focusPos;
glm::vec3 po = p - o;
po *= s;
p -= po;
if(bPan)
focusPos -= po;
eyePos = p;
}
//------------------------------------------------------------------------------------
/// > simulation step to call with a proper time interval to update the animation
//------------------------------------------------------------------------------------
bool update(float dt)
{
if(dt > (1.0f / 60.0f))
dt = (1.0f / 60.0f);
bool bContinue = false;
static glm::vec3 eyeVel = glm::vec3(0, 0, 0);
static glm::vec3 eyeAcc = glm::vec3(0, 0, 0);
eyeD = glm::length(curEyePos - eyePos);
if(eyeD > epsilon)
{
bContinue = true;
glm::vec3 dV = curEyePos - eyePos;
eyeAcc = (-2.0f / tau) * eyeVel - dV / (tau * tau);
// integrate
eyeVel += eyeAcc * glm::vec3(dt, dt, dt);
curEyePos += eyeVel * glm::vec3(dt, dt, dt);
}
else
{
eyeVel = glm::vec3(0, 0, 0);
eyeAcc = glm::vec3(0, 0, 0);
}
static glm::vec3 focusVel = glm::vec3(0, 0, 0);
static glm::vec3 focusAcc = glm::vec3(0, 0, 0);
focusD = glm::length(curFocusPos - focusPos);
if(focusD > epsilon)
{
bContinue = true;
glm::vec3 dV = curFocusPos - focusPos;
focusAcc = (-2.0f / tau) * focusVel - dV / (tau * tau);
// integrate
focusVel += focusAcc * glm::vec3(dt, dt, dt);
curFocusPos += focusVel * glm::vec3(dt, dt, dt);
}
else
{
focusVel = glm::vec3(0, 0, 0);
focusAcc = glm::vec3(0, 0, 0);
}
static glm::vec3 objectVel = glm::vec3(0, 0, 0);
static glm::vec3 objectAcc = glm::vec3(0, 0, 0);
objectD = glm::length(curObjectPos - objectPos);
if(objectD > epsilon)
{
bContinue = true;
glm::vec3 dV = curObjectPos - objectPos;
objectAcc = (-2.0f / tau) * objectVel - dV / (tau * tau);
// integrate
objectVel += objectAcc * glm::vec3(dt, dt, dt);
curObjectPos += objectVel * glm::vec3(dt, dt, dt);
}
else
{
objectVel = glm::vec3(0, 0, 0);
objectAcc = glm::vec3(0, 0, 0);
}
//
// Camera View matrix
//
glm::vec3 up(0, 1, 0);
m4_view = glm::mat4(1);
glm::mat4 Lookat = glm::lookAt(curEyePos, curFocusPos, up);
m4_view *= Lookat;
return bContinue;
}
//------------------------------------------------------------------------------
/// > Call this function to update the camera position and targets position
/// \arg *reset* set to true will directly update the actual positions without
/// performing the animation for transitioning.
//------------------------------------------------------------------------------
void look_at(const glm::vec3& eye, const glm::vec3& center /*, const glm::vec3& up*/, bool reset = false)
{
eyePos = eye;
focusPos = center;
if(reset)
{
curEyePos = eye;
curFocusPos = center;
glm::vec3 up(0, 1, 0);
m4_view = glm::mat4(1);
glm::mat4 Lookat = glm::lookAt(curEyePos, curFocusPos, up);
m4_view *= Lookat;
}
}
//------------------------------------------------------------------------------
/// > debug information of camera position and target position
/// Particularily useful to record a bunch of positions that can later be
/// reuses as "recorded" presets
//------------------------------------------------------------------------------
void print_look_at(bool cppLike = false)
{
if(cppLike)
{
LOGI("{glm::vec3(%.2f, %.2f, %.2f), glm::vec3(%.2f, %.2f, %.2f)},\n", eyePos.x, eyePos.y, eyePos.z, focusPos.x,
focusPos.y, focusPos.z);
}
else
{
LOGI("%.2f %.2f %.2f %.2f %.2f %.2f 0.0\n", eyePos.x, eyePos.y, eyePos.z, focusPos.x, focusPos.y, focusPos.z);
}
}
};

View file

@ -0,0 +1,564 @@
/*
* Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2018-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
//--------------------------------------------------------------------
#include "cameramanipulator.hpp"
#include <chrono>
#include <iostream>
#include <nvpwindow.hpp>
namespace nvh {
//--------------------------------------------------------------------------------------------------
//
//
CameraManipulator::CameraManipulator()
{
update();
}
//--------------------------------------------------------------------------------------------------
// Set the new camera as a goal
//
void CameraManipulator::setCamera(Camera camera, bool instantSet /*=true*/)
{
m_anim_done = true;
if(instantSet)
{
m_current = camera;
update();
}
else if(camera != m_current)
{
m_goal = camera;
m_snapshot = m_current;
m_anim_done = false;
m_start_time = getSystemTime();
findBezierPoints();
}
}
//--------------------------------------------------------------------------------------------------
// Creates a viewing matrix derived from an eye point, a reference point indicating the center of
// the scene, and an up vector
//
void CameraManipulator::setLookat(const glm::vec3& eye, const glm::vec3& center, const glm::vec3& up, bool instantSet)
{
Camera camera{eye, center, up, m_current.fov};
setCamera(camera, instantSet);
}
//-----------------------------------------------------------------------------
// Get the current camera's look-at parameters.
void CameraManipulator::getLookat(glm::vec3& eye, glm::vec3& center, glm::vec3& up) const
{
eye = m_current.eye;
center = m_current.ctr;
up = m_current.up;
}
//--------------------------------------------------------------------------------------------------
// Pan the camera perpendicularly to the light of sight.
//
void CameraManipulator::pan(float dx, float dy)
{
if(m_mode == Fly)
{
dx *= -1;
dy *= -1;
}
glm::vec3 z(m_current.eye - m_current.ctr);
float length = static_cast<float>(glm::length(z)) / 0.785f; // 45 degrees
z = glm::normalize(z);
glm::vec3 x = glm::cross(m_current.up, z);
glm::vec3 y = glm::cross(z, x);
x = glm::normalize(x);
y = glm::normalize(y);
glm::vec3 panVector = (-dx * x + dy * y) * length;
m_current.eye += panVector;
m_current.ctr += panVector;
}
//--------------------------------------------------------------------------------------------------
// Orbit the camera around the center of interest. If 'invert' is true,
// then the camera stays in place and the interest orbit around the camera.
//
void CameraManipulator::orbit(float dx, float dy, bool invert)
{
if(dx == 0 && dy == 0)
return;
// Full width will do a full turn
dx *= glm::two_pi<float>();
dy *= glm::two_pi<float>();
// Get the camera
glm::vec3 origin(invert ? m_current.eye : m_current.ctr);
glm::vec3 position(invert ? m_current.ctr : m_current.eye);
// Get the length of sight
glm::vec3 centerToEye(position - origin);
float radius = glm::length(centerToEye);
centerToEye = glm::normalize(centerToEye);
glm::vec3 axe_z = centerToEye;
// Find the rotation around the UP axis (Y)
glm::mat4 rot_y = glm::rotate(glm::mat4(1), -dx, m_current.up);
// Apply the (Y) rotation to the eye-center vector
centerToEye = rot_y * glm::vec4(centerToEye, 0);
// Find the rotation around the X vector: cross between eye-center and up (X)
glm::vec3 axe_x = glm::normalize(glm::cross(m_current.up, axe_z));
glm::mat4 rot_x = glm::rotate(glm::mat4(1), -dy, axe_x);
// Apply the (X) rotation to the eye-center vector
glm::vec3 vect_rot = rot_x * glm::vec4(centerToEye, 0);
if(glm::sign(vect_rot.x) == glm::sign(centerToEye.x))
centerToEye = vect_rot;
// Make the vector as long as it was originally
centerToEye *= radius;
// Finding the new position
glm::vec3 newPosition = centerToEye + origin;
if(!invert)
{
m_current.eye = newPosition; // Normal: change the position of the camera
}
else
{
m_current.ctr = newPosition; // Inverted: change the interest point
}
}
//--------------------------------------------------------------------------------------------------
// Move the camera toward the interest point, but don't cross it
//
void CameraManipulator::dolly(float dx, float dy)
{
glm::vec3 z = m_current.ctr - m_current.eye;
float length = static_cast<float>(glm::length(z));
// We are at the point of interest, and don't know any direction, so do nothing!
if(length < 0.000001f)
return;
// Use the larger movement.
float dd;
if(m_mode != Examine)
dd = -dy;
else
dd = fabs(dx) > fabs(dy) ? dx : -dy;
float factor = m_speed * dd;
// Adjust speed based on distance.
if(m_mode == Examine)
{
// Don't move over the point of interest.
if(factor >= 1.0f)
return;
z *= factor;
}
else
{
// Normalize the Z vector and make it faster
z *= factor / length * 10.0f;
}
// Not going up
if(m_mode == Walk)
{
if(m_current.up.y > m_current.up.z)
z.y = 0;
else
z.z = 0;
}
m_current.eye += z;
// In fly mode, the interest moves with us.
if(m_mode != Examine)
m_current.ctr += z;
}
//--------------------------------------------------------------------------------------------------
// Modify the position of the camera over time
// - The camera can be updated through keys. A key set a direction which is added to both
// eye and center, until the key is released
// - A new position of the camera is defined and the camera will reach that position
// over time.
void CameraManipulator::updateAnim()
{
auto elapse = static_cast<float>(getSystemTime() - m_start_time) / 1000.f;
// Key animation
if(m_key_vec != glm::vec3(0, 0, 0))
{
m_current.eye += m_key_vec * elapse;
m_current.ctr += m_key_vec * elapse;
update();
m_start_time = getSystemTime();
return;
}
// Camera moving to new position
if(m_anim_done)
return;
float t = std::min(elapse / float(m_duration), 1.0f);
// Evaluate polynomial (smoother step from Perlin)
t = t * t * t * (t * (t * 6.0f - 15.0f) + 10.0f);
if(t >= 1.0f)
{
m_current = m_goal;
m_anim_done = true;
return;
}
// Interpolate camera position and interest
// The distance of the camera between the interest is preserved to
// create a nicer interpolation
m_current.ctr = glm::mix(m_snapshot.ctr, m_goal.ctr, t);
m_current.up = glm::mix(m_snapshot.up, m_goal.up, t);
m_current.eye = computeBezier(t, m_bezier[0], m_bezier[1], m_bezier[2]);
m_current.fov = glm::mix(m_snapshot.fov, m_goal.fov, t);
update();
}
//--------------------------------------------------------------------------------------------------
//
void CameraManipulator::setMatrix(const glm::mat4& matrix, bool instantSet, float centerDistance)
{
Camera camera;
camera.eye = matrix[3];
auto rotMat = glm::mat3(matrix);
camera.ctr = {0, 0, -centerDistance};
camera.ctr = camera.eye + (rotMat * camera.ctr);
camera.up = {0, 1, 0};
camera.fov = m_current.fov;
m_anim_done = instantSet;
if(instantSet)
{
m_current = camera;
}
else
{
m_goal = camera;
m_snapshot = m_current;
m_start_time = getSystemTime();
findBezierPoints();
}
update();
}
//--------------------------------------------------------------------------------------------------
//
//
void CameraManipulator::setMousePosition(int x, int y)
{
m_mouse = glm::vec2(x, y);
}
//--------------------------------------------------------------------------------------------------
//
//
void CameraManipulator::getMousePosition(int& x, int& y)
{
x = static_cast<int>(m_mouse.x);
y = static_cast<int>(m_mouse.y);
}
//--------------------------------------------------------------------------------------------------
//
//
void CameraManipulator::setWindowSize(int w, int h)
{
m_width = w;
m_height = h;
}
//--------------------------------------------------------------------------------------------------
//
// Low level function for when the camera move.
//
void CameraManipulator::motion(int x, int y, int action)
{
float dx = float(x - m_mouse[0]) / float(m_width);
float dy = float(y - m_mouse[1]) / float(m_height);
switch(action)
{
case Orbit:
orbit(dx, dy, false);
break;
case CameraManipulator::Dolly:
dolly(dx, dy);
break;
case CameraManipulator::Pan:
pan(dx, dy);
break;
case CameraManipulator::LookAround:
orbit(dx, -dy, true);
break;
}
// Resetting animation
m_anim_done = true;
update();
m_mouse[0] = static_cast<float>(x);
m_mouse[1] = static_cast<float>(y);
}
//
// Function for when the camera move with keys (ex. WASD).
//
void CameraManipulator::keyMotion(float dx, float dy, int action)
{
if(action == NoAction)
{
m_key_vec = {0, 0, 0};
return;
}
auto d = glm::normalize(m_current.ctr - m_current.eye);
dx *= m_speed * 2.f;
dy *= m_speed * 2.f;
glm::vec3 key_vec;
if(action == Dolly)
{
key_vec = d * dx;
if(m_mode == Walk)
{
if(m_current.up.y > m_current.up.z)
key_vec.y = 0;
else
key_vec.z = 0;
}
}
else if(action == Pan)
{
auto r = glm::cross(d, m_current.up);
key_vec = r * dx + m_current.up * dy;
}
m_key_vec += key_vec;
// Resetting animation
m_start_time = getSystemTime();
}
//--------------------------------------------------------------------------------------------------
// To call when the mouse is moving
// It find the appropriate camera operator, based on the mouse button pressed and the
// keyboard modifiers (shift, ctrl, alt)
//
// Returns the action that was activated
//
CameraManipulator::Actions CameraManipulator::mouseMove(int x, int y, const Inputs& inputs)
{
if(!inputs.lmb && !inputs.rmb && !inputs.mmb)
{
setMousePosition(x, y);
return NoAction; // no mouse button pressed
}
Actions curAction = NoAction;
if(inputs.lmb)
{
if(((inputs.ctrl) && (inputs.shift)) || inputs.alt)
curAction = m_mode == Examine ? LookAround : Orbit;
else if(inputs.shift)
curAction = Dolly;
else if(inputs.ctrl)
curAction = Pan;
else
curAction = m_mode == Examine ? Orbit : LookAround;
}
else if(inputs.mmb)
curAction = Pan;
else if(inputs.rmb)
curAction = Dolly;
if(curAction != NoAction)
motion(x, y, curAction);
return curAction;
}
//--------------------------------------------------------------------------------------------------
// Trigger a dolly when the wheel change, or change the FOV if the shift key was pressed
//
void CameraManipulator::wheel(int value, const Inputs& inputs)
{
float fval(static_cast<float>(value));
float dx = (fval * fabsf(fval)) / static_cast<float>(m_width);
if(inputs.shift)
{
setFov(m_current.fov + fval);
}
else
{
dolly(dx * m_speed, dx * m_speed);
update();
}
}
// Set and clamp FOV between 0.01 and 179 degrees
void CameraManipulator::setFov(float _fov)
{
m_current.fov = std::min(std::max(_fov, 0.01f), 179.0f);
}
glm::vec3 CameraManipulator::computeBezier(float t, glm::vec3& p0, glm::vec3& p1, glm::vec3& p2)
{
float u = 1.f - t;
float tt = t * t;
float uu = u * u;
glm::vec3 p = uu * p0; // first term
p += 2 * u * t * p1; // second term
p += tt * p2; // third term
return p;
}
void CameraManipulator::findBezierPoints()
{
glm::vec3 p0 = m_current.eye;
glm::vec3 p2 = m_goal.eye;
glm::vec3 p1, pc;
// point of interest
glm::vec3 pi = (m_goal.ctr + m_current.ctr) * 0.5f;
glm::vec3 p02 = (p0 + p2) * 0.5f; // mid p0-p2
float radius = (length(p0 - pi) + length(p2 - pi)) * 0.5f; // Radius for p1
glm::vec3 p02pi(p02 - pi); // Vector from interest to mid point
p02pi = glm::normalize(p02pi);
p02pi *= radius;
pc = pi + p02pi; // Calculated point to go through
p1 = 2.f * pc - p0 * 0.5f - p2 * 0.5f; // Computing p1 for t=0.5
p1.y = p02.y; // Clamping the P1 to be in the same height as p0-p2
m_bezier[0] = p0;
m_bezier[1] = p1;
m_bezier[2] = p2;
}
//--------------------------------------------------------------------------------------------------
// Return the time in fraction of milliseconds
//
double CameraManipulator::getSystemTime()
{
auto now(std::chrono::system_clock::now());
auto duration = now.time_since_epoch();
return std::chrono::duration_cast<std::chrono::microseconds>(duration).count() / 1000.0;
}
//--------------------------------------------------------------------------------------------------
// Return a string which can be included in help dialogs
//
const std::string& CameraManipulator::getHelp()
{
static std::string helpText =
"LMB: rotate around the target\n"
"RMB: Dolly in/out\n"
"MMB: Pan along view plane\n"
"LMB + Shift: Dolly in/out\n"
"LMB + Ctrl: Pan\n"
"LMB + Alt: Look aroundPan\n"
"Mouse wheel: Dolly in/out\n"
"Mouse wheel + Shift: Zoom in/out\n";
return helpText;
}
//--------------------------------------------------------------------------------------------------
// Move the camera closer or further from the center of the the bounding box, to see it completely
//
// boxMin - lower corner of the bounding box
// boxMax - upper corner of the bounding box
// instantFit - true: set the new position, false: will animate to new position.
// tight - true: fit exactly the corner, false: fit to radius (larger view, will not get closer or further away)
// aspect - aspect ratio of the window.
//
void CameraManipulator::fit(const glm::vec3& boxMin, const glm::vec3& boxMax, bool instantFit /*= true*/, bool tightFit /*=false*/, float aspect /*=1.0f*/)
{
// Calculate the half extents of the bounding box
const glm::vec3 boxHalfSize = 0.5f * (boxMax - boxMin);
// Calculate the center of the bounding box
const glm::vec3 boxCenter = 0.5f * (boxMin + boxMax);
const float yfov = tan(glm::radians(m_current.fov * 0.5f));
const float xfov = yfov * aspect;
// Calculate the ideal distance for a tight fit or fit to radius
float idealDistance = 0;
if(tightFit)
{
// Get only the rotation matrix
glm::mat3 mView = glm::lookAt(m_current.eye, boxCenter, m_current.up);
// Check each 8 corner of the cube
for(int i = 0; i < 8; i++)
{
// Rotate the bounding box in the camera view
glm::vec3 vct(i & 1 ? boxHalfSize.x : -boxHalfSize.x, //
i & 2 ? boxHalfSize.y : -boxHalfSize.y, //
i & 4 ? boxHalfSize.z : -boxHalfSize.z); //
vct = mView * vct;
if(vct.z < 0) // Take only points in front of the center
{
// Keep the largest offset to see that vertex
idealDistance = std::max(fabs(vct.y) / yfov + fabs(vct.z), idealDistance);
idealDistance = std::max(fabs(vct.x) / xfov + fabs(vct.z), idealDistance);
}
}
}
else // Using the bounding sphere
{
const float radius = glm::length(boxHalfSize);
idealDistance = std::max(radius / xfov, radius / yfov);
}
// Calculate the new camera position based on the ideal distance
const glm::vec3 newEye = boxCenter - idealDistance * glm::normalize(boxCenter - m_current.eye);
// Set the new camera position and interest point
setLookat(newEye, boxCenter, m_current.up, instantFit);
}
} // namespace nvh

View file

@ -0,0 +1,252 @@
/*
* Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2018-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
//--------------------------------------------------------------------
#pragma once
#include <array>
#include <glm/glm.hpp>
#include <glm/gtc/matrix_transform.hpp>
#include <string>
namespace nvh {
/** @DOC_START
# class nvh::CameraManipulator
nvh::CameraManipulator is a camera manipulator help class
It allow to simply do
- Orbit (LMB)
- Pan (LMB + CTRL | MMB)
- Dolly (LMB + SHIFT | RMB)
- Look Around (LMB + ALT | LMB + CTRL + SHIFT)
In a various ways:
- examiner(orbit around object)
- walk (look up or down but stays on a plane)
- fly ( go toward the interest point)
Do use the camera manipulator, you need to do the following
- Call setWindowSize() at creation of the application and when the window size change
- Call setLookat() at creation to initialize the camera look position
- Call setMousePosition() on application mouse down
- Call mouseMove() on application mouse move
Retrieve the camera matrix by calling getMatrix()
See: appbase_vkpp.hpp
Note: There is a singleton `CameraManip` which can be use across the entire application
```cpp
// Retrieve/set camera information
CameraManip.getLookat(eye, center, up);
CameraManip.setLookat(eye, center, glm::vec3(m_upVector == 0, m_upVector == 1, m_upVector == 2));
CameraManip.getFov();
CameraManip.setSpeed(navSpeed);
CameraManip.setMode(navMode == 0 ? nvh::CameraManipulator::Examine : nvh::CameraManipulator::Fly);
// On mouse down, keep mouse coordinates
CameraManip.setMousePosition(x, y);
// On mouse move and mouse button down
if(m_inputs.lmb || m_inputs.rmb || m_inputs.mmb)
{
CameraManip.mouseMove(x, y, m_inputs);
}
// Wheel changes the FOV
CameraManip.wheel(delta > 0 ? 1 : -1, m_inputs);
// Retrieve the matrix to push to the shader
m_ubo.view = CameraManip.getMatrix();
````
@DOC_END */
class CameraManipulator
{
public:
// clang-format off
enum Modes { Examine, Fly, Walk};
enum Actions { NoAction, Orbit, Dolly, Pan, LookAround };
struct Inputs {bool lmb=false; bool mmb=false; bool rmb=false;
bool shift=false; bool ctrl=false; bool alt=false;};
// clang-format on
struct Camera
{
glm::vec3 eye = glm::vec3(10, 10, 10);
glm::vec3 ctr = glm::vec3(0, 0, 0);
glm::vec3 up = glm::vec3(0, 1, 0);
float fov = 60.0f;
bool operator!=(const Camera& rhr) const
{
return (eye != rhr.eye) || (ctr != rhr.ctr) || (up != rhr.up) || (fov != rhr.fov);
}
bool operator==(const Camera& rhr) const
{
return (eye == rhr.eye) && (ctr == rhr.ctr) && (up == rhr.up) && (fov == rhr.fov);
}
};
public:
// Main function to call from the application
// On application mouse move, call this function with the current mouse position, mouse
// button presses and keyboard modifier. The camera matrix will be updated and
// can be retrieved calling getMatrix
Actions mouseMove(int x, int y, const Inputs& inputs);
// Set the camera to look at the interest point
// instantSet = true will not interpolate to the new position
void setLookat(const glm::vec3& eye, const glm::vec3& center, const glm::vec3& up, bool instantSet = true);
// This should be called in an application loop to update the camera matrix if this one is animated: new position, key movement
void updateAnim();
// To call when the size of the window change. This allows to do nicer movement according to the window size.
void setWindowSize(int w, int h);
// Setting the current mouse position, to call on mouse button down. Allow to compute properly the deltas
void setMousePosition(int x, int y);
Camera getCamera() const { return m_current; }
void setCamera(Camera camera, bool instantSet = true);
// Retrieve the position, interest and up vector of the camera
void getLookat(glm::vec3& eye, glm::vec3& center, glm::vec3& up) const;
glm::vec3 getEye() const { return m_current.eye; }
glm::vec3 getCenter() const { return m_current.ctr; }
glm::vec3 getUp() const { return m_current.up; }
// Set the manipulator mode, from Examiner, to walk, to fly, ...
void setMode(Modes mode) { m_mode = mode; }
// Retrieve the current manipulator mode
Modes getMode() const { return m_mode; }
// Retrieving the transformation matrix of the camera
const glm::mat4& getMatrix() const { return m_matrix; }
// Set the position, interest from the matrix.
// instantSet = true will not interpolate to the new position
// centerDistance is the distance of the center from the eye
void setMatrix(const glm::mat4& mat_, bool instantSet = true, float centerDistance = 1.f);
// Changing the default speed movement
void setSpeed(float speed) { m_speed = speed; }
// Retrieving the current speed
float getSpeed() { return m_speed; }
// Retrieving the last mouse position
void getMousePosition(int& x, int& y);
// Main function which is called to apply a camera motion.
// It is preferable to
void motion(int x, int y, int action = 0);
void keyMotion(float dx, float dy, int action);
// To call when the mouse wheel change
void wheel(int value, const Inputs& inputs);
// Retrieve the screen dimension
int getWidth() const { return m_width; }
int getHeight() const { return m_height; }
float getAspectRatio() const { return static_cast<float>(m_width) / static_cast<float>(m_height); }
// Field of view in degrees
void setFov(float _fov);
float getFov() { return m_current.fov; }
// Clip planes
void setClipPlanes(glm::vec2 clip) { m_clipPlanes = clip; }
const glm::vec2& getClipPlanes() const { return m_clipPlanes; }
// Animation duration
double getAnimationDuration() const { return m_duration; }
void setAnimationDuration(double val) { m_duration = val; }
bool isAnimated() { return m_anim_done == false; }
// Returning a default help string
const std::string& getHelp();
// Fitting the camera position and interest to see the bounding box
void fit(const glm::vec3& boxMin, const glm::vec3& boxMax, bool instantFit = true, bool tight = false, float aspect = 1.0f);
protected:
CameraManipulator();
private:
// Update the internal matrix.
void update() { m_matrix = glm::lookAt(m_current.eye, m_current.ctr, m_current.up); }
// Do panning: movement parallels to the screen
void pan(float dx, float dy);
// Do orbiting: rotation around the center of interest. If invert, the interest orbit around the camera position
void orbit(float dx, float dy, bool invert = false);
// Do dolly: movement toward the interest.
void dolly(float dx, float dy);
double getSystemTime();
glm::vec3 computeBezier(float t, glm::vec3& p0, glm::vec3& p1, glm::vec3& p2);
void findBezierPoints();
protected:
glm::mat4 m_matrix = glm::mat4(1);
Camera m_current; // Current camera position
Camera m_goal; // Wish camera position
Camera m_snapshot; // Current camera the moment a set look-at is done
// Animation
std::array<glm::vec3, 3> m_bezier;
double m_start_time = 0;
double m_duration = 0.5;
bool m_anim_done{true};
glm::vec3 m_key_vec{0, 0, 0};
// Screen
int m_width = 1;
int m_height = 1;
// Other
float m_speed = 3.f;
glm::vec2 m_mouse = glm::vec2(0.f, 0.f);
glm::vec2 m_clipPlanes = glm::vec2(0.001f, 100000000.f);
bool m_button = false; // Button pressed
bool m_moving = false; // Mouse is moving
float m_tbsize = 0.8f; // Trackball size;
Modes m_mode = Examine;
public:
// Factory.
static CameraManipulator& Singleton()
{
static CameraManipulator manipulator;
return manipulator;
}
};
// Global Manipulator
} // namespace nvh
#define CameraManip nvh::CameraManipulator::Singleton()

View file

@ -0,0 +1,232 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2022 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#pragma once
#include <iostream>
#include <string>
#include <variant>
#include <vector>
#include <algorithm>
#include <iomanip>
#include <sstream>
#include "nvprint.hpp"
static constexpr int MAX_LINE_WIDTH = 60;
namespace nvh {
/* @DOC_START
Command line parser.
```cpp
std::string inFilename = "";
bool printHelp = false;
CommandLineParser args("Test Parser");
args.addArgument({"-f", "--filename"}, &inFilename, "Input filename");
args.addArgument({"-h", "--help"}, &printHelp, "Print Help");
bool result = args.parse(argc, argv);
```
@DOC_END */
class CommandLineParser
{
public:
// These are the possible variables the options may point to. Bool and
// std::string are handled in a special way, all other values are parsed
// with a std::stringstream. This std::variant can be easily extended if
// the stream operator>> is overloaded. If not, you have to add a special
// case to the parse() method.
using Value = std::variant<int32_t*, uint32_t*, double*, float*, bool*, std::string*>;
// The description is printed as part of the help message.
CommandLineParser(const std::string& description)
: m_description(description)
{
}
void addArgument(std::vector<std::string> const& flags, Value const& value, std::string const& help)
{
m_arguments.emplace_back(Argument{flags, value, help});
}
// Prints the description given to the constructor and the help for each option.
void printHelp(std::ostream& os = std::cout) const
{
// Print the general description.
os << m_description << std::endl;
// Find the argument with the longest combined flag length (in order to align the help messages).
uint32_t maxFlagLength = 0;
for(auto const& argument : m_arguments)
{
uint32_t flagLength = 0;
for(auto const& flag : argument.m_flags)
{
// Plus comma and space.
flagLength += static_cast<uint32_t>(flag.size()) + 2;
}
maxFlagLength = std::max(maxFlagLength, flagLength);
}
// Now print each argument.
for(auto const& argument : m_arguments)
{
std::string flags;
for(auto const& flag : argument.m_flags)
{
flags += flag + ", ";
}
// Remove last comma and space and add padding according to the longest flags in order to align the help messages.
std::stringstream sstr;
sstr << std::left << std::setw(maxFlagLength) << flags.substr(0, flags.size() - 2);
// Print the help for each argument. This is a bit more involved since we do line wrapping for long descriptions.
size_t spacePos = 0;
size_t lineWidth = 0;
while(spacePos != std::string::npos)
{
size_t nextspacePos = argument.m_help.find_first_of(' ', spacePos + 1);
sstr << argument.m_help.substr(spacePos, nextspacePos - spacePos);
lineWidth += nextspacePos - spacePos;
spacePos = nextspacePos;
if(lineWidth > MAX_LINE_WIDTH)
{
os << sstr.str() << std::endl;
sstr = std::stringstream();
sstr << std::left << std::setw(maxFlagLength - 1) << " ";
lineWidth = 0;
}
}
}
}
// The command line arguments are traversed from start to end. That means,
// if an option is set multiple times, the last will be the one which is
// finally used. This call will throw a std::runtime_error if a value is
// missing for a given option. Unknown flags will cause a warning on
// std::cerr.
bool parse(int argc, char* argv[])
{
bool result = true;
// Skip the first argument (name of the program).
int i = 1;
while(i < argc)
{
// First we have to identify whether the value is separated by a space or a '='.
std::string flag(argv[i]);
std::string value;
bool valueIsSeparate = false;
// If there is an '=' in the flag, the part after the '=' is actually
// the value.
size_t equalPos = flag.find('=');
if(equalPos != std::string::npos)
{
value = flag.substr(equalPos + 1);
flag = flag.substr(0, equalPos);
}
// Else the following argument is the value.
else if(i + 1 < argc)
{
value = argv[i + 1];
valueIsSeparate = true;
}
// Search for an argument with the provided flag.
bool foundArgument = false;
for(auto const& argument : m_arguments)
{
if(std::find(argument.m_flags.begin(), argument.m_flags.end(), flag) != std::end(argument.m_flags))
{
foundArgument = true;
// In the case of booleans, the value is not needed.
if(std::holds_alternative<bool*>(argument.m_value))
{
if(!value.empty() && value != "true" && value != "false")
{
valueIsSeparate = false; // No value
}
*std::get<bool*>(argument.m_value) = (value != "false");
}
// In all other cases there must be a value.
else if(value.empty())
{
LOGE("Failed to parse command line arguments. Missing value for argument %s\n", flag.c_str());
return false;
}
// For a std::string, we take the entire value.
else if(std::holds_alternative<std::string*>(argument.m_value))
{
*std::get<std::string*>(argument.m_value) = value;
}
// In all other cases we use a std::stringstream to convert the value.
else
{
std::visit(
[&value](auto&& arg) {
std::stringstream sstr(value);
sstr >> *arg;
},
argument.m_value);
}
break;
}
}
// Print a warning if there was an unknown argument.
if(!foundArgument)
{
std::cerr << "Ignoring unknown command line argument \"" << flag << "\"." << std::endl;
result = false;
}
// Advance to the next flag.
++i;
// If the value was separated, we have to advance our index once more.
if(foundArgument && valueIsSeparate)
{
++i;
}
}
return result;
}
private:
struct Argument
{
std::vector<std::string> m_flags;
Value m_value;
std::string m_help;
};
std::string m_description;
std::vector<Argument> m_arguments;
};
} // namespace nvh

View file

@ -0,0 +1,105 @@
#ifndef NVPRO_CORE_NVH_CONTAINER_UTILS_HPP_
#define NVPRO_CORE_NVH_CONTAINER_UTILS_HPP_
#include <array>
#include <cassert>
#include <stddef.h>
#include <stdint.h>
#include <vector>
/// @DOC_SKIP (keyword to exclude this file from automatic README.md generation)
// constexpr array size functions for C and C++ style arrays.
// Truncated to 32-bits (with error checking) to support the common case in Vulkan.
template <typename T, size_t size>
constexpr uint32_t arraySize(const T (&)[size])
{
constexpr uint32_t u32_size = static_cast<uint32_t>(size);
static_assert(size == u32_size, "32-bit overflow");
return u32_size;
}
template <typename T, size_t size>
constexpr uint32_t arraySize(const std::array<T, size>&)
{
constexpr uint32_t u32_size = static_cast<uint32_t>(size);
static_assert(size == u32_size, "32-bit overflow");
return u32_size;
}
// Checked 32-bit array size function for vectors.
template <typename T, typename Allocator>
constexpr uint32_t arraySize(const std::vector<T, Allocator>& vector)
{
auto size = vector.size();
uint32_t u32_size = static_cast<uint32_t>(size);
if(u32_size != size)
{
assert(!"32-bit overflow");
}
return u32_size;
}
namespace nvh {
//---- Hash Combination ----
// http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3876.pdf
template <typename T>
void hashCombine(std::size_t& seed, const T& val)
{
seed ^= std::hash<T>()(val) + 0x9e3779b9 + (seed << 6) + (seed >> 2);
}
// Auxiliary generic functions to create a hash value using a seed
template <typename T, typename... Types>
void hashCombine(std::size_t& seed, const T& val, const Types&... args)
{
hashCombine(seed, val);
hashCombine(seed, args...);
}
// Optional auxiliary generic functions to support hash_val() without arguments
inline void hashCombine(std::size_t& seed) {}
// Generic function to create a hash value out of a heterogeneous list of arguments
template <typename... Types>
std::size_t hashVal(const Types&... args)
{
std::size_t seed = 0;
hashCombine(seed, args...);
return seed;
}
//--------------
template <typename T>
std::size_t hashAligned32(const T& v)
{
const uint32_t size = sizeof(T) / sizeof(uint32_t);
const uint32_t* vBits = reinterpret_cast<const uint32_t*>(&v);
std::size_t seed = 0;
for(uint32_t i = 0u; i < size; i++)
{
hashCombine(seed, vBits[i]);
}
return seed;
}
// Generic hash function to use when using a struct aligned to 32-bit as std::map-like container key
// Important: this only works if the struct contains integral types, as it will not
// do any pointer chasing
template <typename T>
struct HashAligned32
{
std::size_t operator()(const T& s) const { return hashAligned32(s); }
};
// Generic equal function to use when using a struct as std::map-like container key
// Important: this only works if the struct contains integral types, as it will not
// do any pointer chasing
template <typename T>
struct EqualMem
{
bool operator()(const T& l, const T& r) const { return memcmp(&l, &r, sizeof(T)) == 0; }
};
} // namespace nvh
#endif

View file

@ -0,0 +1,219 @@
/*
* Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2020-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#include "filemapping.hpp"
#include <assert.h>
#if defined(LINUX)
#include <errno.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/resource.h>
#include <sys/stat.h>
#include <unistd.h>
#endif
#if defined(_WIN32)
#ifndef WIN32_LEAN_AND_MEAN
#define WIN32_LEAN_AND_MEAN
#endif
#include <windows.h>
inline DWORD HIDWORD(size_t x)
{
return (DWORD)(x >> 32);
}
inline DWORD LODWORD(size_t x)
{
return (DWORD)x;
}
#endif
namespace nvh {
bool FileMapping::open(const char* fileName, MappingType mappingType, size_t fileSize)
{
if(!g_pageSize)
{
#if defined(_WIN32)
SYSTEM_INFO si;
GetSystemInfo(&si);
g_pageSize = (size_t)si.dwAllocationGranularity;
#elif defined(LINUX)
g_pageSize = (size_t)getpagesize();
#endif
}
m_mappingType = mappingType;
if(mappingType == MAPPING_READOVERWRITE)
{
assert(fileSize);
m_fileSize = fileSize;
m_mappingSize = ((fileSize + g_pageSize - 1) / g_pageSize) * g_pageSize;
// check if the current process is allowed to save a file of that size
#if defined(_WIN32)
TCHAR dir[MAX_PATH + 1];
BOOL success = FALSE;
ULARGE_INTEGER numFreeBytes;
DWORD length = GetVolumePathName(fileName, dir, MAX_PATH + 1);
if(length > 0)
{
success = GetDiskFreeSpaceEx(dir, NULL, NULL, &numFreeBytes);
}
m_isValid = (!!success) && (m_mappingSize <= numFreeBytes.QuadPart);
#elif defined(LINUX)
struct rlimit rlim;
getrlimit(RLIMIT_FSIZE, &rlim);
m_isValid = (m_mappingSize <= rlim.rlim_cur);
#endif
if(!m_isValid)
{
return false;
}
}
#if defined(_WIN32)
m_win32.file = mappingType == MAPPING_READONLY ?
CreateFile(fileName, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_READONLY, NULL) :
CreateFile(fileName, GENERIC_READ | GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
m_isValid = (m_win32.file != INVALID_HANDLE_VALUE);
if(m_isValid)
{
if(mappingType == MAPPING_READONLY)
{
DWORD sizeHi = 0;
DWORD sizeLo = GetFileSize(m_win32.file, &sizeHi);
m_mappingSize = (static_cast<size_t>(sizeHi) << 32) | sizeLo;
m_fileSize = m_mappingSize;
}
m_win32.fileMapping = CreateFileMapping(m_win32.file, NULL, mappingType == MAPPING_READONLY ? PAGE_READONLY : PAGE_READWRITE,
HIDWORD(m_mappingSize), LODWORD(m_mappingSize), NULL);
m_isValid = (m_win32.fileMapping != NULL);
if(m_isValid)
{
m_mappingPtr = MapViewOfFile(m_win32.fileMapping, mappingType == MAPPING_READONLY ? FILE_MAP_READ : FILE_MAP_ALL_ACCESS,
HIDWORD(0), LODWORD(0), (SIZE_T)0);
if(!m_mappingPtr)
{
#if 0
DWORD err = GetLastError();
#endif
CloseHandle(m_win32.file);
m_isValid = false;
}
}
else
{
CloseHandle(m_win32.file);
}
}
#elif defined(LINUX)
m_unix.file = mappingType == MAPPING_READONLY ? ::open(fileName, O_RDONLY) : ::open(fileName, O_RDWR | O_CREAT | O_TRUNC, 0666);
m_isValid = (m_unix.file != -1);
if(m_isValid)
{
if(mappingType == MAPPING_READONLY)
{
struct stat s;
m_isValid &= (fstat(m_unix.file, &s) >= 0);
m_mappingSize = s.st_size;
}
else
{
// make file large enough to hold the complete scene
m_isValid &= (lseek(m_unix.file, m_mappingSize - 1, SEEK_SET) >= 0);
m_isValid &= (write(m_unix.file, "", 1) >= 0);
m_isValid &= (lseek(m_unix.file, 0, SEEK_SET) >= 0);
}
m_fileSize = m_mappingSize;
if(m_isValid)
{
m_mappingPtr = mmap(0, m_mappingSize, mappingType == MAPPING_READONLY ? PROT_READ : (PROT_READ | PROT_WRITE),
MAP_SHARED, m_unix.file, 0);
m_isValid = (m_mappingPtr != MAP_FAILED);
}
if(!m_isValid)
{
::close(m_unix.file);
m_unix.file = -1;
}
}
#endif
return m_isValid;
}
void FileMapping::close()
{
if(m_isValid)
{
#if defined(_WIN32)
assert((m_win32.file != INVALID_HANDLE_VALUE) && (m_win32.fileMapping != NULL));
UnmapViewOfFile(m_mappingPtr);
CloseHandle(m_win32.fileMapping);
if(m_mappingType == MAPPING_READOVERWRITE)
{
// truncate file to minimum size
// To work with 64-bit file pointers, you can declare a LONG, treat it as the upper half
// of the 64-bit file pointer, and pass its address in lpDistanceToMoveHigh. This means
// you have to treat two different variables as a logical unit, which is error-prone.
// The problems can be ameliorated by using the LARGE_INTEGER structure to create a 64-bit
// value and passing the two 32-bit values by means of the appropriate elements of the union.
// (see msdn documentation on SetFilePointer)
LARGE_INTEGER li;
li.QuadPart = (__int64)m_fileSize;
SetFilePointer(m_win32.file, li.LowPart, &li.HighPart, FILE_BEGIN);
SetEndOfFile(m_win32.file);
}
CloseHandle(m_win32.file);
m_mappingPtr = nullptr;
m_win32.fileMapping = nullptr;
m_win32.file = nullptr;
#elif defined(LINUX)
assert(m_unix.file != -1);
munmap(m_mappingPtr, m_mappingSize);
::close(m_unix.file);
m_mappingPtr = nullptr;
m_unix.file = -1;
#endif
m_isValid = false;
}
}
size_t FileMapping::g_pageSize = 0;
} // namespace nvh

View file

@ -0,0 +1,123 @@
/*
* Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2020-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
/// @DOC_SKIP (keyword to exclude this file from automatic README.md generation)
#pragma once
#include <cstddef>
#include <utility>
namespace nvh {
class FileMapping
{
public:
FileMapping(FileMapping&& other) noexcept { this->operator=(std::move(other)); };
FileMapping& operator=(FileMapping&& other) noexcept
{
m_isValid = other.m_isValid;
m_fileSize = other.m_fileSize;
m_mappingType = other.m_mappingType;
m_mappingPtr = other.m_mappingPtr;
m_mappingSize = other.m_mappingSize;
#ifdef _WIN32
m_win32.file = other.m_win32.file;
m_win32.fileMapping = other.m_win32.fileMapping;
other.m_win32.file = nullptr;
other.m_win32.fileMapping = nullptr;
#else
m_unix.file = other.m_unix.file;
other.m_unix.file = -1;
#endif
other.m_isValid = false;
other.m_mappingPtr = nullptr;
return *this;
}
FileMapping(const FileMapping&) = delete;
FileMapping& operator=(const FileMapping& other) = delete;
FileMapping() {}
~FileMapping() { close(); }
enum MappingType
{
MAPPING_READONLY, // opens existing file for read-only access
MAPPING_READOVERWRITE, // creates new file with read/write access, overwriting existing files
};
// fileSize only for write access
bool open(const char* filename, MappingType mappingType, size_t fileSize = 0);
void close();
const void* data() const { return m_mappingPtr; }
void* data() { return m_mappingPtr; }
size_t size() const { return m_mappingSize; }
bool valid() const { return m_isValid; }
protected:
static size_t g_pageSize;
#ifdef _WIN32
struct
{
void* file = nullptr;
void* fileMapping = nullptr;
} m_win32;
#else
struct
{
int file = -1;
} m_unix;
#endif
bool m_isValid = false;
size_t m_fileSize = 0;
MappingType m_mappingType = MappingType::MAPPING_READONLY;
void* m_mappingPtr = nullptr;
size_t m_mappingSize = 0;
};
// convenience types
class FileReadMapping : private FileMapping
{
public:
bool open(const char* filename) { return FileMapping::open(filename, MAPPING_READONLY, 0); }
void close() { FileMapping::close(); }
const void* data() const { return m_mappingPtr; }
size_t size() const { return m_fileSize; }
bool valid() const { return m_isValid; }
};
class FileReadOverWriteMapping : private FileMapping
{
public:
bool open(const char* filename, size_t fileSize)
{
return FileMapping::open(filename, MAPPING_READOVERWRITE, fileSize);
}
void close() { FileMapping::close(); }
void* data() { return m_mappingPtr; }
size_t size() const { return m_fileSize; }
bool valid() const { return m_isValid; }
};
} // namespace nvh

View file

@ -0,0 +1,181 @@
/*
* Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2019-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#pragma once
#include <algorithm> // std::max
#include <fstream>
#include <sstream>
#include <vector>
#include "nvprint.hpp"
/** @DOC_START
# functions in nvh
- nvh::fileExists : check if file exists
- nvh::findFile : finds filename in provided search directories
- nvh::loadFile : (multiple overloads) loads file as std::string, binary or text, can also search in provided directories
- nvh::getFileName : splits filename from filename with path
- nvh::getFilePath : splits filepath from filename with path
@DOC_END */
namespace nvh {
inline bool fileExists(const char* filename)
{
std::ifstream stream;
stream.open(filename);
return stream.is_open();
}
// returns first found filename (searches within directories provided)
inline std::string findFile(const std::string& infilename, const std::vector<std::string>& directories, bool warn = false)
{
std::ifstream stream;
{
stream.open(infilename.c_str());
if(stream.is_open())
{
// nvprintfLevel(LOGLEVEL_INFO, "Found: %s\n", infilename.c_str());
return infilename;
}
}
for(const auto& directory : directories)
{
std::string filename = directory + "/" + infilename;
stream.open(filename.c_str());
if(stream.is_open())
{
// nvprintfLevel(LOGLEVEL_INFO, "Found: %s\n", filename.c_str());
return filename;
}
}
if(warn)
{
nvprintfLevel(LOGLEVEL_WARNING, "File not found: %s\n", infilename.c_str());
nvprintfLevel(LOGLEVEL_WARNING, "In directories: \n");
for(const auto& directory : directories)
{
nvprintfLevel(LOGLEVEL_WARNING, " - %s\n", directory.c_str());
}
nvprintfLevel(LOGLEVEL_WARNING, "\n");
}
return {};
}
inline std::string loadFile(const std::string& filename, bool binary)
{
std::string result;
std::ifstream stream(filename, std::ios::ate | (binary ? std::ios::binary : std::ios_base::openmode(0)));
if(!stream.is_open())
{
return result;
}
result.reserve(stream.tellg());
stream.seekg(0, std::ios::beg);
result.assign((std::istreambuf_iterator<char>(stream)), std::istreambuf_iterator<char>());
return result;
}
inline std::string loadFile(const char* filename, bool binary)
{
std::string name(filename);
return loadFile(name, binary);
}
inline std::string loadFile(const std::string& filename,
bool binary,
const std::vector<std::string>& directories,
std::string& filenameFound,
bool warn = false)
{
filenameFound = findFile(filename, directories, warn);
if(filenameFound.empty())
{
return {};
}
else
{
return loadFile(filenameFound, binary);
}
}
inline std::string loadFile(const std::string filename, bool binary, const std::vector<std::string>& directories, bool warn = false)
{
std::string filenameFound;
return loadFile(filename, binary, directories, filenameFound, warn);
}
// splits filename excluding path
inline std::string getFileName(std::string const& fullPath)
{
// Determine the last occurrence of path separator
std::size_t lastSeparator = fullPath.find_last_of("/\\");
if(lastSeparator == std::string::npos)
{
// If no separator found, return fullPath as it is (considered as filename)
return fullPath;
}
// Extract the filename from fullPath
return fullPath.substr(lastSeparator + 1);
}
// splits path from filename
inline std::string getFilePath(const char* filename)
{
std::string path;
// find path in filename
{
std::string filepath(filename);
size_t pos0 = filepath.rfind('\\');
size_t pos1 = filepath.rfind('/');
pos0 = pos0 == std::string::npos ? 0 : pos0;
pos1 = pos1 == std::string::npos ? 0 : pos1;
path = filepath.substr(0, std::max(pos0, pos1));
}
if(path.empty())
{
path = ".";
}
return path;
}
// Return true if the filename ends with ending. i.e. ".png"
inline bool endsWith(std::string const& value, std::string const& ending)
{
if(ending.size() > value.size())
return false;
return std::equal(ending.rbegin(), ending.rend(), value.rbegin());
}
} // namespace nvh

View file

@ -0,0 +1,547 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#ifndef NV_GEOMETRY_INCLUDED
#define NV_GEOMETRY_INCLUDED
#include <glm/glm.hpp>
#include <glm/gtc/constants.hpp>
#include <glm/gtc/matrix_transform.hpp>
#include <stdint.h>
#include <cmath>
#include <vector>
namespace nvh {
//////////////////////////////////////////////////////////////////////////
/** @DOC_START
# namespace nvh::geometry
The geometry namespace provides a few procedural mesh primitives
that are subdivided.
nvh::geometry::Mesh template uses the provided TVertex which must have a
constructor from nvh::geometry::Vertex. You can also use nvh::geometry::Vertex
directly.
It provides triangle indices, as well as outline line indices. The outline indices
are typical feature lines (rectangle for plane, some circles for sphere/torus).
All basic primitives are within -1,1 ranges along the axis they use
- nvh::geometry::Plane (x,y subdivision)
- nvh::geometry::Box (x,y,z subdivision, made of 6 planes)
- nvh::geometry::Sphere (lat,long subdivision)
- nvh::geometry::Torus (inner, outer circle subdivision)
- nvh::geometry::RandomMengerSponge (subdivision, tree depth, probability)
Example:
```cpp
// single primitive
nvh::geometry::Box<nvh::geometry::Vertex> box(4,4,4);
// construct from primitives
```
@DOC_END */
namespace geometry {
struct Vertex
{
Vertex(glm::vec3 const& position, glm::vec3 const& normal, glm::vec2 const& texcoord)
: position(glm::vec4(position, 1.0f))
, normal(glm::vec4(normal, 0.0f))
, texcoord(glm::vec4(texcoord, 0.0f, 0.0f))
{
}
glm::vec4 position;
glm::vec4 normal;
glm::vec4 texcoord;
};
// The provided TVertex must have a constructor from Vertex
template <class TVertex = Vertex>
class Mesh
{
public:
std::vector<TVertex> m_vertices;
std::vector<glm::uvec3> m_indicesTriangles;
std::vector<glm::uvec2> m_indicesOutline;
void append(Mesh<TVertex>& geo)
{
m_vertices.reserve(geo.m_vertices.size() + m_vertices.size());
m_indicesTriangles.reserve(geo.m_indicesTriangles.size() + m_indicesTriangles.size());
m_indicesOutline.reserve(geo.m_indicesOutline.size() + m_indicesOutline.size());
uint32_t offset = uint32_t(m_vertices.size());
for(size_t i = 0; i < geo.m_vertices.size(); i++)
{
m_vertices.push_back(geo.m_vertices[i]);
}
for(size_t i = 0; i < geo.m_indicesTriangles.size(); i++)
{
m_indicesTriangles.push_back(geo.m_indicesTriangles[i] + glm::uvec3(offset));
}
for(size_t i = 0; i < geo.m_indicesOutline.size(); i++)
{
m_indicesOutline.push_back(geo.m_indicesOutline[i] + glm::uvec2(offset));
}
}
void flipWinding()
{
for(size_t i = 0; i < m_indicesTriangles.size(); i++)
{
std::swap(m_indicesTriangles[i].x, m_indicesTriangles[i].z);
}
}
size_t getTriangleIndicesSize() const { return m_indicesTriangles.size() * sizeof(glm::uvec3); }
uint32_t getTriangleIndicesCount() const { return (uint32_t)m_indicesTriangles.size() * 3; }
size_t getOutlineIndicesSize() const { return m_indicesOutline.size() * sizeof(glm::uvec2); }
uint32_t getOutlineIndicesCount() const { return (uint32_t)m_indicesOutline.size() * 2; }
size_t getVerticesSize() const { return m_vertices.size() * sizeof(TVertex); }
uint32_t getVerticesCount() const { return (uint32_t)m_vertices.size(); }
};
template <class TVertex = Vertex>
class Plane : public Mesh<TVertex>
{
public:
static void add(Mesh<TVertex>& geo, const glm::mat4& mat, int w, int h)
{
int xdim = w;
int ydim = h;
float xmove = 1.0f / (float)xdim;
float ymove = 1.0f / (float)ydim;
int width = (xdim + 1);
uint32_t vertOffset = (uint32_t)geo.m_vertices.size();
int x, y;
for(y = 0; y < ydim + 1; y++)
{
for(x = 0; x < xdim + 1; x++)
{
float xpos = ((float)x * xmove);
float ypos = ((float)y * ymove);
glm::vec3 pos;
glm::vec2 uv;
glm::vec3 normal;
pos[0] = (xpos - 0.5f) * 2.0f;
pos[1] = (ypos - 0.5f) * 2.0f;
pos[2] = 0;
uv[0] = xpos;
uv[1] = ypos;
normal[0] = 0.0f;
normal[1] = 0.0f;
normal[2] = 1.0f;
Vertex vert = Vertex(pos, normal, uv);
vert.position = mat * vert.position;
vert.normal = mat * vert.normal;
geo.m_vertices.push_back(TVertex(vert));
}
}
for(y = 0; y < ydim; y++)
{
for(x = 0; x < xdim; x++)
{
// upper tris
geo.m_indicesTriangles.push_back(glm::uvec3((x) + (y + 1) * width + vertOffset, (x) + (y)*width + vertOffset,
(x + 1) + (y + 1) * width + vertOffset));
// lower tris
geo.m_indicesTriangles.push_back(glm::uvec3((x + 1) + (y + 1) * width + vertOffset,
(x) + (y)*width + vertOffset, (x + 1) + (y)*width + vertOffset));
}
}
for(y = 0; y < ydim; y++)
{
geo.m_indicesOutline.push_back(glm::uvec2((y)*width + vertOffset, (y + 1) * width + vertOffset));
}
for(y = 0; y < ydim; y++)
{
geo.m_indicesOutline.push_back(glm::uvec2((y)*width + xdim + vertOffset, (y + 1) * width + xdim + vertOffset));
}
for(x = 0; x < xdim; x++)
{
geo.m_indicesOutline.push_back(glm::uvec2((x) + vertOffset, (x + 1) + vertOffset));
}
for(x = 0; x < xdim; x++)
{
geo.m_indicesOutline.push_back(glm::uvec2((x) + ydim * width + vertOffset, (x + 1) + ydim * width + vertOffset));
}
}
Plane(int segments = 1) { add(*this, glm::mat4(1), segments, segments); }
};
template <class TVertex = Vertex>
class Box : public Mesh<TVertex>
{
public:
static void add(Mesh<TVertex>& geo, const glm::mat4& mat, int w, int h, int d)
{
int configs[6][2] = {
{w, h}, {w, h},
{d, h}, {d, h},
{w, d}, {w, d},
};
for(int side = 0; side < 6; side++)
{
glm::mat4 matrixRot(1);
switch(side)
{
case 0:
break;
case 1:
matrixRot = glm::rotate(glm::mat4(1), glm::pi<float>(), glm::vec3(0, 1, 0));
break;
case 2:
matrixRot = glm::rotate(glm::mat4(1), glm::pi<float>() * 0.5f, glm::vec3(0, 1, 0));
break;
case 3:
matrixRot = glm::rotate(glm::mat4(1), glm::pi<float>() * 1.5f, glm::vec3(0, 1, 0));
break;
case 4:
matrixRot = glm::rotate(glm::mat4(1), glm::pi<float>() * 0.5f, glm::vec3(1, 0, 0));
break;
case 5:
matrixRot = glm::rotate(glm::mat4(1), glm::pi<float>() * 1.5f, glm::vec3(1, 0, 0));
break;
}
glm::mat4 matrixMove = glm::translate(glm::mat4(1.f), {0.0f, 0.0f, 1.0f});
Plane<TVertex>::add(geo, mat * matrixRot * matrixMove, configs[side][0], configs[side][1]);
}
}
Box(int segments = 1) { add(*this, glm::mat4(1), segments, segments, segments); }
};
template <class TVertex = Vertex>
class Sphere : public Mesh<TVertex>
{
public:
static void add(Mesh<TVertex>& geo, const glm::mat4& mat, int w, int h)
{
int xydim = w;
int zdim = h;
uint32_t vertOffset = (uint32_t)geo.m_vertices.size();
float xyshift = 1.0f / (float)xydim;
float zshift = 1.0f / (float)zdim;
int width = xydim + 1;
int index = 0;
int xy, z;
for(z = 0; z < zdim + 1; z++)
{
for(xy = 0; xy < xydim + 1; xy++)
{
glm::vec3 pos;
glm::vec3 normal;
glm::vec2 uv;
float curxy = xyshift * (float)xy;
float curz = zshift * (float)z;
float anglexy = curxy * glm::pi<float>() * 2.0f;
float anglez = (1.0f - curz) * glm::pi<float>();
pos[0] = cosf(anglexy) * sinf(anglez);
pos[1] = sinf(anglexy) * sinf(anglez);
pos[2] = cosf(anglez);
normal = pos;
uv[0] = curxy;
uv[1] = curz;
Vertex vert = Vertex(pos, normal, uv);
vert.position = mat * vert.position;
vert.normal = mat * vert.normal;
geo.m_vertices.push_back(TVertex(vert));
}
}
int vertex = 0;
for(z = 0; z < zdim; z++)
{
for(xy = 0; xy < xydim; xy++, vertex++)
{
glm::uvec3 indices;
if(z != zdim - 1)
{
indices[2] = vertex + vertOffset;
indices[1] = vertex + width + vertOffset;
indices[0] = vertex + width + 1 + vertOffset;
geo.m_indicesTriangles.push_back(indices);
}
if(z != 0)
{
indices[2] = vertex + width + 1 + vertOffset;
indices[1] = vertex + 1 + vertOffset;
indices[0] = vertex + vertOffset;
geo.m_indicesTriangles.push_back(indices);
}
}
vertex++;
}
int middlez = zdim / 2;
for(xy = 0; xy < xydim; xy++)
{
glm::uvec2 indices;
indices[0] = middlez * width + xy + vertOffset;
indices[1] = middlez * width + xy + 1 + vertOffset;
geo.m_indicesOutline.push_back(indices);
}
for(int i = 0; i < 4; i++)
{
int x = (xydim * i) / 4;
for(z = 0; z < zdim; z++)
{
glm::uvec2 indices;
indices[0] = x + width * (z) + vertOffset;
indices[1] = x + width * (z + 1) + vertOffset;
geo.m_indicesOutline.push_back(indices);
}
}
}
Sphere(int w = 16, int h = 8) { add(*this, glm::mat4(1), w, h); }
};
template <class TVertex = Vertex>
class Torus : public Mesh<TVertex>
{
public:
static void add(Mesh<TVertex>& geo, const glm::mat4& mat, int w, int h)
{
// Radius of inner and outer circles
float innerRadius = 0.8f;
float outerRadius = 0.2f;
unsigned int numVertices = (w + 1) * (h + 1);
float wf = (float)w;
float hf = (float)h;
float phi_step = 2.0f * glm::pi<float>() / wf;
float theta_step = 2.0f * glm::pi<float>() / hf;
// Setup vertices and normals
// Generate the Torus exactly like the sphere with rings around the origin along the latitudes.
for(unsigned int latitude = 0; latitude <= (unsigned int)w; latitude++) // theta angle
{
float theta = (float)latitude * theta_step;
float sinTheta = sinf(theta);
float cosTheta = cosf(theta);
float radius = innerRadius + outerRadius * cosTheta;
for(unsigned int longitude = 0; longitude <= (unsigned int)h; longitude++) // phi angle
{
float phi = (float)longitude * phi_step;
float sinPhi = sinf(phi);
float cosPhi = cosf(phi);
glm::vec3 position = glm::vec3(radius * cosPhi, outerRadius * sinTheta, radius * -sinPhi);
glm::vec3 normal = glm::vec3(cosPhi * cosTheta, sinTheta, -sinPhi * cosTheta);
glm::vec2 uv = glm::vec2((float)longitude / wf, (float)latitude / hf);
Vertex vertex(position, normal, uv);
geo.m_vertices.push_back(TVertex(vertex));
}
}
const unsigned int columns = w + 1;
// Setup indices
for(unsigned int latitude = 0; latitude < (unsigned int)w; latitude++)
{
for(unsigned int longitude = 0; longitude < (unsigned int)h; longitude++)
{
// Indices for triangles
glm::uvec3 triangle1(latitude * columns + longitude, latitude * columns + longitude + 1, (latitude + 1) * columns + longitude);
glm::uvec3 triangle2((latitude + 1) * columns + longitude, latitude * columns + longitude + 1,
(latitude + 1) * columns + longitude + 1);
geo.m_indicesTriangles.push_back(triangle1);
geo.m_indicesTriangles.push_back(triangle2);
}
}
// Setup outline indices
// Outline for outer ring
for(unsigned int longitude = 0; longitude < (unsigned int)w; longitude++)
{
for(unsigned int y = 0; y < 4; y++)
{
unsigned int latitude = y * (0.25 * h);
glm::uvec2 line(latitude * columns + longitude, latitude * columns + longitude + 1);
geo.m_indicesOutline.push_back(line);
}
}
// Outline for inner rings
for(unsigned int x = 0; x < 4; x++)
{
for(unsigned int latitude = 0; latitude < (unsigned int)h; latitude++)
{
unsigned int longitude = x * (0.25 * w);
glm::uvec2 line(latitude * columns + longitude, (latitude + 1) * columns + longitude);
geo.m_indicesOutline.push_back(line);
}
}
}
Torus(int w = 16, int h = 16) { add(*this, glm::mat4(1), w, h); }
};
template <class TVertex = Vertex>
class RandomMengerSponge : public Mesh<TVertex>
{
public:
static void add(Mesh<TVertex>& geo, const glm::mat4& mat, int w, int h, int d, int level = 3, float probability = -1.f)
{
struct Cube
{
glm::vec3 m_topLeftFront;
float m_size;
void split(std::vector<Cube>& cubes)
{
float size = m_size / 3.f;
glm::vec3 topLeftFront = m_topLeftFront;
for(int x = 0; x < 3; x++)
{
topLeftFront[0] = m_topLeftFront[0] + static_cast<float>(x) * size;
for(int y = 0; y < 3; y++)
{
if(x == 1 && y == 1)
continue;
topLeftFront[1] = m_topLeftFront[1] + static_cast<float>(y) * size;
for(int z = 0; z < 3; z++)
{
if(x == 1 && z == 1)
continue;
if(y == 1 && z == 1)
continue;
topLeftFront[2] = m_topLeftFront[2] + static_cast<float>(z) * size;
cubes.push_back({topLeftFront, size});
}
}
}
}
void splitProb(std::vector<Cube>& cubes, float prob)
{
float size = m_size / 3.f;
glm::vec3 topLeftFront = m_topLeftFront;
for(int x = 0; x < 3; x++)
{
topLeftFront[0] = m_topLeftFront[0] + static_cast<float>(x) * size;
for(int y = 0; y < 3; y++)
{
topLeftFront[1] = m_topLeftFront[1] + static_cast<float>(y) * size;
for(int z = 0; z < 3; z++)
{
float sample = rand() / static_cast<float>(RAND_MAX);
if(sample > prob)
continue;
topLeftFront[2] = m_topLeftFront[2] + static_cast<float>(z) * size;
cubes.push_back({topLeftFront, size});
}
}
}
}
};
Cube cube = {glm::vec3(-0.25, -0.25, -0.25), 0.5f};
//Cube cube = { glm::vec3(-25, -25, -25), 50.f };
//Cube cube = { glm::vec3(-40, -40, -40), 10.f };
std::vector<Cube> cubes1 = {cube};
std::vector<Cube> cubes2 = {};
auto previous = &cubes1;
auto next = &cubes2;
for(int i = 0; i < level; i++)
{
size_t cubeCount = previous->size();
for(Cube& c : *previous)
{
if(probability < 0.f)
c.split(*next);
else
c.splitProb(*next, probability);
}
auto temp = previous;
previous = next;
next = temp;
next->clear();
}
for(Cube& c : *previous)
{
glm::mat4 matrixMove = glm::translate(glm::mat4(1.f), c.m_topLeftFront);
glm::mat4 matrixScale = glm::scale(glm::mat4(1.f), glm::vec3(c.m_size));
;
Box<TVertex>::add(geo, matrixMove * matrixScale, 1, 1, 1);
}
}
};
} // namespace geometry
} // namespace nvh
#endif

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,711 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
/** @DOC_START
# `nvh::GltfScene`
These utilities are for loading glTF models in a
canonical scene representation. From this representation
you would create the appropriate 3D API resources (buffers
and textures).
```cpp
// Typical Usage
// Load the GLTF Scene using TinyGLTF
tinygltf::Model gltfModel;
tinygltf::TinyGLTF gltfContext;
fileLoaded = gltfContext.LoadASCIIFromFile(&gltfModel, &error, &warn, m_filename);
// Fill the data in the gltfScene
gltfScene.getMaterials(tmodel);
gltfScene.getDrawableNodes(tmodel, GltfAttributes::Normal | GltfAttributes::Texcoord_0);
// Todo in App:
// create buffers for vertices and indices, from gltfScene.m_position, gltfScene.m_index
// create textures from images: using tinygltf directly
// create descriptorSet for material using directly gltfScene.m_materials
```
@DOC_END */
#pragma once
#include <glm/glm.hpp>
#include "tiny_gltf.h"
#include <algorithm>
#include <cassert>
#include <functional>
#include <map>
#include <set>
#include <string>
#include <string.h>
#include <unordered_map>
#include <vector>
#define KHR_LIGHTS_PUNCTUAL_EXTENSION_NAME "KHR_lights_punctual"
namespace nvh {
// https://github.com/KhronosGroup/glTF/blob/main/extensions/2.0/Khronos/KHR_materials_specular/README.md
#define KHR_MATERIALS_SPECULAR_EXTENSION_NAME "KHR_materials_specular"
struct KHR_materials_specular
{
float specularFactor{1.f};
int specularTexture{-1};
glm::vec3 specularColorFactor{1.f, 1.f, 1.f};
int specularColorTexture{-1};
};
// https://github.com/KhronosGroup/glTF/tree/master/extensions/2.0/Khronos/KHR_texture_transform
#define KHR_TEXTURE_TRANSFORM_EXTENSION_NAME "KHR_texture_transform"
struct KHR_texture_transform
{
glm::vec2 offset{0.f, 0.f};
float rotation{0.f};
glm::vec2 scale{1.f};
int texCoord{0};
glm::mat3 uvTransform{1}; // Computed transform of offset, rotation, scale
};
// https://github.com/KhronosGroup/glTF/blob/master/extensions/2.0/Khronos/KHR_materials_clearcoat/README.md
#define KHR_MATERIALS_CLEARCOAT_EXTENSION_NAME "KHR_materials_clearcoat"
struct KHR_materials_clearcoat
{
float factor{0.f};
int texture{-1};
float roughnessFactor{0.f};
int roughnessTexture{-1};
int normalTexture{-1};
};
// https://github.com/KhronosGroup/glTF/blob/master/extensions/2.0/Khronos/KHR_materials_sheen/README.md
#define KHR_MATERIALS_SHEEN_EXTENSION_NAME "KHR_materials_sheen"
struct KHR_materials_sheen
{
glm::vec3 colorFactor{0.f, 0.f, 0.f};
int colorTexture{-1};
float roughnessFactor{0.f};
int roughnessTexture{-1};
};
// https://github.com/DassaultSystemes-Technology/glTF/tree/KHR_materials_volume/extensions/2.0/Khronos/KHR_materials_transmission
#define KHR_MATERIALS_TRANSMISSION_EXTENSION_NAME "KHR_materials_transmission"
struct KHR_materials_transmission
{
float factor{0.f};
int texture{-1};
};
// https://github.com/KhronosGroup/glTF/tree/master/extensions/2.0/Khronos/KHR_materials_unlit
#define KHR_MATERIALS_UNLIT_EXTENSION_NAME "KHR_materials_unlit"
struct KHR_materials_unlit
{
int active{0};
};
// PBR Next : KHR_materials_anisotropy
#define KHR_MATERIALS_ANISOTROPY_EXTENSION_NAME "KHR_materials_anisotropy"
struct KHR_materials_anisotropy
{
float factor{0.f};
glm::vec3 direction{1.f, 0.f, 0.f};
int texture{-1};
};
// https://github.com/DassaultSystemes-Technology/glTF/tree/KHR_materials_ior/extensions/2.0/Khronos/KHR_materials_ior
#define KHR_MATERIALS_IOR_EXTENSION_NAME "KHR_materials_ior"
struct KHR_materials_ior
{
float ior{1.5f};
};
// https://github.com/DassaultSystemes-Technology/glTF/tree/KHR_materials_volume/extensions/2.0/Khronos/KHR_materials_volume
#define KHR_MATERIALS_VOLUME_EXTENSION_NAME "KHR_materials_volume"
struct KHR_materials_volume
{
float thicknessFactor{0};
int thicknessTexture{-1};
float attenuationDistance{std::numeric_limits<float>::max()};
glm::vec3 attenuationColor{1.f, 1.f, 1.f};
};
// https://github.com/KhronosGroup/glTF/blob/main/extensions/2.0/Khronos/KHR_texture_basisu/README.md
#define KHR_TEXTURE_BASISU_NAME "KHR_texture_basisu"
struct KHR_texture_basisu
{
int source{-1};
};
// https://github.com/KhronosGroup/glTF/issues/948
#define KHR_MATERIALS_DISPLACEMENT_NAME "KHR_materials_displacement"
struct KHR_materials_displacement
{
float displacementGeometryFactor{1.0f};
float displacementGeometryOffset{0.0f};
int displacementGeometryTexture{-1};
};
// https://github.com/KhronosGroup/glTF/blob/main/extensions/2.0/Khronos/KHR_materials_emissive_strength/README.md
#define KHR_MATERIALS_EMISSIVE_STRENGTH_NAME "KHR_materials_emissive_strength"
struct KHR_materials_emissive_strength
{
float emissiveStrength{1.0};
};
// https://github.com/KhronosGroup/glTF/blob/master/specification/2.0/README.md#reference-material
struct GltfMaterial
{
// pbrMetallicRoughness
glm::vec4 baseColorFactor{1.f, 1.f, 1.f, 1.f};
int baseColorTexture{-1};
float metallicFactor{1.f};
float roughnessFactor{1.f};
int metallicRoughnessTexture{-1};
int emissiveTexture{-1};
glm::vec3 emissiveFactor{0, 0, 0};
int alphaMode{0};
float alphaCutoff{0.5f};
int doubleSided{0};
int normalTexture{-1};
float normalTextureScale{1.f};
int occlusionTexture{-1};
float occlusionTextureStrength{1};
// Extensions
KHR_materials_specular specular;
KHR_texture_transform textureTransform;
KHR_materials_clearcoat clearcoat;
KHR_materials_sheen sheen;
KHR_materials_transmission transmission;
KHR_materials_unlit unlit;
KHR_materials_anisotropy anisotropy;
KHR_materials_ior ior;
KHR_materials_volume volume;
KHR_materials_displacement displacement;
KHR_materials_emissive_strength emissiveStrength;
// Tiny Reference
const tinygltf::Material* tmaterial{nullptr};
};
struct GltfNode
{
glm::mat4 worldMatrix{1};
int primMesh{0};
const tinygltf::Node* tnode{nullptr};
};
struct GltfPrimMesh
{
uint32_t firstIndex{0};
uint32_t indexCount{0};
uint32_t vertexOffset{0};
uint32_t vertexCount{0};
int materialIndex{0};
glm::vec3 posMin{0, 0, 0};
glm::vec3 posMax{0, 0, 0};
std::string name;
// Tiny Reference
const tinygltf::Mesh* tmesh{nullptr};
const tinygltf::Primitive* tprim{nullptr};
};
struct GltfStats
{
uint32_t nbCameras{0};
uint32_t nbImages{0};
uint32_t nbTextures{0};
uint32_t nbMaterials{0};
uint32_t nbSamplers{0};
uint32_t nbNodes{0};
uint32_t nbMeshes{0};
uint32_t nbLights{0};
uint32_t imageMem{0};
uint32_t nbUniqueTriangles{0};
uint32_t nbTriangles{0};
};
struct GltfCamera
{
glm::mat4 worldMatrix{1};
glm::vec3 eye{0, 0, 0};
glm::vec3 center{0, 0, 0};
glm::vec3 up{0, 1, 0};
tinygltf::Camera cam;
};
// See: https://github.com/KhronosGroup/glTF/blob/master/extensions/2.0/Khronos/KHR_lights_punctual/README.md
struct GltfLight
{
glm::mat4 worldMatrix{1};
tinygltf::Light light;
};
enum class GltfAttributes : uint8_t
{
NoAttribs = 0,
Position = 1,
Normal = 2,
Texcoord_0 = 4,
Texcoord_1 = 8,
Tangent = 16,
Color_0 = 32,
All = 0xFF
};
using GltfAttributes_t = std::underlying_type_t<GltfAttributes>;
inline GltfAttributes operator|(GltfAttributes lhs, GltfAttributes rhs)
{
return static_cast<GltfAttributes>(static_cast<GltfAttributes_t>(lhs) | static_cast<GltfAttributes_t>(rhs));
}
inline GltfAttributes operator&(GltfAttributes lhs, GltfAttributes rhs)
{
return static_cast<GltfAttributes>(static_cast<GltfAttributes_t>(lhs) & static_cast<GltfAttributes_t>(rhs));
}
//--------------------------------------------------------------------------------------------------
// Class to convert gltfScene in simple draw-able format
//
struct GltfScene
{
// Importing all materials in a vector of GltfMaterial structure
void importMaterials(const tinygltf::Model& tmodel);
// Import all Mesh and primitives in a vector of GltfPrimMesh,
// - Reads all requested GltfAttributes and create them if `forceRequested` contains it.
// - Create a vector of GltfNode, GltfLight and GltfCamera
void importDrawableNodes(const tinygltf::Model& tmodel,
GltfAttributes requestedAttributes,
GltfAttributes forceRequested = GltfAttributes::All);
void exportDrawableNodes(tinygltf::Model& tmodel, GltfAttributes requestedAttributes);
// Compute the scene bounding box
void computeSceneDimensions();
// Removes everything
void destroy();
static GltfStats getStatistics(const tinygltf::Model& tinyModel);
// Scene data
std::vector<GltfMaterial> m_materials; // Material for shading
std::vector<GltfNode> m_nodes; // Drawable nodes, flat hierarchy
std::vector<GltfPrimMesh> m_primMeshes; // Primitive promoted to meshes
std::vector<GltfCamera> m_cameras;
std::vector<GltfLight> m_lights;
// Attributes, all same length if valid
std::vector<glm::vec3> m_positions;
std::vector<uint32_t> m_indices;
std::vector<glm::vec3> m_normals;
std::vector<glm::vec4> m_tangents;
std::vector<glm::vec2> m_texcoords0;
std::vector<glm::vec2> m_texcoords1;
std::vector<glm::vec4> m_colors0;
// #TODO - Adding support for Skinning
//using vec4us = vector4<unsigned short>;
//std::vector<vec4us> m_joints0;
//std::vector<glm::vec4> m_weights0;
// Size of the scene
struct Dimensions
{
glm::vec3 min = glm::vec3(std::numeric_limits<float>::max());
glm::vec3 max = glm::vec3(std::numeric_limits<float>::min());
glm::vec3 size{0.f};
glm::vec3 center{0.f};
float radius{0};
} m_dimensions;
private:
void processNode(const tinygltf::Model& tmodel, int& nodeIdx, const glm::mat4& parentMatrix);
void processMesh(const tinygltf::Model& tmodel,
const tinygltf::Primitive& tmesh,
GltfAttributes requestedAttributes,
GltfAttributes forceRequested,
const std::string& name);
void createNormals(GltfPrimMesh& resultMesh);
void createTexcoords(GltfPrimMesh& resultMesh);
void createTangents(GltfPrimMesh& resultMesh);
void createColors(GltfPrimMesh& resultMesh);
// Temporary data
std::unordered_map<int, std::vector<uint32_t>> m_meshToPrimMeshes;
std::vector<uint32_t> primitiveIndices32u;
std::vector<uint16_t> primitiveIndices16u;
std::vector<uint8_t> primitiveIndices8u;
std::unordered_map<std::string, GltfPrimMesh> m_cachePrimMesh;
void computeCamera();
void checkRequiredExtensions(const tinygltf::Model& tmodel);
void findUsedMeshes(const tinygltf::Model& tmodel, std::set<uint32_t>& usedMeshes, int nodeIdx);
};
glm::mat4 getLocalMatrix(const tinygltf::Node& tnode);
// Return a vector of data for a tinygltf::Value
template <typename T>
static inline std::vector<T> getVector(const tinygltf::Value& value)
{
std::vector<T> result{0};
if(!value.IsArray())
return result;
result.resize(value.ArrayLen());
for(int i = 0; i < value.ArrayLen(); i++)
{
result[i] = static_cast<T>(value.Get(i).IsNumber() ? value.Get(i).Get<double>() : value.Get(i).Get<int>());
}
return result;
}
static inline void getFloat(const tinygltf::Value& value, const std::string& name, float& val)
{
if(value.Has(name))
{
val = static_cast<float>(value.Get(name).Get<double>());
}
}
static inline void getInt(const tinygltf::Value& value, const std::string& name, int& val)
{
if(value.Has(name))
{
val = value.Get(name).Get<int>();
}
}
static inline void getVec2(const tinygltf::Value& value, const std::string& name, glm::vec2& val)
{
if(value.Has(name))
{
auto s = getVector<float>(value.Get(name));
val = glm::vec2{s[0], s[1]};
}
}
static inline void getVec3(const tinygltf::Value& value, const std::string& name, glm::vec3& val)
{
if(value.Has(name))
{
auto s = getVector<float>(value.Get(name));
val = glm::vec3{s[0], s[1], s[2]};
}
}
static inline void getVec4(const tinygltf::Value& value, const std::string& name, glm::vec4& val)
{
if(value.Has(name))
{
auto s = getVector<float>(value.Get(name));
val = glm::vec4{s[0], s[1], s[2], s[3]};
}
}
static inline void getTexId(const tinygltf::Value& value, const std::string& name, int& val)
{
if(value.Has(name))
{
val = value.Get(name).Get("index").Get<int>();
}
}
// Calls a function (such as a lambda function) for each (index, value) pair in
// a sparse accessor. It's only potentially called for indices from
// accessorFirstElement through accessorFirstElement + numElementsToProcess - 1.
template <class T>
void forEachSparseValue(const tinygltf::Model& tmodel,
const tinygltf::Accessor& accessor,
size_t accessorFirstElement,
size_t numElementsToProcess,
std::function<void(size_t index, const T* value)> fn)
{
if(!accessor.sparse.isSparse)
{
return; // Nothing to do
}
const auto& idxs = accessor.sparse.indices;
if(!(idxs.componentType == TINYGLTF_COMPONENT_TYPE_UNSIGNED_BYTE //
|| idxs.componentType == TINYGLTF_COMPONENT_TYPE_UNSIGNED_SHORT //
|| idxs.componentType == TINYGLTF_COMPONENT_TYPE_UNSIGNED_INT))
{
assert(!"Unsupported sparse accessor index type.");
return;
}
const tinygltf::BufferView& idxBufferView = tmodel.bufferViews[idxs.bufferView];
const unsigned char* idxBuffer = &tmodel.buffers[idxBufferView.buffer].data[idxBufferView.byteOffset];
const size_t idxBufferByteStride =
idxBufferView.byteStride ? idxBufferView.byteStride : tinygltf::GetComponentSizeInBytes(idxs.componentType);
if(idxBufferByteStride == size_t(-1))
return; // Invalid
const auto& vals = accessor.sparse.values;
const tinygltf::BufferView& valBufferView = tmodel.bufferViews[vals.bufferView];
const unsigned char* valBuffer = &tmodel.buffers[valBufferView.buffer].data[valBufferView.byteOffset];
const size_t valBufferByteStride = accessor.ByteStride(valBufferView);
if(valBufferByteStride == size_t(-1))
return; // Invalid
// Note that this could be faster for lots of small copies, since we could
// binary search for the first sparse accessor index to use (since the
// glTF specification requires the indices be sorted)!
for(size_t pairIdx = 0; pairIdx < accessor.sparse.count; pairIdx++)
{
// Read the index from the index buffer, converting its type
size_t index = 0;
const unsigned char* pIdx = idxBuffer + idxBufferByteStride * pairIdx;
switch(idxs.componentType)
{
case TINYGLTF_COMPONENT_TYPE_UNSIGNED_BYTE:
index = *reinterpret_cast<const uint8_t*>(pIdx);
break;
case TINYGLTF_COMPONENT_TYPE_UNSIGNED_SHORT:
index = *reinterpret_cast<const uint16_t*>(pIdx);
break;
case TINYGLTF_COMPONENT_TYPE_UNSIGNED_INT:
index = *reinterpret_cast<const uint32_t*>(pIdx);
break;
}
// If it's not in range, skip it
if(index < accessorFirstElement || (index - accessorFirstElement) >= numElementsToProcess)
{
continue;
}
fn(index, reinterpret_cast<const T*>(valBuffer + valBufferByteStride * pairIdx));
}
}
// Copies accessor elements accessorFirstElement through
// accessorFirstElement + numElementsToCopy - 1 to outData elements
// outFirstElement through outFirstElement + numElementsToCopy - 1.
// This handles sparse accessors correctly! It's intended as a replacement for
// what would be memcpy(..., &buffer.data[...], ...) calls.
//
// However, it performs no conversion: it assumes (but does not check) that
// accessor's elements are of type T. For instance, T should be a struct of two
// floats for a VEC2 float accessor.
//
// This is range-checked, so elements that would be out-of-bounds are not
// copied. We assume size_t overflow does not occur.
// Note that outDataSizeInT is the number of elements in the outDataBuffer,
// while numElementsToCopy is the number of elements to copy, not the number
// of elements in accessor.
template <class T>
void copyAccessorData(T* outData,
size_t outDataSizeInElements,
size_t outFirstElement,
const tinygltf::Model& tmodel,
const tinygltf::Accessor& accessor,
size_t accessorFirstElement,
size_t numElementsToCopy)
{
if(outFirstElement >= outDataSizeInElements)
{
assert(!"Invalid outFirstElement!");
return;
}
if(accessorFirstElement >= accessor.count)
{
assert(!"Invalid accessorFirstElement!");
return;
}
const tinygltf::BufferView& bufferView = tmodel.bufferViews[accessor.bufferView];
const unsigned char* buffer = &tmodel.buffers[bufferView.buffer].data[accessor.byteOffset + bufferView.byteOffset];
const size_t maxSafeCopySize = std::min(accessor.count - accessorFirstElement, outDataSizeInElements - outFirstElement);
numElementsToCopy = std::min(numElementsToCopy, maxSafeCopySize);
if(bufferView.byteStride == 0)
{
memcpy(outData + outFirstElement, reinterpret_cast<const T*>(buffer) + accessorFirstElement, numElementsToCopy * sizeof(T));
}
else
{
// Must copy one-by-one
for(size_t i = 0; i < numElementsToCopy; i++)
{
outData[outFirstElement + i] = *reinterpret_cast<const T*>(buffer + bufferView.byteStride * i);
}
}
// Handle sparse accessors by overwriting already copied elements.
forEachSparseValue<T>(tmodel, accessor, accessorFirstElement, numElementsToCopy,
[&outData](size_t index, const T* value) { outData[index] = *value; });
}
// Same as copyAccessorData(T*, ...), but taking a vector.
template <class T>
void copyAccessorData(std::vector<T>& outData,
size_t outFirstElement,
const tinygltf::Model& tmodel,
const tinygltf::Accessor& accessor,
size_t accessorFirstElement,
size_t numElementsToCopy)
{
copyAccessorData<T>(outData.data(), outData.size(), outFirstElement, tmodel, accessor, accessorFirstElement, numElementsToCopy);
}
// Appending to \p attribVec, all the values of \p accessor
// Return false if the accessor is invalid.
// T must be glm::vec2, glm::vec3, or glm::vec4.
template <typename T>
static bool getAccessorData(const tinygltf::Model& tmodel, const tinygltf::Accessor& accessor, std::vector<T>& attribVec)
{
// Retrieving the data of the accessor
const auto nbElems = accessor.count;
const size_t oldNumElements = attribVec.size();
attribVec.resize(oldNumElements + nbElems);
// Copying the attributes
if(accessor.componentType == TINYGLTF_COMPONENT_TYPE_FLOAT)
{
copyAccessorData<T>(attribVec, oldNumElements, tmodel, accessor, 0, accessor.count);
}
else
{
// The component is smaller than float and need to be converted
const auto& bufView = tmodel.bufferViews[accessor.bufferView];
const auto& buffer = tmodel.buffers[bufView.buffer];
const unsigned char* bufferByte = &buffer.data[accessor.byteOffset + bufView.byteOffset];
// 2, 3, 4 for VEC2, VEC3, VEC4
const int nbComponents = tinygltf::GetNumComponentsInType(accessor.type);
if(nbComponents == -1)
return false; // Invalid
// Stride per element
const size_t byteStride = accessor.ByteStride(bufView);
if(byteStride == size_t(-1))
return false; // Invalid
if(!(accessor.componentType == TINYGLTF_COMPONENT_TYPE_BYTE || accessor.componentType == TINYGLTF_COMPONENT_TYPE_UNSIGNED_BYTE
|| accessor.componentType == TINYGLTF_COMPONENT_TYPE_SHORT || accessor.componentType == TINYGLTF_COMPONENT_TYPE_UNSIGNED_SHORT))
{
assert(!"Unhandled tinygltf component type!");
return false;
}
const auto& copyElementFn = [&](size_t elementIdx, const unsigned char* pElement) {
T vecValue;
for(int c = 0; c < nbComponents; c++)
{
switch(accessor.componentType)
{
case TINYGLTF_COMPONENT_TYPE_BYTE:
vecValue[c] = float(*(reinterpret_cast<const char*>(pElement) + c));
if(accessor.normalized)
{
vecValue[c] = std::max(vecValue[c] / 127.f, -1.f);
}
break;
case TINYGLTF_COMPONENT_TYPE_UNSIGNED_BYTE:
vecValue[c] = float(*(reinterpret_cast<const unsigned char*>(pElement) + c));
if(accessor.normalized)
{
vecValue[c] = vecValue[c] / 255.f;
}
break;
case TINYGLTF_COMPONENT_TYPE_SHORT:
vecValue[c] = float(*(reinterpret_cast<const short*>(pElement) + c));
if(accessor.normalized)
{
vecValue[c] = std::max(vecValue[c] / 32767.f, -1.f);
}
break;
case TINYGLTF_COMPONENT_TYPE_UNSIGNED_SHORT:
vecValue[c] = float(*(reinterpret_cast<const unsigned short*>(pElement) + c));
if(accessor.normalized)
{
vecValue[c] = vecValue[c] / 65535.f;
}
break;
}
}
attribVec[oldNumElements + elementIdx] = vecValue;
};
for(size_t i = 0; i < nbElems; i++)
{
copyElementFn(i, bufferByte + byteStride * i);
}
forEachSparseValue<unsigned char>(tmodel, accessor, 0, nbElems, copyElementFn);
}
return true;
}
// Appending to \p attribVec, all the values of \p attribName
// Return false if the attribute is missing or invalid.
// T must be glm::vec2, glm::vec3, or glm::vec4.
template <typename T>
static bool getAttribute(const tinygltf::Model& tmodel, const tinygltf::Primitive& primitive, std::vector<T>& attribVec, const std::string& attribName)
{
const auto& it = primitive.attributes.find(attribName);
if(it == primitive.attributes.end())
return false;
const auto& accessor = tmodel.accessors[it->second];
return getAccessorData(tmodel, accessor, attribVec);
}
inline bool hasExtension(const tinygltf::ExtensionMap& extensions, const std::string& name)
{
return extensions.find(name) != extensions.end();
}
// This is appending the incoming data to the binary buffer (just one)
// and return the amount in byte of data that was added.
template <class T>
uint32_t appendData(tinygltf::Buffer& buffer, const T& inData)
{
auto* pData = reinterpret_cast<const char*>(inData.data());
uint32_t len = static_cast<uint32_t>(sizeof(inData[0]) * inData.size());
buffer.data.insert(buffer.data.end(), pData, pData + len);
return len;
}
} // namespace nvh

View file

@ -0,0 +1,118 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
//--------------------------------------------------------------------------------------------------
/** @DOC_START
# class InputParser
> InputParser is a Simple command line parser
Example of usage for: test.exe -f name.txt -size 200 100
Parsing the command line: mandatory '-f' for the filename of the scene
```cpp
nvh::InputParser parser(argc, argv);
std::string filename = parser.getString("-f");
if(filename.empty()) filename = "default.txt";
if(parser.exist("-size") {
auto values = parser.getInt2("-size");
```
@DOC_END */
#pragma once
#include <string>
#include <vector>
#include <array>
class InputParser
{
public:
InputParser(int& argc, char** argv)
{
for(int i = 1; i < argc; ++i)
{
if(argv[i])
{
m_tokens.emplace_back(argv[i]);
}
}
}
auto findOption(const std::string& option) const { return std::find(m_tokens.begin(), m_tokens.end(), option); }
const std::string getString(const std::string& option, std::string defaultString = "") const
{
if(exist(option))
{
auto itr = findOption(option);
if(itr != m_tokens.end() && ++itr != m_tokens.end())
{
return *itr;
}
}
return defaultString;
}
std::vector<std::string> getString(const std::string& option, uint32_t nbElem) const
{
auto itr = findOption(option);
std::vector<std::string> items;
while(itr != m_tokens.end() && ++itr != m_tokens.end() && nbElem-- > 0)
{
items.push_back((*itr));
}
return items;
}
int getInt(const std::string& option, int defaultValue = 0) const
{
if(exist(option))
return std::stoi(getString(option));
return defaultValue;
}
auto getInt2(const std::string& option, std::array<int, 2> defaultValues = {0, 0}) const
{
if(exist(option))
{
auto items = getString(option, 2);
if(items.size() == 2)
{
defaultValues[0] = std::stoi(items[0]);
defaultValues[1] = std::stoi(items[1]);
}
}
return defaultValues;
}
float getFloat(const std::string& option, float defaultValue = 0.0f) const
{
if(exist(option))
return std::stof(getString(option));
return defaultValue;
}
bool exist(const std::string& option) const { return findOption(option) != m_tokens.end(); }
private:
std::vector<std::string> m_tokens;
};

View file

@ -0,0 +1,120 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#ifndef NV_MISC_INCLUDED
#define NV_MISC_INCLUDED
#include <algorithm>
#include <assert.h>
#include <math.h>
#include <stdlib.h>
#include <string>
#include <vector>
#include "nvprint.hpp"
/** @DOC_START
# functions in nvh
- mipMapLevels : compute number of mip maps
- stringFormat : sprintf for std::string
- frand : random float using rand()
- permutation : fills uint vector with random permutation of values [0... vec.size-1]
@DOC_END */
namespace nvh {
inline std::string stringFormat(const char* msg, ...)
{
va_list list;
if(msg == 0)
return std::string();
// Speculate needed string size and vsnprintf to std::string.
// If it was too small, we resize and try for the second (and final) time.
std::string str;
str.resize(64);
for(int i = 0; i < 2; ++i)
{
va_start(list, msg);
int charsNeeded = vsnprintf(&str[0], str.size(), msg, list); // charsNeeded doesn't count \0
va_end(list);
if(charsNeeded < 0)
{
assert(!"encoding error");
return std::string();
}
if(charsNeeded < str.size())
{ // Not <= due to \0 terminator (which we trim out of std::string)
str.resize(charsNeeded);
return str;
}
else
{
str.resize(charsNeeded + 1); // Leave room for \0
}
}
assert(!"String should have been resized perfectly second try");
return std::string();
}
inline float frand()
{
return float(rand() % RAND_MAX) / float(RAND_MAX);
}
inline int mipMapLevels(int size)
{
int num = 0;
while(size)
{
num++;
size /= 2;
}
return num;
}
// permutation creates a random permutation of all integer values
// 0..data.size-1 occuring once within data.
inline void permutation(std::vector<unsigned int>& data)
{
size_t size = data.size();
assert(size < RAND_MAX);
for(size_t i = 0; i < size; i++)
{
data[i] = (unsigned int)(i);
}
for(size_t i = size - 1; i > 0; i--)
{
size_t other = rand() % (i + 1);
std::swap(data[i], data[other]);
}
}
} // namespace nvh
#endif

View file

@ -0,0 +1,79 @@
#ifndef __NSIGHTEVENTS__
#define __NSIGHTEVENTS__
/// @DOC_SKIP (keyword to exclude this file from automatic README.md generation)
//-----------------------------------------------------------------------------
// NSIGHT
//-----------------------------------------------------------------------------
#ifdef NVP_SUPPORTS_NVTOOLSEXT
// NSight perf markers - take the whole stuff from "C:\Program Files (x86)\NVIDIA GPU Computing Toolkit\nvToolsExt"
#include <nvtx3/nvToolsExt.h>
typedef int(NVTX_API* nvtxRangePushEx_Pfn)(const nvtxEventAttributes_t* eventAttrib);
typedef int(NVTX_API* nvtxRangePush_Pfn)(const char* message);
typedef int(NVTX_API* nvtxRangePop_Pfn)();
extern nvtxRangePushEx_Pfn nvtxRangePushEx_dyn;
extern nvtxRangePush_Pfn nvtxRangePush_dyn;
extern nvtxRangePop_Pfn nvtxRangePop_dyn;
extern nvtxEventAttributes_t eventAttr;
#define NX_RANGE nvtxRangeId_t
#define NX_MARK(name) nvtxMark(name)
#define NX_RANGESTART(name) nvtxRangeStart(name)
#define NX_RANGEEND(id) nvtxRangeEnd(id)
#define NX_RANGEPUSH(name) nvtxRangePush(name)
#define NX_RANGEPUSHCOL(name, c) \
{ \
nvtxEventAttributes_t eventAttrib = {0}; \
eventAttrib.version = NVTX_VERSION; \
eventAttrib.size = NVTX_EVENT_ATTRIB_STRUCT_SIZE; \
eventAttrib.colorType = NVTX_COLOR_ARGB; \
eventAttrib.color = c; \
eventAttrib.messageType = NVTX_MESSAGE_TYPE_ASCII; \
eventAttrib.message.ascii = name; \
nvtxRangePushEx(&eventAttrib); \
}
#define NX_RANGEPOP() nvtxRangePop()
struct NXProfileFunc
{
NXProfileFunc(const char* name, uint32_t c, /*int64_t*/ uint32_t p = 0)
{
nvtxEventAttributes_t eventAttrib = {0};
// set the version and the size information
eventAttrib.version = NVTX_VERSION;
eventAttrib.size = NVTX_EVENT_ATTRIB_STRUCT_SIZE;
// configure the attributes. 0 is the default for all attributes.
eventAttrib.colorType = NVTX_COLOR_ARGB;
eventAttrib.color = c;
eventAttrib.messageType = NVTX_MESSAGE_TYPE_ASCII;
eventAttrib.message.ascii = name;
eventAttrib.payloadType = NVTX_PAYLOAD_TYPE_INT64;
eventAttrib.payload.llValue = (int64_t)p;
eventAttrib.category = (uint32_t)p;
nvtxRangePushEx(&eventAttrib);
}
~NXProfileFunc() { nvtxRangePop(); }
};
#ifdef NXPROFILEFUNC
#undef NXPROFILEFUNC
#undef NXPROFILEFUNCCOL
#undef NXPROFILEFUNCCOL2
#endif
#define NXPROFILEFUNC(name) NXProfileFunc nxProfileMe(name, 0xFF0000FF)
#define NXPROFILEFUNCCOL(name, c) NXProfileFunc nxProfileMe(name, c)
#define NXPROFILEFUNCCOL2(name, c, p) NXProfileFunc nxProfileMe(name, c, p)
#else
#define NX_RANGE int
#define NX_MARK(name)
#define NX_RANGESTART(name) 0
#define NX_RANGEEND(id)
#define NX_RANGEPUSH(name)
#define NX_RANGEPUSHCOL(name, c)
#define NX_RANGEPOP()
#define NXPROFILEFUNC(name)
#define NXPROFILEFUNCCOL(name, c)
#define NXPROFILEFUNCCOL2(name, c, a)
#endif
#endif //__NSIGHTEVENTS__

View file

@ -0,0 +1,609 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2021-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
* SPDX-License-Identifier: LicenseRef-NvidiaProprietary
*
* NVIDIA CORPORATION, its affiliates and licensors retain all intellectual
* property and proprietary rights in and to this material, related
* documentation and any modifications thereto. Any use, reproduction,
* disclosure or distribution of this material and related documentation
* without an express license agreement from NVIDIA CORPORATION or
* its affiliates is strictly prohibited.
*/
#ifdef WIN32
#include <windows.h>
#endif
#include "nvh/nvml_monitor.hpp"
#include <iostream>
#include <string>
#include <vector>
#include <chrono>
#if defined(NVP_SUPPORTS_NVML)
#define NVML_NO_UNVERSIONED_FUNC_DEFS
#include <nvml.h>
#ifdef _WIN32
// The cfgmgr32 header is necessary for interrogating driver information in the registry.
#include <cfgmgr32.h>
// For convenience the library is also linked in automatically using the #pragma command.
#pragma comment(lib, "Cfgmgr32.lib")
#endif
#define CHECK_NVML_CALL() \
if(res != NVML_SUCCESS) \
{ \
LOGE("NVML Error %s\n", nvmlErrorString(res)); \
}
#define CHECK_NVML(fun) \
{ \
nvmlReturn_t res = fun; \
if(res != NVML_SUCCESS) \
{ \
LOGE("NVML Error in %s: %s\n", #fun, nvmlErrorString(res)); \
} \
}
#define CHECK_NVML_SUPPORT(fun, field) \
{ \
nvmlReturn_t res = fun; \
if(res != NVML_SUCCESS) \
{ \
field.isSupported = false; \
} \
else \
{ \
field.isSupported = true; \
} \
}
static const std::string brandToString(nvmlBrandType_t brand)
{
switch(brand)
{
case NVML_BRAND_UNKNOWN:
return "Unknown";
case NVML_BRAND_QUADRO:
return "Quadro";
case NVML_BRAND_TESLA:
return "Tesla";
case NVML_BRAND_NVS:
return "NVS";
case NVML_BRAND_GRID:
return "Grid";
case NVML_BRAND_GEFORCE:
return "GeForce";
case NVML_BRAND_TITAN:
return "Titan";
case NVML_BRAND_NVIDIA_VAPPS:
return "NVIDIA Virtual Applications";
case NVML_BRAND_NVIDIA_VPC:
return "NVIDIA Virtual PC";
case NVML_BRAND_NVIDIA_VCS:
return "NVIDIA Virtual Compute Server";
case NVML_BRAND_NVIDIA_VWS:
return "NVIDIA RTX Virtual Workstation";
case NVML_BRAND_NVIDIA_CLOUD_GAMING:
return "NVIDIA Cloud Gaming";
case NVML_BRAND_QUADRO_RTX:
return "Quadro RTX";
case NVML_BRAND_NVIDIA_RTX:
return "NVIDIA RTX";
case NVML_BRAND_NVIDIA:
return "NVIDIA";
case NVML_BRAND_GEFORCE_RTX:
return "GeForce RTX";
case NVML_BRAND_TITAN_RTX:
return "Titan RTX";
}
return "Unknown";
}
static const std::string computeModeToString(nvmlComputeMode_t computeMode)
{
switch(computeMode)
{
case NVML_COMPUTEMODE_DEFAULT:
return "Default";
case NVML_COMPUTEMODE_EXCLUSIVE_THREAD:
return "Exclusive thread";
case NVML_COMPUTEMODE_PROHIBITED:
return "Compute prohibited";
case NVML_COMPUTEMODE_EXCLUSIVE_PROCESS:
return "Exclusive process";
default:
return "Unknown";
}
}
#endif
//-------------------------------------------------------------------------------------------------
//
//
nvvkhl::NvmlMonitor::NvmlMonitor(uint32_t interval /*= 100*/, uint32_t limit /*= 100*/)
: m_maxElements(limit) // limit : number of measures
, m_minInterval(interval) // interval : ms between sampling
{
#if defined(NVP_SUPPORTS_NVML)
nvmlReturn_t result;
result = nvmlInit();
if(result != NVML_SUCCESS)
return;
if(nvmlDeviceGetCount(&m_physicalGpuCount) != NVML_SUCCESS)
return;
m_deviceInfo.resize(m_physicalGpuCount);
m_deviceMemory.resize(m_physicalGpuCount);
m_deviceUtilization.resize(m_physicalGpuCount);
m_devicePerformanceState.resize(m_physicalGpuCount);
m_devicePowerState.resize(m_physicalGpuCount);
// System Info
m_sysInfo.cpu.resize(m_maxElements);
// Get driver version
char driverVersion[80];
result = nvmlSystemGetDriverVersion(driverVersion, 80);
if(result == NVML_SUCCESS)
m_sysInfo.driverVersion = driverVersion;
// Loop over all GPUs
for(int i = 0; i < (int)m_physicalGpuCount; i++)
{
// Sizing the data
m_deviceMemory[i].init(m_maxElements);
m_deviceUtilization[i].init(m_maxElements);
m_devicePerformanceState[i].init(m_maxElements);
m_devicePowerState[i].init(m_maxElements);
// Retrieving general capabilities
nvmlDevice_t device;
result = nvmlDeviceGetHandleByIndex(i, &device);
m_deviceInfo[i].refresh(device);
}
m_valid = true;
#endif
}
//-------------------------------------------------------------------------------------------------
// Destructor: shutting down NVML
//
nvvkhl::NvmlMonitor::~NvmlMonitor()
{
#if defined(NVP_SUPPORTS_NVML)
nvmlShutdown();
#endif
}
#if defined(NVP_SUPPORTS_NVML)
//-------------------------------------------------------------------------------------------------
// Returning the current amount of memory is used by the device
static uint64_t getMemory(nvmlDevice_t device)
{
try
{
nvmlMemory_t memory{};
nvmlDeviceGetMemoryInfo(device, &memory);
return memory.used;
}
catch(std::exception ex)
{
return 0ULL;
}
}
static float getLoad(nvmlDevice_t device)
{
nvmlUtilization_t utilization{};
nvmlReturn_t result = nvmlDeviceGetUtilizationRates(device, &utilization);
if(result != NVML_SUCCESS)
return 0.0f;
return static_cast<float>(utilization.gpu);
}
static float getCpuLoad()
{
#ifdef _WIN32
static uint64_t s_previousTotalTicks = 0;
static uint64_t s_previousIdleTicks = 0;
FILETIME idleTime, kernelTime, userTime;
if(!GetSystemTimes(&idleTime, &kernelTime, &userTime))
return 0.0f;
auto fileTimeToInt64 = [](const FILETIME& ft) {
return (((uint64_t)(ft.dwHighDateTime)) << 32) | ((uint64_t)ft.dwLowDateTime);
};
auto totalTicks = fileTimeToInt64(kernelTime) + fileTimeToInt64(userTime);
auto idleTicks = fileTimeToInt64(idleTime);
uint64_t totalTicksSinceLastTime = totalTicks - s_previousTotalTicks;
uint64_t idleTicksSinceLastTime = idleTicks - s_previousIdleTicks;
float result = 1.0f - ((totalTicksSinceLastTime > 0) ? ((float)idleTicksSinceLastTime) / totalTicksSinceLastTime : 0);
s_previousTotalTicks = totalTicks;
s_previousIdleTicks = idleTicks;
return result * 100.f;
#else
return 0;
#endif
}
#endif
//-------------------------------------------------------------------------------------------------
// Pulling the information from NVML and storing the data
// Note: the interval is important, as it cannot be query too quickly
//
void nvvkhl::NvmlMonitor::refresh()
{
#if defined(NVP_SUPPORTS_NVML)
static std::chrono::high_resolution_clock::time_point s_startTime;
if(!m_valid)
return;
// Pulling the information only when it is over the defined interval
const auto now = std::chrono::high_resolution_clock::now();
const auto t = std::chrono::duration_cast<std::chrono::milliseconds>(now - s_startTime).count();
if(t < m_minInterval)
return;
s_startTime = now;
// Increasing where to store the value
m_offset = (m_offset + 1) % m_maxElements;
// System
m_sysInfo.cpu[m_offset] = getCpuLoad();
// All GPUs
for(unsigned int gpu_id = 0; gpu_id < m_physicalGpuCount; gpu_id++)
{
nvmlDevice_t device;
nvmlReturn_t result = nvmlDeviceGetHandleByIndex(gpu_id, &device);
m_deviceMemory[gpu_id].refresh(device, m_offset);
m_deviceUtilization[gpu_id].refresh(device, m_offset);
m_devicePerformanceState[gpu_id].refresh(device, m_offset);
m_devicePowerState[gpu_id].refresh(device, m_offset);
}
#endif // NVP_SUPPORTS_NVML
}
void nvvkhl::NvmlMonitor::DeviceInfo::refresh(void* dev)
{
#if defined(NVP_SUPPORTS_NVML)
nvmlDevice_t device = reinterpret_cast<nvmlDevice_t>(dev);
CHECK_NVML_SUPPORT(nvmlDeviceGetBoardId(device, &boardId.get()), boardId);
partNumber.get().resize(NVML_DEVICE_PART_NUMBER_BUFFER_SIZE);
CHECK_NVML_SUPPORT(
nvmlDeviceGetBoardPartNumber(device, partNumber.get().data(), static_cast<uint32_t>(partNumber.get().size())), partNumber);
nvmlBrandType_t brandType;
CHECK_NVML_SUPPORT(nvmlDeviceGetBrand(device, &brandType), brand);
brand.get() = brandToString(brandType);
nvmlBridgeChipHierarchy_t bridgeChipHierarchy{};
CHECK_NVML_SUPPORT(nvmlDeviceGetBridgeChipInfo(device, &bridgeChipHierarchy), bridgeHierarchy);
bridgeHierarchy.get().resize(bridgeChipHierarchy.bridgeCount);
for(int i = 0; i < bridgeChipHierarchy.bridgeCount; i++)
{
bridgeHierarchy.get()[i].first = ((bridgeChipHierarchy.bridgeChipInfo[i].type == NVML_BRIDGE_CHIP_PLX) ? "PLX" : "BRO4");
bridgeHierarchy.get()[i].second = fmt::format("#{}", bridgeChipHierarchy.bridgeChipInfo[i].fwVersion);
}
CHECK_NVML_SUPPORT(nvmlDeviceGetCpuAffinity(device, 1, (unsigned long*)&cpuAffinity.get()), cpuAffinity);
nvmlComputeMode_t cMode;
CHECK_NVML_SUPPORT(nvmlDeviceGetComputeMode(device, &cMode), computeMode);
computeMode = computeModeToString(cMode);
CHECK_NVML_SUPPORT(nvmlDeviceGetCudaComputeCapability(device, &computeCapabilityMajor.get(), &computeCapabilityMinor.get()),
computeCapabilityMajor);
computeCapabilityMinor.isSupported = computeCapabilityMajor.isSupported;
CHECK_NVML_SUPPORT(nvmlDeviceGetCurrPcieLinkGeneration(device, &pcieLinkGen.get()), pcieLinkGen);
CHECK_NVML_SUPPORT(nvmlDeviceGetCurrPcieLinkWidth(device, &pcieLinkWidth.get()), pcieLinkWidth);
CHECK_NVML_SUPPORT(nvmlDeviceGetDefaultApplicationsClock(device, NVML_CLOCK_GRAPHICS, &clockDefaultGraphics.get()),
clockDefaultGraphics);
CHECK_NVML_SUPPORT(nvmlDeviceGetMaxClockInfo(device, NVML_CLOCK_GRAPHICS, &clockMaxGraphics.get()), clockMaxGraphics);
CHECK_NVML_SUPPORT(nvmlDeviceGetMaxCustomerBoostClock(device, NVML_CLOCK_GRAPHICS, &clockBoostGraphics.get()), clockBoostGraphics);
CHECK_NVML_SUPPORT(nvmlDeviceGetDefaultApplicationsClock(device, NVML_CLOCK_SM, &clockDefaultSM.get()), clockDefaultSM);
CHECK_NVML_SUPPORT(nvmlDeviceGetMaxClockInfo(device, NVML_CLOCK_SM, &clockMaxSM.get()), clockMaxSM);
CHECK_NVML_SUPPORT(nvmlDeviceGetMaxCustomerBoostClock(device, NVML_CLOCK_SM, &clockBoostSM.get()), clockBoostSM);
CHECK_NVML_SUPPORT(nvmlDeviceGetDefaultApplicationsClock(device, NVML_CLOCK_MEM, &clockDefaultMem.get()), clockDefaultMem);
CHECK_NVML_SUPPORT(nvmlDeviceGetMaxClockInfo(device, NVML_CLOCK_MEM, &clockMaxMem.get()), clockMaxMem);
CHECK_NVML_SUPPORT(nvmlDeviceGetMaxCustomerBoostClock(device, NVML_CLOCK_MEM, &clockBoostMem.get()), clockBoostMem);
CHECK_NVML_SUPPORT(nvmlDeviceGetDefaultApplicationsClock(device, NVML_CLOCK_VIDEO, &clockDefaultVideo.get()), clockDefaultVideo);
CHECK_NVML_SUPPORT(nvmlDeviceGetMaxClockInfo(device, NVML_CLOCK_VIDEO, &clockMaxVideo.get()), clockMaxVideo);
CHECK_NVML_SUPPORT(nvmlDeviceGetMaxCustomerBoostClock(device, NVML_CLOCK_VIDEO, &clockBoostVideo.get()), clockBoostVideo);
#ifdef _WIN32
nvmlDriverModel_t currentDM, pendingDM;
CHECK_NVML_SUPPORT(nvmlDeviceGetDriverModel(device, &currentDM, &pendingDM), currentDriverModel);
currentDriverModel = (currentDM == NVML_DRIVER_WDDM) ? "WDDM" : "TCC";
pendingDriverModel = (pendingDM == NVML_DRIVER_WDDM) ? "WDDM" : "TCC";
pendingDriverModel.isSupported = currentDriverModel.isSupported;
#endif
nvmlEnableState_t currentES, pendingES;
CHECK_NVML_SUPPORT(nvmlDeviceGetEccMode(device, &currentES, &pendingES), currentEccMode);
currentEccMode = (currentES == NVML_FEATURE_ENABLED);
pendingEccMode = (pendingES == NVML_FEATURE_ENABLED);
pendingEccMode.isSupported = currentEccMode.isSupported;
CHECK_NVML_SUPPORT(nvmlDeviceGetEncoderCapacity(device, NVML_ENCODER_QUERY_H264, &encoderCapacityH264.get()), encoderCapacityH264);
CHECK_NVML_SUPPORT(nvmlDeviceGetEncoderCapacity(device, NVML_ENCODER_QUERY_HEVC, &encoderCapacityHEVC.get()), encoderCapacityHEVC);
infoROMImageVersion.get().resize(NVML_DEVICE_INFOROM_VERSION_BUFFER_SIZE);
CHECK_NVML_SUPPORT(nvmlDeviceGetInforomImageVersion(device, infoROMImageVersion.get().data(),
static_cast<uint32_t>(infoROMImageVersion.get().size())),
infoROMImageVersion);
infoROMOEMVersion.get().resize(NVML_DEVICE_INFOROM_VERSION_BUFFER_SIZE);
infoROMECCVersion.get().resize(NVML_DEVICE_INFOROM_VERSION_BUFFER_SIZE);
infoROMPowerVersion.get().resize(NVML_DEVICE_INFOROM_VERSION_BUFFER_SIZE);
CHECK_NVML_SUPPORT(nvmlDeviceGetInforomVersion(device, NVML_INFOROM_OEM, infoROMOEMVersion.get().data(),
static_cast<uint32_t>(infoROMOEMVersion.get().size())),
infoROMOEMVersion);
CHECK_NVML_SUPPORT(nvmlDeviceGetInforomVersion(device, NVML_INFOROM_ECC, infoROMECCVersion.get().data(),
static_cast<uint32_t>(infoROMECCVersion.get().size())),
infoROMECCVersion);
CHECK_NVML_SUPPORT(nvmlDeviceGetInforomVersion(device, NVML_INFOROM_POWER, infoROMPowerVersion.get().data(),
static_cast<uint32_t>(infoROMPowerVersion.get().size())),
infoROMPowerVersion);
CHECK_NVML_SUPPORT(nvmlDeviceGetMaxPcieLinkGeneration(device, &maxLinkGen.get()), maxLinkGen);
CHECK_NVML_SUPPORT(nvmlDeviceGetMaxPcieLinkWidth(device, &maxLinkWidth.get()), maxLinkWidth);
CHECK_NVML_SUPPORT(nvmlDeviceGetMinorNumber(device, &minorNumber.get()), minorNumber);
CHECK_NVML_SUPPORT(nvmlDeviceGetMultiGpuBoard(device, &multiGpuBool.get()), multiGpuBool);
deviceName.get().resize(NVML_DEVICE_NAME_V2_BUFFER_SIZE);
CHECK_NVML_SUPPORT(nvmlDeviceGetName(device, deviceName.get().data(), static_cast<uint32_t>(deviceName.get().size())), deviceName);
CHECK_NVML_SUPPORT(nvmlDeviceGetSupportedClocksThrottleReasons(device, reinterpret_cast<long long unsigned int*>(
&supportedClocksThrottleReasons.get())),
supportedClocksThrottleReasons);
vbiosVersion.get().resize(NVML_DEVICE_VBIOS_VERSION_BUFFER_SIZE);
CHECK_NVML_SUPPORT(
nvmlDeviceGetVbiosVersion(device, vbiosVersion.get().data(), static_cast<uint32_t>(vbiosVersion.get().size())), vbiosVersion);
CHECK_NVML_SUPPORT(nvmlDeviceGetTemperatureThreshold(device, NVML_TEMPERATURE_THRESHOLD_SHUTDOWN,
&tempThresholdShutdown.get()),
tempThresholdShutdown);
CHECK_NVML_SUPPORT(nvmlDeviceGetTemperatureThreshold(device, NVML_TEMPERATURE_THRESHOLD_SLOWDOWN,
&tempThresholdHWSlowdown.get()),
tempThresholdHWSlowdown);
CHECK_NVML_SUPPORT(nvmlDeviceGetTemperatureThreshold(device, NVML_TEMPERATURE_THRESHOLD_MEM_MAX,
&tempThresholdSWSlowdown.get()),
tempThresholdSWSlowdown);
CHECK_NVML_SUPPORT(nvmlDeviceGetTemperatureThreshold(device, NVML_TEMPERATURE_THRESHOLD_GPU_MAX,
&tempThresholdDropBelowBaseClock.get()),
tempThresholdDropBelowBaseClock);
CHECK_NVML_SUPPORT(nvmlDeviceGetPowerManagementLimit(device, &powerLimit.get()), powerLimit);
// Milliwatt to watt
powerLimit.get() /= 1000;
uint32_t supportedClockCount = 0;
if(nvmlDeviceGetSupportedMemoryClocks(device, &supportedClockCount, nullptr) == NVML_ERROR_INSUFFICIENT_SIZE)
{
supportedMemoryClocks.isSupported = true;
supportedMemoryClocks.get().resize(supportedClockCount);
nvmlDeviceGetSupportedMemoryClocks(device, &supportedClockCount, supportedMemoryClocks.get().data());
}
for(size_t i = 0; i < supportedMemoryClocks.get().size(); i++)
{
supportedClockCount = 0;
if(nvmlDeviceGetSupportedGraphicsClocks(device, supportedMemoryClocks.get()[i], &supportedClockCount, nullptr) == NVML_ERROR_INSUFFICIENT_SIZE)
{
supportedGraphicsClocks.isSupported = true;
auto& graphicsClocks = supportedGraphicsClocks.get()[supportedMemoryClocks.get()[i]];
graphicsClocks.resize(supportedClockCount);
nvmlDeviceGetSupportedGraphicsClocks(device, supportedMemoryClocks.get()[i], &supportedClockCount, graphicsClocks.data());
}
}
#endif
}
void nvvkhl::NvmlMonitor::DeviceMemory::init(uint32_t maxElements)
{
memoryFree.get().resize(maxElements);
memoryUsed.get().resize(maxElements);
bar1Free.get().resize(maxElements);
bar1Used.get().resize(maxElements);
}
void nvvkhl::NvmlMonitor::DeviceMemory::refresh(void* dev, uint32_t offset)
{
#if defined(NVP_SUPPORTS_NVML)
nvmlDevice_t device = reinterpret_cast<nvmlDevice_t>(dev);
nvmlBAR1Memory_t bar1Memory{};
nvmlMemory_t memory{};
CHECK_NVML_SUPPORT(nvmlDeviceGetBAR1MemoryInfo(device, &bar1Memory), bar1Total);
bar1Total = bar1Memory.bar1Total;
bar1Used.get()[offset] = bar1Memory.bar1Used;
bar1Used.isSupported = bar1Total.isSupported;
bar1Free.get()[offset] = bar1Memory.bar1Free;
bar1Free.isSupported = bar1Total.isSupported;
CHECK_NVML_SUPPORT(nvmlDeviceGetMemoryInfo(device, &memory), memoryTotal);
memoryTotal = memory.total;
memoryUsed.get()[offset] = memory.used;
memoryUsed.isSupported = memoryTotal.isSupported;
memoryFree.get()[offset] = memory.free;
memoryFree.isSupported = memoryTotal.isSupported;
#endif
}
void nvvkhl::NvmlMonitor::DeviceUtilization::init(uint32_t maxElements)
{
gpuUtilization.get().resize(maxElements);
memUtilization.get().resize(maxElements);
;
computeProcesses.get().resize(maxElements);
;
graphicsProcesses.get().resize(maxElements);
;
}
void nvvkhl::NvmlMonitor::DeviceUtilization::refresh(void* dev, uint32_t offset)
{
#if defined(NVP_SUPPORTS_NVML)
nvmlDevice_t device = reinterpret_cast<nvmlDevice_t>(dev);
nvmlUtilization_t utilization;
CHECK_NVML_SUPPORT(nvmlDeviceGetUtilizationRates(device, &utilization), gpuUtilization);
gpuUtilization.get()[offset] = utilization.gpu;
memUtilization.get()[offset] = utilization.memory;
memUtilization.isSupported = gpuUtilization.isSupported;
computeProcesses.get()[offset] = 0;
graphicsProcesses.get()[offset] = 0;
CHECK_NVML_SUPPORT(nvmlDeviceGetComputeRunningProcesses(device, &computeProcesses.get()[offset], nullptr), computeProcesses);
CHECK_NVML_SUPPORT(nvmlDeviceGetGraphicsRunningProcesses(device, &graphicsProcesses.get()[offset], nullptr), graphicsProcesses);
#endif
}
void nvvkhl::NvmlMonitor::DevicePerformanceState::init(uint32_t maxElements)
{
clockGraphics.get().resize(maxElements);
clockSM.get().resize(maxElements);
clockMem.get().resize(maxElements);
clockVideo.get().resize(maxElements);
throttleReasons.get().resize(maxElements);
}
void nvvkhl::NvmlMonitor::DevicePerformanceState::refresh(void* dev, uint32_t offset)
{
#if defined(NVP_SUPPORTS_NVML)
nvmlDevice_t device = reinterpret_cast<nvmlDevice_t>(dev);
CHECK_NVML_SUPPORT(nvmlDeviceGetClockInfo(device, NVML_CLOCK_GRAPHICS, &clockGraphics.get()[offset]), clockGraphics);
CHECK_NVML_SUPPORT(nvmlDeviceGetClockInfo(device, NVML_CLOCK_SM, &clockSM.get()[offset]), clockSM);
CHECK_NVML_SUPPORT(nvmlDeviceGetClockInfo(device, NVML_CLOCK_MEM, &clockMem.get()[offset]), clockMem);
CHECK_NVML_SUPPORT(nvmlDeviceGetClockInfo(device, NVML_CLOCK_VIDEO, &clockVideo.get()[offset]), clockVideo);
CHECK_NVML_SUPPORT(nvmlDeviceGetCurrentClocksThrottleReasons(device, reinterpret_cast<unsigned long long*>(
&throttleReasons.get()[offset])),
throttleReasons);
#endif
}
std::vector<std::string> nvvkhl::NvmlMonitor::DevicePerformanceState::getThrottleReasonStrings(uint64_t reason)
{
std::vector<std::string> reasonStrings;
#if defined(NVP_SUPPORTS_NVML)
if(reason & nvmlClocksThrottleReasonGpuIdle)
{
reasonStrings.push_back("Idle");
}
if(reason & nvmlClocksThrottleReasonApplicationsClocksSetting)
{
reasonStrings.push_back("App clock setting");
}
if(reason & nvmlClocksThrottleReasonSwPowerCap)
{
reasonStrings.push_back("SW power cap");
}
if(reason & nvmlClocksThrottleReasonHwSlowdown)
{
reasonStrings.push_back("HW slowdown");
}
if(reason & nvmlClocksThrottleReasonSyncBoost)
{
reasonStrings.push_back("Sync boost");
}
if(reason & nvmlClocksThrottleReasonSwThermalSlowdown)
{
reasonStrings.push_back("SW Thermal slowdown");
}
if(reason & nvmlClocksThrottleReasonHwThermalSlowdown)
{
reasonStrings.push_back("HW Thermal slowdown");
}
if(reason & nvmlClocksThrottleReasonHwPowerBrakeSlowdown)
{
reasonStrings.push_back("Power brake slowdown");
}
if(reasonStrings.empty())
{
reasonStrings.push_back("Full speed");
}
#endif
return reasonStrings;
}
const std::vector<uint64_t>& nvvkhl::NvmlMonitor::DevicePerformanceState::getAllThrottleReasonList()
{
static std::vector<uint64_t> s_reasonList =
#if defined(NVP_SUPPORTS_NVML)
{nvmlClocksThrottleReasonGpuIdle,
nvmlClocksThrottleReasonApplicationsClocksSetting,
nvmlClocksThrottleReasonSwPowerCap,
nvmlClocksThrottleReasonHwSlowdown,
nvmlClocksThrottleReasonSyncBoost,
nvmlClocksThrottleReasonSwThermalSlowdown,
nvmlClocksThrottleReasonHwThermalSlowdown,
nvmlClocksThrottleReasonHwPowerBrakeSlowdown,
nvmlClocksThrottleReasonNone};
#else
{};
#endif
return s_reasonList;
}
void nvvkhl::NvmlMonitor::DevicePowerState::init(uint32_t maxElements)
{
power.get().resize(maxElements);
temperature.get().resize(maxElements);
fanSpeed.get().resize(maxElements);
}
void nvvkhl::NvmlMonitor::DevicePowerState::refresh(void* dev, uint32_t offset)
{
#if defined(NVP_SUPPORTS_NVML)
nvmlDevice_t device = reinterpret_cast<nvmlDevice_t>(dev);
CHECK_NVML_SUPPORT(nvmlDeviceGetTemperature(device, NVML_TEMPERATURE_GPU, &temperature.get()[offset]), temperature);
CHECK_NVML_SUPPORT(nvmlDeviceGetPowerUsage(device, &power.get()[offset]), power);
// Milliwatt to watt
power.get()[offset] /= 1000;
CHECK_NVML_SUPPORT(nvmlDeviceGetFanSpeed(device, &fanSpeed.get()[offset]), fanSpeed);
#endif
}

View file

@ -0,0 +1,220 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2021-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
* SPDX-License-Identifier: LicenseRef-NvidiaProprietary
*
* NVIDIA CORPORATION, its affiliates and licensors retain all intellectual
* property and proprietary rights in and to this material, related
* documentation and any modifications thereto. Any use, reproduction,
* disclosure or distribution of this material and related documentation
* without an express license agreement from NVIDIA CORPORATION or
* its affiliates is strictly prohibited.
*/
#pragma once
#include <string>
#include <vector>
#include <map>
/** @DOC_START
Capture the GPU load and memory for all GPUs on the system.
Usage:
- There should be only one instance of NvmlMonitor
- call refresh() in each frame. It will not pull more measurement that the interval(ms)
- isValid() : return if it can be used
- nbGpu() : return the number of GPU in the computer
- getGpuInfo() : static info about the GPU
- getDeviceMemory() : memory consumption info
- getDeviceUtilization() : GPU and memory utilization
- getDevicePerformanceState() : clock speeds and throttle reasons
- getDevicePowerState() : power, temperature and fan speed
Measurements:
- Uses a cycle buffer.
- Offset is the last measurement
@DOC_END */
namespace nvvkhl {
class NvmlMonitor
{
public:
NvmlMonitor(uint32_t interval = 100, uint32_t limit = 100);
~NvmlMonitor();
template <typename T>
struct NVMLField
{
T data;
bool isSupported;
operator T&() { return data; }
T& get() { return data; }
const T& get() const { return data; }
T& operator=(const T& rhs)
{
data = rhs;
return data;
}
};
// Static device information
struct DeviceInfo
{
NVMLField<std::string> currentDriverModel;
NVMLField<std::string> pendingDriverModel;
NVMLField<uint32_t> boardId;
NVMLField<std::string> partNumber;
NVMLField<std::string> brand;
// Ordered list of bridge chips, each with a type and firmware version strings
NVMLField<std::vector<std::pair<std::string, std::string>>> bridgeHierarchy;
NVMLField<uint64_t> cpuAffinity;
NVMLField<std::string> computeMode;
NVMLField<int32_t> computeCapabilityMajor;
NVMLField<int32_t> computeCapabilityMinor;
NVMLField<uint32_t> pcieLinkGen;
NVMLField<uint32_t> pcieLinkWidth;
NVMLField<uint32_t> clockDefaultGraphics;
NVMLField<uint32_t> clockDefaultSM;
NVMLField<uint32_t> clockDefaultMem;
NVMLField<uint32_t> clockDefaultVideo;
NVMLField<uint32_t> clockMaxGraphics;
NVMLField<uint32_t> clockMaxSM;
NVMLField<uint32_t> clockMaxMem;
NVMLField<uint32_t> clockMaxVideo;
NVMLField<uint32_t> clockBoostGraphics;
NVMLField<uint32_t> clockBoostSM;
NVMLField<uint32_t> clockBoostMem;
NVMLField<uint32_t> clockBoostVideo;
NVMLField<bool> currentEccMode;
NVMLField<bool> pendingEccMode;
NVMLField<uint32_t> encoderCapacityH264;
NVMLField<uint32_t> encoderCapacityHEVC;
NVMLField<std::string> infoROMImageVersion;
NVMLField<std::string> infoROMOEMVersion;
NVMLField<std::string> infoROMECCVersion;
NVMLField<std::string> infoROMPowerVersion;
NVMLField<uint64_t> supportedClocksThrottleReasons;
NVMLField<std::string> vbiosVersion;
NVMLField<uint32_t> maxLinkGen;
NVMLField<uint32_t> maxLinkWidth;
NVMLField<uint32_t> minorNumber;
NVMLField<uint32_t> multiGpuBool;
NVMLField<std::string> deviceName;
NVMLField<uint32_t> tempThresholdShutdown;
NVMLField<uint32_t> tempThresholdHWSlowdown;
NVMLField<uint32_t> tempThresholdSWSlowdown;
NVMLField<uint32_t> tempThresholdDropBelowBaseClock;
NVMLField<uint32_t> powerLimit;
NVMLField<std::vector<uint32_t>> supportedMemoryClocks;
NVMLField<std::map<uint32_t, std::vector<uint32_t>>> supportedGraphicsClocks;
void refresh(void* device);
};
// Device memory usage
struct DeviceMemory
{
NVMLField<uint64_t> bar1Total;
NVMLField<std::vector<uint64_t>> bar1Used;
NVMLField<std::vector<uint64_t>> bar1Free;
NVMLField<uint64_t> memoryTotal;
NVMLField<std::vector<uint64_t>> memoryUsed;
NVMLField<std::vector<uint64_t>> memoryFree;
void init(uint32_t maxElements);
void refresh(void* device, uint32_t offset);
};
// Device utilization ratios
struct DeviceUtilization
{
NVMLField<std::vector<uint32_t>> gpuUtilization;
NVMLField<std::vector<uint32_t>> memUtilization;
NVMLField<std::vector<uint32_t>> computeProcesses;
NVMLField<std::vector<uint32_t>> graphicsProcesses;
void init(uint32_t maxElements);
void refresh(void* device, uint32_t offset);
};
// Device performance state: clocks and throttling
struct DevicePerformanceState
{
NVMLField<std::vector<uint32_t>> clockGraphics;
NVMLField<std::vector<uint32_t>> clockSM;
NVMLField<std::vector<uint32_t>> clockMem;
NVMLField<std::vector<uint32_t>> clockVideo;
NVMLField<std::vector<uint64_t>> throttleReasons;
void init(uint32_t maxElements);
void refresh(void* device, uint32_t offset);
static std::vector<std::string> getThrottleReasonStrings(uint64_t reason);
static const std::vector<uint64_t>& getAllThrottleReasonList();
};
// Device power and temperature
struct DevicePowerState
{
NVMLField<std::vector<uint32_t>> power;
NVMLField<std::vector<uint32_t>> temperature;
NVMLField<std::vector<uint32_t>> fanSpeed;
void init(uint32_t maxElements);
void refresh(void* device, uint32_t offset);
};
// Other information
struct SysInfo
{
std::vector<float> cpu; // Load measurement [0, 100]
std::string driverVersion;
};
void refresh(); // Take measurement
bool isValid() { return m_valid; }
uint32_t getGpuCount() { return m_physicalGpuCount; }
const DeviceInfo& getDeviceInfo(int gpu) { return m_deviceInfo[gpu]; }
const DeviceMemory& getDeviceMemory(int gpu) { return m_deviceMemory[gpu]; }
const DeviceUtilization& getDeviceUtilization(int gpu) { return m_deviceUtilization[gpu]; }
const DevicePerformanceState& getDevicePerformanceState(int gpu) { return m_devicePerformanceState[gpu]; }
const DevicePowerState& getDevicePowerState(int gpu) { return m_devicePowerState[gpu]; }
const SysInfo& getSysInfo() { return m_sysInfo; }
int getOffset() { return m_offset; }
private:
std::vector<DeviceInfo> m_deviceInfo;
std::vector<DeviceMemory> m_deviceMemory;
std::vector<DeviceUtilization> m_deviceUtilization;
std::vector<DevicePerformanceState> m_devicePerformanceState;
std::vector<DevicePowerState> m_devicePowerState;
SysInfo m_sysInfo; // CPU and driver information
bool m_valid = false;
uint32_t m_physicalGpuCount = 0; // Number of NVIDIA GPU
uint32_t m_offset = 0; // Index of the most recent cpu load sample
uint32_t m_maxElements = 100; // Number of max stored measurements
uint32_t m_minInterval = 100; // Minimum interval lapse
};
} // namespace nvvkhl

View file

@ -0,0 +1,334 @@
/*
* Copyright (c) 2014-2023, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#include "nvprint.hpp"
#include <limits.h>
#include <mutex>
#include <vector>
#ifdef _WIN32
#include <io.h>
#include <windows.h>
#else
#include <signal.h>
#include <unistd.h>
#endif
enum class TriState
{
eUnknown,
eFalse,
eTrue
};
static std::string s_logFileName = "log_nvprosample.txt";
static std::vector<char> s_strBuffer; // Persistent allocation for formatted text.
static FILE* s_fd = nullptr;
static bool s_bLogReady = false;
static bool s_bPrintLogging = true;
static uint32_t s_bPrintFileLogging = LOGBITS_ALL;
static uint32_t s_bPrintConsoleLogging = LOGBITS_ALL;
static uint32_t s_bPrintBreakpoints = 0;
static int s_printLevel = -1; // <0 mean no level prefix
static PFN_NVPRINTCALLBACK s_printCallback = nullptr;
static TriState s_consoleSupportsColor = TriState::eUnknown;
// Lock this when modifying any static variables.
// Because it is a recursive mutex, its owner can lock it multiple times.
static std::recursive_mutex s_mutex;
void nvprintSetLogFileName(const char* name) noexcept
{
std::lock_guard<std::recursive_mutex> lockGuard(s_mutex);
if(name == NULL || s_logFileName == name)
return;
try
{
s_logFileName = name;
}
catch(const std::exception& e)
{
nvprintLevel(LOGLEVEL_ERROR, "nvprintfSetLogFileName could not allocate space for new file name. Additional info below:");
nvprintLevel(LOGLEVEL_ERROR, e.what());
}
if(s_fd)
{
fclose(s_fd);
s_fd = nullptr;
s_bLogReady = false;
}
}
void nvprintSetCallback(PFN_NVPRINTCALLBACK callback)
{
s_printCallback = callback;
}
void nvprintSetLevel(int l)
{
s_printLevel = l;
}
int nvprintGetLevel()
{
return s_printLevel;
}
void nvprintSetLogging(bool b)
{
s_bPrintLogging = b;
}
void nvprintSetFileLogging(bool state, uint32_t mask)
{
std::lock_guard<std::recursive_mutex> lockGuard(s_mutex);
if(state)
{
s_bPrintFileLogging |= mask;
}
else
{
s_bPrintFileLogging &= ~mask;
}
}
void nvprintSetConsoleLogging(bool state, uint32_t mask)
{
std::lock_guard<std::recursive_mutex> lockGuard(s_mutex);
if(state)
{
s_bPrintConsoleLogging |= mask;
}
else
{
s_bPrintConsoleLogging &= ~mask;
}
}
void nvprintSetBreakpoints(bool state, uint32_t mask)
{
std::lock_guard<std::recursive_mutex> lockGuard(s_mutex);
if(state)
{
s_bPrintBreakpoints |= mask;
}
else
{
s_bPrintBreakpoints &= ~mask;
}
}
void nvprintfV(va_list& vlist, const char* fmt, int level) noexcept
{
if(s_bPrintLogging == false)
{
return;
}
// Format the inputs into s_strBuffer.
std::lock_guard<std::recursive_mutex> lockGuard(s_mutex);
{
// Copy vlist as it may be modified by vsnprintf.
va_list vlistCopy;
va_copy(vlistCopy, vlist);
const int charactersNeeded = vsnprintf(s_strBuffer.data(), s_strBuffer.size(), fmt, vlistCopy);
va_end(vlistCopy);
// Check that:
// * vsnprintf did not return an error;
// * The string (plus null terminator) could fit in a vector.
if((charactersNeeded < 0) || (size_t(charactersNeeded) > s_strBuffer.max_size() - 1))
{
// Formatting error
nvprintLevel(LOGLEVEL_ERROR, "nvprintfV: Internal message formatting error.");
return;
}
// Increase the size of s_strBuffer as needed if there wasn't enough space.
if(size_t(charactersNeeded) >= s_strBuffer.size())
{
try
{
// Make sure to add 1, because vsnprintf doesn't count the terminating
// null character. This can potentially throw an exception.
s_strBuffer.resize(size_t(charactersNeeded) + 1, '\0');
}
catch(const std::exception& e)
{
nvprintLevel(LOGLEVEL_ERROR, "nvprintfV: Error resizing buffer to hold message. Additional info below:");
nvprintLevel(LOGLEVEL_ERROR, e.what());
return;
}
// Now format it; we know this will succeed.
(void)vsnprintf(s_strBuffer.data(), s_strBuffer.size(), fmt, vlist);
}
}
nvprintLevel(level, s_strBuffer.data());
}
void nvprintLevel(int level, const std::string& msg) noexcept
{
nvprintLevel(level, msg.c_str());
}
void nvprintLevel(int level, const char* msg) noexcept
{
std::lock_guard<std::recursive_mutex> lockGuard(s_mutex);
#ifdef WIN32
// Note: Maybe we could consider changing to a text encoding of UTF-8 in
// the future, bring in calls to Windows' MultiByteToWideChar, and call
// OutputDebugStringW.
OutputDebugStringA(msg);
#endif
if(s_bPrintFileLogging & (1 << level))
{
if(s_bLogReady == false)
{
s_fd = fopen(s_logFileName.c_str(), "wt");
s_bLogReady = true;
}
if(s_fd)
{
fputs(msg, s_fd);
}
}
if(s_printCallback)
{
s_printCallback(level, msg);
}
if(s_bPrintConsoleLogging & (1 << level))
{
// Determine if the output supports ANSI color sequences only once to avoid
// many calls to isatty.
if(TriState::eUnknown == s_consoleSupportsColor)
{
// Determining this perfectly is difficult; terminfo does it by storing
// a large table of all consoles it knows about. For now, we assume
// all consoles support colors, and all pipes do not.
#ifdef WIN32
bool supportsColor = _isatty(_fileno(stderr)) && _isatty(_fileno(stdout));
// This enables ANSI escape codes from the app side.
// We do this because on Windows 10, cmd.exe is a console, but only
// supports ANSI escape codes by default if the
// HKEY_CURRENT_USER\Console\VirtualTerminalLevel registry key is
// nonzero, which we don't want to assume.
// See https://github.com/nvpro-samples/vk_raytrace/issues/28.
// On failure, turn off colors.
if(supportsColor)
{
for(DWORD stdHandleIndex : {STD_OUTPUT_HANDLE, STD_ERROR_HANDLE})
{
const HANDLE consoleHandle = GetStdHandle(stdHandleIndex);
if(INVALID_HANDLE_VALUE == consoleHandle)
{
supportsColor = false;
break;
}
DWORD consoleMode = 0;
if(0 == GetConsoleMode(consoleHandle, &consoleMode))
{
supportsColor = false;
break;
}
SetConsoleMode(consoleHandle, consoleMode | ENABLE_VIRTUAL_TERMINAL_PROCESSING);
}
}
#else
const bool supportsColor = isatty(fileno(stderr)) && isatty(fileno(stdout));
#endif
s_consoleSupportsColor = (supportsColor ? TriState::eTrue : TriState::eFalse);
}
FILE* outStream = (((1 << level) & LOGBITS_ERRORS) ? stderr : stdout);
if(TriState::eTrue == s_consoleSupportsColor)
{
// Set the foreground color depending on level:
// https://en.wikipedia.org/wiki/ANSI_escape_code#SGR_(Select_Graphic_Rendition)_parameters
if(level == LOGLEVEL_OK)
{
fputs("\033[32m", outStream); // Green
}
else if(level == LOGLEVEL_ERROR)
{
fputs("\033[31m", outStream); // Red
}
else if(level == LOGLEVEL_WARNING)
{
fputs("\033[33m", outStream); // Yellow
}
else if(level == LOGLEVEL_DEBUG)
{
fputs("\033[36m", outStream); // Cyan
}
}
fputs(msg, outStream);
if(TriState::eTrue == s_consoleSupportsColor)
{
// Reset all attributes
fputs("\033[0m", outStream);
}
}
if(s_bPrintBreakpoints & (1 << level))
{
#ifdef WIN32
DebugBreak();
#else
raise(SIGTRAP);
#endif
}
}
void nvprintf(
#ifdef _MSC_VER
_Printf_format_string_
#endif
const char* fmt,
...) noexcept
{
// int r = 0;
va_list vlist;
va_start(vlist, fmt);
nvprintfV(vlist, fmt, s_printLevel);
va_end(vlist);
}
void nvprintfLevel(int level,
#ifdef _MSC_VER
_Printf_format_string_
#endif
const char* fmt,
...) noexcept
{
va_list vlist;
va_start(vlist, fmt);
nvprintfV(vlist, fmt, level);
va_end(vlist);
}

View file

@ -0,0 +1,241 @@
/*
* Copyright (c) 2014-2023, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2023 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#ifndef __NVPRINT_H__
#define __NVPRINT_H__
#include <cstdarg>
#include <fmt/format.h>
#include <functional>
#include <stdint.h>
#include <string>
/** @DOC_START
Multiple functions and macros that should be used for logging purposes,
rather than printf. These can print to multiple places at once
# Function nvprintf etc
Configuration:
- nvprintSetLevel : sets default loglevel
- nvprintGetLevel : gets default loglevel
- nvprintSetLogFileName : sets log filename
- nvprintSetLogging : sets file logging state
- nvprintSetCallback : sets custom callback
Printf-style functions and macros.
These take printf-style specifiers.
- nvprintf : prints at default loglevel
- nvprintfLevel : nvprintfLevel print at a certain loglevel
- LOGI : macro that does nvprintfLevel(LOGLEVEL_INFO)
- LOGW : macro that does nvprintfLevel(LOGLEVEL_WARNING)
- LOGE : macro that does nvprintfLevel(LOGLEVEL_ERROR)
- LOGE_FILELINE : macro that does nvprintfLevel(LOGLEVEL_ERROR) combined with filename/line
- LOGD : macro that does nvprintfLevel(LOGLEVEL_DEBUG) (only in debug builds)
- LOGOK : macro that does nvprintfLevel(LOGLEVEL_OK)
- LOGSTATS : macro that does nvprintfLevel(LOGLEVEL_STATS)
std::print-style functions and macros.
These take std::format-style specifiers
(https://en.cppreference.com/w/cpp/utility/format/formatter#Standard_format_specification).
- nvprintLevel : print at a certain loglevel
- PRINTI : macro that does nvprintLevel(LOGLEVEL_INFO)
- PRINTW : macro that does nvprintLevel(LOGLEVEL_WARNING)
- PRINTE : macro that does nvprintLevel(LOGLEVEL_ERROR)
- PRINTE_FILELINE : macro that does nvprintLevel(LOGLEVEL_ERROR) combined with filename/line
- PRINTD : macro that does nvprintLevel(LOGLEVEL_DEBUG) (only in debug builds)
- PRINTOK : macro that does nvprintLevel(LOGLEVEL_OK)
- PRINTSTATS : macro that does nvprintLevel(LOGLEVEL_STATS)
Safety:
On error, all functions print an error message.
All functions are thread-safe.
Printf-style functions have annotations that should produce warnings at
compile-time or when performing static analysis. Their format strings may be
dynamic - but this can be bad if an adversary can choose the content of the
format string.
std::print-style functions are safer: they produce compile-time errors, and
their format strings must be compile-time constants. Dynamic formatting
should be performed outside of printing, like this:
```cpp
ImGui::InputText("Enter a format string: ", userFormat, sizeof(userFormat));
try
{
std::string formatted = fmt::vformat(userFormat, ...);
}
catch (const std::exception& e)
{
(error handling...)
}
PRINTI("{}", formatted);
```
Text encoding:
Printing to the Windows debug console is the only operation that assumes a
text encoding, which is ANSI. In all other cases, strings are copied into
the output.
@DOC_END */
// trick for pragma message so we can write:
// #pragma message(__FILE__"("S__LINE__"): blah")
#define S__(x) #x
#define S_(x) S__(x)
#define S__LINE__ S_(__LINE__)
#ifndef LOGLEVEL_INFO
#define LOGLEVEL_INFO 0
#define LOGLEVEL_WARNING 1
#define LOGLEVEL_ERROR 2
#define LOGLEVEL_DEBUG 3
#define LOGLEVEL_STATS 4
#define LOGLEVEL_OK 7
#define LOGBIT_INFO (1 << LOGLEVEL_INFO)
#define LOGBIT_WARNING (1 << LOGLEVEL_WARNING)
#define LOGBIT_ERROR (1 << LOGLEVEL_ERROR)
#define LOGBIT_DEBUG (1 << LOGLEVEL_DEBUG)
#define LOGBIT_STATS (1 << LOGLEVEL_STATS)
#define LOGBIT_OK (1 << LOGLEVEL_OK)
#define LOGBITS_ERRORS LOGBIT_ERROR
#define LOGBITS_WARNINGS (LOGBITS_ERRORS | LOGBIT_WARNING)
#define LOGBITS_INFO (LOGBITS_WARNINGS | LOGBIT_INFO)
#define LOGBITS_DEBUG (LOGBITS_INFO | LOGBIT_DEBUG)
#define LOGBITS_STATS (LOGBITS_DEBUG | LOGBIT_STATS)
#define LOGBITS_OK (LOGBITS_WARNINGS | LOGBIT_OK)
#define LOGBITS_ALL 0xffffffffu
#endif
// Set/get the default level for calls to nvprintf(). Use LOGLEVEL_*.
void nvprintSetLevel(int l);
int nvprintGetLevel();
void nvprintSetLogFileName(const char* name) noexcept;
// Globally enable/disable all nvprint output and logging
void nvprintSetLogging(bool b);
// Update the bitmask of which levels receive file and stderr output, or
// trigger breakpoints. `state` controls whether to enable or disable the bits
// in `mask`. Use LOGBITS_*.
void nvprintSetFileLogging(bool state, uint32_t mask = ~0);
void nvprintSetConsoleLogging(bool state, uint32_t mask = ~0);
void nvprintSetBreakpoints(bool state, uint32_t mask = LOGBITS_ERRORS);
// Set a custom print handler. Called in addition to file and console logging.
using PFN_NVPRINTCALLBACK = std::function<void(int level, const char* msg)>;
void nvprintSetCallback(PFN_NVPRINTCALLBACK callback);
// Printf-style macros and functions.
#define LOGI(...) \
{ \
nvprintfLevel(LOGLEVEL_INFO, __VA_ARGS__); \
}
#define LOGW(...) \
{ \
nvprintfLevel(LOGLEVEL_WARNING, __VA_ARGS__); \
}
#define LOGE(...) \
{ \
nvprintfLevel(LOGLEVEL_ERROR, __VA_ARGS__); \
}
#define LOGE_FILELINE(...) \
{ \
nvprintfLevel(LOGLEVEL_ERROR, __FILE__ "(" S__LINE__ "): **ERROR**:\n" __VA_ARGS__); \
}
#ifndef NDEBUG
#define LOGD(...) \
{ \
nvprintfLevel(LOGLEVEL_DEBUG, __FILE__ "(" S__LINE__ "): Debug Info:\n" __VA_ARGS__); \
}
#else
#define LOGD(...)
#endif
#define LOGOK(...) \
{ \
nvprintfLevel(LOGLEVEL_OK, __VA_ARGS__); \
}
#define LOGSTATS(...) \
{ \
nvprintfLevel(LOGLEVEL_STATS, __VA_ARGS__); \
}
void nvprintf(
#ifdef _MSC_VER
_Printf_format_string_
#endif
const char* fmt,
...) noexcept
#if defined(__GNUC__) || defined(__clang__)
__attribute__((format(printf, 1, 2)));
#endif
;
void nvprintfLevel(int level,
#ifdef _MSC_VER
_Printf_format_string_
#endif
const char* fmt,
...) noexcept
#if defined(__GNUC__) || defined(__clang__)
__attribute__((format(printf, 2, 3)));
#endif
;
// std::print-style macros and functions.
// Use fmt::format's built-in checking if the compiler supports consteval,
// which cleans up how the macros appear in Intellisense. Otherwise, use
// FMT_STRING; this will be messier. In either case, the last line of the
// compiler error will point to the line with the incorrect print specifier.
#ifdef FMT_HAS_CONSTEVAL
#define PRINT_CHECK_FMT
#else
#define PRINT_CHECK_FMT FMT_STRING
#endif
// This macro catches exceptions from fmt::format. This gives us compile-time
// checking, while still making these functions have the same noexcept
// semantics as nvprintf.
#define PRINT_CATCH(lvl, fmtstr, ...) \
{ \
try \
{ \
nvprintLevel(lvl, fmt::format(PRINT_CHECK_FMT(fmtstr), __VA_ARGS__)); \
} \
catch(const std::exception&) \
{ \
nvprintLevel(LOGLEVEL_ERROR, "PRINT_CATCH: Could not format string.\n"); \
} \
}
#define PRINTI(fmtstr, ...) PRINT_CATCH(LOGLEVEL_INFO, fmtstr, __VA_ARGS__)
#define PRINTW(fmtstr, ...) PRINT_CATCH(LOGLEVEL_WARNING, fmtstr, __VA_ARGS__)
#define PRINTE(fmtstr, ...) PRINT_CATCH(LOGLEVEL_ERROR, fmtstr, __VA_ARGS__)
#define PRINTE_FILELINE(fmtstr, ...) \
PRINT_CATCH(LOGLEVEL_ERROR, __FILE__ "(" S__LINE__ "): **ERROR**:\n" fmtstr, __VA_ARGS__)
#ifndef NDEBUG
#define PRINTD(fmtstr, ...) PRINT_CATCH(LOGLEVEL_DEBUG, __FILE__ "(" S__LINE__ "): Debug Info:\n" fmtstr, __VA_ARGS__)
#else
#define PRINTD(...)
#endif
#define PRINTOK(fmtstr, ...) PRINT_CATCH(LOGLEVEL_OK, fmtstr, __VA_ARGS__)
#define PRINTSTATS(fmtstr, ...) PRINT_CATCH(LOGLEVEL_STATS, fmtstr, __VA_ARGS__)
// Directly prints a message at the given level, without formatting.
void nvprintLevel(int level, const std::string& msg) noexcept;
void nvprintLevel(int level, const char* msg) noexcept;
#endif

View file

@ -0,0 +1,152 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#pragma once
#include <algorithm>
#include <atomic>
#include <cstdint>
#include <functional>
#include <thread>
#include <vector>
namespace nvh {
/* @DOC_START
Distributes batches of loops over BATCHSIZE items across multiple threads. numItems reflects the total number
of items to process.
batches: fn (uint64_t itemIndex, uint32_t threadIndex)
callback does single item
ranges: fn (uint64_t itemBegin, uint64_t itemEnd, uint32_t threadIndex)
callback does loop `for (uint64_t itemIndex = itemBegin; itemIndex < itemEnd; itemIndex++)`
@DOC_END */
template <uint64_t BATCHSIZE = 128>
inline void parallel_batches(uint64_t numItems, std::function<void(uint64_t)> fn, uint32_t numThreads)
{
if(numThreads <= 1 || numItems < numThreads || numItems < BATCHSIZE)
{
for(uint64_t idx = 0; idx < numItems; idx++)
{
fn(idx);
}
}
else
{
std::atomic_uint64_t counter = 0;
auto worker = [&]() {
uint64_t idx;
while((idx = counter.fetch_add(BATCHSIZE)) < numItems)
{
uint64_t last = std::min(numItems, idx + BATCHSIZE);
for(uint64_t i = idx; i < last; i++)
{
fn(i);
}
}
};
std::vector<std::thread> threads(numThreads);
for(uint32_t i = 0; i < numThreads; i++)
{
threads[i] = std::thread(worker);
}
for(uint32_t i = 0; i < numThreads; i++)
{
threads[i].join();
}
}
}
template <uint64_t BATCHSIZE = 128>
inline void parallel_batches(uint64_t numItems, std::function<void(uint64_t, uint32_t threadIdx)> fn, uint32_t numThreads)
{
if(numThreads <= 1 || numItems < numThreads || numItems < BATCHSIZE)
{
for(uint64_t idx = 0; idx < numItems; idx++)
{
fn(idx, 0);
}
}
else
{
std::atomic_uint64_t counter = 0;
auto worker = [&](uint32_t threadIdx) {
uint64_t idx;
while((idx = counter.fetch_add(BATCHSIZE)) < numItems)
{
uint64_t last = std::min(numItems, idx + BATCHSIZE);
for(uint64_t i = idx; i < last; i++)
{
fn(i, threadIdx);
}
}
};
std::vector<std::thread> threads(numThreads);
for(uint32_t i = 0; i < numThreads; i++)
{
threads[i] = std::thread(worker, i);
}
for(uint32_t i = 0; i < numThreads; i++)
{
threads[i].join();
}
}
}
template <uint64_t BATCHSIZE = 128>
inline void parallel_ranges(uint64_t numItems, std::function<void(uint64_t idxBegin, uint64_t idxEnd, uint32_t threadIdx)> fn, uint32_t numThreads)
{
if(numThreads <= 1 || numItems < numThreads || numItems < BATCHSIZE)
{
fn(0, numItems, 0);
}
else
{
std::atomic_uint64_t counter = 0;
auto worker = [&](uint32_t threadIdx) {
uint64_t idx;
while((idx = counter.fetch_add(BATCHSIZE)) < numItems)
{
uint64_t last = std::min(numItems, idx + BATCHSIZE);
fn(idx, last, threadIdx);
}
};
std::vector<std::thread> threads(numThreads);
for(uint32_t i = 0; i < numThreads; i++)
{
threads[i] = std::thread(worker, i);
}
for(uint32_t i = 0; i < numThreads; i++)
{
threads[i].join();
}
}
}
} // namespace nvh

View file

@ -0,0 +1,500 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#include "parametertools.hpp"
#include "nvprint.hpp"
#include <algorithm>
namespace nvh {
ParameterList::ParameterList()
{
Callback helpCallback = [&](uint32_t) { print(); };
setHelp(add("help", helpCallback), "Print help");
setHelp(add("h", helpCallback), "Print help");
}
void ParameterList::tokenizeString(std::string& content, std::vector<const char*>& args)
{
bool wasSpace = true;
bool inQuotes = false;
bool inComment = false;
bool wasQuote = false;
bool wasEscape = false;
for(size_t i = 0; i < content.size(); i++)
{
char* token = &content[i];
char current = content[i];
bool isEndline = current == '\n';
bool isSpace = (current == ' ' || current == '\t' || current == '\n');
bool isQuote = current == '"';
bool isComment = current == '#';
bool isEscape = current == '\\';
if(isEndline && inComment)
{
inComment = false;
}
if(isComment && !inQuotes)
{
content[i] = 0;
inComment = true;
}
if(inComment)
continue;
if(inQuotes)
{
if(wasEscape && (current == 'n' || current == 't'))
{
content[i] = current == 'n' ? '\n' : '\t';
content[i - 1] = ' ';
}
}
if(isQuote)
{
inQuotes = !inQuotes;
// treat as space
content[i] = 0;
isSpace = true;
}
else if(isSpace)
{
// turn space to a terminator
if(!inQuotes)
{
content[i] = 0;
}
}
else if(wasSpace && (!inQuotes || wasQuote))
{
// start a new arg unless comment
args.push_back(token);
}
wasSpace = isSpace;
wasQuote = isQuote;
wasEscape = isEscape;
}
}
ParameterList::Parameter::Parameter(Type atype, const char* aname, Callback acallback, void* adestination, uint32_t areadLength, uint32_t awriteLength)
{
type = atype;
callback = acallback;
readLength = areadLength;
writeLength = awriteLength;
destination.ptr = adestination;
// Set name and if specified, helptext
// Split at delimiter '|'
std::string sname = std::string(aname);
size_t delimiterPos = sname.find_first_of('|');
if(delimiterPos != std::string::npos)
{
name = sname.substr(0, delimiterPos);
helptext = sname.substr(delimiterPos + 1);
}
else
{
name = sname;
helptext = "";
}
}
uint32_t ParameterList::append(const ParameterList& list)
{
uint32_t index = uint32_t(m_parameters.size());
m_parameters.insert(m_parameters.end(), list.m_parameters.begin(), list.m_parameters.end());
return index;
}
uint32_t ParameterList::add(const char* name, float* destination, Callback callback, uint32_t length /*= 1*/, float min /*= -FLT_MAX*/, float max /*= FLT_MAX*/)
{
uint32_t index = uint32_t(m_parameters.size());
Parameter param(TYPE_FLOAT, name, callback, destination, length, length);
param.minmax[0].f32 = min;
param.minmax[1].f32 = max;
m_parameters.push_back(param);
return index;
}
uint32_t ParameterList::add(const char* name, int32_t* destination, Callback callback, uint32_t length /*= 1*/, int32_t min /*= -INT_MAX*/, int32_t max /*= +INT_MAX*/)
{
uint32_t index = uint32_t(m_parameters.size());
Parameter param(TYPE_INT, name, callback, destination, length, length);
param.minmax[0].s32 = min;
param.minmax[1].s32 = max;
m_parameters.push_back(param);
return index;
}
uint32_t ParameterList::add(const char* name, uint32_t* destination, Callback callback, uint32_t length /*= 1*/, uint32_t min /*= 0*/, uint32_t max /*= 0xFFFFFFFF*/)
{
uint32_t index = uint32_t(m_parameters.size());
Parameter param(TYPE_UINT, name, callback, destination, length, length);
param.minmax[0].u32 = min;
param.minmax[1].u32 = max;
m_parameters.push_back(param);
return index;
}
uint32_t ParameterList::add(const char* name, bool* destination, Callback callback, uint32_t length /*= 1*/)
{
uint32_t index = uint32_t(m_parameters.size());
Parameter param(TYPE_BOOL, name, callback, destination, length, length);
m_parameters.push_back(param);
return index;
}
uint32_t ParameterList::add(const char* name, bool* destination, bool value, Callback callback /*= nullptr*/, uint32_t length /*= 1*/)
{
uint32_t index = uint32_t(m_parameters.size());
Parameter param(TYPE_BOOL_VALUE, name, callback, destination, 0, length);
param.minmax[0].b = value;
m_parameters.push_back(param);
return index;
}
uint32_t ParameterList::add(const char* name, std::string* destination, Callback callback, uint32_t length)
{
uint32_t index = uint32_t(m_parameters.size());
Parameter param(TYPE_STRING, name, callback, destination, length, length);
m_parameters.push_back(param);
return index;
}
uint32_t ParameterList::add(const char* name, Callback callback, uint32_t length)
{
uint32_t index = uint32_t(m_parameters.size());
Parameter param(TYPE_TRIGGER, name, callback, nullptr, length, length);
m_parameters.push_back(param);
return index;
}
uint32_t ParameterList::addFilename(const char* name, std::string* destination, Callback callback)
{
uint32_t index = uint32_t(m_parameters.size());
// special case "." searches for specific filenames
Parameter param(TYPE_FILENAME, name, callback, destination, name[0] == '.' ? 0 : 1, 1);
m_parameters.push_back(param);
return index;
}
uint32_t ParameterList::setHelp(uint32_t parameterIndex, const char* helptext)
{
this->m_parameters[parameterIndex].helptext = helptext;
return parameterIndex;
}
static bool endsWith(std::string const& s, std::string const& end)
{
if(s.length() >= end.length())
{
return (0 == s.compare(s.length() - end.length(), end.length(), end));
}
else
{
return false;
}
}
bool ParameterList::applyParameters(uint32_t argc,
const char** argv,
uint32_t& a,
const char* paramPrefix /*= nullptr*/,
const char* defaultFilePath /*= nullptr*/) const
{
std::string prefixStr(paramPrefix ? paramPrefix : "");
std::string defaultPathStr(defaultFilePath ? defaultFilePath : "");
for(uint32_t p = 0; p < uint32_t(m_parameters.size()); p++)
{
const Parameter& param = m_parameters[p];
std::string combined = prefixStr + param.name;
bool searchFileEnding = (param.type == TYPE_FILENAME) && (param.readLength == 0);
bool matched = searchFileEnding ? endsWith(argv[a], param.name) : (strcmp(argv[a], combined.c_str()) == 0);
if(matched && a + param.readLength < argc)
{
switch(param.type)
{
case TYPE_FLOAT: {
for(uint32_t i = 0; i < param.writeLength; i++)
{
param.destination.f32[i] =
std::min(std::max(float(atof(argv[a + i + 1])), param.minmax[0].f32), param.minmax[1].f32);
}
}
break;
case TYPE_UINT: {
for(uint32_t i = 0; i < param.writeLength; i++)
{
param.destination.u32[i] =
std::min(std::max(uint32_t(atoi(argv[a + i + 1])), param.minmax[0].u32), param.minmax[1].u32);
}
}
break;
case TYPE_INT: {
for(uint32_t i = 0; i < param.writeLength; i++)
{
param.destination.s32[i] =
std::min(std::max(int32_t(atoi(argv[a + i + 1])), param.minmax[0].s32), param.minmax[1].s32);
}
}
break;
case TYPE_BOOL: {
for(uint32_t i = 0; i < param.writeLength; i++)
{
param.destination.b[i] = atoi(argv[a + i + 1]) != 0;
}
}
break;
case TYPE_BOOL_VALUE: {
for(uint32_t i = 0; i < param.writeLength; i++)
{
param.destination.b[i] = param.minmax[0].b;
}
}
break;
case TYPE_STRING: {
for(uint32_t i = 0; i < param.writeLength; i++)
{
param.destination.str[i] = std::string(argv[a + i + 1]);
}
}
break;
case TYPE_FILENAME: {
std::string filename(argv[a + param.readLength]);
if(
#ifdef _WIN32
filename.find(':') != std::string::npos
#else
!filename.empty() && filename[0] == '/'
#endif
)
{
param.destination.str[0] = filename;
}
else
{
param.destination.str[0] = defaultPathStr + "/" + filename;
}
}
break;
case TYPE_TRIGGER: {
}
break;
}
if(param.callback)
{
param.callback(p);
}
if(searchFileEnding)
{
LOGI(" %s \"%s\"\n", param.name.c_str(), argv[a]);
}
else
{
LOGI(" ");
for(uint32_t i = 0; i < param.readLength + 1; i++)
{
bool isString = i > 0 && (param.type == TYPE_FILENAME || param.type == TYPE_STRING);
if(isString)
{
LOGI(" \"%s\"", argv[a + i]);
}
else
{
LOGI(" %s", argv[a + i]);
}
}
LOGI("\n");
}
a += param.readLength;
return true;
}
}
return false;
}
const char* ParameterList::toString(Type typ)
{
switch(typ)
{
case TYPE_FLOAT:
return "float ";
case TYPE_INT:
return "int ";
case TYPE_UINT:
return "uint ";
case TYPE_BOOL:
return "bool ";
case TYPE_BOOL_VALUE:
return "value ";
case TYPE_STRING:
return "string ";
case TYPE_FILENAME:
return "filename";
case TYPE_TRIGGER:
return "trigger ";
}
return "unknown";
}
void ParameterList::print() const
{
// Get maximum parameter name length
uint32_t maxParamNameLength = 0;
for(const auto& it : m_parameters)
{
maxParamNameLength = std::max(uint32_t(it.name.size()), maxParamNameLength);
}
// Print header
LOGI("parameterlist:\n");
LOGI(" type [args] %-*s helptext\n", maxParamNameLength, "helptext"); // Pad helptext column with blanks
// Print underline. Format: -----
LOGI(" ");
for(uint32_t i = 0; i < maxParamNameLength + 23; i++)
{
LOGI("-");
}
LOGI("\n");
// Print command line arguments
for(const auto& it : m_parameters)
{
// Log param type, [args], name
LOGI(" %s[%d] %-*s", toString(it.type), it.readLength, maxParamNameLength, it.name.c_str());
if(it.helptext != "")
{
// Log helptext
LOGI(" - %s", it.helptext.c_str());
}
// Newline
LOGI("\n");
}
LOGI("\n");
}
uint32_t ParameterList::applyTokens(uint32_t argc, const char** argv, const char* prefix, const char* defaultPath) const
{
uint32_t found = 0;
for(uint32_t a = 0; a < argc; a++)
{
if(applyParameters(argc, argv, a, prefix, defaultPath))
{
found++;
}
else
{
LOGI(" unhandled argument: %s\n", argv[a])
}
}
return found;
}
bool ParameterSequence::advanceIteration(const char* separator, uint32_t separatorArgLength, uint32_t& argBegin, uint32_t& argCount)
{
if(!m_list || m_index >= m_tokens.size())
return true;
size_t begin = m_index;
size_t end = begin;
m_separator = ~0;
for(size_t i = m_index; i < m_tokens.size(); i++)
{
if(strcmp(m_tokens[i], separator) == 0 && i + separatorArgLength < m_tokens.size())
{
end = i - 1;
m_separator = i;
m_index = i + separatorArgLength + 1;
break;
}
end = i;
}
if(m_separator == ~0)
return true;
uint32_t count = uint32_t(1 + end - begin);
if(count)
{
argCount = count;
argBegin = uint32_t(begin);
m_iteration++;
return false;
}
else
{
return true;
}
}
bool ParameterSequence::applyIteration(const char* separator,
uint32_t separatorLength,
const char* paramPrefix /*= nullptr*/,
const char* defaultFilePath /*= nullptr*/)
{
uint32_t argBegin;
uint32_t argCount;
if(!advanceIteration(separator, separatorLength, argBegin, argCount))
{
m_list->applyTokens(argCount, (const char**)&m_tokens[argBegin], paramPrefix, defaultFilePath);
return false;
}
else
{
// check if there is any parameters left
if(m_index < m_tokens.size())
{
uint32_t argBegin = uint32_t(m_index);
uint32_t argCount = uint32_t(m_tokens.size() - m_index);
m_list->applyTokens(argCount, (const char**)&m_tokens[argBegin], paramPrefix, defaultFilePath);
}
return true;
}
}
void ParameterSequence::resetIteration()
{
m_index = 0;
m_iteration = 0;
}
} // namespace nvh

View file

@ -0,0 +1,231 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#ifndef __NVPARMETERTOOLS_H__
#define __NVPARMETERTOOLS_H__
#include "platform.h"
#include <climits>
#include <functional>
#include <string>
#include <vector>
namespace nvh {
//////////////////////////////////////////////////////////////////////////
/** @DOC_START
# class nvh::ParameterList
The nvh::ParameterList helps parsing commandline arguments
or commandline arguments stored within ascii config files.
Parameters always update the values they point to, and optionally
can trigger a callback that can be provided per-parameter.
```cpp
ParameterList list;
std::string modelFilename;
float modelScale;
list.addFilename(".gltf|model filename", &modelFilename);
list.add("scale|model scale", &modelScale);
list.applyTokens(3, {"blah.gltf","-scale","4"}, "-", "/assets/");
```
Use in combination with the ParameterSequence class to iterate
sequences of parameter changes for benchmarking/automation.
@DOC_END */
class ParameterList
{
public:
typedef std::function<void(uint32_t)> Callback;
enum Type
{
TYPE_FLOAT,
TYPE_INT,
TYPE_UINT,
TYPE_BOOL,
TYPE_BOOL_VALUE,
TYPE_STRING,
TYPE_FILENAME,
TYPE_TRIGGER,
};
ParameterList();
uint32_t append(const ParameterList& list);
// Add a parameter. The name can be given in format: name[|help text], for example: "winsize|Set window size"
uint32_t add(const char* name, float* destination, Callback callback = nullptr, uint32_t length = 1, float min = -FLT_MAX, float max = FLT_MAX);
uint32_t add(const char* name, int32_t* destination, Callback callback = nullptr, uint32_t length = 1, int32_t min = -INT_MAX, int32_t max = +INT_MAX);
uint32_t add(const char* name, uint32_t* destination, Callback callback = nullptr, uint32_t length = 1, uint32_t min = 0, uint32_t max = 0xFFFFFFFF);
uint32_t add(const char* name, bool* destination, Callback callback = nullptr, uint32_t length = 1);
uint32_t add(const char* name, bool* destination, bool value, Callback callback = nullptr, uint32_t length = 1);
uint32_t add(const char* name, std::string* destination, Callback callback = nullptr, uint32_t length = 1);
uint32_t add(const char* name, Callback callback, uint32_t length = 0);
// if the parameter "name" starts with "." then we test the variable against this file-ending rather than
// treating name as commandline option. So an argument that ends with ".blah" will trigger this parameter
uint32_t addFilename(const char* name, std::string* destination, Callback callback = nullptr);
// Set help of a parameter, returns the parameterIndex
uint32_t setHelp(uint32_t parameterIndex, const char* helptext);
// returns number of tokens found
// paramPrefix is typically "-"
// relative filenames get the defaultFilePath prepended
uint32_t applyTokens(uint32_t argCount, const char** argv, const char* paramPrefix = nullptr, const char* defaultFilePath = nullptr) const;
// tests only single argument, increases arg by appropriate length on success (returns true)
bool applyParameters(uint32_t argCount, const char** argv, uint32_t& arg, const char* paramPrefix = nullptr, const char* defaultFilePath = nullptr) const;
// prints all registered parameters and optional help strings
void print() const;
// separators are all space (tab, newline etc.) characters
// preserves quotes based on "", converts backslashes, uses # as line comment
// modifies content string by setting 0 at separators
static void tokenizeString(std::string& content, std::vector<const char*>& args);
static const char* toString(Type typ);
private:
struct Parameter
{
Type type = TYPE_FLOAT;
std::string name;
uint32_t readLength = 0;
uint32_t writeLength = 0;
union
{
uint32_t u32;
int32_t s32;
float f32;
bool b;
} minmax[2] = {0, 0};
union
{
uint32_t* u32;
int32_t* s32;
float* f32;
bool* b;
std::string* str;
void* ptr;
} destination = {nullptr};
Callback callback = nullptr;
std::string helptext;
Parameter() {}
Parameter(Type type, const char* name, Callback callback, void* destination, uint32_t readLength, uint32_t writeLength);
};
std::vector<Parameter> m_parameters;
Parameter makeParam(Type type, const char* name, Callback callback, void* destination, uint32_t readLength, uint32_t writeLength);
};
//////////////////////////////////////////////////////////////////////////
/** @DOC_START
# class nvh::ParameterSequence
The nvh::ParameterSequence processes provided tokens in sequences.
The sequences are terminated by a special "separator" token.
All tokens between the last iteration and the separator are applied
to the provided ParameterList.
Useful to process commands in sequences (automation, benchmarking etc.).
Example:
```cpp
ParameterSequence sequence;
ParameterList list;
int mode;
list.add("mode", &mode);
std::vector<const char*> tokens;
ParameterList::tokenizeString("benchmark simple -mode 10 benchmark complex -mode 20", tokens);
sequence.init(&list, tokens);
// 1 means our separator is followed by one argument (simple/complex)
// "-" as parameters in the string are prefixed with -
while(!sequence.advanceIteration("benchmark", 1, "-")) {
printf("%d %s mode %d\n", sequence.getIteration(), sequence.getSeparatorArg(0), mode);
}
// would print:
// 0 simple mode 10
// 1 complex mode 20
```
@DOC_END */
class ParameterSequence
{
public:
ParameterSequence()
: m_list(nullptr)
, m_index(0)
, m_separator(0)
, m_iteration(0)
{
}
void init(const ParameterList* list, const std::vector<const char*>& tokens)
{
m_tokens = tokens;
m_list = list;
}
// returns true if finished with all tokens, otherwise processes until next separator token is found
bool advanceIteration(const char* separator, uint32_t separatorArgLength, uint32_t& argBegin, uint32_t& argCount);
// also applies parameterlist
bool applyIteration(const char* separator,
uint32_t separatorArgLength = 0,
const char* paramPrefix = nullptr,
const char* defaultFilePath = nullptr);
// sets iteration to beginning
void resetIteration();
bool isActive() const { return m_list && m_index && m_iteration; }
uint32_t getIteration() const { return m_iteration; }
const char* getSeparatorArg(uint32_t offset) const
{
return m_separator != ~0ULL ? m_tokens[m_separator + offset + 1] : "";
}
private:
const ParameterList* m_list;
std::vector<const char*> m_tokens;
size_t m_index;
size_t m_separator;
uint32_t m_iteration;
};
} // namespace nvh
#endif

View file

@ -0,0 +1,801 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2022 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#include <array>
#define _USE_MATH_DEFINES
#include <math.h>
#include <unordered_map>
#include <unordered_set>
#include <random>
#include "primitives.hpp"
#include "container_utils.hpp"
namespace nvh {
static uint32_t addPos(PrimitiveMesh& mesh, glm::vec3 p)
{
PrimitiveVertex v{};
v.p = p;
mesh.vertices.emplace_back(v);
return static_cast<uint32_t>(mesh.vertices.size()) - 1;
}
static void addTriangle(PrimitiveMesh& mesh, uint32_t a, uint32_t b, uint32_t c)
{
mesh.triangles.push_back({{a, b, c}});
}
static void addTriangle(PrimitiveMesh& mesh, glm::vec3 a, glm::vec3 b, glm::vec3 c)
{
mesh.triangles.push_back({{addPos(mesh, a), addPos(mesh, b), addPos(mesh, c)}});
}
static void generateFacetedNormals(PrimitiveMesh& mesh)
{
auto num_indices = static_cast<int>(mesh.triangles.size());
for(int i = 0; i < num_indices; i++)
{
auto& v0 = mesh.vertices[mesh.triangles[i].v[0]];
auto& v1 = mesh.vertices[mesh.triangles[i].v[1]];
auto& v2 = mesh.vertices[mesh.triangles[i].v[2]];
glm::vec3 n = glm::normalize(glm::cross(glm::normalize(v1.p - v0.p), glm::normalize(v2.p - v0.p)));
v0.n = n;
v1.n = n;
v2.n = n;
}
}
// Function to generate texture coordinates
static void generateTexCoords(PrimitiveMesh& mesh)
{
for(auto& vertex : mesh.vertices)
{
glm::vec3 n = normalize(vertex.p);
float u = 0.5f + std::atan2(n.z, n.x) / (2.0F * float(M_PI));
float v = 0.5f - std::asin(n.y) / float(M_PI);
vertex.t = {u, v};
}
}
// Generates a tetrahedron mesh (four triangular faces)
PrimitiveMesh createTetrahedron()
{
PrimitiveMesh mesh;
// choose coordinates on the unit sphere
float a = 1.0F / 3.0F;
float b = sqrt(8.0F / 9.0F);
float c = sqrt(2.0F / 9.0F);
float d = sqrt(2.0F / 3.0F);
// 4 vertices
glm::vec3 v0 = glm::vec3{0.0F, 1.0F, 0.0F} * 0.5F;
glm::vec3 v1 = glm::vec3{-c, -a, d} * 0.5F;
glm::vec3 v2 = glm::vec3{-c, -a, -d} * 0.5F;
glm::vec3 v3 = glm::vec3{b, -a, 0.0F} * 0.5F;
// 4 triangles
addTriangle(mesh, v0, v2, v1);
addTriangle(mesh, v0, v3, v2);
addTriangle(mesh, v0, v1, v3);
addTriangle(mesh, v3, v1, v2);
generateFacetedNormals(mesh);
generateTexCoords(mesh);
return mesh;
}
// Generates an icosahedron mesh (twenty equilateral triangular faces)
PrimitiveMesh createIcosahedron()
{
PrimitiveMesh mesh;
float sq5 = sqrt(5.0F);
float a = 2.0F / (1.0F + sq5);
float b = sqrt((3.0F + sq5) / (1.0F + sq5));
a /= b;
float r = 0.5F;
std::vector<glm::vec3> v;
v.emplace_back(0.0F, r * a, r / b);
v.emplace_back(0.0F, r * a, -r / b);
v.emplace_back(0.0F, -r * a, r / b);
v.emplace_back(0.0F, -r * a, -r / b);
v.emplace_back(r * a, r / b, 0.0F);
v.emplace_back(r * a, -r / b, 0.0F);
v.emplace_back(-r * a, r / b, 0.0F);
v.emplace_back(-r * a, -r / b, 0.0F);
v.emplace_back(r / b, 0.0F, r * a);
v.emplace_back(r / b, 0.0F, -r * a);
v.emplace_back(-r / b, 0.0F, r * a);
v.emplace_back(-r / b, 0.0F, -r * a);
addTriangle(mesh, v[1], v[6], v[4]);
addTriangle(mesh, v[0], v[4], v[6]);
addTriangle(mesh, v[0], v[10], v[2]);
addTriangle(mesh, v[0], v[2], v[8]);
addTriangle(mesh, v[1], v[9], v[3]);
addTriangle(mesh, v[1], v[3], v[11]);
addTriangle(mesh, v[2], v[7], v[5]);
addTriangle(mesh, v[3], v[5], v[7]);
addTriangle(mesh, v[6], v[11], v[10]);
addTriangle(mesh, v[7], v[10], v[11]);
addTriangle(mesh, v[4], v[8], v[9]);
addTriangle(mesh, v[5], v[9], v[8]);
addTriangle(mesh, v[0], v[6], v[10]);
addTriangle(mesh, v[0], v[8], v[4]);
addTriangle(mesh, v[1], v[11], v[6]);
addTriangle(mesh, v[1], v[4], v[9]);
addTriangle(mesh, v[3], v[7], v[11]);
addTriangle(mesh, v[3], v[9], v[5]);
addTriangle(mesh, v[2], v[10], v[7]);
addTriangle(mesh, v[2], v[5], v[8]);
generateFacetedNormals(mesh);
generateTexCoords(mesh);
return mesh;
}
// Generates an octahedron mesh (eight faces), this is like two four-sided pyramids placed base to base.
PrimitiveMesh createOctahedron()
{
PrimitiveMesh mesh;
std::vector<glm::vec3> v;
v.emplace_back(0.5F, 0.0F, 0.0F);
v.emplace_back(-0.5F, 0.0F, 0.0F);
v.emplace_back(0.0F, 0.5F, 0.0F);
v.emplace_back(0.0F, -0.5F, 0.0F);
v.emplace_back(0.0F, 0.0F, 0.5F);
v.emplace_back(0.0F, 0.0F, -0.5F);
addTriangle(mesh, v[0], v[2], v[4]);
addTriangle(mesh, v[0], v[4], v[3]);
addTriangle(mesh, v[0], v[5], v[2]);
addTriangle(mesh, v[0], v[3], v[5]);
addTriangle(mesh, v[1], v[4], v[2]);
addTriangle(mesh, v[1], v[3], v[4]);
addTriangle(mesh, v[1], v[5], v[3]);
addTriangle(mesh, v[2], v[5], v[1]);
generateFacetedNormals(mesh);
generateTexCoords(mesh);
return mesh;
}
// Generates a flat plane mesh with the specified number of steps, width, and depth.
// The plane is essentially a grid with the specified number of subdivisions (steps)
// in both the X and Z directions. It creates vertices, normals, and texture coordinates
// for each point on the grid and forms triangles to create the plane's surface.
PrimitiveMesh createPlane(int steps, float width, float depth)
{
PrimitiveMesh mesh;
float increment = 1.0F / static_cast<float>(steps);
for(int sz = 0; sz <= steps; sz++)
{
for(int sx = 0; sx <= steps; sx++)
{
PrimitiveVertex v{};
v.p = glm::vec3(-0.5F + (static_cast<float>(sx) * increment), 0.0F, -0.5F + (static_cast<float>(sz) * increment));
v.p *= glm::vec3(width, 1.0F, depth);
v.n = glm::vec3(0.0F, 1.0F, 0.0F);
v.t = glm::vec2(static_cast<float>(sx) / static_cast<float>(steps),
static_cast<float>(steps - sz) / static_cast<float>(steps));
mesh.vertices.emplace_back(v);
}
}
for(int sz = 0; sz < steps; sz++)
{
for(int sx = 0; sx < steps; sx++)
{
addTriangle(mesh, sx + sz * (steps + 1), sx + 1 + (sz + 1) * (steps + 1), sx + 1 + sz * (steps + 1));
addTriangle(mesh, sx + sz * (steps + 1), sx + (sz + 1) * (steps + 1), sx + 1 + (sz + 1) * (steps + 1));
}
}
return mesh;
}
// Generates a cube mesh with the specified width, height, and depth
// Start with 8 vertex, 6 normal and 4 uv, then 12 triangles and 24
// unique PrimitiveVertex
PrimitiveMesh createCube(float width /*= 1*/, float height /*= 1*/, float depth /*= 1*/)
{
PrimitiveMesh mesh;
glm::vec3 s = glm::vec3(width, height, depth) * 0.5F;
std::vector<glm::vec3> pnt = {{-s.x, -s.y, -s.z}, {-s.x, -s.y, s.z}, {-s.x, s.y, -s.z}, {-s.x, s.y, s.z},
{s.x, -s.y, -s.z}, {s.x, -s.y, s.z}, {s.x, s.y, -s.z}, {s.x, s.y, s.z}};
std::vector<glm::vec3> nrm = {{-1.0F, 0.0F, 0.0F}, {0.0F, 0.0F, 1.0F}, {1.0F, 0.0F, 0.0F},
{0.0F, 0.0F, -1.0F}, {0.0F, -1.0F, 0.0F}, {0.0F, 1.0F, 0.0F}};
std::vector<glm::vec2> uv = {{0.0F, 0.0F}, {0.0F, 1.0F}, {1.0F, 1.0F}, {1.0F, 0.0F}};
// cube topology
std::vector<std::vector<int>> cube_polygons = {{0, 1, 3, 2}, {1, 5, 7, 3}, {5, 4, 6, 7},
{4, 0, 2, 6}, {4, 5, 1, 0}, {2, 3, 7, 6}};
for(int i = 0; i < 6; ++i)
{
auto index = static_cast<int>(mesh.vertices.size());
for(int j = 0; j < 4; ++j)
mesh.vertices.push_back({pnt[cube_polygons[i][j]], nrm[i], uv[j]});
addTriangle(mesh, index, index + 1, index + 2);
addTriangle(mesh, index, index + 2, index + 3);
}
return mesh;
}
// Generates a UV-sphere mesh with the specified radius, number of sectors (horizontal subdivisions)
// and stacks (vertical subdivisions). It uses latitude-longitude grid generation to create vertices
// with proper positions, normals, and texture coordinates.
PrimitiveMesh createSphereUv(float radius, int sectors, int stacks)
{
PrimitiveMesh mesh;
float omega{0.0F}; // rotation around the X axis
float phi{0.0F}; // rotation around the Y axis
float length_inv = 1.0F / radius; // vertex normal
const float math_pi = static_cast<float>(M_PI);
float sector_step = 2.0F * math_pi / static_cast<float>(sectors);
float stack_step = math_pi / static_cast<float>(stacks);
float sector_angle{0.0F};
float stack_angle{0.0F};
for(int i = 0; i <= stacks; ++i)
{
stack_angle = math_pi / 2.0F - static_cast<float>(i) * stack_step; // starting from pi/2 to -pi/2
phi = radius * cosf(stack_angle); // r * cos(u)
omega = radius * sinf(stack_angle); // r * sin(u)
// add (sectorCount+1) vertices per stack
// the first and last vertices have same position and normal, but different tex coords
for(int j = 0; j <= sectors; ++j)
{
PrimitiveVertex v{};
sector_angle = static_cast<float>(j) * sector_step; // starting from 0 to 2pi
// vertex position (x, y, z)
v.p.x = phi * cosf(sector_angle); // r * cos(u) * cos(v)
v.p.z = phi * sinf(sector_angle); // r * cos(u) * sin(v)
v.p.y = omega;
// normalized vertex normal
v.n = v.p * length_inv;
// vertex tex coord (s, t) range between [0, 1]
v.t.x = 1.0F - static_cast<float>(j) / static_cast<float>(sectors);
v.t.y = static_cast<float>(i) / static_cast<float>(stacks);
mesh.vertices.emplace_back(v);
}
}
// indices
// k2---k2+1
// | \ |
// | \ |
// k1---k1+1
int k1{0};
int k2{0};
for(int i = 0; i < stacks; ++i)
{
k1 = i * (sectors + 1); // beginning of current stack
k2 = k1 + sectors + 1; // beginning of next stack
for(int j = 0; j < sectors; ++j, ++k1, ++k2)
{
// 2 triangles per sector excluding 1st and last stacks
if(i != 0)
{
addTriangle(mesh, k1, k1 + 1, k2); // k1---k2---k1+1
}
if(i != (stacks - 1))
{
addTriangle(mesh, k1 + 1, k2 + 1, k2); // k1+1---k2---k2+1
}
}
}
return mesh;
}
// Function to create a cone
// radius :Adjust this to change the size of the cone
// height :Adjust this to change the height of the cone
// segments :Adjust this for the number of segments forming the base circle
PrimitiveMesh createConeMesh(float radius, float height, int segments)
{
PrimitiveMesh mesh;
float halfHeight = height * 0.5f;
const float math_pi = static_cast<float>(M_PI);
float sector_step = 2.0F * math_pi / static_cast<float>(segments);
float sector_angle{0.0F};
// length of the flank of the cone
float flank_len = sqrtf(radius * radius + 1.0F);
// unit vector along the flank of the cone
float cone_x = radius / flank_len;
float cone_y = -1.0F / flank_len;
glm::vec3 tip = {0.0F, halfHeight, 0.0F};
// Sides
for(int i = 0; i <= segments; ++i)
{
PrimitiveVertex v{};
sector_angle = static_cast<float>(i) * sector_step;
// Position
v.p.x = radius * cosf(sector_angle); // r * cos(u) * cos(v)
v.p.z = radius * sinf(sector_angle); // r * cos(u) * sin(v)
v.p.y = -halfHeight;
// Normal
v.n.x = -cone_y * cosf(sector_angle);
v.n.y = cone_x;
v.n.z = -cone_y * sinf(sector_angle);
// TexCoord
v.t.x = static_cast<float>(i) / static_cast<float>(segments);
v.t.y = 0.0F;
mesh.vertices.emplace_back(v);
// Tip point
v.p = tip;
// Normal
sector_angle += 0.5F * sector_step; // Half way to next triangle
v.n.x = -cone_y * cosf(sector_angle);
v.n.y = cone_x;
v.n.z = -cone_y * sinf(sector_angle);
// TexCoord
v.t.x += 0.5F / static_cast<float>(segments);
v.t.y = 1.0F;
mesh.vertices.emplace_back(v);
}
for(int j = 0; j < segments; ++j)
{
int k1 = j * 2;
addTriangle(mesh, k1, k1 + 1, k1 + 2);
}
// Bottom plate (normal are different)
for(int i = 0; i <= segments; ++i)
{
PrimitiveVertex v{};
sector_angle = static_cast<float>(i) * sector_step; // starting from 0 to 2pi
v.p.x = radius * cosf(sector_angle); // r * cos(u) * cos(v)
v.p.z = radius * sinf(sector_angle); // r * cos(u) * sin(v)
v.p.y = -halfHeight;
//
v.n = {0.0F, -1.0F, 0.0F};
//
v.t.x = static_cast<float>(i) / static_cast<float>(segments);
v.t.y = 0.0F;
mesh.vertices.emplace_back(v);
v.p = -tip;
v.t.x += 0.5F / static_cast<float>(segments);
v.t.y = 1.0F;
mesh.vertices.emplace_back(v);
}
for(int j = 0; j < segments; ++j)
{
int k1 = (j + segments + 1) * 2;
addTriangle(mesh, k1, k1 + 2, k1 + 1);
}
return mesh;
}
// Generates a sphere mesh with the specified radius and subdivisions (level of detail).
// It uses the icosahedron subdivision technique to iteratively refine the mesh by
// subdividing triangles into smaller triangles to approximate a more spherical shape.
// It calculates vertex positions, normals, and texture coordinates for each vertex
// and constructs triangles accordingly.
// Note: There will be duplicated vertices with this method.
// Use removeDuplicateVertices to avoid duplicated vertices.
PrimitiveMesh createSphereMesh(float radius, int subdivisions)
{
const float t = (1.0F + std::sqrt(5.0F)) / 2.0F; // Golden ratio
std::vector<glm::vec3> vertices = {{-1, t, 0}, {1, t, 0}, {-1, -t, 0}, {1, -t, 0}, {0, -1, t}, {0, 1, t},
{0, -1, -t}, {0, 1, -t}, {t, 0, -1}, {t, 0, 1}, {-t, 0, -1}, {-t, 0, 1}};
// Function to calculate the midpoint between two vertices
auto midpoint = [](const glm::vec3& v1, const glm::vec3& v2) { return (v1 + v2) * 0.5f; };
auto texCoord = [](const glm::vec3& v1) {
return glm::vec2{0.5f + std::atan2(v1.z, v1.x) / (2 * M_PI), 0.5f - std::asin(v1.y) / M_PI};
};
std::vector<PrimitiveVertex> primitiveVertices;
for(const auto& vertex : vertices)
{
glm::vec3 n = normalize(vertex);
primitiveVertices.push_back({n * radius, n, texCoord(n)});
}
std::vector<PrimitiveTriangle> triangles = {{{0, 11, 5}}, {{0, 5, 1}}, {{0, 1, 7}}, {{0, 7, 10}}, {{0, 10, 11}},
{{1, 5, 9}}, {{5, 11, 4}}, {{11, 10, 2}}, {{10, 7, 6}}, {{7, 1, 8}},
{{3, 9, 4}}, {{3, 4, 2}}, {{3, 2, 6}}, {{3, 6, 8}}, {{3, 8, 9}},
{{4, 9, 5}}, {{2, 4, 11}}, {{6, 2, 10}}, {{8, 6, 7}}, {{9, 8, 1}}};
for(int i = 0; i < subdivisions; ++i)
{
std::vector<PrimitiveTriangle> subTriangles;
for(const auto& tri : triangles)
{
// Subdivide each triangle into 4 sub-triangles
glm::vec3 mid1 = midpoint(primitiveVertices[tri.v[0]].p, primitiveVertices[tri.v[1]].p);
glm::vec3 mid2 = midpoint(primitiveVertices[tri.v[1]].p, primitiveVertices[tri.v[2]].p);
glm::vec3 mid3 = midpoint(primitiveVertices[tri.v[2]].p, primitiveVertices[tri.v[0]].p);
glm::vec3 mid1Normalized = normalize(mid1);
glm::vec3 mid2Normalized = normalize(mid2);
glm::vec3 mid3Normalized = normalize(mid3);
glm::vec2 mid1Uv = texCoord(mid1Normalized);
glm::vec2 mid2Uv = texCoord(mid2Normalized);
glm::vec2 mid3Uv = texCoord(mid3Normalized);
primitiveVertices.push_back({mid1Normalized * radius, mid1Normalized, mid1Uv});
primitiveVertices.push_back({mid2Normalized * radius, mid2Normalized, mid2Uv});
primitiveVertices.push_back({mid3Normalized * radius, mid3Normalized, mid3Uv});
uint32_t m1 = static_cast<uint32_t>(primitiveVertices.size()) - 3U;
uint32_t m2 = m1 + 1U;
uint32_t m3 = m2 + 1U;
// Create 4 new triangles from the subdivided triangle
subTriangles.push_back({{tri.v[0], m1, m3}});
subTriangles.push_back({{m1, tri.v[1], m2}});
subTriangles.push_back({{m2, tri.v[2], m3}});
subTriangles.push_back({{m1, m2, m3}});
}
triangles = subTriangles;
}
return {primitiveVertices, triangles};
}
// Generates a torus mesh, which is a 3D geometric shape resembling a donut
// majorRadius: This represents the distance from the center of the torus to the center of the tube (the larger circle's radius).
// minorRadius: This represents the radius of the tube (the smaller circle's radius).
// majorSegments: The number of segments used to approximate the larger circle that forms the torus.
// minorSegments: The number of segments used to approximate the smaller circle (tube) within the torus.
nvh::PrimitiveMesh createTorusMesh(float majorRadius, float minorRadius, int majorSegments, int minorSegments)
{
nvh::PrimitiveMesh mesh;
float majorStep = 2.0f * float(M_PI) / float(majorSegments);
float minorStep = 2.0f * float(M_PI) / float(minorSegments);
for(int i = 0; i <= majorSegments; ++i)
{
float angle1 = i * majorStep;
glm::vec3 center = {majorRadius * std::cos(angle1), 0.0f, majorRadius * std::sin(angle1)};
for(int j = 0; j <= minorSegments; ++j)
{
float angle2 = j * minorStep;
glm::vec3 position = {center.x + minorRadius * std::cos(angle2) * std::cos(angle1), minorRadius * std::sin(angle2),
center.z + minorRadius * std::cos(angle2) * std::sin(angle1)};
glm::vec3 normal = {std::cos(angle2) * std::cos(angle1), std::sin(angle2), std::cos(angle2) * std::sin(angle1)};
glm::vec2 texCoord = {static_cast<float>(i) / majorSegments, static_cast<float>(j) / minorSegments};
mesh.vertices.push_back({position, normal, texCoord});
}
}
for(int i = 0; i < majorSegments; ++i)
{
for(int j = 0; j < minorSegments; ++j)
{
uint32_t idx1 = i * (minorSegments + 1) + j;
uint32_t idx2 = (i + 1) * (minorSegments + 1) + j;
uint32_t idx3 = idx1 + 1;
uint32_t idx4 = idx2 + 1;
mesh.triangles.push_back({{idx1, idx3, idx2}});
mesh.triangles.push_back({{idx3, idx4, idx2}});
}
}
return mesh;
}
//------------------------------------------------------------------------
// Create a vector of nodes that represent the Menger Sponge
// Nodes have a different translation and scale, which can be used with
// different objects.
std::vector<nvh::Node> mengerSpongeNodes(int level, float probability, int seed)
{
srand(seed);
struct MengerSponge
{
glm::vec3 m_topLeftFront;
float m_size;
void split(std::vector<MengerSponge>& cubes)
{
float size = m_size / 3.f;
glm::vec3 topLeftFront = m_topLeftFront;
for(int x = 0; x < 3; x++)
{
topLeftFront[0] = m_topLeftFront[0] + static_cast<float>(x) * size;
for(int y = 0; y < 3; y++)
{
if(x == 1 && y == 1)
continue;
topLeftFront[1] = m_topLeftFront[1] + static_cast<float>(y) * size;
for(int z = 0; z < 3; z++)
{
if(x == 1 && z == 1)
continue;
if(y == 1 && z == 1)
continue;
topLeftFront[2] = m_topLeftFront[2] + static_cast<float>(z) * size;
cubes.push_back({topLeftFront, size});
}
}
}
}
void splitProb(std::vector<MengerSponge>& cubes, float prob)
{
float size = m_size / 3.f;
glm::vec3 topLeftFront = m_topLeftFront;
for(int x = 0; x < 3; x++)
{
topLeftFront[0] = m_topLeftFront[0] + static_cast<float>(x) * size;
for(int y = 0; y < 3; y++)
{
topLeftFront[1] = m_topLeftFront[1] + static_cast<float>(y) * size;
for(int z = 0; z < 3; z++)
{
float sample = rand() / static_cast<float>(RAND_MAX);
if(sample > prob)
continue;
topLeftFront[2] = m_topLeftFront[2] + static_cast<float>(z) * size;
cubes.push_back({topLeftFront, size});
}
}
}
}
};
// Starting element
MengerSponge element = {glm::vec3(-0.5, -0.5, -0.5), 1.f};
std::vector<MengerSponge> elements1 = {element};
std::vector<MengerSponge> elements2 = {};
auto previous = &elements1;
auto next = &elements2;
for(int i = 0; i < level; i++)
{
for(MengerSponge& c : *previous)
{
if(probability < 0.f)
c.split(*next);
else
c.splitProb(*next, probability);
}
auto temp = previous;
previous = next;
next = temp;
next->clear();
}
std::vector<nvh::Node> nodes;
for(MengerSponge& c : *previous)
{
nvh::Node node{};
node.translation = c.m_topLeftFront;
node.scale = glm::vec3(c.m_size);
node.mesh = 0; // default to the first mesh
nodes.push_back(node);
}
return nodes;
}
//-------------------------------------------------------------------------------------------------
// Create a list of nodes where the seeds have the position similar as in a sun flower
// and the seeds grow slightly the further they are from the center.
std::vector<nvh::Node> sunflower(int seeds)
{
constexpr double goldenRatio = glm::golden_ratio<double>();
std::vector<nvh::Node> flower;
for(int i = 1; i <= seeds; ++i)
{
double r = pow(i, goldenRatio) / seeds;
double theta = 2 * glm::pi<double>() * goldenRatio * i;
nvh::Node seed;
seed.translation = glm::vec3(r * sin(theta), 0, r * cos(theta));
seed.scale = glm::vec3(10.0f * i / (1.0f * seeds));
seed.mesh = 0;
flower.push_back(seed);
}
return flower;
}
//---------------------------------------------------------------------------
// Merge all nodes meshes into a single one
// - nodes: the nodes to merge
// - meshes: the mesh array that the nodes is referring to
nvh::PrimitiveMesh mergeNodes(const std::vector<nvh::Node>& nodes, const std::vector<nvh::PrimitiveMesh> meshes)
{
nvh::PrimitiveMesh resultMesh;
// Find how many triangles and vertices the merged mesh will have
size_t nb_triangles = 0;
size_t nb_vertices = 0;
for(const auto& n : nodes)
{
nb_triangles += meshes[n.mesh].triangles.size();
nb_vertices += meshes[n.mesh].vertices.size();
}
resultMesh.triangles.reserve(nb_triangles);
resultMesh.vertices.reserve(nb_vertices);
// Merge all nodes meshes into a single one
for(const auto& n : nodes)
{
const glm::mat4 mat = n.localMatrix();
uint32_t tIndex = static_cast<uint32_t>(resultMesh.vertices.size());
const nvh::PrimitiveMesh& mesh = meshes[n.mesh];
for(auto v : mesh.vertices)
{
v.p = glm::vec3(mat * glm::vec4(v.p, 1));
resultMesh.vertices.push_back(v);
}
for(auto t : mesh.triangles)
{
t.v += tIndex;
resultMesh.triangles.push_back(t);
}
}
return resultMesh;
}
// Takes a 3D mesh as input and modifies its vertices by adding random displacements within a
// specified `amplitude` range to create a wobbling effect. The intensity of the wobbling effect
// can be controlled by adjusting the `amplitude` parameter.
// The function returns the modified mesh.
nvh::PrimitiveMesh wobblePrimitive(const nvh::PrimitiveMesh& mesh, float amplitude)
{
// Seed the random number generator with a random device
std::random_device rd;
std::mt19937 gen(rd());
// Define the range for the random number generation (-1.0 to 1.0)
std::uniform_real_distribution<float> distribution(-1.0, 1.0);
// Our random function
auto rand = [&] { return distribution(gen); };
std::vector<PrimitiveVertex> newVertices;
for(auto& vertex : mesh.vertices)
{
glm::vec3 originalPosition = vertex.p;
glm::vec3 displacement = glm::vec3(rand(), rand(), rand());
displacement *= amplitude;
glm::vec3 newPosition = originalPosition + displacement;
newVertices.push_back({newPosition, vertex.n, vertex.t});
}
return {newVertices, mesh.triangles};
}
// Takes a 3D mesh as input and returns a new mesh with duplicate vertices removed.
// This function iterates through each triangle in the original PrimitiveMesh,
// compares its vertices, and creates a new set of unique vertices in uniqueVertices.
// We use an unordered_map called vertexIndexMap to keep track of the mapping between
// the original vertices and their corresponding indices in the uniqueVertices vector.
PrimitiveMesh removeDuplicateVertices(const PrimitiveMesh& mesh, bool testNormal, bool testUv)
{
auto hash = [&](const PrimitiveVertex& v) {
if(testNormal)
{
if(testUv)
return nvh::hashVal(v.p.x, v.p.y, v.p.z, v.n.x, v.n.y, v.n.z, v.t.x, v.t.y);
else
return nvh::hashVal(v.p.x, v.p.y, v.p.z, v.n.x, v.n.y, v.n.z);
}
else if(testUv)
return nvh::hashVal(v.p.x, v.p.y, v.p.z, v.t.x, v.t.y);
return nvh::hashVal(v.p.x, v.p.y, v.p.z);
};
auto equal = [&](const PrimitiveVertex& l, const PrimitiveVertex& r) {
return (l.p == r.p) && (testNormal ? l.n == r.n : true) && (testUv ? l.t == r.t : true);
};
std::unordered_map<PrimitiveVertex, uint32_t, decltype(hash), decltype(equal)> vertexIndexMap(0, hash, equal);
std::vector<PrimitiveVertex> uniqueVertices;
std::vector<PrimitiveTriangle> uniqueTriangles;
for(const auto& triangle : mesh.triangles)
{
PrimitiveTriangle uniqueTriangle = {};
for(int i = 0; i < 3; i++)
{
const PrimitiveVertex& vertex = mesh.vertices[triangle.v[i]];
// Check if the vertex is already in the uniqueVertices list
auto it = vertexIndexMap.find(vertex);
if(it == vertexIndexMap.end())
{
// Vertex not found, add it to uniqueVertices and update the index map
uint32_t newIndex = static_cast<uint32_t>(uniqueVertices.size());
vertexIndexMap[vertex] = newIndex;
uniqueVertices.push_back(vertex);
uniqueTriangle.v[i] = newIndex;
}
else
{
// Vertex found, use its index in uniqueVertices
uniqueTriangle.v[i] = it->second;
}
}
uniqueTriangles.push_back(uniqueTriangle);
}
// nvprintf("Before: %d vertex, %d triangles\n", mesh.vertices.size(), mesh.triangles.size());
// nvprintf("After: %d vertex, %d triangles\n", uniqueVertices.size(), uniqueTriangles.size());
return {uniqueVertices, uniqueTriangles};
}
} // namespace nvh

View file

@ -0,0 +1,112 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2022 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#pragma once
#include <vector>
#include <cstdint>
#include <glm/glm.hpp>
#include <glm/gtc/quaternion.hpp>
/* @DOC_START
# struct `nvh::PrimitiveMesh`
- Common primitive type, made of vertices: position, normal and texture coordinates.
- All primitives are triangles, and each 3 indices is forming a triangle.
# struct `nvh::Node`
- Structure to hold a reference to a mesh, with a material and transformation.
Primitives that can be created:
* Tetrahedron
* Icosahedron
* Octahedron
* Plane
* Cube
* SphereUv
* Cone
* SphereMesh
* Torus
Node creator: returns the instance and the position
* MengerSponge
* SunFlower
Other utilities
* mergeNodes
* removeDuplicateVertices
* wobblePrimitive
@DOC_END */
namespace nvh {
struct PrimitiveVertex
{
glm::vec3 p; // Position
glm::vec3 n; // Normal
glm::vec2 t; // Texture Coordinates
};
struct PrimitiveTriangle
{
glm::uvec3 v; // vertex indices
};
struct PrimitiveMesh
{
std::vector<PrimitiveVertex> vertices; // Array of all vertex
std::vector<PrimitiveTriangle> triangles; // Indices forming triangles
};
struct Node
{
glm::vec3 translation{}; //
glm::quat rotation{}; //
glm::vec3 scale{1.0F}; //
glm::mat4 matrix{1}; // Added with the above transformations
int material{0};
int mesh{-1};
glm::mat4 localMatrix() const
{
glm::mat4 translationMatrix = glm::translate(glm::mat4(1.0f), translation);
glm::mat4 rotationMatrix = glm::mat4_cast(rotation);
glm::mat4 scaleMatrix = glm::scale(glm::mat4(1.0f), scale);
glm::mat4 combinedMatrix = translationMatrix * rotationMatrix * scaleMatrix * matrix;
return combinedMatrix;
}
};
PrimitiveMesh createTetrahedron();
PrimitiveMesh createIcosahedron();
PrimitiveMesh createOctahedron();
PrimitiveMesh createPlane(int steps = 1, float width = 1.0F, float depth = 1.0F);
PrimitiveMesh createCube(float width = 1.0F, float height = 1.0F, float depth = 1.0F);
PrimitiveMesh createSphereUv(float radius = 0.5F, int sectors = 20, int stacks = 20);
PrimitiveMesh createConeMesh(float radius = 0.5F, float height = 1.0F, int segments = 16);
PrimitiveMesh createSphereMesh(float radius = 0.5F, int subdivisions = 3);
PrimitiveMesh createTorusMesh(float majorRadius = 0.5F, float minorRadius = 0.25F, int majorSegments = 32, int minorSegments = 16);
std::vector<Node> mengerSpongeNodes(int level = 3, float probability = -1.f, int seed = 1);
std::vector<Node> sunflower(int seeds = 3000);
// Utilities
PrimitiveMesh mergeNodes(const std::vector<Node>& nodes, const std::vector<PrimitiveMesh> meshes);
PrimitiveMesh removeDuplicateVertices(const PrimitiveMesh& mesh, bool testNormal = true, bool testUv = true);
PrimitiveMesh wobblePrimitive(const PrimitiveMesh& mesh, float amplitude = 0.05F);
} // namespace nvh

View file

@ -0,0 +1,459 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#include "profiler.hpp"
#include <assert.h>
#include <stdarg.h>
#include <stdio.h>
#include <string.h>
//////////////////////////////////////////////////////////////////////////
namespace nvh {
const uint32_t Profiler::CONFIG_DELAY;
const uint32_t Profiler::FRAME_DELAY;
const uint32_t Profiler::START_SECTIONS;
const uint32_t Profiler::MAX_NUM_AVERAGE;
Profiler::Profiler(Profiler* master)
{
m_data = master ? master->m_data : std::shared_ptr<Data>(new Data);
grow(START_SECTIONS);
}
Profiler::Profiler(uint32_t startSections)
{
m_data = std::shared_ptr<Data>(new Data);
grow(startSections);
}
void Profiler::setAveragingSize(uint32_t num)
{
assert(num <= MAX_NUM_AVERAGE);
m_data->numAveraging = num;
for(size_t i = 0; i < m_data->entries.size(); i++)
{
m_data->entries[i].cpuTime.init(num);
m_data->entries[i].gpuTime.init(num);
}
m_data->cpuTime.init(num);
}
void Profiler::beginFrame()
{
m_data->level = 0;
m_data->nextSection = 0;
m_data->frameSections.clear();
m_data->cpuCurrentTime = -m_clock.getMicroSeconds();
}
void Profiler::endFrame()
{
assert(m_data->level == 0);
m_data->cpuCurrentTime += m_clock.getMicroSeconds();
if(!m_data->frameSections.empty() && ((uint32_t)m_data->frameSections.size() != m_data->numLastEntries))
{
m_data->numLastEntries = (uint32_t)m_data->frameSections.size();
m_data->numLastSections = m_data->frameSections.back() + 1;
m_data->resetDelay = CONFIG_DELAY;
}
if(m_data->resetDelay)
{
m_data->resetDelay--;
for(uint32_t i = 0; i < m_data->entries.size(); i++)
{
Entry& entry = m_data->entries[i];
if(entry.level != LEVEL_SINGLESHOT)
{
entry.numTimes = 0;
entry.cpuTime.reset();
entry.gpuTime.reset();
}
}
m_data->cpuTime.reset();
m_data->numFrames = 0;
}
if(m_data->numFrames > FRAME_DELAY)
{
for(uint32_t i : m_data->frameSections)
{
Entry& entry = m_data->entries[i];
if(entry.splitter)
continue;
uint32_t queryFrame = (m_data->numFrames + 1) % FRAME_DELAY;
bool available = entry.api.empty() || entry.gpuTimeProvider(i, queryFrame, entry.gpuTimes[queryFrame]);
if(available)
{
entry.cpuTime.add(entry.cpuTimes[queryFrame]);
entry.gpuTime.add(entry.gpuTimes[queryFrame]);
entry.numTimes++;
}
}
for(uint32_t i : m_data->singleSections)
{
Entry& entry = m_data->entries[i];
uint32_t queryFrame = entry.subFrame;
// query once
bool available = entry.cpuTime.numValid == 0
&& (entry.api.empty() || entry.gpuTimeProvider(i, queryFrame, entry.gpuTimes[queryFrame]));
if(available)
{
entry.cpuTime.add(entry.cpuTimes[queryFrame]);
entry.gpuTime.add(entry.gpuTimes[queryFrame]);
entry.numTimes++;
}
}
m_data->cpuTime.add(m_data->cpuCurrentTime);
}
m_data->numFrames++;
}
void Profiler::grow(uint32_t newsize)
{
size_t oldsize = m_data->entries.size();
if(oldsize == newsize)
{
return;
}
m_data->entries.resize(newsize);
for(size_t i = oldsize; i < newsize; i++)
{
m_data->entries[i].cpuTime.init(m_data->numAveraging);
m_data->entries[i].gpuTime.init(m_data->numAveraging);
}
}
void Profiler::clear()
{
m_data->entries.clear();
m_data->singleSections.clear();
}
void Profiler::reset(uint32_t delay)
{
m_data->resetDelay = delay;
}
static std::string format(const char* msg, ...)
{
std::size_t const STRING_BUFFER(8192);
char text[STRING_BUFFER];
va_list list;
if(msg == 0)
return std::string();
va_start(list, msg);
#ifdef _WIN32
vsprintf_s(text, msg, list);
#else // #ifdef _WIN32
vsprintf(text, msg, list);
#endif
va_end(list);
return std::string(text);
}
bool Profiler::getTimerInfo(uint32_t i, TimerInfo& info)
{
Entry& entry = m_data->entries[i];
if(!entry.numTimes || entry.accumulated)
{
return false;
}
info.gpu.average = entry.gpuTime.getAveraged();
info.cpu.average = entry.cpuTime.getAveraged();
info.cpu.absMinValue = entry.cpuTime.absMinValue;
info.cpu.absMaxValue = entry.cpuTime.absMaxValue;
info.gpu.absMinValue = entry.gpuTime.absMinValue;
info.gpu.absMaxValue = entry.gpuTime.absMaxValue;
bool found = false;
for(uint32_t n = i + 1; n < m_data->numLastSections; n++)
{
Entry& otherentry = m_data->entries[n];
if(otherentry.name == entry.name && otherentry.level == entry.level && otherentry.api == entry.api && !otherentry.accumulated)
{
found = true;
info.gpu.average += otherentry.gpuTime.getAveraged();
info.cpu.average += otherentry.cpuTime.getAveraged();
info.cpu.absMinValue += entry.cpuTime.absMinValue;
info.cpu.absMaxValue += entry.cpuTime.absMaxValue;
info.gpu.absMinValue += entry.gpuTime.absMinValue;
info.gpu.absMaxValue += entry.gpuTime.absMaxValue;
otherentry.accumulated = true;
}
if(otherentry.splitter && otherentry.level <= entry.level)
break;
}
info.accumulated = found;
info.numAveraged = entry.cpuTime.numValid;
return true;
}
bool Profiler::getTimerInfo(const char* name, TimerInfo& info)
{
if(name == nullptr)
{
info = TimerInfo();
if(!m_data->cpuTime.numValid)
{
return false;
}
info.cpu.average = m_data->cpuTime.getAveraged();
info.cpu.absMaxValue = m_data->cpuTime.absMaxValue;
info.cpu.absMinValue = m_data->cpuTime.absMinValue;
info.numAveraged = m_data->cpuTime.numValid;
return true;
}
for(uint32_t i = 0; i < m_data->numLastSections; i++)
{
Entry& entry = m_data->entries[i];
entry.accumulated = false;
}
for(uint32_t i = 0; i < (uint32_t)m_data->entries.size(); i++)
{
Entry& entry = m_data->entries[i];
if(entry.name.empty())
continue;
if(name != entry.name)
continue;
return getTimerInfo(i, info);
}
return false;
}
void Profiler::print(std::string& stats)
{
stats.clear();
for(uint32_t i = 0; i < m_data->numLastSections; i++)
{
Entry& entry = m_data->entries[i];
entry.accumulated = false;
}
printf("Timer null;\t N/A %6d; CPU %6d;\n", 0, (uint32_t)m_data->cpuTime.getAveraged());
for(uint32_t i = 0; i < m_data->numLastSections; i++)
{
static const char* spaces = " "; // 8
Entry& entry = m_data->entries[i];
if(entry.level == LEVEL_SINGLESHOT)
continue;
uint32_t level = 7 - (entry.level > 7 ? 7 : entry.level);
TimerInfo info;
if(!getTimerInfo(i, info))
continue;
const char* gpuname = !entry.api.empty() ? entry.api.c_str() : "N/A";
const char* entryname = !entry.name.empty() ? entry.name.c_str() : "N/A";
if(info.accumulated)
{
stats += format("%sTimer %s;\t %s %6d; CPU %6d; (microseconds, accumulated loop)\n", &spaces[level], entryname,
gpuname, (uint32_t)(info.gpu.average), (uint32_t)(info.cpu.average));
}
else
{
stats += format("%sTimer %s;\t %s %6d; CPU %6d; (microseconds, avg %d)\n", &spaces[level], entryname, gpuname,
(uint32_t)(info.gpu.average), (uint32_t)(info.cpu.average), (uint32_t)entry.cpuTime.numValid);
}
}
}
uint32_t Profiler::getTotalFrames() const
{
return m_data->numFrames;
}
void Profiler::accumulationSplit()
{
SectionID sec = getSectionID(false, nullptr);
if(sec >= m_data->entries.size())
{
grow((uint32_t)(m_data->entries.size() * 2));
}
m_data->entries[sec].level = m_data->level;
m_data->entries[sec].splitter = true;
}
Profiler::SectionID Profiler::getSectionID(bool singleShot, const char* name)
{
uint32_t numEntries = (uint32_t)m_data->entries.size();
if(singleShot)
{
// find empty slot or with same name
for(uint32_t i = 0; i < numEntries; i++)
{
Entry& entry = m_data->entries[i];
if(entry.name == name || entry.name.empty())
{
m_data->singleSections.push_back(i);
return i;
}
}
m_data->singleSections.push_back(numEntries);
return numEntries;
}
else
{
// find non-single shot slot
while(m_data->nextSection < numEntries && m_data->entries[m_data->nextSection].level == LEVEL_SINGLESHOT)
{
m_data->nextSection++;
}
m_data->frameSections.push_back(m_data->nextSection);
return m_data->nextSection++;
}
}
Profiler::SectionID Profiler::beginSection(const char* name, const char* api, gpuTimeProvider_fn gpuTimeProvider, bool singleShot)
{
uint32_t subFrame = m_data->numFrames % FRAME_DELAY;
SectionID sec = getSectionID(singleShot, name);
if(sec >= m_data->entries.size())
{
grow((uint32_t)(m_data->entries.size() * 2));
}
Entry& entry = m_data->entries[sec];
uint32_t level = singleShot ? LEVEL_SINGLESHOT : (m_data->level++);
const std::string name_str = (name ? name : "");
const std::string api_str = (api ? api : "");
if(entry.name != name_str || entry.api != api_str || entry.level != level)
{
entry.name = name_str;
entry.api = api_str;
if(!singleShot)
{
m_data->resetDelay = CONFIG_DELAY;
}
}
entry.subFrame = subFrame;
entry.level = level;
entry.splitter = false;
entry.gpuTimeProvider = gpuTimeProvider;
#ifdef NVP_SUPPORTS_NVTOOLSEXT
{
nvtxEventAttributes_t eventAttrib = {0};
eventAttrib.version = NVTX_VERSION;
eventAttrib.size = NVTX_EVENT_ATTRIB_STRUCT_SIZE;
eventAttrib.colorType = NVTX_COLOR_ARGB;
unsigned char color[4];
color[0] = 255;
color[1] = 0;
color[2] = sec % 2 ? 127 : 255;
color[3] = 255;
color[2] -= level * 16;
color[3] -= level * 16;
eventAttrib.color = *(uint32_t*)(color);
eventAttrib.messageType = NVTX_MESSAGE_TYPE_ASCII;
eventAttrib.message.ascii = name;
nvtxRangePushEx(&eventAttrib);
}
#endif
entry.cpuTimes[subFrame] = -getMicroSeconds();
entry.gpuTimes[subFrame] = 0;
if(singleShot)
{
entry.cpuTime.init(1);
entry.gpuTime.init(1);
}
return sec;
}
void Profiler::endSection(SectionID sec)
{
Entry& entry = m_data->entries[sec];
entry.cpuTimes[entry.subFrame] += getMicroSeconds();
#ifdef NVP_SUPPORTS_NVTOOLSEXT
nvtxRangePop();
#endif
if(entry.level != LEVEL_SINGLESHOT)
{
m_data->level--;
}
}
Profiler::Clock::Clock()
{
m_init = std::chrono::high_resolution_clock::now();
}
double Profiler::Clock::getMicroSeconds() const
{
return double(std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::high_resolution_clock::now() - m_init).count())
/ double(1000);
}
} // namespace nvh

View file

@ -0,0 +1,369 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#ifndef NV_PROFILER_INCLUDED
#define NV_PROFILER_INCLUDED
#include <algorithm>
#include <chrono>
#include <float.h> // DBL_MAX
#include <functional>
#include <memory>
#include <stdint.h>
#include <stdio.h>
#include <string.h> //memset
#include <string>
#include <vector>
#ifdef NVP_SUPPORTS_NVTOOLSEXT
#define NVTX_STDINT_TYPES_ALREADY_DEFINED
#include <nvtx3/nvToolsExt.h>
#endif
namespace nvh {
//////////////////////////////////////////////////////////////////////////
/** @DOC_START
# class nvh::Profiler
> The nvh::Profiler class is designed to measure timed sections.
Each section has a cpu and gpu time. Gpu times are typically provided
by derived classes for each individual api (e.g. OpenGL, Vulkan etc.).
There is functionality to pretty print the sections with their nesting level.
Multiple profilers can reference the same database, so one profiler
can serve as master that they others contribute to. Typically the
base class measuring only CPU time could be the master, and the api
derived classes reference it to share the same database.
Profiler::Clock can be used standalone for time measuring.
@DOC_END */
class Profiler
{
public:
/// if we detect a change in timers (api/name change we trigger a reset after that amount of frames)
static const uint32_t CONFIG_DELAY = 8;
/// gpu times are queried after that amount of frames
static const uint32_t FRAME_DELAY = 4;
/// by default we start with space for that many begin/end sections per-frame
static const uint32_t START_SECTIONS = 64;
/// cyclic window for averaging
static const uint32_t MAX_NUM_AVERAGE = 128;
public:
typedef uint32_t SectionID;
typedef uint32_t OnceID;
class Clock
{
// generic utility class for measuring time
// uses high resolution timer provided by OS
public:
Clock();
double getMicroSeconds() const;
private:
std::chrono::time_point<std::chrono::high_resolution_clock> m_init;
};
//////////////////////////////////////////////////////////////////////////
// utility class for automatic calling of begin/end within a local scope
class Section
{
public:
Section(Profiler& profiler, const char* name, bool singleShot = false)
: m_profiler(profiler)
{
m_id = profiler.beginSection(name, nullptr, nullptr, singleShot);
}
~Section() { m_profiler.endSection(m_id); }
private:
SectionID m_id;
Profiler& m_profiler;
};
// recurring, must be within beginFrame/endFrame
Section timeRecurring(const char* name) { return Section(*this, name, false); }
// single shot, results are available after FRAME_DELAY many endFrame
Section timeSingle(const char* name) { return Section(*this, name, true); }
//////////////////////////////////////////////////////////////////////////
// num <= MAX_NUM_AVERAGE
void setAveragingSize(uint32_t num);
//////////////////////////////////////////////////////////////////////////
// gpu times for a section are queried at "endFrame" with the use of this optional function.
// It returns true if the queried result was available, and writes the microseconds into gpuTime.
typedef std::function<bool(SectionID, uint32_t subFrame, double& gpuTime)> gpuTimeProvider_fn;
// must be called every frame
void beginFrame();
void endFrame();
// there are two types of sections
// singleShot = true, means the timer can exist outside begin/endFrame and is non-recurring
// results of previous singleShot with same name will be overwritten.
// singleShot = false, sections can be nested, but must be within begin/endFrame
//
SectionID beginSection(const char* name, const char* api = nullptr, gpuTimeProvider_fn gpuTimeProvider = nullptr, bool singleShot = false);
void endSection(SectionID slot);
// When a section is used within a loop (same nesting level), and the the same arguments for name and api are
// passed, we normally average the results of those sections together when printing the stats or using the
// getAveraged functions below.
// Calling the splitter (outside of a section) means we insert a split point that the averaging will not
// pass.
void accumulationSplit();
inline double getMicroSeconds() const { return m_clock.getMicroSeconds(); }
//////////////////////////////////////////////////////////////////////////
// resets all stats
void clear();
// resets recurring sections
// in case averaging should be reset after a few frames (warm-up cache, hide early heavier frames after
// configuration changes)
// implicit resets are triggered if the frame's configuration of timer section changes compared to
// previous frame.
void reset(uint32_t delay = CONFIG_DELAY);
// pretty print current averaged timers
void print(std::string& stats);
// returns number of frames since reset
uint32_t getTotalFrames() const;
struct TimerStats
{
// time in microseconds
double average = 0;
double absMinValue = DBL_MAX;
double absMaxValue = 0;
};
struct TimerInfo
{
// number of averaged values, <= MAX_NUM_AVERAGE
uint32_t numAveraged = 0;
// accumulation happens for example in loops:
// for (..) { auto scopeTimer = timeSection("blah"); ... }
// then the reported values are the accumulated sum of all those timers.
bool accumulated = false;
TimerStats cpu;
TimerStats gpu;
};
// query functions for current gathered cyclic averages ( <= MAX_NUM_AVERAGE)
// use nullptr name to get the cpu timing of the outermost scope (beginFrame/endFrame)
// returns true if found timer and it had valid values
bool getTimerInfo(const char* name, TimerInfo& info);
// simplified wrapper
bool getAveragedValues(const char* name, double& cpuTime, double& gpuTime)
{
TimerInfo info;
if(getTimerInfo(name, info))
{
cpuTime = info.cpu.average;
gpuTime = info.gpu.average;
return true;
}
else
{
cpuTime = 0;
gpuTime = 0;
return false;
}
}
//////////////////////////////////////////////////////////////////////////
// if a master is provided we use its database
// otherwise our own
Profiler(Profiler* master = nullptr);
Profiler(uint32_t startSections);
protected:
//////////////////////////////////////////////////////////////////////////
// Utility functions for derived classes that provide gpu times.
// We assume most apis use a big pool of api-specific events/timers,
// the functions below help manage such pool.
inline uint32_t getSubFrame(SectionID slot) const { return m_data->entries[slot].subFrame; }
inline uint32_t getRequiredTimers() const { return (uint32_t)(m_data->entries.size() * FRAME_DELAY * 2); }
static inline uint32_t getTimerIdx(SectionID slot, uint32_t subFrame, bool begin)
{
// must not change order of begin/end
return ((slot * FRAME_DELAY) + subFrame) * 2 + (begin ? 0 : 1);
}
inline bool isSectionRecurring(SectionID slot) const { return m_data->entries[slot].level != LEVEL_SINGLESHOT; }
protected:
//////////////////////////////////////////////////////////////////////////
static const uint32_t LEVEL_SINGLESHOT = ~0;
struct TimeValues
{
double times[MAX_NUM_AVERAGE] = {0};
double valueTotal = 0;
double absMinValue = DBL_MAX;
double absMaxValue = 0;
uint32_t index = 0;
uint32_t numCycle = MAX_NUM_AVERAGE;
uint32_t numValid = 0;
TimeValues(uint32_t cycleSize = MAX_NUM_AVERAGE) { init(cycleSize); }
void init(uint32_t cycleSize)
{
numCycle = std::min(cycleSize, MAX_NUM_AVERAGE);
reset();
}
void reset()
{
valueTotal = 0;
absMinValue = DBL_MAX;
absMaxValue = 0;
index = 0;
numValid = 0;
memset(times, 0, sizeof(times));
}
void add(double time)
{
valueTotal += time - times[index];
times[index] = time;
index = (index + 1) % numCycle;
numValid = std::min(numValid + 1, numCycle);
absMinValue = std::min(time, absMinValue);
absMaxValue = std::max(time, absMaxValue);
}
double getAveraged()
{
if(numValid)
{
return valueTotal / double(numValid);
}
else
{
return 0;
}
}
};
struct Entry
{
std::string name = {};
std::string api = {};
gpuTimeProvider_fn gpuTimeProvider = nullptr;
// level == ~0 used for "singleShot"
uint32_t level = 0;
uint32_t subFrame = 0;
#ifdef NVP_SUPPORTS_NVTOOLSEXT
nvtxRangeId_t m_nvrange;
#endif
double cpuTimes[FRAME_DELAY] = {0};
double gpuTimes[FRAME_DELAY] = {0};
// number of times summed since last reset
uint32_t numTimes = 0;
TimeValues gpuTime;
TimeValues cpuTime;
// splitter is used to prevent accumulated case below
// when same depth level is used
// {section("BLAH"); ... }
// splitter
// {section("BLAH"); ...}
// now the result of "BLAH" is not accumulated
bool splitter = false;
// if the same timer name is used within a loop (same
// depth level), e.g.:
//
// for () { section("BLAH"); ... }
//
// we accumulate the timing values of all of them
bool accumulated = false;
};
struct Data
{
uint32_t numAveraging = MAX_NUM_AVERAGE;
uint32_t resetDelay = 0;
uint32_t numFrames = 0;
uint32_t level = 0;
uint32_t nextSection = 0;
uint32_t numLastSections = 0;
uint32_t numLastEntries = 0;
std::vector<uint32_t> frameSections;
std::vector<uint32_t> singleSections;
double cpuCurrentTime = 0;
TimeValues cpuTime;
std::vector<Entry> entries;
};
std::shared_ptr<Data> m_data = nullptr;
Clock m_clock;
SectionID getSectionID(bool singleShot, const char* name);
bool getTimerInfo(uint32_t i, TimerInfo& info);
void grow(uint32_t newsize);
};
} // namespace nvh
#endif

View file

@ -0,0 +1,108 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#ifndef NV_RADIXSORT_INCLUDED
#define NV_RADIXSORT_INCLUDED
namespace nvh {
/** @DOC_START
# function nvh::radixsort
The radixsort function sorts the provided keys based on
BYTES many bytes stored inside TKey starting at BYTEOFFSET.
The sorting result is returned as indices into the keys array.
For example:
```cpp
struct MyData {
uint32_t objectIdentifier;
uint16_t objectSortKey;
};
// 4-byte offset of objectSortKey within MyData
// 2-byte size of sorting key
result = radixsort<4,2>(keys, indicesIn, indicesTemp);
// after sorting the following is true
keys[result[i]].objectSortKey < keys[result[i + 1]].objectSortKey
// result can point either to indicesIn or indicesTemp (we swap the arrays
// after each byte iteration)
```
@DOC_END */
template <uint32_t BYTEOFFSET, uint32_t BYTES, typename TKey>
uint32_t* radixsort(uint32_t numIndices, const TKey* keys, uint32_t* indicesIn, uint32_t* indicesTemp)
{
uint32_t histogram[BYTES][256] = {0};
for(uint32_t i = 0; i < numIndices; i++)
{
uint32_t idx = indicesIn[i];
const uint8_t* bytes = (const uint8_t*)&keys[idx];
for(uint32_t p = 0; p < BYTES; p++)
{
uint8_t curbyte = bytes[BYTEOFFSET + p];
histogram[p][curbyte]++;
}
}
uint32_t* tempIn = indicesIn;
uint32_t* tempOut = indicesTemp;
for(uint32_t p = 0; p < BYTES; p++)
{
uint32_t offset = 0;
for(int32_t i = 0; i < 256; i++)
{
uint32_t numBin = histogram[p][i];
histogram[p][i] = offset;
offset += numBin;
}
for(uint32_t i = 0; i < numIndices; i++)
{
uint32_t idx = tempIn[i];
const uint8_t* bytes = (const uint8_t*)&keys[idx];
uint8_t curbyte = bytes[BYTEOFFSET + p];
uint32_t pos = histogram[p][curbyte]++;
tempOut[pos] = idx;
}
assert(histogram[p][255] == offset);
// swap
uint32_t* temp = tempIn;
tempIn = tempOut;
tempOut = temp;
}
// post swap tempIn is last tempOut
return tempIn;
}
} // namespace nvh
#endif

View file

@ -0,0 +1,320 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
/*
* This file contains code derived from glf by Christophe Riccio, www.g-truc.net
* Copyright (c) 2005 - 2015 G-Truc Creation (www.g-truc.net)
* https://github.com/g-truc/ogl-samples/blob/master/framework/compiler.cpp
*/
#include "shaderfilemanager.hpp"
#include <algorithm>
#include <assert.h>
#include <fstream>
#include <iostream>
#include <sstream>
#include <stdarg.h>
#include <stdio.h>
#include "fileoperations.hpp"
namespace nvh {
std::string ShaderFileManager::format(const char* msg, ...)
{
char text[8192];
va_list list;
if(msg == 0)
return std::string();
va_start(list, msg);
vsnprintf(text, sizeof(text), msg, list);
va_end(list);
return std::string(text);
}
inline std::string ShaderFileManager::markerString(int line, std::string const& filename, int fileid)
{
if(m_supportsExtendedInclude || m_forceLineFilenames)
{
#if defined(_WIN32) && 1
std::string fixedname;
for(size_t i = 0; i < filename.size(); i++)
{
char c = filename[i];
if(c == '/' || c == '\\')
{
fixedname.append("\\\\");
}
else
{
fixedname.append(1, c);
}
}
#else
std::string fixedname = filename;
#endif
return ShaderFileManager::format("#line %d \"", line) + fixedname + std::string("\"\n");
}
else
{
return ShaderFileManager::format("#line %d %d\n", line, fileid);
}
}
std::string ShaderFileManager::getIncludeContent(IncludeID idx, std::string& filename)
{
IncludeEntry& entry = m_includes[idx];
filename = entry.filename;
if(m_forceIncludeContent)
{
return entry.content;
}
if(!entry.content.empty() && !findFile(entry.filename, m_directories).empty())
{
return entry.content;
}
std::string content = loadFile(entry.filename, false, m_directories, filename, true);
return content.empty() ? entry.content : content;
}
std::string ShaderFileManager::getContent(std::string const& filename, std::string& filenameFound)
{
if(filename.empty())
{
return std::string();
}
IncludeID idx = findInclude(filename);
if(idx.isValid())
{
return getIncludeContent(idx, filenameFound);
}
// fall back
filenameFound = filename;
return loadFile(filename, false, m_directories, filenameFound, true);
}
std::string ShaderFileManager::getContentWithRequestingSourceDirectory(std::string const& filename,
std::string& filenameFound,
std::string const& requestingSource)
{
if(filename.empty())
{
return std::string();
}
IncludeID idx = findInclude(filename);
if(idx.isValid())
{
return getIncludeContent(idx, filenameFound);
}
// fall back; check requestingSource's directory first.
filenameFound = filename;
m_extendedDirectories.resize(m_directories.size() + 1);
m_extendedDirectories[0] = getDirectoryComponent(requestingSource);
for(size_t i = 0; i < m_directories.size(); ++i)
{
m_extendedDirectories[i + 1] = m_directories[i];
}
return loadFile(filename, false, m_extendedDirectories, filenameFound, true);
}
std::string ShaderFileManager::getDirectoryComponent(std::string filename)
{
while(!filename.empty())
{
auto popped = filename.back();
filename.pop_back();
switch(popped)
{
case '/':
goto exitLoop;
#if defined(_WIN32)
case '\\':
goto exitLoop;
#endif
}
}
exitLoop:
if(filename.empty())
filename.push_back('.');
return filename;
}
std::string ShaderFileManager::manualInclude(std::string const& filename, std::string& filenameFound, std::string const& prepend, bool foundVersion)
{
std::string source = getContent(filename, filenameFound);
return manualIncludeText(source, filenameFound, prepend, foundVersion);
}
std::string ShaderFileManager::manualIncludeText(std::string const& sourceText,
std::string const& textFilename,
std::string const& prepend,
bool foundVersion)
{
if(sourceText.empty())
{
return std::string();
}
std::stringstream stream;
stream << sourceText;
std::string line, text;
// Handle command line defines
text += prepend;
if(m_lineMarkers)
{
text += markerString(1, textFilename, 0);
}
int lineCount = 0;
while(std::getline(stream, line))
{
std::size_t offset = 0;
lineCount++;
// Version
offset = line.find("#version");
if(offset != std::string::npos)
{
std::size_t commentOffset = line.find("//");
if(commentOffset != std::string::npos && commentOffset < offset)
continue;
if(foundVersion)
{
// someone else already set the version, so just comment out
text += std::string("//") + line + std::string("\n");
}
else
{
// Reorder so that the #version line is always the first of a shader text
text = line + std::string("\n") + text + std::string("//") + line + std::string("\n");
foundVersion = true;
}
continue;
}
// Handle replacing #include with text if configured to do so.
// Otherwise just insert the #include command verbatim, for shaderc to handle.
if(m_handleIncludePasting)
{
offset = line.find("#include");
if(offset != std::string::npos)
{
std::size_t commentOffset = line.find("//");
if(commentOffset != std::string::npos && commentOffset < offset)
continue;
size_t firstQuote = line.find("\"", offset);
size_t secondQuote = line.find("\"", firstQuote + 1);
std::string include = line.substr(firstQuote + 1, secondQuote - firstQuote - 1);
std::string includeFound;
std::string includeContent = manualInclude(include, includeFound, std::string(), foundVersion);
if(!includeContent.empty())
{
text += includeContent;
if(m_lineMarkers)
{
text += std::string("\n") + markerString(lineCount + 1, textFilename, 0);
}
}
continue; // Skip adding the original #include line.
}
}
text += line + "\n";
}
return text;
}
ShaderFileManager::IncludeID ShaderFileManager::registerInclude(std::string const& name, std::string const& filename, std::string const& content)
{
// find if already registered
for(size_t i = 0; i < m_includes.size(); i++)
{
if(m_includes[i].name == name)
{
m_includes[i].content = content;
return i;
}
}
IncludeEntry entry;
entry.name = name;
entry.filename = filename.empty() ? name : filename;
entry.content = content;
m_includes.push_back(entry);
return m_includes.size() - 1;
}
ShaderFileManager::IncludeID ShaderFileManager::findInclude(std::string const& name) const
{
// check registered includes first
for(std::size_t i = 0; i < m_includes.size(); ++i)
{
if(m_includes[i].name == name)
{
return IncludeID(i);
}
}
return IncludeID();
}
bool ShaderFileManager::loadIncludeContent(IncludeID idx)
{
std::string filenameFound;
m_includes[idx].content = getIncludeContent(idx, filenameFound);
return !m_includes[idx].content.empty();
}
const ShaderFileManager::IncludeEntry& ShaderFileManager::getIncludeEntry(IncludeID idx) const
{
return m_includes[idx];
}
std::string ShaderFileManager::getProcessedContent(std::string const& filename, std::string& filenameFound)
{
return manualInclude(filename, filenameFound, "", false);
}
} // namespace nvh

View file

@ -0,0 +1,204 @@
/*
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#ifndef NV_SHADERFILEMANAGER_INCLUDED
#define NV_SHADERFILEMANAGER_INCLUDED
#include <stdint.h>
#include <stdio.h>
#include <string>
#include <vector>
namespace nvh {
class ShaderFileManager
{
//////////////////////////////////////////////////////////////////////////
/** @DOC_START
# class nvh::ShaderFileManager
The nvh::ShaderFileManager class is meant to be derived from to create the actual api-specific
shader/program managers.
The ShaderFileManager provides a system to find/load shader files.
It also allows resolving #include instructions in HLSL/GLSL source files.
Such includes can be registered before pointing to strings in memory.
If m_handleIncludePasting is true, then `#include`s are replaced by
the include file contents (recursively) before presenting the
loaded shader source code to the caller. Otherwise, the include file
loader is still available but `#include`s are left unchanged.
Furthermore it handles injecting prepended strings (typically used
for #defines) after the #version statement of GLSL files,
regardless of m_handleIncludePasting's value.
@DOC_END */
public:
enum FileType
{
FILETYPE_DEFAULT,
FILETYPE_GLSL,
FILETYPE_HLSL,
FILETYPE_SPIRV,
};
struct IncludeEntry
{
std::string name;
std::string filename;
std::string content;
};
typedef std::vector<IncludeEntry> IncludeRegistry;
static std::string format(const char* msg, ...);
public:
class IncludeID
{
public:
size_t m_value;
IncludeID()
: m_value(size_t(~0))
{
}
IncludeID(size_t b)
: m_value((uint32_t)b)
{
}
IncludeID& operator=(size_t b)
{
m_value = b;
return *this;
}
bool isValid() const { return m_value != size_t(~0); }
operator bool() const { return isValid(); }
operator size_t() const { return m_value; }
friend bool operator==(const IncludeID& lhs, const IncludeID& rhs) { return rhs.m_value == lhs.m_value; }
};
struct Definition
{
Definition() {}
Definition(uint32_t type, std::string const& prepend, std::string const& filename)
: type(type)
, prepend(prepend)
, filename(filename)
{
}
Definition(uint32_t type, std::string const& filename)
: type(type)
, filename(filename)
{
}
uint32_t type = 0;
std::string filename;
std::string prepend;
std::string entry = "main";
FileType filetype = FILETYPE_DEFAULT;
std::string filenameFound;
std::string content;
};
// optionally register files to be included, optionally provide content directly rather than from disk
//
// name: name used within shader files
// diskname = filename on disk (defaults to name if not set)
// content = provide content as string rather than loading from disk
IncludeID registerInclude(std::string const& name,
std::string const& diskname = std::string(),
std::string const& content = std::string());
// Use m_prepend to pass global #defines
// Derived api classes will use this as global prepend to the per-definition prepends in combination
// with the source files
// actualSoure = m_prepend + definition.prepend + definition.content
std::string m_prepend;
// per file state, used when FILETYPE_DEFAULT is provided in the Definition
FileType m_filetype;
// add search directories
void addDirectory(const std::string& dir) { m_directories.push_back(dir); }
ShaderFileManager(bool handleIncludePasting = true)
: m_filetype(FILETYPE_GLSL)
, m_lineMarkers(true)
, m_forceLineFilenames(false)
, m_forceIncludeContent(false)
, m_supportsExtendedInclude(false)
, m_handleIncludePasting(handleIncludePasting)
{
m_directories.push_back(".");
}
//////////////////////////////////////////////////////////////////////////
// in rare cases you may want to access the included content in detail yourself
IncludeID findInclude(std::string const& name) const;
bool loadIncludeContent(IncludeID);
const IncludeEntry& getIncludeEntry(IncludeID idx) const;
std::string getProcessedContent(std::string const& filename, std::string& filenameFound);
protected:
std::string markerString(int line, std::string const& filename, int fileid);
std::string getIncludeContent(IncludeID idx, std::string& filenameFound);
std::string getContent(std::string const& filename, std::string& filenameFound);
std::string getContentWithRequestingSourceDirectory(std::string const& filename,
std::string& filenameFound,
std::string const& requestingSource);
static std::string getDirectoryComponent(std::string filename);
std::string manualInclude(std::string const& filename, std::string& filenameFound, std::string const& prepend, bool foundVersion);
std::string manualIncludeText(std::string const& sourceText, std::string const& textFilename, std::string const& prepend, bool foundVersion);
bool m_lineMarkers;
bool m_forceLineFilenames;
bool m_forceIncludeContent;
bool m_supportsExtendedInclude;
bool m_handleIncludePasting;
std::vector<std::string> m_directories;
IncludeRegistry m_includes;
// Used as temporary storage in getContentWithRequestingSourceDirectory; saves on dynamic allocation.
std::vector<std::string> m_extendedDirectories;
};
} // namespace nvh
#endif //NV_PROGRAM_INCLUDED

View file

@ -0,0 +1,172 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2014-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#pragma once
#include <thread>
#include <condition_variable>
#include <mutex>
namespace nvh {
using DefaultDelayClock = std::chrono::steady_clock;
using DefaultDelayDuration = std::chrono::nanoseconds;
/** @DOC_START
# class nvh::delayed_call
Class returned by delay_noreturn_for to track the thread created and possibly reset the
delay timer.
@DOC_END */
template <class Clock = DefaultDelayClock, class Duration = std::chrono::duration<double>>
class delayed_call
{
template <class ClockT, class DurationT, class Function, class... Args>
friend delayed_call<ClockT, DurationT> delay_noreturn_for(const DurationT& sleep_duration, Function&& f, Args&&... args);
public:
/** Update the thread to make the call sleep_duration from now
*
* \return True if the delay was updated before the callback was called. False otherwise.
*/
bool delay_for(const Duration& sleep_duration)
{
bool result = false;
if(m_delay)
{
std::lock_guard<std::mutex> lock(m_delay->mutex);
if(!m_delay->started)
{
auto prevUntil = m_delay->until;
m_delay->until = Clock::now() + sleep_duration;
// No need to wake up the other thread if the delay is longer. It'll keep looping while dirty is set.
if(prevUntil < m_delay->until)
m_delay->dirty = true;
else
m_delay->cv.notify_all();
}
result = !m_delay->started;
}
return result;
}
/** Cancel a delayed call
*
* \return True if the call was cancelled before running. False otherwise.
*/
bool cancel()
{
bool result = false;
if(m_delay)
{
std::lock_guard<std::mutex> lock(m_delay->mutex);
if(!m_delay->started)
{
m_delay->cancelled = true;
m_delay->cv.notify_all();
}
result = !m_delay->started;
}
return result;
}
delayed_call() = default;
delayed_call(delayed_call&& other) { *this = std::move(other); }
~delayed_call() = default;
delayed_call& operator=(delayed_call&& other)
{
m_delay = std::move(other.m_delay);
return *this;
}
// This class is movable only
delayed_call(const delayed_call& other) = delete;
delayed_call& operator=(const delayed_call& other) = delete;
private:
struct DelayData
{
std::chrono::time_point<Clock, Duration> until;
std::thread thread;
std::mutex mutex;
std::condition_variable cv;
bool dirty = true;
bool cancelled = false;
bool started = false;
~DelayData()
{
if(thread.joinable())
thread.join();
}
};
template <class Function, class... Args>
static void delayEntry(DelayData* delay, Function&& f, Args&&... args)
{
{
std::unique_lock<std::mutex> lock(delay->mutex);
std::cv_status status = std::cv_status::no_timeout;
while(!delay->cancelled && (delay->dirty || status == std::cv_status::no_timeout))
{
delay->dirty = false;
status = delay->cv.wait_until(lock, delay->until);
}
if(delay->cancelled)
return;
delay->started = true;
}
// Ignore the return value. Need to keep a std::future object if not.
(void)f(std::forward<Args>(args)...);
}
std::unique_ptr<DelayData> m_delay;
template <class Function, class... Args>
delayed_call(const Duration& sleep_duration, Function&& f, Args&&... args)
: m_delay(std::make_unique<DelayData>())
{
m_delay->until = Clock::now() + sleep_duration;
m_delay->thread =
std::thread(delayed_call::delayEntry<std::remove_reference_t<Function>&&, std::remove_reference_t<Args>&&...>,
m_delay.get(), std::forward<Function>(f), std::forward<Args>(args)...);
}
};
/** @DOC_START
Delay a call to a void function for sleep_duration.
`return`: A delayed_call object that holds the running thread.
Example:
```cpp
// Create or update a delayed call to callback. Useful to consolidate multiple events into one call.
if(!m_delayedCall.delay_for(delay))
m_delayedCall = nvh::delay_noreturn_for(delay, callback);
```
@DOC_END */
template <class Clock = DefaultDelayClock, class Duration = DefaultDelayDuration, class Function, class... Args>
delayed_call<Clock, Duration> delay_noreturn_for(const Duration& sleep_duration, Function&& f, Args&&... args)
{
return delayed_call<Clock, Duration>(sleep_duration, std::forward<Function>(f), std::forward<Args>(args)...);
}
} // namespace nvh

View file

@ -0,0 +1,207 @@
/*
* Copyright (c) 2013-2023, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2013 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
//--------------------------------------------------------------------
#pragma once
#include <chrono>
#include <string>
#include <cstdarg>
#include <cassert>
#include "nvprint.hpp"
/* @DOC_START -----------------------------------------------------------------------------
# struct TimeSampler
TimeSampler does time sampling work
@DOC_END ----------------------------------------------------------------------------- */
struct TimeSampler
{
using Clock = std::chrono::steady_clock;
using TimePoint = typename Clock::time_point;
bool bNonStopRendering;
int renderCnt;
TimePoint start_time, end_time;
int timing_counter;
int maxTimeSamples;
int frameFPS;
double frameDT;
TimeSampler()
{
bNonStopRendering = true;
renderCnt = 1;
timing_counter = 0;
maxTimeSamples = 60;
frameDT = 1.0 / 60.0;
frameFPS = 0;
start_time = end_time = Clock::now();
}
inline double getFrameDT() { return frameDT; }
inline int getFPS() { return frameFPS; }
void resetSampling(int i = 10) { maxTimeSamples = i; }
bool update(bool bContinueToRender, bool* glitch = nullptr)
{
if(glitch)
*glitch = false;
bool updated = false;
if((timing_counter >= maxTimeSamples) && (maxTimeSamples > 0))
{
timing_counter = 0;
end_time = Clock::now();
// Get delta in seconds
frameDT = std::chrono::duration_cast<std::chrono::duration<double>>(end_time - start_time).count();
// Linux/OSX etc. TODO
frameDT /= maxTimeSamples;
#define MAXDT (1.0 / 40.0)
#define MINDT (1.0 / 3000.0)
if(frameDT < MINDT)
{
frameDT = MINDT;
}
else if(frameDT > MAXDT)
{
frameDT = MAXDT;
if(glitch)
*glitch = true;
}
frameFPS = (int)(1.0 / frameDT);
// update the amount of samples to average, depending on the speed of the scene
maxTimeSamples = (int)(0.15 / (frameDT));
if(maxTimeSamples > 50)
maxTimeSamples = 50;
updated = true;
}
if(bContinueToRender || bNonStopRendering)
{
if(timing_counter == 0)
start_time = Clock::now();
timing_counter++;
}
return updated;
return true;
}
};
/** @DOC_START
# struct nvh::Stopwatch
> Timer in milliseconds.
Starts the timer at creation and the elapsed time is retrieved by calling `elapsed()`.
The timer can be reset if it needs to start timing later in the code execution.
Usage:
````cpp
{
nvh::Stopwatch sw;
... work ...
LOGI("Elapsed: %f ms\n", sw.elapsed()); // --> Elapsed: 128.157 ms
}
````
@DOC_END */
namespace nvh {
struct Stopwatch
{
Stopwatch() { reset(); }
void reset() { startTime = std::chrono::steady_clock::now(); }
double elapsed()
{
return std::chrono::duration<double>(std::chrono::steady_clock::now() - startTime).count() * 1000.;
}
std::chrono::time_point<std::chrono::steady_clock> startTime;
};
// Logging the time spent while alive in a scope.
// Usage: at beginning of a function:
// auto stimer = ScopedTimer("Time for doing X");
// Nesting timers is handled, but since the time is printed when it goes out of
// scope, printing anything else will break the output formatting.
struct ScopedTimer
{
ScopedTimer(const std::string& str) { init_(str); }
ScopedTimer(const char* fmt, ...)
{
std::string str(256, '\0'); // initial guess. ideally the first try fits
va_list args1, args2;
va_start(args1, fmt);
va_copy(args2, args1); // make a backup as vsnprintf may consume args1
int rc = vsnprintf(str.data(), str.size(), fmt, args1);
if(rc >= 0 && static_cast<size_t>(rc + 1) > str.size())
{
str.resize(rc + 1); // include storage for '\0'
rc = vsnprintf(str.data(), str.size(), fmt, args2);
}
va_end(args1);
assert(rc >= 0 && "vsnprintf error");
str.resize(rc >= 0 ? static_cast<size_t>(rc) : 0);
init_(str);
}
void init_(const std::string& str)
{
// If nesting timers, break the newline of the previous one
if(s_openNewline)
{
assert(s_nesting > 0);
LOGI("\n");
}
m_manualIndent = !str.empty() && (str[0] == ' ' || str[0] == '-' || str[0] == '|');
// Add indentation automatically if not already in str.
if(s_nesting > 0 && !m_manualIndent)
{
LOGI("%s", indent().c_str());
}
LOGI("%s", str.c_str());
s_openNewline = str.empty() || str[str.size() - 1] != '\n';
++s_nesting;
}
~ScopedTimer()
{
--s_nesting;
// If nesting timers and this is the second destructor in a row, indent and
// print "Total" as it won't be on the same line.
if(!s_openNewline && !m_manualIndent)
{
LOGI("%s|", indent().c_str());
}
else
{
LOGI(" ");
}
LOGI("-> %.3f ms\n", m_stopwatch.elapsed());
s_openNewline = false;
}
static std::string indent()
{
std::string result(static_cast<size_t>(s_nesting * 2), ' ');
for(int i = 0; i < s_nesting * 2; i += 2)
result[i] = '|';
return result;
}
nvh::Stopwatch m_stopwatch;
bool m_manualIndent = false;
static inline thread_local int s_nesting = 0;
static inline thread_local bool s_openNewline = false;
};
} // namespace nvh

View file

@ -0,0 +1,553 @@
/*
* Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-FileCopyrightText: Copyright (c) 2019-2021 NVIDIA CORPORATION
* SPDX-License-Identifier: Apache-2.0
*/
#pragma once
#include <algorithm>
#include <assert.h>
#include <stdint.h>
#include <stdio.h>
#include <NvFoundation.h> // for NV_X86 and NV_X64
#if(defined(NV_X86) || defined(NV_X64)) && defined(_MSC_VER)
#include <intrin.h>
#endif
namespace nvh {
/** @DOC_START
# class nvh::TRangeAllocator
The nvh::TRangeAllocator<GRANULARITY> template allows to sub-allocate ranges from a fixed
maximum size. Ranges are allocated at GRANULARITY and are merged back on freeing.
Its primary use is within allocators that sub-allocate from fixed-size blocks.
The implementation is based on [MakeID by Emil Persson](http://www.humus.name/3D/MakeID.h).
Example :
```cpp
TRangeAllocator<256> range;
// initialize to a certain range
range.init(range.alignedSize(128 * 1024 * 1024));
...
// allocate a sub range
// example
uint32_t size = vertexBufferSize;
uint32_t alignment = vertexAlignment;
uint32_t allocOffset;
uint32_t allocSize;
uint32_t alignedOffset;
if (range.subAllocate(size, alignment, allocOffset, alignedOffset, allocSize)) {
... use the allocation space
// [alignedOffset + size] is guaranteed to be within [allocOffset + allocSize]
}
// give back the memory range for re-use
range.subFree(allocOffset, allocSize);
...
// at the end cleanup
range.deinit();
```
@DOC_END */
// GRANULARITY must be power of two
template <uint32_t GRANULARITY = 256>
class TRangeAllocator
{
private:
uint32_t m_size;
uint32_t m_used;
public:
TRangeAllocator()
: m_size(0)
, m_used(0)
{
}
TRangeAllocator(uint32_t size) { init(size); }
~TRangeAllocator() { deinit(); }
static uint32_t alignedSize(uint32_t size) { return (size + GRANULARITY - 1) & (~(GRANULARITY - 1)); }
void init(uint32_t size)
{
assert(size % GRANULARITY == 0 && "managed total size must be aligned to GRANULARITY");
uint32_t pages = ((size + GRANULARITY - 1) / GRANULARITY);
rangeInit(pages - 1);
m_used = 0;
m_size = size;
}
void deinit() { rangeDeinit(); }
bool isEmpty() const { return m_used == 0; }
bool isAvailable(uint32_t size, uint32_t align) const
{
uint32_t alignRest = align - 1;
uint32_t sizeReserved = size;
if(m_used >= m_size)
{
return false;
}
if(m_used != 0 && align > GRANULARITY)
{
sizeReserved += alignRest;
}
uint32_t countReserved = (sizeReserved + GRANULARITY - 1) / GRANULARITY;
return isRangeAvailable(countReserved);
}
bool subAllocate(uint32_t size, uint32_t align, uint32_t& outOffset, uint32_t& outAligned, uint32_t& outSize)
{
if(align == 0)
{
align = 1;
}
uint32_t alignRest = align - 1;
uint32_t sizeReserved = size;
#if(defined(NV_X86) || defined(NV_X64)) && defined(_MSC_VER)
bool alignIsPOT = __popcnt(align) == 1;
#else
bool alignIsPOT = __builtin_popcount(align) == 1;
#endif
if(m_used >= m_size)
{
outSize = 0;
outOffset = 0;
outAligned = 0;
return false;
}
if(m_used != 0 && (alignIsPOT ? (align > GRANULARITY) : ((alignRest + size) > GRANULARITY)))
{
sizeReserved += alignRest;
}
uint32_t countReserved = (sizeReserved + GRANULARITY - 1) / GRANULARITY;
uint32_t startID;
if(createRangeID(startID, countReserved))
{
outOffset = startID * GRANULARITY;
outAligned = ((outOffset + alignRest) / align) * align;
// due to custom alignment, we may be able to give
// pages back that we over-allocated
//
// reserved: [ | | | ] (GRANULARITY spacing)
// used: [ ] (custom alignment/size)
// corrected: [ | ] (GRANULARITY spacing)
// correct start (warning could yield more fragmentation)
uint32_t skipFront = (outAligned - outOffset) / GRANULARITY;
if(skipFront)
{
destroyRangeID(startID, skipFront);
outOffset += skipFront * GRANULARITY;
startID += skipFront;
countReserved -= skipFront;
}
assert(outOffset <= outAligned);
// correct end
uint32_t outLast = alignedSize(outAligned + size);
outSize = outLast - outOffset;
uint32_t usedCount = outSize / GRANULARITY;
assert(usedCount <= countReserved);
if(usedCount < countReserved)
{
destroyRangeID(startID + usedCount, countReserved - usedCount);
}
assert((outAligned + size) <= (outOffset + outSize));
m_used += outSize;
//checkRanges();
return true;
}
else
{
outSize = 0;
outOffset = 0;
outAligned = 0;
return false;
}
}
void subFree(uint32_t offset, uint32_t size)
{
assert(offset % GRANULARITY == 0);
assert(size % GRANULARITY == 0);
m_used -= size;
destroyRangeID(offset / GRANULARITY, size / GRANULARITY);
//checkRanges();
}
TRangeAllocator& operator=(const TRangeAllocator& other)
{
m_size = other.m_size;
m_used = other.m_used;
m_Ranges = other.m_Ranges;
m_Count = other.m_Count;
m_Capacity = other.m_Capacity;
m_MaxID = other.m_MaxID;
if(m_Ranges)
{
m_Ranges = static_cast<Range*>(::malloc(m_Capacity * sizeof(Range)));
memcpy(m_Ranges, other.m_Ranges, m_Capacity * sizeof(Range));
}
return *this;
}
TRangeAllocator(const TRangeAllocator& other)
{
m_size = other.m_size;
m_used = other.m_used;
m_Ranges = other.m_Ranges;
m_Count = other.m_Count;
m_Capacity = other.m_Capacity;
m_MaxID = other.m_MaxID;
if(m_Ranges)
{
m_Ranges = static_cast<Range*>(::malloc(m_Capacity * sizeof(Range)));
assert(m_Ranges); // Make sure allocation succeeded
memcpy(m_Ranges, other.m_Ranges, m_Capacity * sizeof(Range));
}
}
TRangeAllocator& operator=(TRangeAllocator&& other)
{
m_size = other.m_size;
m_used = other.m_used;
m_Ranges = other.m_Ranges;
m_Count = other.m_Count;
m_Capacity = other.m_Capacity;
m_MaxID = other.m_MaxID;
other.m_Ranges = nullptr;
return *this;
}
TRangeAllocator(TRangeAllocator&& other)
{
m_size = other.m_size;
m_used = other.m_used;
m_Ranges = other.m_Ranges;
m_Count = other.m_Count;
m_Capacity = other.m_Capacity;
m_MaxID = other.m_MaxID;
other.m_Ranges = nullptr;
}
private:
//////////////////////////////////////////////////////////////////////////
// most of the following code is taken from Emil Persson's MakeID
// http://www.humus.name/3D/MakeID.h (v1.02)
struct Range
{
uint32_t m_First;
uint32_t m_Last;
};
Range* m_Ranges = nullptr; // Sorted array of ranges of free IDs
uint32_t m_Count = 0; // Number of ranges in list
uint32_t m_Capacity = 0; // Total capacity of range list
uint32_t m_MaxID = 0;
public:
void rangeInit(const uint32_t max_id)
{
// Start with a single range, from 0 to max allowed ID (specified)
m_Ranges = static_cast<Range*>(::malloc(sizeof(Range)));
assert(m_Ranges != nullptr); // Make sure allocation succeeded
m_Ranges[0].m_First = 0;
m_Ranges[0].m_Last = max_id;
m_Count = 1;
m_Capacity = 1;
m_MaxID = max_id;
}
void rangeDeinit()
{
if(m_Ranges)
{
::free(m_Ranges);
m_Ranges = nullptr;
}
}
bool createID(uint32_t& id)
{
if(m_Ranges[0].m_First <= m_Ranges[0].m_Last)
{
id = m_Ranges[0].m_First;
// If current range is full and there is another one, that will become the new current range
if(m_Ranges[0].m_First == m_Ranges[0].m_Last && m_Count > 1)
{
destroyRange(0);
}
else
{
++m_Ranges[0].m_First;
}
return true;
}
// No availble ID left
return false;
}
bool createRangeID(uint32_t& id, const uint32_t count)
{
uint32_t i = 0;
do
{
const uint32_t range_count = 1 + m_Ranges[i].m_Last - m_Ranges[i].m_First;
if(count <= range_count)
{
id = m_Ranges[i].m_First;
// If current range is full and there is another one, that will become the new current range
if(count == range_count && i + 1 < m_Count)
{
destroyRange(i);
}
else
{
m_Ranges[i].m_First += count;
}
return true;
}
++i;
} while(i < m_Count);
// No range of free IDs was large enough to create the requested continuous ID sequence
return false;
}
bool destroyID(const uint32_t id) { return destroyRangeID(id, 1); }
bool destroyRangeID(const uint32_t id, const uint32_t count)
{
const uint32_t end_id = id + count;
assert(end_id <= m_MaxID + 1);
// Binary search of the range list
uint32_t i0 = 0;
uint32_t i1 = m_Count - 1;
for(;;)
{
const uint32_t i = (i0 + i1) / 2;
if(id < m_Ranges[i].m_First)
{
// Before current range, check if neighboring
if(end_id >= m_Ranges[i].m_First)
{
if(end_id != m_Ranges[i].m_First)
return false; // Overlaps a range of free IDs, thus (at least partially) invalid IDs
// Neighbor id, check if neighboring previous range too
if(i > i0 && id - 1 == m_Ranges[i - 1].m_Last)
{
// Merge with previous range
m_Ranges[i - 1].m_Last = m_Ranges[i].m_Last;
destroyRange(i);
}
else
{
// Just grow range
m_Ranges[i].m_First = id;
}
return true;
}
else
{
// Non-neighbor id
if(i != i0)
{
// Cull upper half of list
i1 = i - 1;
}
else
{
// Found our position in the list, insert the deleted range here
insertRange(i);
m_Ranges[i].m_First = id;
m_Ranges[i].m_Last = end_id - 1;
return true;
}
}
}
else if(id > m_Ranges[i].m_Last)
{
// After current range, check if neighboring
if(id - 1 == m_Ranges[i].m_Last)
{
// Neighbor id, check if neighboring next range too
if(i < i1 && end_id == m_Ranges[i + 1].m_First)
{
// Merge with next range
m_Ranges[i].m_Last = m_Ranges[i + 1].m_Last;
destroyRange(i + 1);
}
else
{
// Just grow range
m_Ranges[i].m_Last += count;
}
return true;
}
else
{
// Non-neighbor id
if(i != i1)
{
// Cull bottom half of list
i0 = i + 1;
}
else
{
// Found our position in the list, insert the deleted range here
insertRange(i + 1);
m_Ranges[i + 1].m_First = id;
m_Ranges[i + 1].m_Last = end_id - 1;
return true;
}
}
}
else
{
// Inside a free block, not a valid ID
return false;
}
}
}
bool isRangeAvailable(uint32_t searchCount) const
{
uint32_t i = 0;
do
{
uint32_t count = m_Ranges[i].m_Last - m_Ranges[i].m_First + 1;
if(count >= searchCount)
return true;
++i;
} while(i < m_Count);
return false;
}
void printRanges() const
{
uint32_t i = 0;
for(;;)
{
if(m_Ranges[i].m_First < m_Ranges[i].m_Last)
printf("%u-%u", m_Ranges[i].m_First, m_Ranges[i].m_Last);
else if(m_Ranges[i].m_First == m_Ranges[i].m_Last)
printf("%u", m_Ranges[i].m_First);
else
printf("-");
++i;
if(i >= m_Count)
{
printf("\n");
return;
}
printf(", ");
}
}
void checkRanges() const
{
for(uint32_t i = 0; i < m_Count; i++)
{
assert(m_Ranges[i].m_Last <= m_MaxID);
if(m_Ranges[i].m_First == m_Ranges[i].m_Last + 1)
{
continue;
}
assert(m_Ranges[i].m_First <= m_Ranges[i].m_Last);
assert(m_Ranges[i].m_First <= m_MaxID);
}
}
void insertRange(const uint32_t index)
{
if(m_Count >= m_Capacity)
{
m_Capacity += m_Capacity;
m_Ranges = (Range*)realloc(m_Ranges, m_Capacity * sizeof(Range));
assert(m_Ranges); // Make sure reallocation succeeded
}
::memmove(m_Ranges + index + 1, m_Ranges + index, (m_Count - index) * sizeof(Range));
++m_Count;
}
void destroyRange(const uint32_t index)
{
--m_Count;
::memmove(m_Ranges + index, m_Ranges + index + 1, (m_Count - index) * sizeof(Range));
}
};
} // namespace nvh