SVO ray marcher devblog

Project specification

Voxel engines are used in 3D rendering and simulations for various different applications such as games or digital environments. These engines use volumetric data to represent a 3D space where each volumetric pixel (voxel) corresponds to a cubic chunk of space.

The challenge with traditional cartesian voxel grids is that they tend to use and access a lot of memory that it technically doesn’t need, making it inefficient. This issue is especially visible for environments that are sparse or where big areas are empty.

One way to tackle this is to use a more efficient spatial data structure, called a Sparse Voxel Octree (SVO). A normal octree works by dividing space into 8 different child octants. These octants can then contain 8 new child octants themselves. This continues on until a certain level of detail. For SVOs, all octants don't have to contain 8 new child octants, so for more sparse environments there will be less voxels than if all octants would have to go to a certain depth. This enables very fast hierarchical travel through nodes and drastically reduces memory usage.

My project aims to implement a basic ray-marcher that uses an SVO as its data structure for rendering voxel scenes. This implementation will then be evaluated on its performance and image quality. This will be done by running the implementation under varyingly complex environments and looking at things like frames per second, amount of voxels compared to a cartesian grid and memory usage. If there is time I could also compare this performance to a more naive cartesian grid-based approach which would give an insight into how much more effective it actually is.

Defining the data structure

To begin the development of the ray marcher I first had to define the Sparse Voxel Octree data structure that it will be traversing. To do this I both researched a few different papers on the topic to get a grasp of what an Octree is, how it should be defined and also how it is used. I found it especially hard to understand how to define the octree in such a way that it represents the 3D volume in a way where it would then be easy to traverse using a ray marcher. Furthermore, the concept of the voxel octree being sparse was an additional challenge. During the first period of development I completely misunderstood what sparse actually entailed which led to me not getting how to actually ray march through the octree. I spent a lot of time sketching and making sure I understood how it would work before I actually started developing it.

When I felt comfortable I understood how this would work I started defining the structs that would hold the data as well as a SparseVoxelOctree class. In this class I also outlined the functions that would be needed to create and traverse the Octree. This included inserting a node, fetching a node at a certain position and the function for actually ray marching until a collision is found or the bounding box is exited. Since this is a ray marcher I decided to base the code on the code that we had already developed in the ray tracing lab of this course.

Simply explained, the data structure is defined as a 8-ary tree (binary, but with 8 branches) where the children of a node are contained in octants within the parent node. So the definition of the data structure is kind of simple and the more complicated parts arise when you have to actually march through this data structure.

Ray marching algorithm

To ray trace this data structure I decided to implement a dynamic step size ray marcher. What this means is that instead of calculating if a line intersects a triangle you shoot a ray in a certain direction and take steps until the ray collides with something. This can be done either with a fixed step size, or with a dynamic one where you jump different lengths depending on the environment. For an SVO ray marcher this is especially useful since you can jump further distances if you know the data is sparse, which the data structure allows for.

To begin the ray marching algorithm we first have to find what node holds the starting position of the camera. Therefore I began by defining the function for getting a node at a certain position. I found this youtube video that explains how you can very easily know what child index to traverse down by seeing each octant as a cube of size 1. Thus we get coordinates at each of the corners that consist of zeroes and ones (hint hint). By seeing this position as a binary number instead of a vector we can get the child index. Codewise this can be done quite simply by defining each bit as a boolean representing each axis, that says whether the position is higher than the center of the parent node. This essentially simulates the same logic of having a cube of size 1 but generalizes it for all node positions. I then implemented the below code on this logic.

										float nodeSize = svoSize / (float)(1 << depth);
vec3 center = offset + vec3(nodeSize, nodeSize, nodeSize) * 0.5f;

ivec3 childPos;
childPos.x = pos.x >= center.x;
childPos.y = pos.y >= center.y;
childPos.z = pos.z >= center.z;
int childIndex = (childPos.x << 0) | (childPos.y << 1) | (childPos.z << 2);

First we just make sure that the node we are in is not a leaf, but if not we now know what node we are in, and can begin actually marching the ray forward. Firstly we have to calculate how far to step to reach into the next voxel boundary. To do this we snap the position in each axis to the closest voxel edge and save this as gridPos. This means that we either round the position up or down to the nearest grid, depending on if the direction of the ray is positive or negative for each axis. By elementwise subtracting our original position from this gridposition, and then elementwise dividing this new vector by the direction vector from the ray, we get a new vector that contains how many steps of d to take in each axis to reach that axis boundary. By taking the minimum value of the x, y and z of this vector we get the shortest step that will make us reach a new voxel.

Here I encountered quite a few issues with inaccuracies and artifacts, some of which I still haven’t solved that cause the engine to look a bit strange if you look closely.
Face bias issues
However, I was able to implement some fixes that got rid of some of these inaccuracies. The biggest fix was probably the face bias. By adding a small factor that stepped a bit backwards in comparison to the normal of the face of the octant you are passing through, it made sure the algorithm didn’t land on the boundary between two octants. This avoided some major artifacts that were very prevalent otherwise as can be seen in the image below.
Face bias issues
I also added some epsilons and safety checks to different operations, to make sure that very very small numbers passed as zeroes for equals operations and that the code never divided by numbers very close to zeroes. This also helped fix some artifacts that I was not able to reproduce while writing this blog however.

Lighting and coloring

Since I mostly wanted to compare the performance difference between an SVO ray marcher and a cartesian grid ray marcher I didn’t put too much effort into the lighting and coloring of the engine. Instead I implemented quite a simple lighting model called Lambertian shading that we also used in the course. By using the normal that I calculated for the face bias previously, I could compare how much the normal vector aligned with a vector from the face of the voxel to the light source. This is of course done using the dot product between these two vectors, and is essentially a rough representation of how aligned the face is towards the light source. This makes the material of the object look very matte and the apparent brightness of the surface is the same regardless of the angle of view. This diffuse shading value is then multiplied by the color of the voxel to get the final color of the pixel.

Lambertian shading explanation

GPU fail

Originally this project was aimed to first be developed using the library SDL2 and then when I fully understood how it worked I was going to flatten the data structure for the SVO and send it into a fragment shader for GPU rendering. However, this turned out to be much more difficult than what I anticipated. I learned how OpenGL worked and developed some small things with it to get an understanding of how it worked. I then transferred the C++ code for the SVO into this OpenGL environment I created. Here I structured the code a bit better for the planned OpenGL implementation of the ray marching shader. Since you can’t send more complex data structures like structs into a fragment shader through the buffer I also had to flatten the data structure. This meant coming up with a way to store the needed data for the nodes in an integer of x amount of bits. I decided to store it like this:

8 bits: Color
8 bits: Child mask (Each bit represents if that child out of 8 exists)
1 bit: isLeaf (boolean)
24 bits: firstChildIndex (The index of this nodes first child in the flat array of ints)

Since this is 41 bits I couldn’t send it using a 32 bit int and so I sent it as a 64 bit uint with this structure:
(Color) Bits 63 … 56 │(Child mask) Bits 55 … 48│(isLeaf) Bit 47│(firstChildIndex) Bits 46 … 0

In the shader I then turned these back into structs and tried implementing the functions for the ray marching algorithm again but wasn’t able to. Both due to the difficulty of debugging shader files, and time constraints and so I eventually decided to just evaluate the C++ implementation as it still worked as a fair comparison of an SVO.

Results

Here are some gifs showing of the result of this project in a few different terrains!
Flying through perlin terrain Flying through ball terrain Flying through detailed perlin terrain