Raydiance - Blog


Adding sheen

2023-04-28

This release implements the Disney sheen model, which adds two new parameters: and . This model adds a new diffuse lobe that attempts to model clothing and textile materials. The effect is subtle but noticeable.

Below we can see the effect of interpolating and between to respectively.

Disney defines as:

We also added the new parameters to our debug visualization system.

Interpolating incoming direction The effect of sheen tint

Modeling cloth geometry

We made the cloth model in Blender with these steps:

  1. Create a subdivided, one-sided plane surface.
  2. Apply cloth simulation with self-collision turned on.
  3. Freeze the simulation into a static mesh.
  4. Apply smooth shading.
  5. Apply solidify modifier to turn the cloth surface into a two-sided surface with non-zero thickness. We need to do this because Raydiance does not support two-sided surfaces.

Commit: 43a57981


Tweenable materials, Disney model bug fix

2023-02-11

Expanding on the skylight model post, we have also made all material properties in a scene tweenable with keyframes. The title video animates multiple material properties and sky model properties simultaneously.

In each video below, we interpolate one material property from to while keeping other parameters the same. In this release, we also exposed and parameters, previously hardcoded as and , respectively.

Disney model bug fix

We fixed how the parameter worked in our Disney BRDF implementation. Previously, sliding the parameter between to barely changed the material's appearance, while in the Disney paper, the amount of reflection goes to as the goes to . The problem was in mixing diffuse and specular BRDFs: we only accounted for the parameter. Unsurprisingly, this bug meant that the parameter did not significantly influence the material's appearance. After the fix, our implementation looks better now that we account for and parameters.

Additionally, we added the parameter, which tints incident specular towards the base color. It's intended for additional artistic control.

Commit: 66ca414c


Releasing skylight model as a crate

2023-02-11

We have released the skylight model from the previous post as a standalone crate named hw-skymodel. The implementation is almost identical, except that the new() function returns Result instead of assert!.

Publishing into crates.io was straightforward; we had to get an API key and fill in all the required fields in the Cargo.toml file, as mentioned in this article.

Commit: 3069b84b


New skylight model, tone mapping, new visualizations

2023-02-09

This release implements a new skylight model, tone mapping, and a simple keyframe animation system. This release also unifies all visualization systems under a single set of tools.

Hosek-Wilkie skylight model

Our previous lighting was a straightforward ad-hoc directional light which we defined as:

In this release, we replaced it with An Analytic Model for Full Spectral Sky-Dome Radiance by Lukas Hosek and Alexander Wilkie. This paper proposes a physically-based analytical model for dynamic daytime skydomes. We can now animate the sun's position in real-time and have the model produce realistic sunsets.

The authors wrote a brute-force path tracing simulation for gathering reference data on atmospheric scattering. They used this reference data to create the following model.

To evaluate the model, we query the model parameters and from the datasets provided by the authors. Since our renderer only supports RGB, we must use their RGB datasets.

The sky model runs fast enough to be interactive. Here's a short real-time demonstration:

Implementation details

The original implementation is implemented ~1000 lines of ANSI C. We could have built it into our Rust project, but we realized that most of the code was useless. We re-implemented the original C version into Rust with the following changes:

  • We removed spectral and CIE XYZ data sets since our renderer only support RGB.
  • We removed the solar radiance functionality since its API only supported spectral radiances. However, having a solar disc increases realism, so we might have to revisit this later.
  • We removed the "Alien World" functionality since we are only concerned with terrestrial skies.
  • We switched from double-precision to single-precision computations and storage. Reduced precision did not seem to have any visual impact and performed better than doubles.
  • The model state size went down from 1088 bytes to only 120 bytes.

With these changes, our implementation ended up being just ~200 LOC of Rust.

Simple exposure control

With the new skylight model, we now have a problem: the scene is too bright. We can scale the brightness down by a constant factor called , which we are going to define as:

We apply this scaling to every pixel as the final step of the render.

Tone mapping

Dim scene Bright scene

We can make the image "pop" more with tone mapping, which is a technique for mapping high-dynamic range (HDR) images into low-dynamic range (LDR) to be shown on a display device. Our implementation uses the "ACES Filmic Tone Mapping Curve," which is currently the default tone mapping curve in Unreal Engine. More specifically, we used Krzysztof Narkowicz's simple curve fit, which seemed simple, fast enough, and visually pleasing.

Output color Input color 0.00 0.50 1.00 0.001 0.01 0.1 1.0 10.0 Linear ACES

We can see in the log plot that the ACES curve boosts colors between and , and then it starts to gracefully dim colors from to . In contrast, the linear curve clips from onwards.

New sky dome visualizations

Elevation between Turbidity between

Building on BRDF visualizations from the previous posts, we can also visualize the sky dome under different parameter combinations.

Offline rendering improvements

The offline rendering mode from the previous post gained some improvements:

  • Parameters now support simple keyframe animations. Instead of hard-coding animations like the swinging camera in previous posts, users can now define keyframes for all sky model parameters. We will continue to expand the number of animateable parameters in the future, such as object transforms and material properties.
  • Offline renders can now have the same text annotations as the BRDF and the new skydome visualizations.

We also reviewed all our existing visualization systems and combined their different capabilities under a single set of tools.

Commit: 9f73353b


Offline rendering mode

2023-02-02

The raytracer can now run "offline," which means that we never start up the Vulkan rasterizer, and the program exits after rendering finishes. This mode can generate offline rendered animations at high sample counts. The title animation contains 60 frames, each rendered at 256 samples per pixel.

Commit: 17e72dad


Multithreaded raytracer

2023-02-02

Raytracer is now multithreaded with rayon. We split the image into 16x16 pixel tiles and use into_par_iter() to render tiles in parallel. On AMD Ryzen 5950X processor, we can render the cube scene at 66 MRays/s, up from 4.6 MRays/s we had previously with our single-threaded raytracer. If we only used one sample per pixel, it would run slightly over 60 fps. Of course, the image would be very noisy, but at least it would be interactive.

To retain our previous single-threaded debugging capabilities, we can set RAYON_NUM_THREADS=1 to force rayon only to use one thread.

With multithreading, there is a subtle issue with our current random number generator because we can no longer share the same RNG across all threads without locking. We can sidestep the whole problem because we can initialize the RNG with a unique seed at each pixel, like so:

Multiplying by ensures the is different at each sample. With this strategy, we assume that the cost of creating an RNG is negligible compared to the rest of the raytracer, which is true with rand_pcg.

Commit: d129b260


A new shiny specular BRDF

2023-01-31

A very brief overview of microfacet models

A popular way to model physically based reflective surfaces is to imagine that such surfaces are built of small perfect mirrors called microfacets. Conceptually, each facet has a random height and orientation. The randomness of these properties determines the roughness of the microsurface. These microfacets are so tiny that they can be modeled with functions, as opposed to being modeled with geometry or with normal maps, for example.

Let's define some common terms first:

The popular Cook-Torrance model is defined as:

is the normal distribution function, or NDF. It describes how microfacet varies related to the microsurface normal . Disney model uses the widely popular GGX distribution, so that is what we are going to use as well.

describes how much light is reflected from a microfacet. We use the same Schlick's approximation as we did with Disney diffuse BRDF.

is the masking-shadowing function. It describes the ratio of masked or shadowed microfacets when viewed from a pair of incoming and outgoing directions. We implemented Smith's height-correlated masking-shadowing function from this great paper by Eric Heitz called Understanding the Masking-Shadowing Function in Microfacet-Based BRDFs.

Microfacet models in practice

What happens if we use a wrong coordinate system

Translating math into source code has a couple of gotchas we need to be aware of:

  • Different papers have different naming conventions (incoming vs. light, outgoing vs. view), and different coordinate systems (z is up vs. y is up), which can quickly get confusing if we are not being consistent.
  • Floating point inaccuracies can make various terms go to or become a . For example, if the incident or outgoing rays are close to being perpendicular to the surface normal, the cosine of their angles related to the surface normal approaches zero. Then, any expression divided by this value results in . The program won't crash, but the image will slowly become more and more corrupted with black or white pixels. We must take extra care to clamp such values to a small positive number to avoid dividing by zero.
  • Sometimes the sampled vector appears below the hemisphere. In these cases, we discard the whole sample because those samples have zero reflectance.

We also use the trick from pbrt where they perform all BRDF calculations in a local space, where the surface normal . In this local space, many computations simplify a lot. For example, computing the dot product between a vector against the surface normal is simply the y-component of the vector. We can use the same orthonormal basis from the previous posts to go from world to local space, and once we are done with all BRDF math, we can transform the results back to world space.

Integrating microfacets with Disney diffuse BRDF

The new specular BRDF introduced three new parameters to our material:

  • is a linear blend between and . The "specular color" is derived from the base color.
  • replaces the explicit index of refraction. It is currently fixed to because we don't have a way to get it from GLB yet.
  • defines the degree of anisotropy. Controls the aspect ratio of the specular highlight. It's currently disabled because our model does not have tangents.

The Disney paper states that their model allows their artists to blend between any two combinations of parameters and have good results. In the example we interpolate metallic from to to .

We now have an interesting problem: choosing which BRDF to sample from. The Disney paper doesn't describe a method for it, so in our implementation, we draw a new random variable that selects between diffuse and specular BRDF based on the metallic parameter. For example, if the metallic value is , both diffuse and specular BRDFs are equally likely to be chosen.

Animated BRDF visualizations

Fixed incoming direction
Roughness interpolates between and
Incoming direction interpolates along x-axis
Fixed roughness to

We dramatically improved the capabilities of the sample placement visualizer from the previous post. The visualizations are now animated and can render different text for each frame, and reflectance is directly visualized separately from probability density functions.

The animations are encoded in APNG format. We chose APNG because:

  • GIF is too low quality due to limited 256-color palette limitation
  • WebP's crate takes very long to build and has slightly worse support than APNG
  • Traditional video formats are not as convenient for short looping animations

We used these crates to create the animations:

A simple interactive material editor

Having to recompile the program or exporting a new scene from Blender every time we needed to change the roughness or metallic value quickly became a significant bottleneck. Since our raytracing is already progressive, we can quickly implement simple material edits and have the raytracer re-render the image at each change.

We will rewrite this utility after a more extensive user interface overhaul.

Visualizing normals

Raytraced Rasterized

While hunting for bugs in our specular BRDFs, we added a simple way to visualize shading normals in raytraced and rasterized scenes. We will add visualizations for texture coordinates and tangents in the future.

Commit: bf578f68


Visualizing sample placement

2023-01-19

Importance sampling Disney specular models is more challenging than our current diffuse models. To improve our chances, we created a small tool that visualizes where the samples are placed around the hemisphere.

We used our existing uniform and cosine-weighted hemisphere samplers to ensure the tool worked. We also added two additional sample sequences for comparison:

  1. grid sequence, which uniformly samples the unit square.
  2. sobol sequence, which is provided by the sobol_burley crate. The implementation is based on Practical Hash-based Owen Scrambling.

In terms of coordinate spaces, the top plots view the hemisphere from above in cartesian space. The side plots in and hemispherical space.

For both sets of plots, the background brightness corresponds to the magnitude of .

Looking at the plots, we can intuitively say that cosine performs better than uniform sampling because it places samples closer to the bright spots. Similarly, sobol performs better than random and grid.

Note that the new sequences are not currently available for rendering. We will revisit low-discrepancy sequences later.

Commit: f5b80674


Implementing Disney BRDF - Diffuse model

2023-01-19

Disney principled BRDF is a popular physically based reflectance model developed by Brent Burley et al. at Disney. It is adopted by Blender, Unreal Engine, Frostbite, and many other productions. The team at Disney analyzed the MERL BRDF Database and fit their model based on MERL's empirical measurements. Their goal was to create an artist-friendly model with as few parameters as possible. These are the design principles from the course notes:

  1. Intuitive rather than physical parameters should be used.
  2. There should be as few parameters as possible.
  3. Parameters should be zero to one over their plausible range.
  4. Parameters should be allowed to be pushed beyond their plausible range where it makes sense.
  5. All combinations of parameters should be as robust and plausible as possible.

The full Disney model combines multiple scattering models, some of which we need to become more experienced with. To avoid getting overwhelmed, we will study and implement one model at a time, starting with the diffuse model , which is defined as:

Disney's diffuse model is a novel empirical model which attempts to solve the over-darkening that comes from the Lambertian diffuse model. This darkening happens at grazing angles, i.e., the angle between the incoming and outgoing light is close to . Disney models this by adding a Fresnel factor, which they approximated with Schlick's approximation.

The difference in the comparison above can be subtle. The main difference is that the cube's edges are slightly brighter compared to the Lambert model, and the right side of the cube also appears brighter.

Our implementation ignores the "sheen" term and the subsurface scattering approximation. We will come back to these terms later. Also, any roughness value below looks incorrect because our current implementation has no specular terms.

References:

Commit: c43f282e


Texture support

2023-01-15

Raydiance now supports texture-mapped surfaces. We used multiple shortcuts to get a basic implementation going:

  • Only nearest-neighbor filtering is supported.
  • No mipmaps.
  • No anisotropic filtering.
  • Only the R8G8B8A8_UNORM pixel format is supported.

We will revisit these shortcuts later once our scenes get more complicated.

Commit: 41a05a78


New user interface

2023-01-15

It was time to replace the window title hacks and random keybindings with a real graphical user interface. We use the the excellent Dear ImGui library. Since our project is written in Rust, we use imgui and imgui-winit-support crates to wrap the original C++ library and interface with winit.

Commit: 9fb5a380


Cosine-weighted hemisphere sampling

2023-01-13

To get a clearer picture, we could increase the number of samples, which would increase render times, forcing us to find ways to make the renderer run faster. Alternatively, we could be smarter at using our limited number of samples. This way of reducing noise in Monte Carlo simulations is called importance sampling. One of the most impactful techniques for our simple diffuse cube scene is cosine-weighted hemisphere sampling. Since the rendering equation has a cosine term, it makes sense to sample from a similar distribution. We based our implementation on pbrt.

Uniform sampling Cosine-weighted sampling

Here is the comparison between uniform sampling. It is clear that with identical sample counts, cosine-weighted sampling results in a much clearer picture than uniform sampling. And it does it at an equivalent time.

Commit: 2e0e3199


Progressive rendering

2023-01-12

Previously we had to wait until the renderer completed the entire image before displaying it on the screen. In this commit, we redesigned the path tracing loop to render progressively and submit intermediate frames as soon as they are finished. This change significantly improves interactivity.

Commit: f50c3b6f


Interactive CPU path tracer

2023-01-12

This commit merges the CPU path tracer with the Vulkan renderer and makes the camera interactive. As soon as the path tracer finishes rendering, the image is uploaded to the GPU and rendered on the window. We can also toggle between raytraced and rasterized images to confirm that both renderers are in sync.

To keep the Vulkan renderer running while the CPU is busy path tracing, we need to run the path tracer on a separate thread. To communicate across thread boundaries, we use Rust standard library std::sync::mpsc:channel. The main thread sends camera transforms to the path tracer, and the path tracer sends the rendered images back to the main thread. The path tracer thread blocks the channel to prevent busy looping.

For displaying path traced images on the window, we set up both the uploader and the rendering pipeline for the image.

We used two tricks to render the image:

  • To render a fullscreen textured quad, you don't need to create vertex buffers, set up vertex inputs, etc. With this trick, you can use gl_VertexIndex intrinsic in the vertex shader to build a huge triangle and then calculate its UVs. This technique saves a lot of boilerplate code.
  • In Vulkan, if you want to sample a texture in your fragment shader, you need to create descriptor pools, and descriptor set layouts, allocate descriptor sets, make sure pipeline layouts are correct, bind the descriptor sets, and so on. With VK_KHR_push_descriptor extension, it is possible to simplify this process significantly. Enabling it allows you to push the descriptor right before issuing the draw call, saving a lot of boilerplate. We still have to create one descriptor set layout for the pipeline layout object, but that is pretty good compared to what we had to do before, only to bind one texture to a shader.

As a side, vulkan.rs is reaching 2000 LOC, which is getting challenging to work with. We will have to break it down soon.

The path tracing performance could be better because we are still using only one thread. It is also why the image is noisier than the previous post since we had to lower the sample count to get barely interactive frame rates. We will address the noise and the performance in upcoming commits.

Commit: 956e4bf6


Path tracing on CPU

2023-01-11

Finally, we are getting into the main feature of raydiance: rendering pretty images using path tracing. We start with a pure CPU implementation. The plan is to develop and maintain the CPU version as the reference implementation for the future GPU version, mainly because it is much easier to work with, especially when debugging shaders. The Vulkan renderer we've built so far serves as the visual interface for raydiance, and later, we will use Vulkan's ray tracing extensions to create the GPU version.

We use the following components:

We put this together into a loop, where we bounce rays until they hit the sky or have bounced too many times. Each pixel in the image does this several times, averages all the samples, and writes out the final color to the image buffer.

For materials, we start with the simplest one: Lambertian material, which scatters incoming light equally in all directions. However, a subtle detail in Lambertian BRDF is that you have to divide the base color with . Here's the explanation from pbrt.

For lights, we assume that every ray that bounces off the scene will hit “the sky.” In that case, we return a bright white color.

For anti-aliasing, we randomly shift the subpixel position of each primary ray and apply the box filter over the samples. With enough samples, this naturally resolves into a nice image with no aliasing. pbrt's image reconstruction chapter has better alternatives for the box filter, which we might look into later.

We currently run the path tracer in a single CPU thread. This could be better, but the rendering only takes a couple of seconds for such a tiny image and a low sample count. We will return to this once we need to make the path tracer run at interactive speeds.

Currently, raydiance doesn't display the path traced image anywhere. For this post, we wrote the image out directly into the disk. We will fix this soon.

Commit: 4ade2d5b


Adding multisampled anti-aliasing (MSAA)

2023-01-09

Implementing MSAA was easy. Similarly to the depth buffer, we create a new color buffer which will be multisampled. The depth buffer is also updated to support multisampling. Then we update all the resolve* fields in VkRenderingAttachmentInfo, and finally, we add the multisampling state to our pipeline. No more jagged edges.

Commit: ca2a23ca


More triangles, cameras, light, and depth

2023-01-09

spinning cube

A lot has happened since our single hardcoded triangle. We can now render shaded, depth tested, transformed, and indexed triangle lists with perspective projection.

Loading and rendering GLTF scenes

We created a simple "cube on a plane" scene in Blender. Each object has a "Principled BSDF" material attached to it. This material is well supported by Blender's GLTF exporter, which is what we will use for our application. GLTF supports text formats, but we will export the scene in binary (.glb) for efficiency.

To load the .glb file, we use gltf crate. Immediately after loading, we pick out the interesting fields (cameras, meshes, materials) and convert them into our internal data format. We designed this internal format to be easy to upload to the GPU. We also do aggressive validation to catch any properties that we don't support yet, such as textures, meshes that do not have normals, etc. Our internal formats represent matrices and vectors with types from nalgebra crate. To turn our internal formats into byte slices, we use the bytemuck crate.

Before we can render, we need to upload geometry data to the GPU. We assume the number of meshes is much less than 4096 (on most Windows hosts, the maxMemoryAllocationCount is 4096). This assumption allows us to cheat and allocate buffers for each mesh. The better way to handle allocations is to make a few large allocations and sub-allocate within those, which we can do ourselves or use a library like VulkanMemoryAllocator. We will come back to memory allocators in the future.

To render, we will have to work out the perspective projection, the view transform, and object transforms from GLTF. We also add a rotation transform to animate the cube. We pre-multiply all transforms and upload the final matrix to the vertex shader using push constants. We also pack the base color into the push constant. Push constants are great for small data because we can avoid the following:

  1. Descriptor set layouts, descriptor pools, descriptor sets
  2. Uniform buffers, which would have to be double buffered to avoid pipeline stalls
  3. Synchronizing updates to uniform buffers

As a side, while looking into push constants, we learned about VK_KHR_push_descriptor. This extension could simplify working with Vulkan, which is exciting. We will return to it once we get into texture mapping.

Depth testing with VK_KHR_dynamic_rendering

Depth testing requires a depth texture, which we create at startup, and re-create when the window changes size. To enable depth testing with VK_KHR_dynamic_rendering, we had to extend our graphics pipeline with a new structure called VkPipelineRenderingCreateInfo and add a color blend state which was previously left out. One additional pipeline barrier was required to transition the depth texture for rendering.

Commit: cb1bcc19


The first triangle

2023-01-08

first triangle

This is the simplest triangle example rendered without any device memory allocations. The triangle is hardcoded in the vertex shader, and we index into its attributes with vertex index.

We added a simple shader compiling step in build.rs which builds .glsl source code into .spv binary format using Google's glslc, which is included in LunarG's Vulkan SDK.

Commit: c8f9ef2c


Clearing window with VK_KHR_dynamic_rendering

2023-01-08

resizable color window

After around 1000 LOC, we have a barebones Vulkan application which:

  1. Load Vulkan with ash crate.
  2. Creates Vulkan instance with VK_LAYER_KHRONOS_validation and debug utilities.
  3. Creates window surface with ash-window and raw-window-handle crates.
  4. Creates a logical device and queues.
  5. Creates command pool and buffers.
  6. Creates the swapchain.
  7. Creates semaphores and fences for host-to-host and host-to-device synchronization.
  8. Clears the screen with a different color for every frame.

We also handle tricky situations, such as the user resizing the window and minimizing the window.

We don't have to create render passes or framebuffers, thanks to the VK_KHR_dynamic_rendering extension. However, we have to specify some render pass parameters when we record command buffers, but reducing the number of API abstractions simplifies the implementation. We used this example by Sascha Willems as a reference.

We wrote everything under the main() with minimal abstractions and liberal use of the unsafe keyword. We will do a semantic compression pass later once we learn more about how to structure the program.

Next we will continue with more Vulkan code to get a triangle on the screen.

Commit: 0f6d7f1b


Hello, winit!

2023-01-07

empty window

Before anything interesting can happen, we need a window to draw on. We use the winit crate for windowing and handling inputs. For convenience, we bound the Escape key to close the window and center the window in the middle of the primary monitor.

For simple logging, we use log and env_logger, and for application-level error handling, we use anyhow.

Next, we will slog through a huge Vulkan boilerplate to draw something on our blank window.

Commit: ff4c31c2