Gradient-based optimization (steady state)

mitransient’s integrators can be used to solve inverse problems just like in Mitsuba 3. Importantly, mitransient extends this idea into the transient domain, where the signal that you obtain from the scene is time-dependent.

This tutorial deals with one of the two forms of inverse rendering: backward inverse rendering, in the transient domain. Mathematically, consider \(\textbf{x}\) as your set of scene parameters (e.g. material properties, albedos…). Our transient path tracing algorithms are functions \(f\) that transform them into a time-resolved image \(\textbf{y}\). Given that we are in the transient domain:

\[\textbf{y}(t) = f(\textbf{x}, t)\]

The goal of the problem is to match time-resolved image \(\textbf{y}\) to an observation \(\textbf{y}^\ast\). Thus, you can define a loss function:

\[L = \text{loss}(\textbf{y}, \textbf{y}^\ast)\]

This tutorial has two variants: steady-state and transient. The first one assumes that \(\textbf{y}\) is steady-state i.e. there is no time dimension. The second one can handle time-dependent estimations \(\textbf{y}(t)\). This way, the temporal information could give extra information to the optimization if handled correctly.

With this formulation, backward mode differentiation allows you to compute \(\nabla_{\textbf{x}_i}L\) for a subset \(\textbf{x}_i \subseteq \textbf{x}\) of the scene parameters. As you will see, the backward mode is very useful to optimize scene parameters:

\[\textbf{x}_i \leftarrow \textbf{x}_i - \eta \nabla_{\textbf{x}_i}L\]

With a learning rate \(\eta\). If your goal is instead to generate visualizations of the effect of individual scene parameters over time, you should look at forward mode automatic differentiation in mitransient.

🚀 You will learn:

  • What is backward inverse rendering and how to use it for gradient-based optimization

  • How to compute gradients of scene parameters from the render

  • Set up an optimization with one of Mitsuba’s optimizers

  • Visualize and understand the results of the optimization

Importing mitransient

You need to use a Mitsuba variant that enables Automatic Differentiation (AD). Any *_ad_* variant will do. Common choices should be llvm_ad_rgb for CPU and cuda_ad_rgb for GPU.

[1]:
# If you have compiled Mitsuba 3 yourself, you will need to specify the path
# to the compilation folder
# import sys
# sys.path.insert(0, '<mitsuba-path>/mitsuba3/build/python')
import mitsuba as mi
# To set a variant, you need to have set it in the mitsuba.conf file
# https://mitsuba.readthedocs.io/en/latest/src/key_topics/variants.html
mi.set_variant('llvm_ad_rgb')

import mitransient as mitr
import drjit as dr

print('Using mitsuba version:', mi.__version__)
print('Using Dr.JIT version:', dr.__version__)
print('Using mitransient version:', mitr.__version__)
Using mitsuba version: 3.7.0
Using Dr.JIT version: 1.1.0
Using mitransient version: 1.2.0

Reference image

We render a reference image of the original scene that will later be used in the objective function for the optimization. Ideally, this reference image should expose very little noise as it will pertube optimization process otherwise. For best results, we should render it with an even larger sample count.

[2]:
scene = mi.load_file('cornell-box/cbox_volumetric.xml')
[3]:
image_ref, transient_ref = mi.render(scene, spp=128)

# Preview the reference image
mi.util.convert_to_bitmap(image_ref)
[3]:
[4]:
transient_ref_tonemap = mitr.vis.tonemap_transient(transient_ref)
mitr.vis.show_video(transient_ref_tonemap)

Using the mi.traverse function will give a list of all parameters that can be differentiated. You should look for parameters with the symbol. D means ns that there are discontinuities involved and, in practice, computation of these gradients can be very noisy/incorrect sometimes.

[5]:
params = mi.traverse(scene)

params
[5]:
SceneParameters[
  -------------------------------------------------------------------------------------------------------
  Name                                                Flags    Type              Parent
  -------------------------------------------------------------------------------------------------------
  allow_thread_reordering                                      bool              Scene
  sensor.near_clip                                             float             PerspectiveCamera
  sensor.far_clip                                              float             PerspectiveCamera
  sensor.shutter_open                                          float             PerspectiveCamera
  sensor.shutter_open_time                                     float             PerspectiveCamera
  sensor.film.size                                             ScalarVector2u    Film
  sensor.film.crop_size                                        ScalarVector2u    Film
  sensor.film.crop_offset                                      ScalarPoint2u     Film
  sensor.film.temporal_bins                                    int               Film
  sensor.film.bin_width_opl                                    float             Film
  sensor.film.start_opl                                        float             Film
  sensor.x_fov                                                 Float             PerspectiveCamera
  sensor.principal_point_offset_x                              Float             PerspectiveCamera
  sensor.principal_point_offset_y                              Float             PerspectiveCamera
  sensor.to_world                                              AffineTransform4f PerspectiveCamera
  gray.reflectance.value                              ∂        Color3f           SRGBReflectanceSpectrum
  white.reflectance.value                             ∂        Color3f           SRGBReflectanceSpectrum
  red.reflectance.value                               ∂        Color3f           SRGBReflectanceSpectrum
  largebox.interior_medium.scale                               float             HomogeneousMedium
  largebox.interior_medium.albedo.value.value         ∂        Color3f           SRGBReflectanceSpectrum
  largebox.interior_medium.sigma_t.value.value        ∂        Float             UniformSpectrum
  largebox.interior_medium.phase_function.g           ∂, D     Float             HGPhaseFunction
  largebox.silhouette_sampling_weight                          float             OBJMesh
  largebox.faces                                               UInt              OBJMesh
  largebox.vertex_positions                           ∂, D     Float             OBJMesh
  largebox.vertex_normals                             ∂, D     Float             OBJMesh
  largebox.vertex_texcoords                           ∂        Float             OBJMesh
  smallbox.interior_medium.scale                               float             HomogeneousMedium
  smallbox.interior_medium.albedo.value.value         ∂        Color3f           SRGBReflectanceSpectrum
  smallbox.interior_medium.sigma_t.value.value        ∂        Float             UniformSpectrum
  smallbox.interior_medium.phase_function.g           ∂, D     Float             HGPhaseFunction
  smallbox.silhouette_sampling_weight                          float             OBJMesh
  smallbox.faces                                               UInt              OBJMesh
  smallbox.vertex_positions                           ∂, D     Float             OBJMesh
  smallbox.vertex_normals                             ∂, D     Float             OBJMesh
  smallbox.vertex_texcoords                           ∂        Float             OBJMesh
  redwall.silhouette_sampling_weight                           float             OBJMesh
  redwall.faces                                                UInt              OBJMesh
  redwall.vertex_positions                            ∂, D     Float             OBJMesh
  redwall.vertex_normals                              ∂, D     Float             OBJMesh
  redwall.vertex_texcoords                            ∂        Float             OBJMesh
  greenwall.bsdf.reflectance.value                    ∂        Color3f           SRGBReflectanceSpectrum
  greenwall.silhouette_sampling_weight                         float             OBJMesh
  greenwall.faces                                              UInt              OBJMesh
  greenwall.vertex_positions                          ∂, D     Float             OBJMesh
  greenwall.vertex_normals                            ∂, D     Float             OBJMesh
  greenwall.vertex_texcoords                          ∂        Float             OBJMesh
  object_94277027071808.silhouette_sampling_weight             float             Mesh
  object_94277027071808.faces                                  UInt              Mesh
  object_94277027071808.vertex_positions              ∂, D     Float             Mesh
  object_94277027071808.vertex_normals                ∂, D     Float             Mesh
  object_94277027071808.vertex_texcoords              ∂        Float             Mesh
  light.emitter.sampling_weight                                float             AreaLight
  light.emitter.radiance.value                        ∂        Color3f           SRGBReflectanceSpectrum
  light.silhouette_sampling_weight                             float             OBJMesh
  light.faces                                                  UInt              OBJMesh
  light.vertex_positions                              ∂, D     Float             OBJMesh
  light.vertex_normals                                ∂, D     Float             OBJMesh
  light.vertex_texcoords                              ∂        Float             OBJMesh
]

We select the red.reflectance.value, which refers to how much light the red wall (on the left of the scene) reflects.

Our inverse problem goes as follows:

  1. We have an observation (Cornell Box with a red wall)

  2. We perturb the original scene and change the red wall to a blue wall. This is our starting point for the optimization

  3. The optimization tries to match the rendered image (our first estimation will be blue) to the original red wall.

[6]:
key = 'red.reflectance.value'

# Save the original value
param_ref = mi.Color3f(params[key])

# Set another color value and update the scene
params[key] = mi.Color3f(0.01, 0.2, 0.9)
params.update();

As expected, when rendering the scene again, the wall has changed color.

[7]:
image_init, transient = mi.render(scene, spp=128)
mi.util.convert_to_bitmap(image_init)
[7]:

We set up an Optimizer (in this case Adam), and tell it to optimize the left wall’s color

[8]:
opt = mi.ad.Adam(lr=0.05)
opt[key] = params[key]
params.update(opt);

At every iteration of the gradient descent, we will compute the derivatives of the scene parameters with respect to the objective function. In this simple experiment, we use the mean square error, or \(L_2\) error, between the current image and the reference created above.

[9]:
def mse(image):
    return dr.mean(dr.square(image - image_ref))

In the following cell we define the hyper parameters controlling our optimization loop, such as the number of iterations:

[10]:
iteration_count = 50

It is now time to actually perform the gradient-descent loop that executes 50 differentiable rendering iterations.

[11]:
errors = []
for it in range(50):
    # Perform a (noisy) differentiable rendering of the scene
    image, transient = mi.render(scene, params, spp=32)

    # Evaluate the objective function from the current rendered image
    loss = mse(image)

    # Backpropagate through the rendering process
    dr.backward(loss)

    # Optimizer: take a gradient descent step
    opt.step()

    # Post-process the optimized parameters to ensure legal color values.
    opt[key] = dr.clip(opt[key], 0.0, 1.0)

    # Update the scene state to the new optimized values
    params.update(opt)

    # Track the difference between the current color and the true value
    err_ref = dr.sum(dr.square(param_ref - params[key]))
    print(f"Iteration {it:02d}: parameter error = {err_ref}", end='\r')
    errors.append(dr.slice(err_ref))
print('\nOptimization complete.')
Iteration 49: parameter error = [0.00121203]
Optimization complete.

Results

We can now render the scene again to check whether the optimization process successfully recovered the color of the red wall.

[12]:
image_final_steady, image_final_transient = mi.render(scene, spp=128)
mi.util.convert_to_bitmap(image_final_steady)
[12]:

It worked!

Note visualizing the objective value directly sometimes gives limited information, since differences between image and image_ref can be dominated by Monte Carlo noise that is not related to the parameter being optimized.

Since we know the “true” target parameter in this scene, we can validate the convergence of the optimization by checking the difference to the true color at each iteration:

[14]:
import matplotlib.pyplot as plt
plt.plot(errors)
plt.xlabel('Iteration'); plt.ylabel('MSE(param)'); plt.title('Parameter error plot');
plt.show()
../../../_images/src_examples_diff-transient_backward_steady_24_0.png