proseminar/handout/handout.tex

\newcommand{\CLASSINPUTbottomtextmargin}{2.5cm}
\documentclass[conference, a4paper, hidelinks]{IEEEtran}

% BEGIN: DO NOT CHANGE HERE!
    \pagenumbering{gobble}
    \makeatletter
    \def\@maketitle{%
    \newpage
    \null{}
    \begin{center}%
        {\LARGE \textsf \@title{} \par} % Adjust the font size and style of the title
        \vskip 0.5em % Adjust the vertical spacing after the title
        {\small \textsf{Proseminar Differentiable Programming} \par}
        \vskip 0.5em % Adjust the vertical spacing after the subtitle
        {\small \textsf \@author{} \par }
        \vskip 0.5em % Adjust the vertical spacing after the author
        {\small \textsf \@date{} \par} % Adjust the font size and style of the date
    \end{center}%
    \par
    \vskip -1em % Adjust the vertical spacing after the title block
    }
    \makeatother
% END: DO NOT CHANGE HERE!

\title{Differentiable Monte Carlo Ray Tracing through Edge Sampling}
\author{Clemens Dautermann}
\date{July 26, 2023} % Put the date of your talk here

% Bibliography setup
\usepackage[citestyle=numeric,bibstyle=numeric,hyperref,backend=biber]{biblatex}
\addbibresource{bibliography.bib}

\usepackage[english]{babel}
\usepackage{csquotes}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{caption}
\DeclareCaptionType{eqcap}[Equation][List of equations]

\usepackage{tikz}
\usetikzlibrary{calc,
patterns,
angles,
quotes,
shapes,
arrows,
positioning,
overlay-beamer-styles,
backgrounds}

\usepackage{subfig}

\usepackage{xcolor}
\usepackage{graphicx}

\usepackage{hyperref}
% Do not add anything here
\begin{document}

\maketitle

\begin{abstract}
    Differentiable Programming is a technique frequently used to solve optimization problems by minimizing some kind of error function. This requires the error function to be differentiable with respect to the parameters that are to be optimized, a condition which usually does not hold with ray tracing. This report will explain why this problem occurs and present the method developed in~\cite{ACM:diffable_raytracing} to tackle it.
\end{abstract}

\section{Introduction}

One of the main tasks in Computer Graphics is image synthesis. This means ``given a 3D scene, output an image depicting the scene``. Often it is required for the image to be as realistic as possible, meaning as close to a picture of the scene as if it was set in the real world. This is most commonly achieved using the ray tracing algorithm and is a well studied problem. Doing this in a differentiable way, however, is much less trivial. This stems from the fact that the rendering integral (equation~\ref{eq:rendering_integral}) is not differentiable in certain well defined places. This report will illustrate why this is, how it can be mitigated and what might be interesting applications of differentiable ray tracing.

\section{Ray Tracing}
To formalize the problem of photo realistic image synthesis, an equation has been proposed by Kajiya in 1986~\cite{ACM:rendering_equation}. This equation captures physical light transport for a scene and if solved yields the color for a given point in the scene accounting for most physical light transport phenomena.

\begin{eqcap}[h]
    \centering
    \begin{equation}
        I(x,x^{\prime})=
        g(x,x')
        \left[
            \epsilon(x,x')+\int_S \rho(x,x',x'')I(x',x'')dx''
            \right]
        \notag
        \label{eq:rendering_integral}
    \end{equation}
    \captionof{eqcap}{The rendering equation capturing physical light transport. It assigns  a value to the intensity of light transported from a point $x$ to a point $x^\prime$. The geometry term $g$ will be discussed later. The term $\epsilon$ accounts for the emissivity of the material at point $x$. The integral term represents all light scattered from any other point in the scene towards the point $x$. The integral domain $S$ contains all points in the scene.}
\end{eqcap}


This equation (equation~\ref{eq:rendering_integral}) is now widely recognized as ``the rendering integral``. It can not be solved analytically and is thus most commonly approximated using Monte-Carlo integration~-~i.e.\ ray tracing. Ray tracing works by backtracking light rays from the light sources in the scene and thus simulates physically realistic lighting.\\
To do this, rays are cast from the camera, through each pixel in the camera frustum. The intersection point with the scene geometry $x$ is calculated for each ray and material properties (e.g.\ color, emissivity etc.) are taken into account to calculate the pixel color. From this point more rays are drawn towards each light source. If the light source is visible, its light contributes to the pixel color as well. To account for indirect lighting, the ray ``bounces around', yielding a color for some of the points scattering lights towards $x$. This approximates the integral term in equation~\ref{eq:rendering_integral}.\\
Differentiable ray tracing is the task of calculating the gradient of this process with respect to \emph{any} scene parameter.

\section{Problems with differentiability}
The geometry term $g(x,x^\prime)$ in equation~\ref{eq:rendering_integral} is the main problem when it comes to differentiating the rendering integral. This term is 1 iff $x$ is visible from $x^\prime$ and 0 otherwise.
\begin{figure}[h]
    \centering
    \include{diagrams/diagramm_occlusion}
    \caption{A simple occlusion scenario. An infinitesimal change in ray angle $d\omega$ is sufficient to determine whether the wall is visible or not.}\label{fig:occlusion_rays}
\end{figure}

\begin{figure}
    \centering
    \input{diagrams/geometry_term_diff.tex}
    \caption{The visibility of a point and the differentiation}\label{fig:visibility_term}
\end{figure}

As illustrated in figure~\ref{fig:occlusion_rays}, an infinitesimal angle change $d\omega$ can lead to the blocker obstructing the wall or the wall being visible. The geometry term is thus a Heaviside step function which when differentiated yields a Dirac delta functional (see figure~\ref{fig:visibility_term}). Since the Dirac delta functional only differs from 0 in $\omega_0$ one point, the probability of sampling it when using uniformly distributed Monte-Carlo integration is 0.

\section{Proposed solution: Edge sampling}
The key observation is that these discontinuities only occur at the edges of meshes.
Thus the problem can be mitigated by not sampling a pixel uniformly, but importance sampling the edges of a mesh, hence capturing the Dirac delta spikes.

\begin{figure}[h]
    \subfloat[area sampling]{
        \input{diagrams/area_sampling.tex}
    }
    \hfil
    \subfloat[edge sampling]{
        \input{diagrams/edge_sampling.tex}
    }
    \caption{The figure depicts the sampling situation in a single pixel. The gradient with respect to the gray triangle moving up is the metric to be calculated. Area sampling does not account for this color change whereas edge sampling does. Primary occlusion is handled correctly.}\label{fig:area_vs_edge_sampling}
\end{figure}

Consider a situation as depicted in figure~\ref{fig:area_vs_edge_sampling}. When the gray triangle moves up, the ratio of the white area decreases and the ratio of the gray area increases. Thus white should contribute less to the pixel color. Area sampling does not account for this change, since all samples land on an area with the same color before and after movement. Edge sampling does account for this change because especially areas of discontinuity i.e.\ the edges are sampled. The samples will have a different colors after the triangle moves up, correctly capturing the gradient. Primary occlusion is handled correctly because the sample that intersected the blocker before, will also intersect the blocker after movement as it is independent of the triangle movement.

\subsection{Half spaces}

To understand why edge sampling correctly captures the gradient, an explanation of what exactly happens at the point where a ray intersects an edge is required.

\begin{figure}[h]
    \centering
    \input{diagrams/halfspaces.tex}
    \caption{An edge separates the domain into two regions, whose contribution can be considered separately. The contribution of the upper half space increases while the contribution of the lower half space decreases with respect to the gray triangle moving right. Occlusion is handled correctly because there the contribution of either half space is equal.}\label{fig:halfspaces}
\end{figure}

As seen in figure~\ref{fig:halfspaces}, the edge separates the space into two half spaces. These two halfspaces can now be considered separately and a gradient can be calculated from the color difference. It is also evident that occlusion is handled correctly. This arises from the fact that where occlusion occurs, the half spaces' contribution to color is equal. From that it follows that the gradient will be 0 where occlusion occurs.\\
This allows the integral to be separated into two parts: one part that represents the discontinuous edges and one part that accounts for the original pixel integral over continuous regions.

\section{Results}
The main application of differentiable ray tracing discussed in the paper is inverse rendering. That is: Given images of a scene, synthesize a 3D representation of that scene. To do this an error function $e: \Phi\mapsto\mathbb{R}$, that measures how close an approximate render is to a target image with respect to a set of scene parameters $\Phi$, is required.\\
Given this function and a set of target images the inverse rendering problem can be solved using differentiable ray tracing. The first step is to render an approximation of the scene differentiably. The resulting images are then compared to the target images by calculating the error. Since the rendering process is differentiable the gradient of the error function can be calculated, allowing for a minimization of the error function using a process such as stochastic gradient descent.

\begin{figure}[h]
    \centering
    \subfloat[initial guess]{
        \includegraphics[width=0.3\linewidth]{../presentation/img/results/guess.png}
    }
    \subfloat[optimized result]{
        \includegraphics[width=0.3\linewidth]{../presentation/img/results/result.png}\label{fig:result_image}
    }
    \subfloat[target]{
        \includegraphics[width=0.3\linewidth]{../presentation/img/results/photo.png}
    }
    \caption{A generic example for how differentiable ray tracing can be used to approximate a solution for the inverse rendering problem.}\label{fig:inverse_rendering_example}
\end{figure}

An example for how this might look is given in figure~\ref{fig:inverse_rendering_example}. The optimized image (figure~\ref{fig:inverse_rendering_example}, image~\ref{sub@fig:result_image}) has been generated by the authors using differentiable ray tracing and an ADAM gradient descent optimizer.
From this image it is evident that the algorithm can handle a multitude of rendering phenomena, such as e.g.\ primary occlusion, emission and glossy receivers.
% BIBLIOGRAPHY
\nocite{*} % List all entries of the .bib file, even those not cited in the main body
\printbibliography{}


\end{document}