NEURAL FIELD FOR HEAT SHIMMERING VISUALIZATION AND
REFRACTIVE INDEX FIELD RECONSTRUCTION

By

Lijiang Xu

A THESIS

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

Computer Science — Master of Science

2024

ABSTRACT

This thesis addresses the challenging task of synthesizing novel views of natural scenes

influenced by heat-shimmering effects due to variations in refractive indices. We develop a

novel computational model that employs Neural Radiance Fields (NeRF) for accurate simu-

lation of light refraction and reconstruction of three-dimensional refractive index fields. Our

approach integrates two ray marching techniques: Iterative Bending (IB) for high accuracy

in dataset generation, and Non-Translating (NT) to enhance training efficiency by assuming

nearly straight ray paths, facilitating faster computations while maintaining accuracy. This

methodology adeptly captures the dynamic visual effects and physical scene details.

Validation involved creating diverse temperature fields with sinusoidal distributions and

Gaussian enhancements, known as Gabor wavelets. These wavelets, with their Gaussian

component, ensure a constant ambient temperature outside for stable testing conditions.

Rigorous evaluations using various boundary conditions demonstrated the model’s robust

performance in structured environments like urban scenes and smooth noisy backgrounds.

However, smooth gradient backgrounds posed challenges due to a lack of distinct features

necessary for accurate refraction predictions. This research highlights the potential of com-

plex optical simulations and suggests applications in diverse fields, setting a foundation for

further advancements in computer vision and realistic environmental rendering.

Dedicated to my husband Zhite, advisor Dr. Yiying Tong, parents Jinghui & Mu, and my
maternal grandfather Fuxiang Jiang.

iii

ACKNOWLEDGMENTS

It is hardly conceivable that I could have accomplished this thesis without the guidance and

support from Dr. Yiying Tong. In retrospect, I am thrilled to see how I have turned from not

even knowing how to create a skybox using an environment map, to having pursued the idea

of generating novel views with the heat shimmering effect and successfully reconstructing

the 3D field using NeRF. While it was particularly challenging for me as I changed my major

from CE to CS, Dr. Tong has been consistently encouraging, supporting, and helping me

since even before the idea of this project was formed. His patient and insightful guidance has

allowed me to grow as a researcher throughout my master’s program. I feel very grateful that

he provided me with continuous financial support via TA and RA opportunities, allowing

me to also work remotely while I faced the two-body problem with my husband. His support

also extended beyond academics and especially filled me with strength during some very

difficult times.

I would also like to thank Dr. Junlin Yuan for the introductory course on Fluid Mechanics,

which opened new perspectives for me. Her careful instruction and thoughtful guidance

helped me develop foundational skills and encouraged me to pursue further learning. I am

also grateful to Dr. Xiaoming Liu; through his Computer Vision course, I realized the vast

potential for exploration and improvement in this field. His expertise and dedication have

inspired me to delve deeper into these areas in my future studies.

Last but not least, I would like to express my gratitude to my husband for all the

unwavering love he has given me from the very beginning of this journey. Thank you for

believing in me, for accompanying me on countless adventures, for introducing me to new

people, and for standing by my side through every challenge. I am especially grateful for

the time and space you provided so that I can confront my struggles and find our path

iv

forward in building a family together.

In addition, I am deeply grateful to my mother,

Jinghui, and my maternal grandfather, Fuxiang, for their constant encouragement and faith

in me. Their understanding and reassurance, especially during our phone calls, have been

a tremendous source of comfort and strength. Finally, I am forever indebted to my father,

whose unwavering support made this journey possible; I know he would be proud of this

accomplishment.

v

TABLE OF CONTENTS

Chapter 1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Chapter 2

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Chapter 3

Refraction Modeling Method . . . . . . . . . . . . . . . . . . . . . . .

Chapter 4

Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . .

Chapter 5

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

4

6

29

46

48

vi

Chapter 1

Introduction

While light typically travels in straight lines in nature, it bends when passing through me-

dia with varying refractive indices, creating visual distortions or rippling effects known as

shimmering effects. Such distorted views can be widely observed in various natural scenes,

ranging from those involving heat transfer and dynamic fluid flows to massive gravitational

fields near black holes, where light curves due to the gradient of the refractive index field or

spacetime curvature. There is thus a substantial need to synthesize views that convincingly

capture these refraction phenomena and to reconstruct the underlying refractive index field

for a wide range of applications.

For instance, in remote sensing and meteorology, accurate modeling of atmospheric re-

fractive indices is crucial for enhancing radar data interpretation, which is fundamental to

weather forecasting and earth observation [Ottersten, 1969]. Similarly, in environmental

science, the interaction of light with airborne particles significantly improves the accuracy

of climate models [Di Biagio et al., 2017] and satellite data [Prasad et al., 2015]. Detailed

knowledge of refractive indices also aids astronomers in understanding gravitational lens-

ing [Ye and Lin, 2008], while advances in optical microscopy rely on precise refractive index

measurements for cell imaging [Liu et al., 2016; Sch¨urmann et al., 2018]. Furthermore, the

development of innovative polymers [Liu and Ueda, 2009] and glasses [T¨opfer et al., 2000]

enhances the performance of optical devices and sensors, showcasing the broad implications

of this research across multiple disciplines.

In this work, we address the challenge of synthesizing novel views of natural scenes with

heat shimmering effects, while concurrently constructing a three-dimensional (3D) refractive

1

index field. The key point for this task is to model the light refraction, detailed in Ch. 3.

There are two distinct ray marching methodologies. The first method calculates the bending

direction at the current location and marches iteratively with a constant step size, referred

to as the Iterative Bending (IB) method detailed in Sec. 3.2. This method allows for a

precise determination of the bending ray’s trajectory, continuously updating the path based

on changes in the refractive index field and ensuring high accuracy. In contrast, the second

method assumes that the bending trajectory can be approximated by a straight line described

in Sec. 3.4.

It involves predetermining evenly spaced points along each ray, estimating

directional changes at these points, and combining them to determine the final ray direction.

We refer to the second method as the Non-Translating (NT) method because it omits the

transverse translation typically considered in the IB method. The NT method significantly

enhances computational efficiency by facilitating parallel processing of these predetermined

points, while retaining the computational accuracy of the IB method to a good extent.

Owing to the absence of real-world datasets that precisely capture multi-view heat shim-

mering effects, this study is conducted using exclusively synthetic data. For accuracy, we

employ the first, more precise IB method in our dataset generation detailed in Sec. 3.3. Con-

versely, the NT method’s predetermined points are used to compute the directional changes

in parallel, utilizing the estimated refractive index field from our neural network as described

in Sec. 3.5. The same spherical environment renderer is used during both dataset generation

and neural network training to ensure consistency and comparability in rendered images.

Utilizing the differentiable nature of this process, we employ gradient descent to optimize

our model, aiming to minimize the discrepancy between the images generated by our neural

representation and their corresponding ground truth. Besides, the images generated by the

IB method and NT method are compared in Sec. 3.4.1 to show the effectiveness of using the

2

NT method in our system.

To rigorously evaluate our approach, we have meticulously designed a variety of tem-

perature fields to generate datasets that accurately simulate the heat shimmering effect,

leveraging the monotonic relationship between temperature and refractive index [Haynes,

2016]. To mimic natural conditions, a constant ambient temperature was set as the bound-

ary condition surrounding the lattice field, and various methods of boundary implementation

were explored in Ch. 4. We utilize sinusoidal distributions of varying wave numbers, comple-

mented by Gaussian-distributed boundary conditions, embodying the Gabor wavelet. This

input field selection thoroughly tests our system’s robustness and adaptability, detailed in

Sec. 4.2 and Sec. 4.3. Additionally, we conducted experiments using diverse background set-

tings, ranging from normal street scenes to smooth color gradients and subtly noisy images,

to assess the system’s performance under different visual contexts, detailed in Sec. 4.2.3.

These tests demonstrate the effectiveness of our approach in accurately reconstructing re-

fractive index fields and synthesizing visually coherent views. The results confirm that our

system not only captures the intricate dynamics of light refraction but also robustly adapts

to varying environmental conditions, making significant strides in both the precision of 3D

field reconstruction and the enhancement of novel view synthesis.

3

Chapter 2

Background

Recently, a technique known as Neural Radiance Field (NeRF) has emerged as a scene rep-

resentation method for novel-view synthesis [Mildenhall et al., 2021]. NeRF deploys straight

camera rays from multiple viewpoints into the scene, utilizing a multilayer perceptron (MLP)

to estimate the density and color at sampled points in 3D space. This model is meticulously

trained to ensure consistency with the input images through volume rendering techniques.

While NeRF cannot directly applied to refraction due to volume rendering through straight

rays, its architecture provides a robust platform for generating novel views and concurrently

reconstructing underlying 3D fields.

Leveraging NeRF’s capabilities, bending rays are modeled as offsets at sampled points

along straight rays [Fujitomi et al., 2022; Kim et al., 2023]. This method is particularly

useful for modeling transparent objects where refraction occurs a discrete number of times,

such as in glass cases with a constant refractive index surrounded by air. Furthermore, for

scenarios requiring a continuous refractive index, the Eikonal equation has been utilized to

efficiently simulate light bending trajectories [Ihrke et al., 2007; Teh et al., 2022]. Inspired by

these advancements, Eikonal ray-tracing combined with neural fields has been adapted for

generating refractive fields in the context of novel-view synthesis involving glassware [Bemana

et al., 2022]. While the reconstructed refractive index fields do not match the ground truth,

this approach has yielded impressive results in generating novel views of scenes characterized

by piecewise constant refractive indices.

The framework of NeRF has also been extended to complex challenges in astronomy

such as reconstructing the tomography of emission flares near black holes [Levis et al., 2024]

4

and simulating refractive fields from single viewpoints to model dark matter distribution in

the weak lensing regime [Zhao et al., 2024]. The reconstruction through a single viewpoint

needs to leverage the prior knowledge of light sources scattered within the refractive medium

since there are many possible refractive index fields that the same image measurements can

produce. Additionally, this research employs adaptive step size in iterative ray marching,

enhancing the precision of ray-bending calculations across vast spatial domains.

These developments highlight NeRF’s adaptability in addressing refraction and tomog-

raphy challenges across varied disciplines. Our work propels NeRF into new territories,

particularly focusing on heat shimmering effects where refractive indices are continuously

influenced by temperature changes. Due to the lack of available real-world datasets that

accurately represent multi-view heat shimmering effects, this study exclusively utilizes syn-

thetic data. We leverage multiple viewpoints to reconstruct the refractive field, maximizing

the benefits of broad accessibility while circumventing the challenges associated with ac-

quiring adequate constraints for single-view tomography. Our findings also reveal that the

discrepancy between trajectories computed through iterative ray marching and those esti-

mated using predetermined points along the incident ray is minimal. This observation allows

us to achieve greater computational efficiency while preserving high accuracy in our models.

Ultimately, our approach not only addresses the continuous variations in the refractive field

but also significantly improves the efficiency of its reconstruction.

5

Chapter 3

Refraction Modeling Method

In order to synthesize novel views with a heat-shimmering effect, we march rays from different

viewpoints through the 3D refractive index field produced by our neural network. Through

this ray marching process, the light refraction is simulated and a spherical environment

renderer is adopted to generate images by mapping the directions of the refracted rays to

their corresponding pixels on a panoramic image, described in Sec. 3.1. By minimizing the

error of a set of rendered images with heat-induced visual distortions, we optimize the model

to reconstruct the refractive index field. This allows the generation of novel views with an

accurate depiction of heat-shimmering patterns.

To accurately replicate the refraction of light through media with varying refractive

indices, we derive the bending-ray equation based on Snell’s law in Sec. 3.2. This equation

is extensively used in the ray marching procedure to determine the refractive direction with

an incident ray direction at any location in the field. There are two ray marching methods:

1) calculate the ray bending direction at the current step and march iteratively with a

constant step size, referred to as the Iteratively Bending (IB) method in Sec. 3.2; and 2)

calculate the ray bending directions in parallel at all the predetermined evenly spaced points

along a straight line in the initial viewing direction, referred to as the Non-Translating (NT)

method in Sec. 3.4. The NT method assumes the bending ray is almost aligned with the

straight line along the viewing direction and thus neglects their relative translation. The IB

method is more precise but time-consuming than the NT method. As we will demonstrate

in Sec. 3.4, however, the NT method can achieve sufficiently accurate image renderings at

a much lower computation price.

It is therefore well-suited to the application in neural

6

networks for reconstructing the 3D field and generating novel views.

Because of the absence of a real-world dataset that adequately captures the multi-view

heat-shimmering scenes, this study exclusively utilizes synthetic images. As to be described

in Sec. 3.3, we first introduce a temperature field and convert it to a refractive index field

using an empirical formula. We then use the IB method—for better precision—to determine

the directions of the output refractive rays and render the images as the ground truth. We

will then set up our Temperature Neural Network (TempNN) in Sec. 3.5 to reconstruct the

3D neural refractive index field for synthesizing the novel views of the heat-shimmering effect.

3.1

spherical environment renderer

A spherical environment renderer [Shirley et al., 2009], often used in computer graph-

ics, relies on environment mapping techniques to simulate how an environment affects the

appearance of a scene without explicitly rendering the surrounding geometry. As shown in

Fig. 3.1, any direction (x, y, z) has a one-to-one correspondence to a pixel with the texture

coordinates (θ, ϕ) on a panoramic image, where θ = arcsin (z) and ϕ = arctan (y/x) using

the spherical coordinate system. These texture coordinates (θ, ϕ) resemble the latitude and

longitude we usually refer to as a location in the map. Therefore, the spherical environment

renderer is the technique that produces an image using spherical environment mapping.

Figure 3.1: spherical environment mapping.

7

We chose to use this renderer because it typically assumes the environment (like the sky

or distant mountains) is infinitely far away from the viewer, which aligns with the heat-

shimmering scenes, where a great distance is needed to observe the distorted view of the

scene. Because of this infinite distance assumption, the changes in the viewer’s location

within a typical scene scale do not meaningfully change the direction in which the environ-

mental light arrives. Thus, the renderer does not need to recalibrate for different viewing

locations within the scene. This simplifies the input to only the viewing direction because

the viewing location doesn’t significantly affect how rays intersect with the environment

map. The output directions of the rays passing through the field are all that is needed to

render images in the spherical environment renderer, while the bending ray trajectories are

not necessary.

The spherical environment renderer is consistently used to generate images throughout

this work. This ensures the rendered images are the same as long as our neural network is

optimized to produce the underlying refractive index field.

3.2

Iterative Bending method

Physically, a light ray changes its direction according to Snell’s law when it passes through

the interface of two media with different refractive indices n.

In a varying temperature

field, the refractive index of the air, n, also changes continuously according to the tem-

perature [Haynes, 2016]. This variation causes an incoming ray to continuously change its

direction as it travels, resulting in distorted images. For example, as illustrated in Fig. 3.2,

the ray maintains its original direction when passing through a constant temperature field,

producing an undistorted image, whereas it bends and generates a distorted image when the

gradient of the temperature field is nonzero.

8

Figure 3.2: Light trajectories in media with constant (Left) or varying (Right) refraction
indices, resulting in undistorted or distorted view.

To numerically simulate the continuous light refraction, we discretize the medium and

divide it into many lattice cells. As the light enters a cell along the direction i (normalized),

it travels inside the cell in the same straight line along i, as if the cell is a homogeneous

medium. It is the difference in refractive indices among adjacent cells that causes the light

to bend. This bending is proportional to the gradient of the index of refraction, ∇n, which

can be discretized by taking differences of n among neighboring cells. As shown in Fig. 3.3,

the incident light is along i and the refracted light is along i′. The light direction is changed

from i to i′ (also normalized) when the light leaves the cell. The two vertices in Fig. 3.3

indicate the places where the light enters and leaves the discretized cell, in which it goes by

dℓ = idℓ.

We treat dℓ ≡ |dℓ| as a small quantity (which can be chosen sufficiently small by making

increasingly finer lattice grids) and work only up to its first order. To this accuracy, we can

write i′ ∝ i + λ∇n, where the coefficient λ is proportional to dℓ. By normalizing it and only

9

Figure 3.3: Refraction of light in discretized medium. The light ray enters the cell along i,
travels for dℓ = idℓ, and leaves at i′, being bent in the direction of ∇n.

keeping to the first order of λ, we have

i′ =

(cid:113)

i + λ∇n

(i + λ∇n)2

≃

i + λ∇n
(cid:112)1 + 2λ (i · ∇n)

≃ (i + λ∇n) (1 − λ (i · ∇n))

≃ i + λ∇n − λ i(i · ∇n) = i + λ(cid:2)∇n − i(i · ∇n)(cid:3),

(3.1)

where at each step we consistently keep only the first-order terms of λ.

Now we use Snell’s law to solve for the value of λ.

In the simplest case with two neighboring homogeneous media, we have n sin θ = n′ sin θ′,

where θ and θ′ are the angles of the incident and refracted lights with respect to the normal.

In the language here, we do not want to introduce more surfaces besides those defined by

the cells. Therefore, we can express the θ and θ′ more conveniently as the angles between

the light directions i or i′ and the direction ∇n. Indeed, ∇n is the normal to the surface

between two media. That is, we have

cos θ =

i · ∇n
|∇n|

,

cos θ′ =

i′ · ∇n
|∇n|

.

(3.2)

Correspondingly, the indices of refraction of the two media (of the neighboring cells the

10

light passes) to the first order of dℓ. are, respectively,

n, n′ = n + ∇n · dℓ.

(3.3)

Substituting Eq. (3.2) for the angles and Eq. (3.3) for the refraction indices to the Snell’s

law,

we get

n sin θ = n′ sin θ′,

or n2(1 − cos2 θ) = n′ 2(1 − cos2 θ′),

(3.4)

n2 (cid:2)(∇n)2 − (i · ∇n)2(cid:3) = (n + ∇n · dℓ)2 (cid:2)(∇n)2 − (i′ · ∇n)2(cid:3) .

(3.5)

The left-hand side of Eq. (3.5) is already at zeroth order in dℓ. Now we expand the

right-hand side to the first order of dℓ, where the first factor is

(n + ∇n · dℓ)2 ≃ n2 + 2n∇n · dℓ = n2 + 2n dℓ (i · ∇n),

(3.6)

and the second factor is, by using Eq. (3.1),

(∇n)2 − (i′ · ∇n)2 ≃ (∇n)2 − (cid:8)i · ∇n + λ (cid:2)(∇n)2 − (i · ∇n)2(cid:3)(cid:9)2

≃ (∇n)2 − (cid:8)(i · ∇n)2 + 2λ(i · ∇n) (cid:2)(∇n)2 − (i · ∇n)2(cid:3)(cid:9)

= (cid:2)(∇n)2 − (i · ∇n)2(cid:3) − 2λ(i · ∇n) (cid:2)(∇n)2 − (i · ∇n)2(cid:3) .

(3.7)

Multiplying Eqs. (3.6) and (3.7) together, the leading-order term of dℓ or λ is the same as

11

the left-hand side of Eq. (3.5). The linear term of dℓ should be therefore canceled,

−n2 2λ(i · ∇n) (cid:2)(∇n)2 − (i · ∇n)2(cid:3) + 2n dℓ (i · ∇n) (cid:2)(∇n)2 − (i · ∇n)2(cid:3) = 0,

(3.8)

which gives, after a bunch of cancellations,

λ =

dℓ
n

.

Hence we obtain the ray-bending equation, returning to Eq. (3.1),

i′ = i +

∇n − (i · ∇n)i
n

dℓ.

(3.9)

(3.10)

We note that in this formulation, dℓ represents the distance that light travels within a

cell, extending from its entry to its exit, with each turning point located on the boundaries

of the lattice cell. This is not convenient to use literally because one would need extra effort

to determine from which sides of the cells the light enters and leaves. Therefore, instead of

calculating the refracted direction on the boundaries using the varying step size, we march

the ray with a fixed step size ds and calculate the refracted direction at each marching point

in the field. This approach effectively mimics the behavior of ray bending at the boundaries of

lattice cells, providing an equivalent outcome with simplified processing. The computational

error is negligible in a fine-grid lattice.

The detail is illustrated in Fig. 3.4, where the ray with initial direction i0 proceeds step

by step with a predetermined constant step size ds, which is chosen to be equal to the

grid cell length dx for accuracy and efficiency. This yields a total of M sampling points

{x0, x1, · · · , xM −1} along the ray. At each step, we calculate the refracted ray direction ik+1

12

Figure 3.4: Schematic of the Iterative Bending method in simulating the trajectory of a light
ray passing through a medium with continuously varying refractive indices.

from the current ray direction ik (which is itself derived from that (ik−1) in the previous

step) using Eq. (3.10), in which the refraction index n and its gradient ∇n at the current

location xk (labeled by k in Fig. 3.4) are computed by trilinear interpolation,

ik+1 = ik +

∇n(xk) − [ik · ∇n(xk)] ik
n(xk)

ds,

(k = 0, 1, · · · , M − 1.)

(3.11)

This allows us to iteratively calculate the ray trajectory, by using

xk+1 = xk + ik+1 ds,

(k = 0, 1, · · · , M − 2.)

(3.12)

We refer to this method as the Iterative Bending (IB) method, which is described by

Eqs. (3.11) and (3.12). It starts from an initial direction i0 (entering the medium at x0)

and ends with a final direction iM , which is then fed to the spherical environment map for

rendering. The computation iteratively builds the ray trajectory array {x0, x1, · · · , xM −1},

which is necessary for obtaining the ray direction array {i1, · · · , iM } but is not needed in

the image rendering in the spherical environment map.

13

3.3 Synthetic data generation

Our method requires a dataset consisting of images of a static scene that is affected due

to heat-shimmering. However, it is difficult to acquire a real-world dataset that adequately

captures the multi-view heat-shimmering scene. Therefore, we only utilize synthetic data in

this work. To do that, we first introduce a temperature field on a discretized domain and

then employ the ray-bending equation Eq. (3.11) to simulate the light refraction in the field.

The trajectories of these rays from different viewpoints are calculated using Eq. (3.12) in the

IB method and the final output directions of the refracted rays are passed to the spherical

environment renderer to produce the images that correspond to the input temperature field.

This approach allows for an accurate depiction of heat-induced visual distortions in the

rendered scene.

Specifically, the temperature field is constructed in a cube of dimension L × L × L, with

N = 101 points sampled along each dimension, evenly spaced by dx = L/(N − 1). This will

be taken as an input in our model. The refractive index field is determined on each lattice

point by using the empirical formula for the air’s index of refraction [Haynes, 2016] based

on the Ciddor Equation [Ciddor, 1996],

n(T ) = 1 + (nair − 1)

c1 P [1.0 + P · (60.1 − 0.972 T ) · 10−10]
1.0 + c2 T

,

(3.13)

which depends on the temperature T (in Celsius) and pressure P of the air (in Pascals),

and where c1 = 0.0000104 and c2 = 0.00366 are two constants, and nair = 1.000293 is

the air’s index of refraction at 0◦C and 1 atm. We take P = 101325 Pascals. Although

Eq. (3.13) only applies to the temperature range −40◦C to 100◦C, we simply extend it to

arbitrary temperature values since our goal is to prove the effectiveness of our model instead

14

of simulating the precise relationship between the refractive index and temperature.

Figure 3.5: The air’s refractive index n (a) and its derivative (b) as functions of the temper-
ature T (in ◦C).

From Eq. (3.13), the gradient of the refractive index field can then be determined at each

lattice point from that of the temperature field,

∇n =

dn
dT

∇T.

(3.14)

For a typical range of T , the n(T ) and its derivative dn/dT are given in Fig. 3.5. As one

can immediately notice, the value of n is close to nair for a wide range of T , and thus dn/dT

has a very small magnitude. This poses a severe challenge to our finite-precision simulation.

Given that it is not our purpose to simulate a fully realistic temperature field, but rather

we aim to examine how well our model can be used to reconstruct the 3D refractive index

field from multi-view images. It is the light refraction effect itself that is more important.

Therefore, in our image rendering, we manually multiply the gradient ∇n by 10 to make the

heat-shimmering effect more noticeable throughout this work.

For the generation of our dataset, we render 516 images for the training set, 18 images

for the validation set for parameter tuning, and 216 images for the testing set. Each of these

15

-10001002003004005001.00011.00021.00031.00041.0005-1000100200300400500-25-20-15-10-50images is captured by a camera consistently oriented around the center of the field with a

rotation radius of L, as depicted in Fig. 3.6. It captures images from various positions, with

its polar angle ranging from −90◦ to 90◦ and azimuthal angle ranging from 0◦ to 360◦.

Figure 3.6: Schematic of the ground-truth dataset generation.

At each viewpoint, we march camera rays using the IB method through the refractive

index field corresponding to the input temperature field. As described in Fig. 3.4, the step

size ds is selected as the grid size dx for both accuracy and efficiency. This results in M = 181

points on each ray from the near plane of 0.1 × L to the far plane of 1.9 × L. The output

directions iM for all rays from this viewpoint are rendered by the spherical environment

renderer and the produced image is taken as a ground truth image of our model.

One example of the rendered images in our dataset is shown in Fig. 3.7 (a), which has

been rendered with the gradient field ∇n multiplied by 10, as previously discussed, to make

the heat shimmering effect more visible. Without it, the image in Fig. 3.7(b) rendered with

the original ∇n field is not distinguishable from the undistorted image within our finite

numerical precision.

16

(a)

(b)

Figure 3.7: An example of the synthetic image rendered using the IB method, with the
gradient ∇n of the refractive index field multiplied by (a) 10 or (b) 1.

3.4 Non-Translating method

We can gain more insights into the rendering of shimmering effects by visualizing the light

ray trajectories in the IB method. This is shown in Fig. 3.8, where we find that, strikingly,

the points sampled along the curved light rays nearly align with those on straight lines.

This suggests that while the bending effects are sufficient to render the shimmering view in

Fig. 3.7, their deviations are minimal. This observation motivates a simplified treatment for

the ray bending simulation, which we refer to as the Non-Translating (NT) method, where

we disregard the translations of the points on the bending rays with respect to the ones on

straight rays.

Physically, the light ray bends according to the IB method, as described in Eqs. (3.11)

and (3.12). At each iterative step k, the ray direction changes from ik to ik+1 by

δk = ik+1 − ik =

∇n(xk) − [ik · ∇n(xk)] ik
n(xk)

ds,

(k = 0, 1, · · · , M − 1)

(3.15)

where ds is the marching step size. As emphasized in Eq. (3.15), in the IB method, the n and

17

Figure 3.8: Points sampled in the light ray trajectories simulated by the IB method (with
the gradient ∇n of the refractive index field multiplied by 10). Different trajectories are
labeled by different colors, with bigger colorful scatter points representing the bent rays and
smaller black ones the straight ray counterparts.

∇n at step k are evaluated at the current position xk, which is again iteratively calculated

together with ik, cf. Eq. (3.12),

xk = xk−1 + ik ds = x0 + ds

k
(cid:88)

j=1

ij.

(3.16)

Since we have found that the light ray x0 → x1 → · · · → xM −1 does not appreciably

differ from a straight line, we can neglect this difference when evaluating n and ∇n. That is,

we evaluate n and ∇n at evenly spaced points along the straight line determined by (x0, i0),

n(xk) ≃ n(x0 + k i0 ds), ∇n(xk) ≃ ∇n(x0 + k i0 ds).

(3.17)

18

Furthermore, we can neglect the difference of ik from i0 in Eq. (3.15),

δk ≃

∇n(xk) − [i0 · ∇n(xk)] i0
n(xk)

ds ≃

∇n(x0 + k i0 ds) − [i0 · ∇n(x0 + k i0 ds)] i0
n(x0 + k i0 ds)

ds.

(3.18)

The errors caused by Eqs. (3.17) and (3.18) are only of second order in ds.

Eqs. (3.17) and (3.18) constitute the NT method.

It has the advantage that the set

of locations {xk} to evaluate n and ∇n can be predetermined, thereby allowing a parallel

computation of the set {δk|k = 0, 1, · · · , M − 1}. Therefore, the final outgoing ray direction

iM can be calculated in a single (vectorized) step,

iM = i0 +

M −1
(cid:88)

k=0

δk.

(3.19)

While this is all one needs in spherical environment renderer as described in Sec. 3.1, one

does not lose the information of ray trajectories, which are simply given by Eq. (3.16), just

with the ij replaced by that calculated in the NT method,

xk = x0 + ds

k
(cid:88)

j=1

(i0 + δ0 + · · · + δj−1)

= x0 + k i0 ds + ds (k, k − 1, · · · , 1) · (δ0, δ1, · · · , δk−1) ,

(3.20)

where in the second line, the first two terms just represent the straight line, and the rest

is the deviation which has been written in a manifestly vectorized way. In fact, Eq. (3.20)

applies to both IB and NT methods; the only difference is that in the NT method, the set

{δj} is calculated using Eq. (3.18) as a predetermined set, whereas the IB method relies on

an iterative use of Eq. (3.15) so that Eq. (3.20) is not particularly useful.

19

Figure 3.9: Schematic of simulating the trajectory of a light ray passing through a medium
with continuously varying refractive indices using both the IB (red circular points labeled)
and the NT (yellow circular points labeled) methods.

To illustrate the difference of trajectories between the IB method and the NT method,

we provide a schematic representation in Fig. 3.9, where the bending effect, and hence the

trajectory difference, have been exaggerated for clarity. The δk’s indicated in Fig. 3.9 refer

to those calculated in the NT method at the evenly spaced points (labeled as blue solid

dots) on the straight line along i0, as given in Eq. (3.18). They are used to construct the

NT ray trajectory using Eq. (3.20). In Fig. 3.9, the δ4, δ5, and δ6 are all zero because the

corresponding points on the straight lines are located outside the varying temperature field.

As a result, the ray trajectory of the NT method remains straight after x3. In contrast,

the trajectory of the IB method is constructed iteratively with the δk evaluated locally at

the actual point xk along the trajectory. This gives a nonzero δ4 in Fig. 3.9 because the x4

is situated inside the temperature field, and the resultant trajectory becomes straight only

after x4.

Usually, the case in this example does not happen in heat-induced shimmering effect.

First, if the temperature field varies so abruptly that the sampled points can fall in the

field in one method but not in the other, we should adjust the discretion lattice and sample

more points. Second, since we have found that the actual IB trajectories almost align with

20

straight lines, there is no reason for the NT trajectories to differ significantly from them.

We therefore expect the NT method to work for both the output ray directions and the

constructed ray trajectories.

3.4.1 Comparative analysis of the IB and NT methods

The accuracy of images generated by the IB and the NT methods is compared in Fig. 3.10

using a specific example, where the image (a – c) utilizes the output ray directions from

the IB method, and (d – f) uses those from the NT method. All other conditions such as

the input temperature field, the number of sampled points on each ray (or the incremental

distance ds), the rendering factor applied to ∇n, and the spherical environment renderer,

are kept consistent for an effective comparison.

(a)

(b)

(c)

(e)

(f)

(g)

Figure 3.10: Images generated by the spherical environment renderer using the output ray
directions from (a–c) the IB method and (d–f) the NT method, respectively, both with a
rendering factor of 10.

21

Evidently from Fig. 3.10, the visual difference between the images generated from the

two methods is negligible. Their quantitative differences, as measured using MSE, PSNR,

MAE, LPIPS, and SSIM metrics, are detailed in Table 3.1. These underscore the accuracy

of the NT method which obtains the final output ray directions in a single step based on

predetermined bending vectors {δk} along a straight line. As discussed in Sec. 3.3, the

minimal gradient of the refraction index field ensures tiny deviations of the bending rays.

The NT approach not only simplifies the computations but also preserves accuracy. It is

therefore ideal for applications in the neural network.

Table 3.1: Performance metrics comparing the IB and NT methods across different images.

Image
Building
People on a street
Trees on a street
Mean of validation set

MSE
1.67 × 10−6
3.12 × 10−6
3.52 × 10−6
1.15 × 10−6

PSNR MAE
0.00056
57.76
0.00070
55.05
0.00079
54.54
0.00050
60.31

LPIPS
0.000133
0.000235
0.000408
0.00010

SSIM
0.9999
0.9998
0.9997
0.9999

3.5 Neural Refractive index field representation

We aim to reconstruct the 3D neural field of refractive index using the dataset generated

from multiple views. From each viewing direction, we march the rays through the field as

shown in Fig. 3.11(a). As discussed in Sec. 3.4, since the IB method is considerably slower

compared to the NT method and because of the minimal bending of the ray trajectories, we

efficiently utilize as inputs to our Temperature Neural Network (TempNN) the predetermined

points along the camera’s (straight) viewing rays, linearly spaced according to the grid cell

length ds = dx. This network is specifically designed to estimate the temperature T at each

of these points and then convert it to the refractive index n using Eq. (3.13) as shown in

Fig. 3.11(a). The refractive index n at each point and its neighboring points generated by

TempNN are then used to calculate the gradient ∇n by Eq. (3.14).

22

(a)

(b)

(c)

(d)

Figure 3.11: An overview of our neural refractive index field representation and differentiable
rendering procedure.

As shown in Fig. 3.11(b), the directional changes δ at all points along each ray are

calculated in parallel, guided by the formulas in Eq. (3.15). This method enhances the

computational efficiency by avoiding iteratively calculating the directional changes step by

step. The output ray direction is calculated based on Eq. (3.19), which sums the directional

changes δ at every point along its path to the initial ray direction i0. In Fig. 3.11 (c), these

output ray directions are fed to the spherical environment renderer to generate an image.

By minimizing the discrepancy between the rendered image and its corresponding ground

truth, the underlying refractive index field can be optimized through the process illustrated

in Fig. 3.11 (d), which thereby validates the effectiveness of the NT method.

3.5.1 Temperature Neural Network - TempNN

In contrast to the NeRF, which requires complex input data including both locations and

directions, we only input a continuous 3D location x = (x, y, z) to our TempNN, because

the viewing direction does not influence the temperature. The TempNN utilizes a multilayer

perceptron (MLP) architecture to estimate a single scalar output, the temperature T , as

shown in Fig. 3.12. This temperature value is then transformed into the refractive index n,

which is subsequently used in ray bending calculations. The choice to model the temperature

field directly, rather than the refractive index field, ensures consistency with the generation of

23

ground truth images, thereby facilitating a more intuitive understanding and alignment with

the input data. It is important to note that the underlying model to predict the temperature

or refractive index remains the same; the key is to fine-tune the learning rate for optimal

performance. In this work, we use a compact MLP with 6 fully connected layers, each with

128 units employing ReLU activations.

Figure 3.12: Schematic of the architecture of the fully connected network.

It has been shown that encoding coordinates instead of directly using them as inputs

enhances the model’s ability to capture continuous fields more accurately [Tancik et al.,

2020]. Therefore, we employ positional encoding that projects each coordinate onto a series

of sinusoids with exponentially increasing frequencies:

γ(x) = (cid:2)sin(x), cos(x), . . . , sin (cid:0)2n−1x(cid:1) , cos (cid:0)2n−1x(cid:1)(cid:3)T .

(3.21)

The positional encoding significantly influences the interpolation kernel of the MLP, with

the parameter n setting the kernel’s bandwidth. The input dimension in the Fig. 3.12 is

derived from both the location x and its positional encoding of x with n = 10. This specific

setting helps TempNN more effectively capture and adapt to higher frequency variations

within the data, enhancing its overall performance in modeling the temperature field.

24

An important aspect to highlight is the calculation of the temperature gradient ∇T ,

which is subsequently used to determine ∇n using Eq. (3.14). The temperature gradient at

a point x is computed by the central differencing method between its two neighboring points

as shown in Fig. 3.11(a),

(∇T )i =

T (x + aiei) − T (x − aiei)
2ai

,

(i = x, y, z),

(3.22)

where ei represents the unit vector, and ai is selected as the cell length along the i-axis for

consistency. This approach allows us to accurately compute the gradient at any location in

the field.

3.5.2 Optimization

As outlined in Section 3.5.1, TempNN employs an MLP architecture to produce a continuous

temperature field, denoted as (cid:98)Tθ. This field and its gradient ∇x (cid:98)Tθ, which correspond to the

refractive index (cid:98)nθ and its gradient ∇x(cid:98)nθ (as per Eqs. (3.13) and (3.14)), are parameterized by

weights θ. These parameters directly influence the directional changes δ in the NT method

and consequently, the rendered image ˆIθ. The workflow, governed by θ, is illustrated in

Figure 3.11. Our training objective is to minimize the mean square error between the images

produced by TempNN using the NT method and the ground truth images obtained by the

IB method. The loss function is defined as

L(θ) =

1
N

(cid:88)

i,j

∥ ˆIθ(i, j) − I(i, j)∥2

2 + λR(θ),

(3.23)

which amounts to taking an average of the squared difference between the images ˆIθ generated

by TempNN and the ground truth images I at each pixel (i, j). N is the total number of

25

pixels in these images. The regulation term R(θ) imposes a boundary condition (BC) on

specific regions of the temperature field produced by TempNN, with λ regulating the extent

of regularization applied.

In optimizing TempNN, we prioritize the implementation and evaluation of various BCs

to boost both the performance and stability. As the ray bending is mainly influenced by the

refractive index gradient ∇n, shifts across the entire domain do not alter the visual output.

Thus, we can implement specific BCs, such as setting the ambient temperature, to resolve

ambiguities related to the refractive index effectively.

• No BC (λ = 0).

Employing an ill-posed BC can hinder the learning process, potentially leading to

suboptimal or distorted results. To test the ability of TempNN to capture the field

pattern of the refractive index, we use the loss function L(θ) without any BC as

a reference. Furthermore, the refractive index ratio, a measurable physical quantity,

allows for the rescaling of the TempNN-generated refractive index to match the ground

truth average:

ˆns(i, j, k) = ˆn(i, j, k) ×

⟨n⟩
⟨ˆn⟩

,

(3.24)

where ˆns and ˆn represent the rescaled and original refractive indices at each grid point

(i, j, k). The factor ⟨n⟩ / ⟨ˆn⟩ is calculated as the mean ⟨n⟩ of the ground truth refractive

index divided by the mean ⟨ˆn⟩ of the generated refractive index. This adjustment is

feasible because the simulation typically assumes a stable ambient temperature, as

usually encountered when observing heat shimmering effects from a distance.

• Ambient temperature on the grid points of the grid boundary.

We apply a fixed ambient temperature to the grid points on all six sides of the lattice

26

cube-shaped field, enhancing stability and ensuring consistency with physical bound-

aries:

R(i,j,k)∈Ω1(θ) =

1
N

(cid:88)

i,j,k

∥ (cid:98)Tθ(i, j, k) − T (i, j, k)∥2
2 ,

(3.25)

where Ω1 is the set of grid points of the grid boundary. The T and (cid:98)Tθ are the temper-

ature fields in the ground truth and TempNN, respectively.

• Ambient temperature on all the points on and outside of the grid.

For points on and outside the cube’s boundaries, we enforce the ambient boundary

temperature using mean square loss to simulate an extended stable environment:

R(i,j,k)∈Ω2(θ) =

1
N

(cid:88)

i,j,k

∥ (cid:98)Tθ(i, j, k) − T (i, j, k)∥2
2 ,

(3.26)

where Ω2 is the set of all points on and outside the grid boundary.

The entire optimization process is differentiable, allowing for efficient use of gradient

descent methods. The neural network is implemented using PyTorch and trained with the

Adam optimizer, with an exponential learning rate decay from 3 × 10−4 to 3 × 10−6 over 600

to 800 epochs. This setup ensures robust training dynamics and helps in achieving the novel

view synthesis and converging to a solution that faithfully reproduces the refractive index

fields.

3.5.3 Evaluation of reconstructed fields

While the primary focus of training TempNN involves minimizing image discrepancies as

previously discussed, we additionally assess the accuracy of physical field reconstructions

during post-training. The performance of the reconstructed refractive index field is evaluated

27

at the lattice grid points using the root mean square error (RMSE), which is defined as:

Ln(θ) =

(cid:115) 1
N

(cid:88)

i,j,k

∥ˆnθ(i, j, k) − n(i, j, k)∥2
2,

(3.27)

where ˆnθ denotes the refractive index predicted by TempNN, n is the ground truth refractive

index, and N represents the total number of grid points on the lattice cube.

These metrics provide critical insights into the model’s effectiveness in accurately sim-

ulating the refractive index field, offering a comprehensive view of TempNN’s capabilities

beyond image synthesis. Such evaluations are crucial for validating the physical accuracy of

the simulations produced by TempNN.

28

Chapter 4

Results and Discussion

In order to test the quality of the novel view synthesis with the heat shimmering effect and the

reconstruction of the field of refractive index, we examine the performance of our system with

the dataset generated by different kinds of temperature fields. Since any distribution can be

decomposed into sinusoidal wavelets, we select sinusoidal distributions with different wave

numbers, supplemented with proper boundary conditions profiled by a Gaussian distribution,

also known as Morlet wavelet (or Gabor wavelet).

As discussed in Sec. 3.3, we maintain a fixed number of grid points, N , along each

dimension of the cube-shaped field with length L to enhance computational efficiency. Con-

sequently, ray bending remains invariant to scaling of the field: extending L by a factor of α

proportionally decreases the gradient by the same factor, thus maintaining the consistency of

bending regardless of the field size. By fixing N , the field effectively becomes dimensionless,

allowing for consistent modeling of ray behavior irrespective of absolute dimensions.

4.1 Design of Gabor wavelets Field

First, we define a sinusoidal wavelet as

F (x, n, ϕ; λ) = cos

(cid:21)

(x · n) + ϕ

,

(cid:20)2π
λ

(4.1)

where x = (x, y, z) denotes the location, n = (nx, ny, nz) refers to the wave number vector,

ϕ is a initial phase, and λ is the basic wavelength.

A Gaussian profile can be assigned at any region centered around c = (cx, cy, cz) to

highlight the surrounding wavelet. The general form can be written as the exponential of a

29

quadratic form of x. Here, for simplicity, we construct it from the basic Gaussian profile,

(cid:40)

G0(x, σ) = exp

−

(cid:34)(cid:18) x
σx

1
2

(cid:19)2

(cid:19)2

+

(cid:18) y
σy

+

(cid:18) z
σz

(cid:19)2(cid:35)(cid:41)

.

(4.2)

Then we rotate it according to the Euler angles θ = (α, β, γ),

R(θ) = R(α, β, γ) = Rz(γ)Ry(β)Rz(α),

(4.3)

where Rz(α) is the three-dimensional rotation matrix around the z axis by angle α, etc., and

translate it to center at c. This gives a generic Gaussian profile as

G(x, σ; c, θ) = G0

(cid:0)R−1(θ)(x − c), σ(cid:1) .

(4.4)

A general Gabor wavelet is then given by

G(x, σ; c, θ) · F (x, n, ϕ; λ).

(4.5)

Any realistic temperature field can be well approximated by a superposition of a set of

Gabor wavelets. As two examples, in Secs. 4.2 and 4.3, we will consider the superpositions

of two and three Gabor wavelets, respectively, and examine their shimmering effects and

reconstruction from our neural network model.

4.2 Two Gabor waves

In this section, we consider the temperature field constructed by superposing two inde-

pendent Gabor wavelets. We will use this as the main example to study the neural network

performance and the impacts of different BCs, and spherical environmental backgrounds.

30

4.2.1 Ground truth

The lattice field is of length L = 1000 with a discretization of N = 101 in all dimensions,

the cell grid is of length dx = L/(N − 1) = 10.

We design the temperature field by starting with

f (x) = A2 + A1 × G0(x, σ1) × (cid:2)F (x, n1, ϕ1; λ) + F (x, n2, ϕ2; λ)(cid:3),

(4.6)

which is a combination of two sinusoidal waves F (x, n1, ϕ1; λ) and F (x, n2, ϕ2; λ), enveloped

by the Gaussian G0(x, σ1) with an amplitude A1, and further raised by an amplitude A2.

We choose λ = 32 dx, n1 = (−3, 1, 2), and n2 = (1, 2, −4) in the two sinusoidal waves,

so that they cover a wide range of frequencies and maintain a good accuracy by having at

least 5 dx in one period. We also choose (ϕ1, ϕ2) = (0, π/2), (A1, A2) = (30, 80)◦C, and

σ1 = dx (18, 18, 18).

In order to remove the temperature variation around the boundaries, we multiply f (x)

with an additional Gaussian envelope G0(x, σ2), and set the surrounding temperature at T0,

g(x) = T0 + G0(x, σ2) × f (x),

(4.7)

where σ2 = dx (14, 14, 16) and T0 = 10◦C. The cross-sectional temperature fields at plane

x = 500, y = 500, and z = 500 are shown in Fig. 4.1 (a), (b), and (c), respectively. The

corresponding refractive index field n(x) converted from the temperature field g(x) using

Eq. (3.13) is displayed in Fig. 4.1 (d)–(f) at the same three planes.

31

(a)

(b)

(c)

(d)

(e)

(f)

Figure 4.1: Temperature field in Eq. (4.7) at the cross-sectional plane (a) x = 500, (b)
y = 500, and (c) z = 500; corresponding refractive index field at the same plane in (d)–(f)
respectively.

4.2.2 Results with different boundary conditions

To optimize the results, we investigate the reconstruction error of the refractive index field

under various boundary condition (BC) enforcement methods. The mean value rescaling as

defined in Eq. (3.24) is applied to both scenarios—with and without BCs—for comparative

analysis. As illustrated in Fig. 4.2 (a), the rescaling significantly reduces the RMSE of the

reconstructed refractive index field when no BCs are implemented. In contrast, enforcing BCs

on the lattice grid, as per Ω1 in Eq. (3.25), does not prove beneficial; the error still requires

a rescaling to decrease, as depicted in Fig. 4.2 (b). Therefore, we expand the coverage of

our BC to include all points on and outside the grid boundary as Ω2 in Eq. (3.26), which

efficiently improves the results, shown in Fig. 4.2 (c).

It is important to note that the

learning process with BCs is substantially slower compared to scenarios without them (The

RMSE reaches 6 × 10−5 after approximately 450 epochs without BC, compared to around

32

400 epochs when using BC Ω2). Thus, for rapid field reconstruction, learning without BCs

can be advantageous.

Figure 4.2: Reconstruction RMSE of the refractive index before and after rescaling under
(a) No BC, (b) Ω1 BC, and (c) Ω2 BC.

Fig. 4.3 (a)–(c) show the difference between the refractive index field generated by our

TempNN and the ground truth across the cross-sectional planes at x = 500, y = 500, and

z = 500 without any BCs and after scaling. These results demonstrate that TempNN

accurately captures the correct pattern with minimal error following rescaling. However,

when a fixed BC is enforced only on the grid points of the grid boundary Ω1, the desired

pattern does not fully emerge even after rescaling, as depicted in Fig. 4.3 (d)–(f). Conversely,

as shown in Fig. 4.3 (g)–(i), extending the BC to cover all refractive index points on and

outside of the boundary Ω2 not only facilitates the development of the correct pattern but

also minimizes errors, with or without rescaling. This BC strategy will be adopted for

33

010020030040050060010-710-610-510-4EpochsRMSE(a)010020030040050010-710-610-510-4EpochsRMSE(b)0400800120010-710-610-510-4EpochsRMSE(c)(a)

(b)

(c)

(d)

(e)

(g)

(h)

(f)

(i)

Figure 4.3: The difference of generated refractive index and its corresponding ground truth:
with no BC after rescaling at planes x = 500, y = 500, and z = 500 respectively (a)–(c);
with Ω1 BC (Eq. (3.25)) after rescaling at same planes respectively (d)–(f); with Ω2 BC
(Eq. (3.26)) at same planes respectively (g)–(i).

34

subsequent experiments in this section.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

Figure 4.4: Generated novel views with no BC (a)–(c), with Ω1 BC in Eq. (3.25) (d)–(f),
and with Ω2 BC in Eq. (3.26) (g)–(i), compared with the ground truth views (j)–(k).

35

The generated novel views further demonstrate the effectiveness of the methods in ap-

plying the BCs, as shown in Fig. 4.4, with its Peak Signal-to-Noise Ratio (PSNR) given in

Table 4.1.

Table 4.1: PSNR of generated images using different BCs.

Image
Building
People on a street
Trees on a street
Average

No BC Ω1 BC Ω2 BC
51.264
40.549
51.466
50.182
39.714
50.485
49.27
39.480
50.361
50.24
39.914
50.771

4.2.3 Results with different background

We use a background depicted in Fig. 4.5 (a), which features an urban street scene for

the BC experiments. As described with the spherical environment renderer in Sec. 3.1, the

direction of the ray d = (x, y, z) is mapped to a specific pixel of the image, represented by

the coordinates (θ, ϕ). This mapping allows us to analyze changes in the color gradients

in the θ- and ϕ-directions. The structural nuances of the image, captured through these

gradients, are illustrated in Fig. 4.5 (b) and Fig. 4.5 (c), providing insight into the visual

complexity of the scene.

To assess the impact of different backgrounds on our system’s performance, we introduce

two additional types: a smooth noisy image in Fig. 4.6 (a) and a smooth color-gradient image

in Fig. 4.7 (a). The color gradients for the θ- and ϕ-directions in these new backgrounds are

systematically analyzed, with detailed results presented in Table 4.2.

Table 4.2: Color gradient details in three background images.

Image
Urban street
Smooth noise
Smooth color-gradient

Mean ϕ Max ϕ Min ϕ Mean θ Max θ Min θ

23.53
57.34
0.226

817
435
4

-812
-399
-4

29.34
79.55
0.551

831
648
4

-812
-665
-4

The effect of different backgrounds on the RMSE of the reconstructed refractive index is

36

(a)

(b)

(c)

Figure 4.5: Urban street background image (a) and its color gradient in θ-direction (b) and
ϕ-direction (c).

(a)

(b)

(c)

Figure 4.6: Smooth noisy image (a) and its color gradients in θ-direction (b) and ϕ-direction
(c).

37

(a)

(b)

(c)

Figure 4.7: Smooth color-gradient image (a) and its color gradients in θ-direction (b) and
ϕ-direction (c).

depicted in Fig. 4.8. The experiments are under the condition of Ω2 BC without rescaling. A

comparison between the two decreasing curves reveals that using a smooth noisy background

(Fig. 4.6) in the rendering slows the convergence process, although it eventually achieves a

similar level of precision, as shown in the second column of Table 4.3. The smooth noisy

background provides a higher average color gradient, but lacks low-frequency contours, which

are crucial for structural details. Consequently, our TempNN is more sensitive to low-

frequency structural details, such as contours, rather than to high-frequency local color

changes.

Table 4.3: PSNR of generated images using different backgrounds with BC Ω2.

Image Urban street Smooth noisy Smooth color-gradient
47.411
Viewer 1
46.990
Viewer 2
47.710
Viewer 3
47.37
Average

51.264
50.182
49.27
50.24

58.104
57.358
56.897
57.453

38

Figure 4.8: The effect of different backgrounds on the RMSE of the reconstructed refractive
index after rescaling under the best-performed boundary condition Ω2.

To extend the limit of our system’s capabilities, we also experimented with a smooth

color-gradient background (Fig. 4.7 (a)), characterized by minimal color gradient changes as

depicted in Fig. 4.7 (b) and (c). According to the result of using the smooth color-gradient

backgroundFig. 4.8, the initial RMSE of the refractive index is on the order of 10−5 and

stagnates at this level for another 200 epochs, indicating no significant RMSE reduction as

typically expected. This stagnation suggests that the system struggles to detect sufficient

changes in the image to find an optimal convergence path effectively. Therefore, the system

fails to work by using the smooth color-gradient background, although having a high PSNR

value in Table 4.3.

The comparison of the generated refractive index field with their corresponding ground

truth across different backgrounds is illustrated in Fig. 4.9. In contrast to others, the color

scale using the smooth color-gradient background (Fig. 4.9 (g)–(i)) is not fixed, allowing for

a detailed observation of the patterns.

39

UrbanstreetSmoothnoisySmoothcolor-gradient0400800120010-610-5EpochsRMSE(a)

(b)

(c)

(d)

(e)

(g)

(h)

(f)

(i)

Figure 4.9: The difference between the generated refractive index field with Ω2 BC and its
corresponding ground truth at plane (a) x = 500, (b) y = 500, and (c) z = 500 with the
urban street background.
(d)–(f) are the same, but with the smooth noisy background.
(g)–(i) are the same, but with the smooth color-gradient background.

40

4.3 Three Gabor waves with more variation near the

boundary

Through the example of superposing two Gabor wavelets, in Section 4.2, we demonstrated

that our model successfully reconstructs the refractive index field and generates correspond-

ing novel views. We conducted two ablation studies focusing on boundary conditions (BCs)

and background variations. The results indicated that both the Ω2 BC and no BC with

rescaling (using Eq. (3.24)) are effective, although the latter significantly accelerates the

process. Additionally, our model displays a preference for learning with an urban street

background over smooth noisy, or smooth color gradient backgrounds.

In this section, we enhance the complexity of the ground truth temperature by incorporat-

ing three Gabor wavelets and retaining the variation near the boundary. This more complex

input field challenges our model further. To optimize convergence, we proceeded with no

BC, applying rescaling and utilizing the urban street background. The lattice field maintains

the same configuration as previously described in Sec. 4.2. However, instead of using a single

circular-shaped temperature field, we apply different Gaussian profiles to different sinusoidal

waves and stack these Gabor waves as

f ′(x) =

(cid:88)

i=1,2,3

Ai × G(x, σi; ci, θi) × F (x, ni, ϕi; λi),

(4.8)

where the parameters are given in Table 4.4.

This approach helps retain their distinct shapes. To maintain natural temperature vari-

ations around the boundaries, we do not apply a Gaussian envelope to the overall field of

41

Table 4.4: Parameters in Eq. (4.8): the amplitudes Ai; the standard deviations σi, centers
ci, and orientations θi of the Gaussian profiles; and the wave vectors ni, initial phases ϕi,
and wavelengths λi of the sinusoidal waves.

i Ai(◦C)

λi

ni

60

50

1

2

3

16dx

(1, 1, 1)

16dx (−2, 1, −1) π/2

100

64dx (−2, 2, −1) π/4

ϕi

0

ci
10, 0, 0(cid:1)
(cid:0) L
10, 0, 0(cid:1)
(cid:1)
8 , − L

(cid:0)− L
(cid:0)0, − L

8

(cid:1)

15
(cid:1)

σi
(cid:0) L
15, L
10, L
(cid:0) L
10, L
8 , L
(cid:0) L
20, L
8 , L

10

10

(cid:1)

θi
4 , 0, 0(cid:1)
(cid:0) π
(cid:0)− π
6 , π
4 , − π
(cid:0) π
6 , 0(cid:1)
6 , − π

2

(cid:1)

f ′(x) in Eq. (4.9),

g′(x) = T0 + f ′(x),

(4.9)

where the ambient temperature is consistently set to T0 = 25◦C. The cross-sectional tem-

perature fields of g′(x) at planes x = 500, y = 500, and z = 500 are sequentially shown

in Fig. 4.10 (a)–(c). The corresponding refractive index fields converted from them using

Eq. (3.13) are illustrated in Fig. 4.10 (d)–(f).

(a)

(b)

(c)

(d)

(e)

(f)

Figure 4.10: Temperature field in Eq. (4.9) at the cross-sectional plane (a) x = 500, (b)
y = 500, and (c) z = 500; corresponding refractive index field at the same plane in (d)–(f)
respectively.

42

Figure 4.11: Reconstruction RMSE of the refractive index field before (blue line) and after
(orange line) rescaling with no BC.

The reconstruction process of RMSE is shown in Fig. 4.11, which demonstrates the

model’s effectiveness. The convergence is slower compared to the settings with 2 Garbor

waves in Fig. 4.2 (a) as anticipated due to the increased complexity of the field. Consequently,

these more complex fields cause greater challenges to the reconstruction process. Fig. 4.12

further showcases the post-rescaling reconstructed refractive index fields with no BCs ap-

plied, where one can see from (a)–(f) that the model effectively captures distinct shapes

and variations near the boundary. Notably, low-frequency waves converge more rapidly —

approximately within 200 epochs — compared to high-frequency waves, as illustrated in

Fig. 4.12 (g)–(i).

Table 4.5: PSNRs of the images generated by the temperature setting in Eqs. (4.8) and (4.9)
after rescaling.

Image Bridge (a) Tower (b) Street (c) Trees (d)
No BC

50.85

46.93

50.35

49.82

The generated novel views are shown in Fig. 4.13 in comparison with the ground truth

views, with the PSNRs given in Table 4.5. They exhibit a high degree of resemblance

and further demonstrate the effectiveness of our model. We, therefore, conclude that the

TempNN also works well for the temperature field constructed as a superposition of three

43

050010001500200010-710-610-510-4EpochsRMSE(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

Figure 4.12: Reconstructed refractive index fields after rescaling using Eq. (3.24) with no
BC at plane (a) x = 500, (b) y = 500, and (c) z = 500. (d)–(f) are their differences from
the corresponding ground truth values after the training finishes. (g)–(i) are the differences
at 200 training epochs.

44

independent Gabor wavelets with variations near the boundaries.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Figure 4.13: Generated novel views without BC after rescaling in (a)–(d) compared with the
ground truth views in (e)–(h).

45

Chapter 5

Conclusion

This thesis has advanced the understanding of light refraction through variational media

by introducing novel methodologies based on Neural Radiance Fields (NeRF) for simulat-

ing and analyzing refractive index fields. Our integrated approach, combining the Iterative

Bending (IB) and Non-Translating (NT) techniques, has successfully modeled complex re-

fractive behaviors necessary for synthesizing novel views of natural phenomena such as heat

shimmering effects.

The IB method has proven effective in generating highly accurate synthetic datasets that

replicate real-world optical phenomena, while the NT method has greatly enhanced com-

putational efficiency, facilitating broader experimental applications and quicker iterations

without significantly compromising accuracy. Extensive testing across various backgrounds

and boundary conditions has illuminated the model’s strengths and adaptability, particularly

in structured environments like urban landscapes. For instance, the model exhibited robust

performance in environments with structured elements, such as urban landscapes, where de-

tailed textures and contours help define the behavior of light. However, it faced challenges in

environments with smooth gradients, where the absence of distinct features made it difficult

for the model to accurately predict refractive changes. Furthermore, our exploration of var-

ious refractive fields has demonstrated the model’s robust capabilities. It adeptly manages

complex fields shaped by Gabor wavelets across a broad spectrum of frequency variations,

effectively reconstructing details near the boundary of the grid field. Besides, we observed

that low-frequency waves converge more rapidly than high-frequency waves, highlighting a

slower convergence rate in more complex fields.

46

This work paves the way for future research into more sophisticated light transport

phenomena involving variable refractive index fields, promising further advancements in

optical simulation and analysis.

47

BIBLIOGRAPHY

Mojtaba Bemana, Karol Myszkowski, Jeppe Revall Frisvad, Hans-Peter Seidel, and Tobias
In ACM SIGGRAPH 2022

Ritschel. Eikonal fields for refractive novel-view synthesis.
Conference Proceedings, pages 1–9, 2022.

Philip E Ciddor. Refractive index of air: new equations for the visible and near infrared.

Applied optics, 35(9):1566–1573, 1996.

Claudia Di Biagio, Paola Formenti, Yves Balkanski, Lorenzo Caponi, Mathieu Cazaunau,
Edouard Pangui, Emilie Journet, Sophie Nowak, Sandrine Caquineau, Meinrat O Andreae,
et al. Global scale variability of the mineral dust long-wave refractive index: a new dataset
of in situ measurements for climate modeling and remote sensing. Atmospheric Chemistry
and Physics, 17(3):1901–1929, 2017.

Taku Fujitomi, Ken Sakurada, Ryuhei Hamaguchi, Hidehiko Shishido, Masaki Onishi, and
Yoshinari Kameda. Lb-nerf: light bending neural radiance fields for transparent medium.
In 2022 IEEE International Conference on Image Processing (ICIP), pages 2142–2146.
IEEE, 2022.

William M Haynes. CRC handbook of chemistry and physics. CRC press, 2016.

Ivo Ihrke, Gernot Ziegler, Art Tevs, Christian Theobalt, Marcus Magnor, and Hans-Peter
Seidel. Eikonal rendering: Efficient light transport in refractive objects. ACM Transactions
on Graphics (TOG), 26(3):59–es, 2007.

Wooseok Kim, Taiki Fukiage, and Takeshi Oishi. Ref2-nerf: Reflection and refraction aware

neural radiance field. arXiv preprint arXiv:2311.17116, 2023.

Aviad Levis, Andrew A Chael, Katherine L Bouman, Maciek Wielgus, and Pratul P Srini-
vasan. Orbital polarimetric tomography of a flare near the sagittarius a* supermassive
black hole. Nature Astronomy, pages 1–9, 2024.

Jin-gang Liu and Mitsuru Ueda. High refractive index polymers: fundamental research and

practical applications. Journal of Materials Chemistry, 19(47):8907–8919, 2009.

Patricia Yang Liu, Lip Ket Chin, Wee Ser, HF Chen, C-M Hsieh, C-H Lee, K-B Sung,
TC Ayi, PH Yap, Bo Liedberg, et al. Cell refractive index for cell biology and disease
diagnosis: past, present and future. Lab on a Chip, 16(4):634–644, 2016.

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoor-
thi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis.
Communications of the ACM, 65(1):99–106, 2021.

Hans Ottersten. Atmospheric structure and radar backscattering in clear air. Radio Science,

4(12):1179–1193, 1969.

48

Janak Prasad, Inga Zins, Robert Branscheid, Jan Becker, Amelie HR Koch, George Fytas,
Ute Kolb, and Carsten S¨onnichsen. Plasmonic core–satellite assemblies as highly sensitive
refractive index sensors. The Journal of Physical Chemistry C, 119(10):5577–5582, 2015.

Mirjam Sch¨urmann, Gheorghe Cojoc, Salvatore Girardo, Elke Ulbricht, Jochen Guck, and
Paul M¨uller. Three-dimensional correlative single-cell imaging utilizing fluorescence and
refractive index tomography. Journal of biophotonics, 11(3):e201700145, 2018.

Peter Shirley, Michael Ashikhmin, and Steve Marschner. Fundamentals of computer graphics.

AK Peters/CRC Press, 2009.

Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan,
Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, and Ren Ng. Fourier features let
networks learn high frequency functions in low dimensional domains. Advances in neural
information processing systems, 33:7537–7547, 2020.

Arjun Teh, Matthew O’Toole, and Ioannis Gkioulekas. Adjoint nonlinear ray tracing. ACM

Transactions on Graphics (TOG), 41(4):1–13, 2022.

T T¨opfer, J Hein, J Philipps, D Ehrt, and R Sauerbrey. Tailoring the nonlinear refractive
index of fluoride-phosphate glasses for laser applications. Applied Physics B, 71:203–206,
2000.

Xing-Hao Ye and Qiang Lin. Gravitational lensing analysed by the graded refractive index

of a vacuum. Journal of Optics A: Pure and Applied Optics, 10(7):075001, 2008.

Brandon Zhao, Aviad Levis, Liam Connor, Pratul P Srinivasan, and Katherine L Bouman.
Single view refractive index tomography with neural fields.
In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 25358–25367,
2024.

49