NEURAL FIELD FOR HEAT SHIMMERING VISUALIZATION AND REFRACTIVE INDEX FIELD RECONSTRUCTION By Lijiang Xu A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Computer Science — Master of Science 2024 ABSTRACT This thesis addresses the challenging task of synthesizing novel views of natural scenes influenced by heat-shimmering effects due to variations in refractive indices. We develop a novel computational model that employs Neural Radiance Fields (NeRF) for accurate simu- lation of light refraction and reconstruction of three-dimensional refractive index fields. Our approach integrates two ray marching techniques: Iterative Bending (IB) for high accuracy in dataset generation, and Non-Translating (NT) to enhance training efficiency by assuming nearly straight ray paths, facilitating faster computations while maintaining accuracy. This methodology adeptly captures the dynamic visual effects and physical scene details. Validation involved creating diverse temperature fields with sinusoidal distributions and Gaussian enhancements, known as Gabor wavelets. These wavelets, with their Gaussian component, ensure a constant ambient temperature outside for stable testing conditions. Rigorous evaluations using various boundary conditions demonstrated the model’s robust performance in structured environments like urban scenes and smooth noisy backgrounds. However, smooth gradient backgrounds posed challenges due to a lack of distinct features necessary for accurate refraction predictions. This research highlights the potential of com- plex optical simulations and suggests applications in diverse fields, setting a foundation for further advancements in computer vision and realistic environmental rendering. Dedicated to my husband Zhite, advisor Dr. Yiying Tong, parents Jinghui & Mu, and my maternal grandfather Fuxiang Jiang. iii ACKNOWLEDGMENTS It is hardly conceivable that I could have accomplished this thesis without the guidance and support from Dr. Yiying Tong. In retrospect, I am thrilled to see how I have turned from not even knowing how to create a skybox using an environment map, to having pursued the idea of generating novel views with the heat shimmering effect and successfully reconstructing the 3D field using NeRF. While it was particularly challenging for me as I changed my major from CE to CS, Dr. Tong has been consistently encouraging, supporting, and helping me since even before the idea of this project was formed. His patient and insightful guidance has allowed me to grow as a researcher throughout my master’s program. I feel very grateful that he provided me with continuous financial support via TA and RA opportunities, allowing me to also work remotely while I faced the two-body problem with my husband. His support also extended beyond academics and especially filled me with strength during some very difficult times. I would also like to thank Dr. Junlin Yuan for the introductory course on Fluid Mechanics, which opened new perspectives for me. Her careful instruction and thoughtful guidance helped me develop foundational skills and encouraged me to pursue further learning. I am also grateful to Dr. Xiaoming Liu; through his Computer Vision course, I realized the vast potential for exploration and improvement in this field. His expertise and dedication have inspired me to delve deeper into these areas in my future studies. Last but not least, I would like to express my gratitude to my husband for all the unwavering love he has given me from the very beginning of this journey. Thank you for believing in me, for accompanying me on countless adventures, for introducing me to new people, and for standing by my side through every challenge. I am especially grateful for the time and space you provided so that I can confront my struggles and find our path iv forward in building a family together. In addition, I am deeply grateful to my mother, Jinghui, and my maternal grandfather, Fuxiang, for their constant encouragement and faith in me. Their understanding and reassurance, especially during our phone calls, have been a tremendous source of comfort and strength. Finally, I am forever indebted to my father, whose unwavering support made this journey possible; I know he would be proud of this accomplishment. v TABLE OF CONTENTS Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 3 Refraction Modeling Method . . . . . . . . . . . . . . . . . . . . . . . Chapter 4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 4 6 29 46 48 vi Chapter 1 Introduction While light typically travels in straight lines in nature, it bends when passing through me- dia with varying refractive indices, creating visual distortions or rippling effects known as shimmering effects. Such distorted views can be widely observed in various natural scenes, ranging from those involving heat transfer and dynamic fluid flows to massive gravitational fields near black holes, where light curves due to the gradient of the refractive index field or spacetime curvature. There is thus a substantial need to synthesize views that convincingly capture these refraction phenomena and to reconstruct the underlying refractive index field for a wide range of applications. For instance, in remote sensing and meteorology, accurate modeling of atmospheric re- fractive indices is crucial for enhancing radar data interpretation, which is fundamental to weather forecasting and earth observation [Ottersten, 1969]. Similarly, in environmental science, the interaction of light with airborne particles significantly improves the accuracy of climate models [Di Biagio et al., 2017] and satellite data [Prasad et al., 2015]. Detailed knowledge of refractive indices also aids astronomers in understanding gravitational lens- ing [Ye and Lin, 2008], while advances in optical microscopy rely on precise refractive index measurements for cell imaging [Liu et al., 2016; Sch¨urmann et al., 2018]. Furthermore, the development of innovative polymers [Liu and Ueda, 2009] and glasses [T¨opfer et al., 2000] enhances the performance of optical devices and sensors, showcasing the broad implications of this research across multiple disciplines. In this work, we address the challenge of synthesizing novel views of natural scenes with heat shimmering effects, while concurrently constructing a three-dimensional (3D) refractive 1 index field. The key point for this task is to model the light refraction, detailed in Ch. 3. There are two distinct ray marching methodologies. The first method calculates the bending direction at the current location and marches iteratively with a constant step size, referred to as the Iterative Bending (IB) method detailed in Sec. 3.2. This method allows for a precise determination of the bending ray’s trajectory, continuously updating the path based on changes in the refractive index field and ensuring high accuracy. In contrast, the second method assumes that the bending trajectory can be approximated by a straight line described in Sec. 3.4. It involves predetermining evenly spaced points along each ray, estimating directional changes at these points, and combining them to determine the final ray direction. We refer to the second method as the Non-Translating (NT) method because it omits the transverse translation typically considered in the IB method. The NT method significantly enhances computational efficiency by facilitating parallel processing of these predetermined points, while retaining the computational accuracy of the IB method to a good extent. Owing to the absence of real-world datasets that precisely capture multi-view heat shim- mering effects, this study is conducted using exclusively synthetic data. For accuracy, we employ the first, more precise IB method in our dataset generation detailed in Sec. 3.3. Con- versely, the NT method’s predetermined points are used to compute the directional changes in parallel, utilizing the estimated refractive index field from our neural network as described in Sec. 3.5. The same spherical environment renderer is used during both dataset generation and neural network training to ensure consistency and comparability in rendered images. Utilizing the differentiable nature of this process, we employ gradient descent to optimize our model, aiming to minimize the discrepancy between the images generated by our neural representation and their corresponding ground truth. Besides, the images generated by the IB method and NT method are compared in Sec. 3.4.1 to show the effectiveness of using the 2 NT method in our system. To rigorously evaluate our approach, we have meticulously designed a variety of tem- perature fields to generate datasets that accurately simulate the heat shimmering effect, leveraging the monotonic relationship between temperature and refractive index [Haynes, 2016]. To mimic natural conditions, a constant ambient temperature was set as the bound- ary condition surrounding the lattice field, and various methods of boundary implementation were explored in Ch. 4. We utilize sinusoidal distributions of varying wave numbers, comple- mented by Gaussian-distributed boundary conditions, embodying the Gabor wavelet. This input field selection thoroughly tests our system’s robustness and adaptability, detailed in Sec. 4.2 and Sec. 4.3. Additionally, we conducted experiments using diverse background set- tings, ranging from normal street scenes to smooth color gradients and subtly noisy images, to assess the system’s performance under different visual contexts, detailed in Sec. 4.2.3. These tests demonstrate the effectiveness of our approach in accurately reconstructing re- fractive index fields and synthesizing visually coherent views. The results confirm that our system not only captures the intricate dynamics of light refraction but also robustly adapts to varying environmental conditions, making significant strides in both the precision of 3D field reconstruction and the enhancement of novel view synthesis. 3 Chapter 2 Background Recently, a technique known as Neural Radiance Field (NeRF) has emerged as a scene rep- resentation method for novel-view synthesis [Mildenhall et al., 2021]. NeRF deploys straight camera rays from multiple viewpoints into the scene, utilizing a multilayer perceptron (MLP) to estimate the density and color at sampled points in 3D space. This model is meticulously trained to ensure consistency with the input images through volume rendering techniques. While NeRF cannot directly applied to refraction due to volume rendering through straight rays, its architecture provides a robust platform for generating novel views and concurrently reconstructing underlying 3D fields. Leveraging NeRF’s capabilities, bending rays are modeled as offsets at sampled points along straight rays [Fujitomi et al., 2022; Kim et al., 2023]. This method is particularly useful for modeling transparent objects where refraction occurs a discrete number of times, such as in glass cases with a constant refractive index surrounded by air. Furthermore, for scenarios requiring a continuous refractive index, the Eikonal equation has been utilized to efficiently simulate light bending trajectories [Ihrke et al., 2007; Teh et al., 2022]. Inspired by these advancements, Eikonal ray-tracing combined with neural fields has been adapted for generating refractive fields in the context of novel-view synthesis involving glassware [Bemana et al., 2022]. While the reconstructed refractive index fields do not match the ground truth, this approach has yielded impressive results in generating novel views of scenes characterized by piecewise constant refractive indices. The framework of NeRF has also been extended to complex challenges in astronomy such as reconstructing the tomography of emission flares near black holes [Levis et al., 2024] 4 and simulating refractive fields from single viewpoints to model dark matter distribution in the weak lensing regime [Zhao et al., 2024]. The reconstruction through a single viewpoint needs to leverage the prior knowledge of light sources scattered within the refractive medium since there are many possible refractive index fields that the same image measurements can produce. Additionally, this research employs adaptive step size in iterative ray marching, enhancing the precision of ray-bending calculations across vast spatial domains. These developments highlight NeRF’s adaptability in addressing refraction and tomog- raphy challenges across varied disciplines. Our work propels NeRF into new territories, particularly focusing on heat shimmering effects where refractive indices are continuously influenced by temperature changes. Due to the lack of available real-world datasets that accurately represent multi-view heat shimmering effects, this study exclusively utilizes syn- thetic data. We leverage multiple viewpoints to reconstruct the refractive field, maximizing the benefits of broad accessibility while circumventing the challenges associated with ac- quiring adequate constraints for single-view tomography. Our findings also reveal that the discrepancy between trajectories computed through iterative ray marching and those esti- mated using predetermined points along the incident ray is minimal. This observation allows us to achieve greater computational efficiency while preserving high accuracy in our models. Ultimately, our approach not only addresses the continuous variations in the refractive field but also significantly improves the efficiency of its reconstruction. 5 Chapter 3 Refraction Modeling Method In order to synthesize novel views with a heat-shimmering effect, we march rays from different viewpoints through the 3D refractive index field produced by our neural network. Through this ray marching process, the light refraction is simulated and a spherical environment renderer is adopted to generate images by mapping the directions of the refracted rays to their corresponding pixels on a panoramic image, described in Sec. 3.1. By minimizing the error of a set of rendered images with heat-induced visual distortions, we optimize the model to reconstruct the refractive index field. This allows the generation of novel views with an accurate depiction of heat-shimmering patterns. To accurately replicate the refraction of light through media with varying refractive indices, we derive the bending-ray equation based on Snell’s law in Sec. 3.2. This equation is extensively used in the ray marching procedure to determine the refractive direction with an incident ray direction at any location in the field. There are two ray marching methods: 1) calculate the ray bending direction at the current step and march iteratively with a constant step size, referred to as the Iteratively Bending (IB) method in Sec. 3.2; and 2) calculate the ray bending directions in parallel at all the predetermined evenly spaced points along a straight line in the initial viewing direction, referred to as the Non-Translating (NT) method in Sec. 3.4. The NT method assumes the bending ray is almost aligned with the straight line along the viewing direction and thus neglects their relative translation. The IB method is more precise but time-consuming than the NT method. As we will demonstrate in Sec. 3.4, however, the NT method can achieve sufficiently accurate image renderings at a much lower computation price. It is therefore well-suited to the application in neural 6 networks for reconstructing the 3D field and generating novel views. Because of the absence of a real-world dataset that adequately captures the multi-view heat-shimmering scenes, this study exclusively utilizes synthetic images. As to be described in Sec. 3.3, we first introduce a temperature field and convert it to a refractive index field using an empirical formula. We then use the IB method—for better precision—to determine the directions of the output refractive rays and render the images as the ground truth. We will then set up our Temperature Neural Network (TempNN) in Sec. 3.5 to reconstruct the 3D neural refractive index field for synthesizing the novel views of the heat-shimmering effect. 3.1 spherical environment renderer A spherical environment renderer [Shirley et al., 2009], often used in computer graph- ics, relies on environment mapping techniques to simulate how an environment affects the appearance of a scene without explicitly rendering the surrounding geometry. As shown in Fig. 3.1, any direction (x, y, z) has a one-to-one correspondence to a pixel with the texture coordinates (θ, ϕ) on a panoramic image, where θ = arcsin (z) and ϕ = arctan (y/x) using the spherical coordinate system. These texture coordinates (θ, ϕ) resemble the latitude and longitude we usually refer to as a location in the map. Therefore, the spherical environment renderer is the technique that produces an image using spherical environment mapping. Figure 3.1: spherical environment mapping. 7 We chose to use this renderer because it typically assumes the environment (like the sky or distant mountains) is infinitely far away from the viewer, which aligns with the heat- shimmering scenes, where a great distance is needed to observe the distorted view of the scene. Because of this infinite distance assumption, the changes in the viewer’s location within a typical scene scale do not meaningfully change the direction in which the environ- mental light arrives. Thus, the renderer does not need to recalibrate for different viewing locations within the scene. This simplifies the input to only the viewing direction because the viewing location doesn’t significantly affect how rays intersect with the environment map. The output directions of the rays passing through the field are all that is needed to render images in the spherical environment renderer, while the bending ray trajectories are not necessary. The spherical environment renderer is consistently used to generate images throughout this work. This ensures the rendered images are the same as long as our neural network is optimized to produce the underlying refractive index field. 3.2 Iterative Bending method Physically, a light ray changes its direction according to Snell’s law when it passes through the interface of two media with different refractive indices n. In a varying temperature field, the refractive index of the air, n, also changes continuously according to the tem- perature [Haynes, 2016]. This variation causes an incoming ray to continuously change its direction as it travels, resulting in distorted images. For example, as illustrated in Fig. 3.2, the ray maintains its original direction when passing through a constant temperature field, producing an undistorted image, whereas it bends and generates a distorted image when the gradient of the temperature field is nonzero. 8 Figure 3.2: Light trajectories in media with constant (Left) or varying (Right) refraction indices, resulting in undistorted or distorted view. To numerically simulate the continuous light refraction, we discretize the medium and divide it into many lattice cells. As the light enters a cell along the direction i (normalized), it travels inside the cell in the same straight line along i, as if the cell is a homogeneous medium. It is the difference in refractive indices among adjacent cells that causes the light to bend. This bending is proportional to the gradient of the index of refraction, ∇n, which can be discretized by taking differences of n among neighboring cells. As shown in Fig. 3.3, the incident light is along i and the refracted light is along i′. The light direction is changed from i to i′ (also normalized) when the light leaves the cell. The two vertices in Fig. 3.3 indicate the places where the light enters and leaves the discretized cell, in which it goes by dℓ = idℓ. We treat dℓ ≡ |dℓ| as a small quantity (which can be chosen sufficiently small by making increasingly finer lattice grids) and work only up to its first order. To this accuracy, we can write i′ ∝ i + λ∇n, where the coefficient λ is proportional to dℓ. By normalizing it and only 9 Figure 3.3: Refraction of light in discretized medium. The light ray enters the cell along i, travels for dℓ = idℓ, and leaves at i′, being bent in the direction of ∇n. keeping to the first order of λ, we have i′ = (cid:113) i + λ∇n (i + λ∇n)2 ≃ i + λ∇n (cid:112)1 + 2λ (i · ∇n) ≃ (i + λ∇n) (1 − λ (i · ∇n)) ≃ i + λ∇n − λ i(i · ∇n) = i + λ(cid:2)∇n − i(i · ∇n)(cid:3), (3.1) where at each step we consistently keep only the first-order terms of λ. Now we use Snell’s law to solve for the value of λ. In the simplest case with two neighboring homogeneous media, we have n sin θ = n′ sin θ′, where θ and θ′ are the angles of the incident and refracted lights with respect to the normal. In the language here, we do not want to introduce more surfaces besides those defined by the cells. Therefore, we can express the θ and θ′ more conveniently as the angles between the light directions i or i′ and the direction ∇n. Indeed, ∇n is the normal to the surface between two media. That is, we have cos θ = i · ∇n |∇n| , cos θ′ = i′ · ∇n |∇n| . (3.2) Correspondingly, the indices of refraction of the two media (of the neighboring cells the 10 light passes) to the first order of dℓ. are, respectively, n, n′ = n + ∇n · dℓ. (3.3) Substituting Eq. (3.2) for the angles and Eq. (3.3) for the refraction indices to the Snell’s law, we get n sin θ = n′ sin θ′, or n2(1 − cos2 θ) = n′ 2(1 − cos2 θ′), (3.4) n2 (cid:2)(∇n)2 − (i · ∇n)2(cid:3) = (n + ∇n · dℓ)2 (cid:2)(∇n)2 − (i′ · ∇n)2(cid:3) . (3.5) The left-hand side of Eq. (3.5) is already at zeroth order in dℓ. Now we expand the right-hand side to the first order of dℓ, where the first factor is (n + ∇n · dℓ)2 ≃ n2 + 2n∇n · dℓ = n2 + 2n dℓ (i · ∇n), (3.6) and the second factor is, by using Eq. (3.1), (∇n)2 − (i′ · ∇n)2 ≃ (∇n)2 − (cid:8)i · ∇n + λ (cid:2)(∇n)2 − (i · ∇n)2(cid:3)(cid:9)2 ≃ (∇n)2 − (cid:8)(i · ∇n)2 + 2λ(i · ∇n) (cid:2)(∇n)2 − (i · ∇n)2(cid:3)(cid:9) = (cid:2)(∇n)2 − (i · ∇n)2(cid:3) − 2λ(i · ∇n) (cid:2)(∇n)2 − (i · ∇n)2(cid:3) . (3.7) Multiplying Eqs. (3.6) and (3.7) together, the leading-order term of dℓ or λ is the same as 11 the left-hand side of Eq. (3.5). The linear term of dℓ should be therefore canceled, −n2 2λ(i · ∇n) (cid:2)(∇n)2 − (i · ∇n)2(cid:3) + 2n dℓ (i · ∇n) (cid:2)(∇n)2 − (i · ∇n)2(cid:3) = 0, (3.8) which gives, after a bunch of cancellations, λ = dℓ n . Hence we obtain the ray-bending equation, returning to Eq. (3.1), i′ = i + ∇n − (i · ∇n)i n dℓ. (3.9) (3.10) We note that in this formulation, dℓ represents the distance that light travels within a cell, extending from its entry to its exit, with each turning point located on the boundaries of the lattice cell. This is not convenient to use literally because one would need extra effort to determine from which sides of the cells the light enters and leaves. Therefore, instead of calculating the refracted direction on the boundaries using the varying step size, we march the ray with a fixed step size ds and calculate the refracted direction at each marching point in the field. This approach effectively mimics the behavior of ray bending at the boundaries of lattice cells, providing an equivalent outcome with simplified processing. The computational error is negligible in a fine-grid lattice. The detail is illustrated in Fig. 3.4, where the ray with initial direction i0 proceeds step by step with a predetermined constant step size ds, which is chosen to be equal to the grid cell length dx for accuracy and efficiency. This yields a total of M sampling points {x0, x1, · · · , xM −1} along the ray. At each step, we calculate the refracted ray direction ik+1 12 Figure 3.4: Schematic of the Iterative Bending method in simulating the trajectory of a light ray passing through a medium with continuously varying refractive indices. from the current ray direction ik (which is itself derived from that (ik−1) in the previous step) using Eq. (3.10), in which the refraction index n and its gradient ∇n at the current location xk (labeled by k in Fig. 3.4) are computed by trilinear interpolation, ik+1 = ik + ∇n(xk) − [ik · ∇n(xk)] ik n(xk) ds, (k = 0, 1, · · · , M − 1.) (3.11) This allows us to iteratively calculate the ray trajectory, by using xk+1 = xk + ik+1 ds, (k = 0, 1, · · · , M − 2.) (3.12) We refer to this method as the Iterative Bending (IB) method, which is described by Eqs. (3.11) and (3.12). It starts from an initial direction i0 (entering the medium at x0) and ends with a final direction iM , which is then fed to the spherical environment map for rendering. The computation iteratively builds the ray trajectory array {x0, x1, · · · , xM −1}, which is necessary for obtaining the ray direction array {i1, · · · , iM } but is not needed in the image rendering in the spherical environment map. 13 3.3 Synthetic data generation Our method requires a dataset consisting of images of a static scene that is affected due to heat-shimmering. However, it is difficult to acquire a real-world dataset that adequately captures the multi-view heat-shimmering scene. Therefore, we only utilize synthetic data in this work. To do that, we first introduce a temperature field on a discretized domain and then employ the ray-bending equation Eq. (3.11) to simulate the light refraction in the field. The trajectories of these rays from different viewpoints are calculated using Eq. (3.12) in the IB method and the final output directions of the refracted rays are passed to the spherical environment renderer to produce the images that correspond to the input temperature field. This approach allows for an accurate depiction of heat-induced visual distortions in the rendered scene. Specifically, the temperature field is constructed in a cube of dimension L × L × L, with N = 101 points sampled along each dimension, evenly spaced by dx = L/(N − 1). This will be taken as an input in our model. The refractive index field is determined on each lattice point by using the empirical formula for the air’s index of refraction [Haynes, 2016] based on the Ciddor Equation [Ciddor, 1996], n(T ) = 1 + (nair − 1) c1 P [1.0 + P · (60.1 − 0.972 T ) · 10−10] 1.0 + c2 T , (3.13) which depends on the temperature T (in Celsius) and pressure P of the air (in Pascals), and where c1 = 0.0000104 and c2 = 0.00366 are two constants, and nair = 1.000293 is the air’s index of refraction at 0◦C and 1 atm. We take P = 101325 Pascals. Although Eq. (3.13) only applies to the temperature range −40◦C to 100◦C, we simply extend it to arbitrary temperature values since our goal is to prove the effectiveness of our model instead 14 of simulating the precise relationship between the refractive index and temperature. Figure 3.5: The air’s refractive index n (a) and its derivative (b) as functions of the temper- ature T (in ◦C). From Eq. (3.13), the gradient of the refractive index field can then be determined at each lattice point from that of the temperature field, ∇n = dn dT ∇T. (3.14) For a typical range of T , the n(T ) and its derivative dn/dT are given in Fig. 3.5. As one can immediately notice, the value of n is close to nair for a wide range of T , and thus dn/dT has a very small magnitude. This poses a severe challenge to our finite-precision simulation. Given that it is not our purpose to simulate a fully realistic temperature field, but rather we aim to examine how well our model can be used to reconstruct the 3D refractive index field from multi-view images. It is the light refraction effect itself that is more important. Therefore, in our image rendering, we manually multiply the gradient ∇n by 10 to make the heat-shimmering effect more noticeable throughout this work. For the generation of our dataset, we render 516 images for the training set, 18 images for the validation set for parameter tuning, and 216 images for the testing set. Each of these 15 -10001002003004005001.00011.00021.00031.00041.0005-1000100200300400500-25-20-15-10-50 images is captured by a camera consistently oriented around the center of the field with a rotation radius of L, as depicted in Fig. 3.6. It captures images from various positions, with its polar angle ranging from −90◦ to 90◦ and azimuthal angle ranging from 0◦ to 360◦. Figure 3.6: Schematic of the ground-truth dataset generation. At each viewpoint, we march camera rays using the IB method through the refractive index field corresponding to the input temperature field. As described in Fig. 3.4, the step size ds is selected as the grid size dx for both accuracy and efficiency. This results in M = 181 points on each ray from the near plane of 0.1 × L to the far plane of 1.9 × L. The output directions iM for all rays from this viewpoint are rendered by the spherical environment renderer and the produced image is taken as a ground truth image of our model. One example of the rendered images in our dataset is shown in Fig. 3.7 (a), which has been rendered with the gradient field ∇n multiplied by 10, as previously discussed, to make the heat shimmering effect more visible. Without it, the image in Fig. 3.7(b) rendered with the original ∇n field is not distinguishable from the undistorted image within our finite numerical precision. 16 (a) (b) Figure 3.7: An example of the synthetic image rendered using the IB method, with the gradient ∇n of the refractive index field multiplied by (a) 10 or (b) 1. 3.4 Non-Translating method We can gain more insights into the rendering of shimmering effects by visualizing the light ray trajectories in the IB method. This is shown in Fig. 3.8, where we find that, strikingly, the points sampled along the curved light rays nearly align with those on straight lines. This suggests that while the bending effects are sufficient to render the shimmering view in Fig. 3.7, their deviations are minimal. This observation motivates a simplified treatment for the ray bending simulation, which we refer to as the Non-Translating (NT) method, where we disregard the translations of the points on the bending rays with respect to the ones on straight rays. Physically, the light ray bends according to the IB method, as described in Eqs. (3.11) and (3.12). At each iterative step k, the ray direction changes from ik to ik+1 by δk = ik+1 − ik = ∇n(xk) − [ik · ∇n(xk)] ik n(xk) ds, (k = 0, 1, · · · , M − 1) (3.15) where ds is the marching step size. As emphasized in Eq. (3.15), in the IB method, the n and 17 Figure 3.8: Points sampled in the light ray trajectories simulated by the IB method (with the gradient ∇n of the refractive index field multiplied by 10). Different trajectories are labeled by different colors, with bigger colorful scatter points representing the bent rays and smaller black ones the straight ray counterparts. ∇n at step k are evaluated at the current position xk, which is again iteratively calculated together with ik, cf. Eq. (3.12), xk = xk−1 + ik ds = x0 + ds k (cid:88) j=1 ij. (3.16) Since we have found that the light ray x0 → x1 → · · · → xM −1 does not appreciably differ from a straight line, we can neglect this difference when evaluating n and ∇n. That is, we evaluate n and ∇n at evenly spaced points along the straight line determined by (x0, i0), n(xk) ≃ n(x0 + k i0 ds), ∇n(xk) ≃ ∇n(x0 + k i0 ds). (3.17) 18 Furthermore, we can neglect the difference of ik from i0 in Eq. (3.15), δk ≃ ∇n(xk) − [i0 · ∇n(xk)] i0 n(xk) ds ≃ ∇n(x0 + k i0 ds) − [i0 · ∇n(x0 + k i0 ds)] i0 n(x0 + k i0 ds) ds. (3.18) The errors caused by Eqs. (3.17) and (3.18) are only of second order in ds. Eqs. (3.17) and (3.18) constitute the NT method. It has the advantage that the set of locations {xk} to evaluate n and ∇n can be predetermined, thereby allowing a parallel computation of the set {δk|k = 0, 1, · · · , M − 1}. Therefore, the final outgoing ray direction iM can be calculated in a single (vectorized) step, iM = i0 + M −1 (cid:88) k=0 δk. (3.19) While this is all one needs in spherical environment renderer as described in Sec. 3.1, one does not lose the information of ray trajectories, which are simply given by Eq. (3.16), just with the ij replaced by that calculated in the NT method, xk = x0 + ds k (cid:88) j=1 (i0 + δ0 + · · · + δj−1) = x0 + k i0 ds + ds (k, k − 1, · · · , 1) · (δ0, δ1, · · · , δk−1) , (3.20) where in the second line, the first two terms just represent the straight line, and the rest is the deviation which has been written in a manifestly vectorized way. In fact, Eq. (3.20) applies to both IB and NT methods; the only difference is that in the NT method, the set {δj} is calculated using Eq. (3.18) as a predetermined set, whereas the IB method relies on an iterative use of Eq. (3.15) so that Eq. (3.20) is not particularly useful. 19 Figure 3.9: Schematic of simulating the trajectory of a light ray passing through a medium with continuously varying refractive indices using both the IB (red circular points labeled) and the NT (yellow circular points labeled) methods. To illustrate the difference of trajectories between the IB method and the NT method, we provide a schematic representation in Fig. 3.9, where the bending effect, and hence the trajectory difference, have been exaggerated for clarity. The δk’s indicated in Fig. 3.9 refer to those calculated in the NT method at the evenly spaced points (labeled as blue solid dots) on the straight line along i0, as given in Eq. (3.18). They are used to construct the NT ray trajectory using Eq. (3.20). In Fig. 3.9, the δ4, δ5, and δ6 are all zero because the corresponding points on the straight lines are located outside the varying temperature field. As a result, the ray trajectory of the NT method remains straight after x3. In contrast, the trajectory of the IB method is constructed iteratively with the δk evaluated locally at the actual point xk along the trajectory. This gives a nonzero δ4 in Fig. 3.9 because the x4 is situated inside the temperature field, and the resultant trajectory becomes straight only after x4. Usually, the case in this example does not happen in heat-induced shimmering effect. First, if the temperature field varies so abruptly that the sampled points can fall in the field in one method but not in the other, we should adjust the discretion lattice and sample more points. Second, since we have found that the actual IB trajectories almost align with 20 straight lines, there is no reason for the NT trajectories to differ significantly from them. We therefore expect the NT method to work for both the output ray directions and the constructed ray trajectories. 3.4.1 Comparative analysis of the IB and NT methods The accuracy of images generated by the IB and the NT methods is compared in Fig. 3.10 using a specific example, where the image (a – c) utilizes the output ray directions from the IB method, and (d – f) uses those from the NT method. All other conditions such as the input temperature field, the number of sampled points on each ray (or the incremental distance ds), the rendering factor applied to ∇n, and the spherical environment renderer, are kept consistent for an effective comparison. (a) (b) (c) (e) (f) (g) Figure 3.10: Images generated by the spherical environment renderer using the output ray directions from (a–c) the IB method and (d–f) the NT method, respectively, both with a rendering factor of 10. 21 Evidently from Fig. 3.10, the visual difference between the images generated from the two methods is negligible. Their quantitative differences, as measured using MSE, PSNR, MAE, LPIPS, and SSIM metrics, are detailed in Table 3.1. These underscore the accuracy of the NT method which obtains the final output ray directions in a single step based on predetermined bending vectors {δk} along a straight line. As discussed in Sec. 3.3, the minimal gradient of the refraction index field ensures tiny deviations of the bending rays. The NT approach not only simplifies the computations but also preserves accuracy. It is therefore ideal for applications in the neural network. Table 3.1: Performance metrics comparing the IB and NT methods across different images. Image Building People on a street Trees on a street Mean of validation set MSE 1.67 × 10−6 3.12 × 10−6 3.52 × 10−6 1.15 × 10−6 PSNR MAE 0.00056 57.76 0.00070 55.05 0.00079 54.54 0.00050 60.31 LPIPS 0.000133 0.000235 0.000408 0.00010 SSIM 0.9999 0.9998 0.9997 0.9999 3.5 Neural Refractive index field representation We aim to reconstruct the 3D neural field of refractive index using the dataset generated from multiple views. From each viewing direction, we march the rays through the field as shown in Fig. 3.11(a). As discussed in Sec. 3.4, since the IB method is considerably slower compared to the NT method and because of the minimal bending of the ray trajectories, we efficiently utilize as inputs to our Temperature Neural Network (TempNN) the predetermined points along the camera’s (straight) viewing rays, linearly spaced according to the grid cell length ds = dx. This network is specifically designed to estimate the temperature T at each of these points and then convert it to the refractive index n using Eq. (3.13) as shown in Fig. 3.11(a). The refractive index n at each point and its neighboring points generated by TempNN are then used to calculate the gradient ∇n by Eq. (3.14). 22 (a) (b) (c) (d) Figure 3.11: An overview of our neural refractive index field representation and differentiable rendering procedure. As shown in Fig. 3.11(b), the directional changes δ at all points along each ray are calculated in parallel, guided by the formulas in Eq. (3.15). This method enhances the computational efficiency by avoiding iteratively calculating the directional changes step by step. The output ray direction is calculated based on Eq. (3.19), which sums the directional changes δ at every point along its path to the initial ray direction i0. In Fig. 3.11 (c), these output ray directions are fed to the spherical environment renderer to generate an image. By minimizing the discrepancy between the rendered image and its corresponding ground truth, the underlying refractive index field can be optimized through the process illustrated in Fig. 3.11 (d), which thereby validates the effectiveness of the NT method. 3.5.1 Temperature Neural Network - TempNN In contrast to the NeRF, which requires complex input data including both locations and directions, we only input a continuous 3D location x = (x, y, z) to our TempNN, because the viewing direction does not influence the temperature. The TempNN utilizes a multilayer perceptron (MLP) architecture to estimate a single scalar output, the temperature T , as shown in Fig. 3.12. This temperature value is then transformed into the refractive index n, which is subsequently used in ray bending calculations. The choice to model the temperature field directly, rather than the refractive index field, ensures consistency with the generation of 23 ground truth images, thereby facilitating a more intuitive understanding and alignment with the input data. It is important to note that the underlying model to predict the temperature or refractive index remains the same; the key is to fine-tune the learning rate for optimal performance. In this work, we use a compact MLP with 6 fully connected layers, each with 128 units employing ReLU activations. Figure 3.12: Schematic of the architecture of the fully connected network. It has been shown that encoding coordinates instead of directly using them as inputs enhances the model’s ability to capture continuous fields more accurately [Tancik et al., 2020]. Therefore, we employ positional encoding that projects each coordinate onto a series of sinusoids with exponentially increasing frequencies: γ(x) = (cid:2)sin(x), cos(x), . . . , sin (cid:0)2n−1x(cid:1) , cos (cid:0)2n−1x(cid:1)(cid:3)T . (3.21) The positional encoding significantly influences the interpolation kernel of the MLP, with the parameter n setting the kernel’s bandwidth. The input dimension in the Fig. 3.12 is derived from both the location x and its positional encoding of x with n = 10. This specific setting helps TempNN more effectively capture and adapt to higher frequency variations within the data, enhancing its overall performance in modeling the temperature field. 24 An important aspect to highlight is the calculation of the temperature gradient ∇T , which is subsequently used to determine ∇n using Eq. (3.14). The temperature gradient at a point x is computed by the central differencing method between its two neighboring points as shown in Fig. 3.11(a), (∇T )i = T (x + aiei) − T (x − aiei) 2ai , (i = x, y, z), (3.22) where ei represents the unit vector, and ai is selected as the cell length along the i-axis for consistency. This approach allows us to accurately compute the gradient at any location in the field. 3.5.2 Optimization As outlined in Section 3.5.1, TempNN employs an MLP architecture to produce a continuous temperature field, denoted as (cid:98)Tθ. This field and its gradient ∇x (cid:98)Tθ, which correspond to the refractive index (cid:98)nθ and its gradient ∇x(cid:98)nθ (as per Eqs. (3.13) and (3.14)), are parameterized by weights θ. These parameters directly influence the directional changes δ in the NT method and consequently, the rendered image ˆIθ. The workflow, governed by θ, is illustrated in Figure 3.11. Our training objective is to minimize the mean square error between the images produced by TempNN using the NT method and the ground truth images obtained by the IB method. The loss function is defined as L(θ) = 1 N (cid:88) i,j ∥ ˆIθ(i, j) − I(i, j)∥2 2 + λR(θ), (3.23) which amounts to taking an average of the squared difference between the images ˆIθ generated by TempNN and the ground truth images I at each pixel (i, j). N is the total number of 25 pixels in these images. The regulation term R(θ) imposes a boundary condition (BC) on specific regions of the temperature field produced by TempNN, with λ regulating the extent of regularization applied. In optimizing TempNN, we prioritize the implementation and evaluation of various BCs to boost both the performance and stability. As the ray bending is mainly influenced by the refractive index gradient ∇n, shifts across the entire domain do not alter the visual output. Thus, we can implement specific BCs, such as setting the ambient temperature, to resolve ambiguities related to the refractive index effectively. • No BC (λ = 0). Employing an ill-posed BC can hinder the learning process, potentially leading to suboptimal or distorted results. To test the ability of TempNN to capture the field pattern of the refractive index, we use the loss function L(θ) without any BC as a reference. Furthermore, the refractive index ratio, a measurable physical quantity, allows for the rescaling of the TempNN-generated refractive index to match the ground truth average: ˆns(i, j, k) = ˆn(i, j, k) × ⟨n⟩ ⟨ˆn⟩ , (3.24) where ˆns and ˆn represent the rescaled and original refractive indices at each grid point (i, j, k). The factor ⟨n⟩ / ⟨ˆn⟩ is calculated as the mean ⟨n⟩ of the ground truth refractive index divided by the mean ⟨ˆn⟩ of the generated refractive index. This adjustment is feasible because the simulation typically assumes a stable ambient temperature, as usually encountered when observing heat shimmering effects from a distance. • Ambient temperature on the grid points of the grid boundary. We apply a fixed ambient temperature to the grid points on all six sides of the lattice 26 cube-shaped field, enhancing stability and ensuring consistency with physical bound- aries: R(i,j,k)∈Ω1(θ) = 1 N (cid:88) i,j,k ∥ (cid:98)Tθ(i, j, k) − T (i, j, k)∥2 2 , (3.25) where Ω1 is the set of grid points of the grid boundary. The T and (cid:98)Tθ are the temper- ature fields in the ground truth and TempNN, respectively. • Ambient temperature on all the points on and outside of the grid. For points on and outside the cube’s boundaries, we enforce the ambient boundary temperature using mean square loss to simulate an extended stable environment: R(i,j,k)∈Ω2(θ) = 1 N (cid:88) i,j,k ∥ (cid:98)Tθ(i, j, k) − T (i, j, k)∥2 2 , (3.26) where Ω2 is the set of all points on and outside the grid boundary. The entire optimization process is differentiable, allowing for efficient use of gradient descent methods. The neural network is implemented using PyTorch and trained with the Adam optimizer, with an exponential learning rate decay from 3 × 10−4 to 3 × 10−6 over 600 to 800 epochs. This setup ensures robust training dynamics and helps in achieving the novel view synthesis and converging to a solution that faithfully reproduces the refractive index fields. 3.5.3 Evaluation of reconstructed fields While the primary focus of training TempNN involves minimizing image discrepancies as previously discussed, we additionally assess the accuracy of physical field reconstructions during post-training. The performance of the reconstructed refractive index field is evaluated 27 at the lattice grid points using the root mean square error (RMSE), which is defined as: Ln(θ) = (cid:115) 1 N (cid:88) i,j,k ∥ˆnθ(i, j, k) − n(i, j, k)∥2 2, (3.27) where ˆnθ denotes the refractive index predicted by TempNN, n is the ground truth refractive index, and N represents the total number of grid points on the lattice cube. These metrics provide critical insights into the model’s effectiveness in accurately sim- ulating the refractive index field, offering a comprehensive view of TempNN’s capabilities beyond image synthesis. Such evaluations are crucial for validating the physical accuracy of the simulations produced by TempNN. 28 Chapter 4 Results and Discussion In order to test the quality of the novel view synthesis with the heat shimmering effect and the reconstruction of the field of refractive index, we examine the performance of our system with the dataset generated by different kinds of temperature fields. Since any distribution can be decomposed into sinusoidal wavelets, we select sinusoidal distributions with different wave numbers, supplemented with proper boundary conditions profiled by a Gaussian distribution, also known as Morlet wavelet (or Gabor wavelet). As discussed in Sec. 3.3, we maintain a fixed number of grid points, N , along each dimension of the cube-shaped field with length L to enhance computational efficiency. Con- sequently, ray bending remains invariant to scaling of the field: extending L by a factor of α proportionally decreases the gradient by the same factor, thus maintaining the consistency of bending regardless of the field size. By fixing N , the field effectively becomes dimensionless, allowing for consistent modeling of ray behavior irrespective of absolute dimensions. 4.1 Design of Gabor wavelets Field First, we define a sinusoidal wavelet as F (x, n, ϕ; λ) = cos (cid:21) (x · n) + ϕ , (cid:20)2π λ (4.1) where x = (x, y, z) denotes the location, n = (nx, ny, nz) refers to the wave number vector, ϕ is a initial phase, and λ is the basic wavelength. A Gaussian profile can be assigned at any region centered around c = (cx, cy, cz) to highlight the surrounding wavelet. The general form can be written as the exponential of a 29 quadratic form of x. Here, for simplicity, we construct it from the basic Gaussian profile, (cid:40) G0(x, σ) = exp − (cid:34)(cid:18) x σx 1 2 (cid:19)2 (cid:19)2 + (cid:18) y σy + (cid:18) z σz (cid:19)2(cid:35)(cid:41) . (4.2) Then we rotate it according to the Euler angles θ = (α, β, γ), R(θ) = R(α, β, γ) = Rz(γ)Ry(β)Rz(α), (4.3) where Rz(α) is the three-dimensional rotation matrix around the z axis by angle α, etc., and translate it to center at c. This gives a generic Gaussian profile as G(x, σ; c, θ) = G0 (cid:0)R−1(θ)(x − c), σ(cid:1) . (4.4) A general Gabor wavelet is then given by G(x, σ; c, θ) · F (x, n, ϕ; λ). (4.5) Any realistic temperature field can be well approximated by a superposition of a set of Gabor wavelets. As two examples, in Secs. 4.2 and 4.3, we will consider the superpositions of two and three Gabor wavelets, respectively, and examine their shimmering effects and reconstruction from our neural network model. 4.2 Two Gabor waves In this section, we consider the temperature field constructed by superposing two inde- pendent Gabor wavelets. We will use this as the main example to study the neural network performance and the impacts of different BCs, and spherical environmental backgrounds. 30 4.2.1 Ground truth The lattice field is of length L = 1000 with a discretization of N = 101 in all dimensions, the cell grid is of length dx = L/(N − 1) = 10. We design the temperature field by starting with f (x) = A2 + A1 × G0(x, σ1) × (cid:2)F (x, n1, ϕ1; λ) + F (x, n2, ϕ2; λ)(cid:3), (4.6) which is a combination of two sinusoidal waves F (x, n1, ϕ1; λ) and F (x, n2, ϕ2; λ), enveloped by the Gaussian G0(x, σ1) with an amplitude A1, and further raised by an amplitude A2. We choose λ = 32 dx, n1 = (−3, 1, 2), and n2 = (1, 2, −4) in the two sinusoidal waves, so that they cover a wide range of frequencies and maintain a good accuracy by having at least 5 dx in one period. We also choose (ϕ1, ϕ2) = (0, π/2), (A1, A2) = (30, 80)◦C, and σ1 = dx (18, 18, 18). In order to remove the temperature variation around the boundaries, we multiply f (x) with an additional Gaussian envelope G0(x, σ2), and set the surrounding temperature at T0, g(x) = T0 + G0(x, σ2) × f (x), (4.7) where σ2 = dx (14, 14, 16) and T0 = 10◦C. The cross-sectional temperature fields at plane x = 500, y = 500, and z = 500 are shown in Fig. 4.1 (a), (b), and (c), respectively. The corresponding refractive index field n(x) converted from the temperature field g(x) using Eq. (3.13) is displayed in Fig. 4.1 (d)–(f) at the same three planes. 31 (a) (b) (c) (d) (e) (f) Figure 4.1: Temperature field in Eq. (4.7) at the cross-sectional plane (a) x = 500, (b) y = 500, and (c) z = 500; corresponding refractive index field at the same plane in (d)–(f) respectively. 4.2.2 Results with different boundary conditions To optimize the results, we investigate the reconstruction error of the refractive index field under various boundary condition (BC) enforcement methods. The mean value rescaling as defined in Eq. (3.24) is applied to both scenarios—with and without BCs—for comparative analysis. As illustrated in Fig. 4.2 (a), the rescaling significantly reduces the RMSE of the reconstructed refractive index field when no BCs are implemented. In contrast, enforcing BCs on the lattice grid, as per Ω1 in Eq. (3.25), does not prove beneficial; the error still requires a rescaling to decrease, as depicted in Fig. 4.2 (b). Therefore, we expand the coverage of our BC to include all points on and outside the grid boundary as Ω2 in Eq. (3.26), which efficiently improves the results, shown in Fig. 4.2 (c). It is important to note that the learning process with BCs is substantially slower compared to scenarios without them (The RMSE reaches 6 × 10−5 after approximately 450 epochs without BC, compared to around 32 400 epochs when using BC Ω2). Thus, for rapid field reconstruction, learning without BCs can be advantageous. Figure 4.2: Reconstruction RMSE of the refractive index before and after rescaling under (a) No BC, (b) Ω1 BC, and (c) Ω2 BC. Fig. 4.3 (a)–(c) show the difference between the refractive index field generated by our TempNN and the ground truth across the cross-sectional planes at x = 500, y = 500, and z = 500 without any BCs and after scaling. These results demonstrate that TempNN accurately captures the correct pattern with minimal error following rescaling. However, when a fixed BC is enforced only on the grid points of the grid boundary Ω1, the desired pattern does not fully emerge even after rescaling, as depicted in Fig. 4.3 (d)–(f). Conversely, as shown in Fig. 4.3 (g)–(i), extending the BC to cover all refractive index points on and outside of the boundary Ω2 not only facilitates the development of the correct pattern but also minimizes errors, with or without rescaling. This BC strategy will be adopted for 33 010020030040050060010-710-610-510-4EpochsRMSE(a)010020030040050010-710-610-510-4EpochsRMSE(b)0400800120010-710-610-510-4EpochsRMSE(c) (a) (b) (c) (d) (e) (g) (h) (f) (i) Figure 4.3: The difference of generated refractive index and its corresponding ground truth: with no BC after rescaling at planes x = 500, y = 500, and z = 500 respectively (a)–(c); with Ω1 BC (Eq. (3.25)) after rescaling at same planes respectively (d)–(f); with Ω2 BC (Eq. (3.26)) at same planes respectively (g)–(i). 34 subsequent experiments in this section. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) Figure 4.4: Generated novel views with no BC (a)–(c), with Ω1 BC in Eq. (3.25) (d)–(f), and with Ω2 BC in Eq. (3.26) (g)–(i), compared with the ground truth views (j)–(k). 35 The generated novel views further demonstrate the effectiveness of the methods in ap- plying the BCs, as shown in Fig. 4.4, with its Peak Signal-to-Noise Ratio (PSNR) given in Table 4.1. Table 4.1: PSNR of generated images using different BCs. Image Building People on a street Trees on a street Average No BC Ω1 BC Ω2 BC 51.264 40.549 51.466 50.182 39.714 50.485 49.27 39.480 50.361 50.24 39.914 50.771 4.2.3 Results with different background We use a background depicted in Fig. 4.5 (a), which features an urban street scene for the BC experiments. As described with the spherical environment renderer in Sec. 3.1, the direction of the ray d = (x, y, z) is mapped to a specific pixel of the image, represented by the coordinates (θ, ϕ). This mapping allows us to analyze changes in the color gradients in the θ- and ϕ-directions. The structural nuances of the image, captured through these gradients, are illustrated in Fig. 4.5 (b) and Fig. 4.5 (c), providing insight into the visual complexity of the scene. To assess the impact of different backgrounds on our system’s performance, we introduce two additional types: a smooth noisy image in Fig. 4.6 (a) and a smooth color-gradient image in Fig. 4.7 (a). The color gradients for the θ- and ϕ-directions in these new backgrounds are systematically analyzed, with detailed results presented in Table 4.2. Table 4.2: Color gradient details in three background images. Image Urban street Smooth noise Smooth color-gradient Mean ϕ Max ϕ Min ϕ Mean θ Max θ Min θ 23.53 57.34 0.226 817 435 4 -812 -399 -4 29.34 79.55 0.551 831 648 4 -812 -665 -4 The effect of different backgrounds on the RMSE of the reconstructed refractive index is 36 (a) (b) (c) Figure 4.5: Urban street background image (a) and its color gradient in θ-direction (b) and ϕ-direction (c). (a) (b) (c) Figure 4.6: Smooth noisy image (a) and its color gradients in θ-direction (b) and ϕ-direction (c). 37 (a) (b) (c) Figure 4.7: Smooth color-gradient image (a) and its color gradients in θ-direction (b) and ϕ-direction (c). depicted in Fig. 4.8. The experiments are under the condition of Ω2 BC without rescaling. A comparison between the two decreasing curves reveals that using a smooth noisy background (Fig. 4.6) in the rendering slows the convergence process, although it eventually achieves a similar level of precision, as shown in the second column of Table 4.3. The smooth noisy background provides a higher average color gradient, but lacks low-frequency contours, which are crucial for structural details. Consequently, our TempNN is more sensitive to low- frequency structural details, such as contours, rather than to high-frequency local color changes. Table 4.3: PSNR of generated images using different backgrounds with BC Ω2. Image Urban street Smooth noisy Smooth color-gradient 47.411 Viewer 1 46.990 Viewer 2 47.710 Viewer 3 47.37 Average 51.264 50.182 49.27 50.24 58.104 57.358 56.897 57.453 38 Figure 4.8: The effect of different backgrounds on the RMSE of the reconstructed refractive index after rescaling under the best-performed boundary condition Ω2. To extend the limit of our system’s capabilities, we also experimented with a smooth color-gradient background (Fig. 4.7 (a)), characterized by minimal color gradient changes as depicted in Fig. 4.7 (b) and (c). According to the result of using the smooth color-gradient backgroundFig. 4.8, the initial RMSE of the refractive index is on the order of 10−5 and stagnates at this level for another 200 epochs, indicating no significant RMSE reduction as typically expected. This stagnation suggests that the system struggles to detect sufficient changes in the image to find an optimal convergence path effectively. Therefore, the system fails to work by using the smooth color-gradient background, although having a high PSNR value in Table 4.3. The comparison of the generated refractive index field with their corresponding ground truth across different backgrounds is illustrated in Fig. 4.9. In contrast to others, the color scale using the smooth color-gradient background (Fig. 4.9 (g)–(i)) is not fixed, allowing for a detailed observation of the patterns. 39 UrbanstreetSmoothnoisySmoothcolor-gradient0400800120010-610-5EpochsRMSE (a) (b) (c) (d) (e) (g) (h) (f) (i) Figure 4.9: The difference between the generated refractive index field with Ω2 BC and its corresponding ground truth at plane (a) x = 500, (b) y = 500, and (c) z = 500 with the urban street background. (d)–(f) are the same, but with the smooth noisy background. (g)–(i) are the same, but with the smooth color-gradient background. 40 4.3 Three Gabor waves with more variation near the boundary Through the example of superposing two Gabor wavelets, in Section 4.2, we demonstrated that our model successfully reconstructs the refractive index field and generates correspond- ing novel views. We conducted two ablation studies focusing on boundary conditions (BCs) and background variations. The results indicated that both the Ω2 BC and no BC with rescaling (using Eq. (3.24)) are effective, although the latter significantly accelerates the process. Additionally, our model displays a preference for learning with an urban street background over smooth noisy, or smooth color gradient backgrounds. In this section, we enhance the complexity of the ground truth temperature by incorporat- ing three Gabor wavelets and retaining the variation near the boundary. This more complex input field challenges our model further. To optimize convergence, we proceeded with no BC, applying rescaling and utilizing the urban street background. The lattice field maintains the same configuration as previously described in Sec. 4.2. However, instead of using a single circular-shaped temperature field, we apply different Gaussian profiles to different sinusoidal waves and stack these Gabor waves as f ′(x) = (cid:88) i=1,2,3 Ai × G(x, σi; ci, θi) × F (x, ni, ϕi; λi), (4.8) where the parameters are given in Table 4.4. This approach helps retain their distinct shapes. To maintain natural temperature vari- ations around the boundaries, we do not apply a Gaussian envelope to the overall field of 41 Table 4.4: Parameters in Eq. (4.8): the amplitudes Ai; the standard deviations σi, centers ci, and orientations θi of the Gaussian profiles; and the wave vectors ni, initial phases ϕi, and wavelengths λi of the sinusoidal waves. i Ai(◦C) λi ni 60 50 1 2 3 16dx (1, 1, 1) 16dx (−2, 1, −1) π/2 100 64dx (−2, 2, −1) π/4 ϕi 0 ci 10, 0, 0(cid:1) (cid:0) L 10, 0, 0(cid:1) (cid:1) 8 , − L (cid:0)− L (cid:0)0, − L 8 (cid:1) 15 (cid:1) σi (cid:0) L 15, L 10, L (cid:0) L 10, L 8 , L (cid:0) L 20, L 8 , L 10 10 (cid:1) θi 4 , 0, 0(cid:1) (cid:0) π (cid:0)− π 6 , π 4 , − π (cid:0) π 6 , 0(cid:1) 6 , − π 2 (cid:1) f ′(x) in Eq. (4.9), g′(x) = T0 + f ′(x), (4.9) where the ambient temperature is consistently set to T0 = 25◦C. The cross-sectional tem- perature fields of g′(x) at planes x = 500, y = 500, and z = 500 are sequentially shown in Fig. 4.10 (a)–(c). The corresponding refractive index fields converted from them using Eq. (3.13) are illustrated in Fig. 4.10 (d)–(f). (a) (b) (c) (d) (e) (f) Figure 4.10: Temperature field in Eq. (4.9) at the cross-sectional plane (a) x = 500, (b) y = 500, and (c) z = 500; corresponding refractive index field at the same plane in (d)–(f) respectively. 42 Figure 4.11: Reconstruction RMSE of the refractive index field before (blue line) and after (orange line) rescaling with no BC. The reconstruction process of RMSE is shown in Fig. 4.11, which demonstrates the model’s effectiveness. The convergence is slower compared to the settings with 2 Garbor waves in Fig. 4.2 (a) as anticipated due to the increased complexity of the field. Consequently, these more complex fields cause greater challenges to the reconstruction process. Fig. 4.12 further showcases the post-rescaling reconstructed refractive index fields with no BCs ap- plied, where one can see from (a)–(f) that the model effectively captures distinct shapes and variations near the boundary. Notably, low-frequency waves converge more rapidly — approximately within 200 epochs — compared to high-frequency waves, as illustrated in Fig. 4.12 (g)–(i). Table 4.5: PSNRs of the images generated by the temperature setting in Eqs. (4.8) and (4.9) after rescaling. Image Bridge (a) Tower (b) Street (c) Trees (d) No BC 50.85 46.93 50.35 49.82 The generated novel views are shown in Fig. 4.13 in comparison with the ground truth views, with the PSNRs given in Table 4.5. They exhibit a high degree of resemblance and further demonstrate the effectiveness of our model. We, therefore, conclude that the TempNN also works well for the temperature field constructed as a superposition of three 43 050010001500200010-710-610-510-4EpochsRMSE (a) (b) (c) (d) (e) (f) (g) (h) (i) Figure 4.12: Reconstructed refractive index fields after rescaling using Eq. (3.24) with no BC at plane (a) x = 500, (b) y = 500, and (c) z = 500. (d)–(f) are their differences from the corresponding ground truth values after the training finishes. (g)–(i) are the differences at 200 training epochs. 44 independent Gabor wavelets with variations near the boundaries. (a) (b) (c) (d) (e) (f) (g) (h) Figure 4.13: Generated novel views without BC after rescaling in (a)–(d) compared with the ground truth views in (e)–(h). 45 Chapter 5 Conclusion This thesis has advanced the understanding of light refraction through variational media by introducing novel methodologies based on Neural Radiance Fields (NeRF) for simulat- ing and analyzing refractive index fields. Our integrated approach, combining the Iterative Bending (IB) and Non-Translating (NT) techniques, has successfully modeled complex re- fractive behaviors necessary for synthesizing novel views of natural phenomena such as heat shimmering effects. The IB method has proven effective in generating highly accurate synthetic datasets that replicate real-world optical phenomena, while the NT method has greatly enhanced com- putational efficiency, facilitating broader experimental applications and quicker iterations without significantly compromising accuracy. Extensive testing across various backgrounds and boundary conditions has illuminated the model’s strengths and adaptability, particularly in structured environments like urban landscapes. For instance, the model exhibited robust performance in environments with structured elements, such as urban landscapes, where de- tailed textures and contours help define the behavior of light. However, it faced challenges in environments with smooth gradients, where the absence of distinct features made it difficult for the model to accurately predict refractive changes. Furthermore, our exploration of var- ious refractive fields has demonstrated the model’s robust capabilities. It adeptly manages complex fields shaped by Gabor wavelets across a broad spectrum of frequency variations, effectively reconstructing details near the boundary of the grid field. Besides, we observed that low-frequency waves converge more rapidly than high-frequency waves, highlighting a slower convergence rate in more complex fields. 46 This work paves the way for future research into more sophisticated light transport phenomena involving variable refractive index fields, promising further advancements in optical simulation and analysis. 47 BIBLIOGRAPHY Mojtaba Bemana, Karol Myszkowski, Jeppe Revall Frisvad, Hans-Peter Seidel, and Tobias In ACM SIGGRAPH 2022 Ritschel. Eikonal fields for refractive novel-view synthesis. Conference Proceedings, pages 1–9, 2022. Philip E Ciddor. Refractive index of air: new equations for the visible and near infrared. Applied optics, 35(9):1566–1573, 1996. Claudia Di Biagio, Paola Formenti, Yves Balkanski, Lorenzo Caponi, Mathieu Cazaunau, Edouard Pangui, Emilie Journet, Sophie Nowak, Sandrine Caquineau, Meinrat O Andreae, et al. Global scale variability of the mineral dust long-wave refractive index: a new dataset of in situ measurements for climate modeling and remote sensing. Atmospheric Chemistry and Physics, 17(3):1901–1929, 2017. Taku Fujitomi, Ken Sakurada, Ryuhei Hamaguchi, Hidehiko Shishido, Masaki Onishi, and Yoshinari Kameda. Lb-nerf: light bending neural radiance fields for transparent medium. In 2022 IEEE International Conference on Image Processing (ICIP), pages 2142–2146. IEEE, 2022. William M Haynes. CRC handbook of chemistry and physics. CRC press, 2016. Ivo Ihrke, Gernot Ziegler, Art Tevs, Christian Theobalt, Marcus Magnor, and Hans-Peter Seidel. Eikonal rendering: Efficient light transport in refractive objects. ACM Transactions on Graphics (TOG), 26(3):59–es, 2007. Wooseok Kim, Taiki Fukiage, and Takeshi Oishi. Ref2-nerf: Reflection and refraction aware neural radiance field. arXiv preprint arXiv:2311.17116, 2023. Aviad Levis, Andrew A Chael, Katherine L Bouman, Maciek Wielgus, and Pratul P Srini- vasan. Orbital polarimetric tomography of a flare near the sagittarius a* supermassive black hole. Nature Astronomy, pages 1–9, 2024. Jin-gang Liu and Mitsuru Ueda. High refractive index polymers: fundamental research and practical applications. Journal of Materials Chemistry, 19(47):8907–8919, 2009. Patricia Yang Liu, Lip Ket Chin, Wee Ser, HF Chen, C-M Hsieh, C-H Lee, K-B Sung, TC Ayi, PH Yap, Bo Liedberg, et al. Cell refractive index for cell biology and disease diagnosis: past, present and future. Lab on a Chip, 16(4):634–644, 2016. Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoor- thi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021. Hans Ottersten. Atmospheric structure and radar backscattering in clear air. Radio Science, 4(12):1179–1193, 1969. 48 Janak Prasad, Inga Zins, Robert Branscheid, Jan Becker, Amelie HR Koch, George Fytas, Ute Kolb, and Carsten S¨onnichsen. Plasmonic core–satellite assemblies as highly sensitive refractive index sensors. The Journal of Physical Chemistry C, 119(10):5577–5582, 2015. Mirjam Sch¨urmann, Gheorghe Cojoc, Salvatore Girardo, Elke Ulbricht, Jochen Guck, and Paul M¨uller. Three-dimensional correlative single-cell imaging utilizing fluorescence and refractive index tomography. Journal of biophotonics, 11(3):e201700145, 2018. Peter Shirley, Michael Ashikhmin, and Steve Marschner. Fundamentals of computer graphics. AK Peters/CRC Press, 2009. Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, and Ren Ng. Fourier features let networks learn high frequency functions in low dimensional domains. Advances in neural information processing systems, 33:7537–7547, 2020. Arjun Teh, Matthew O’Toole, and Ioannis Gkioulekas. Adjoint nonlinear ray tracing. ACM Transactions on Graphics (TOG), 41(4):1–13, 2022. T T¨opfer, J Hein, J Philipps, D Ehrt, and R Sauerbrey. Tailoring the nonlinear refractive index of fluoride-phosphate glasses for laser applications. Applied Physics B, 71:203–206, 2000. Xing-Hao Ye and Qiang Lin. Gravitational lensing analysed by the graded refractive index of a vacuum. Journal of Optics A: Pure and Applied Optics, 10(7):075001, 2008. Brandon Zhao, Aviad Levis, Liam Connor, Pratul P Srinivasan, and Katherine L Bouman. Single view refractive index tomography with neural fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 25358–25367, 2024. 49