CALDER ´ON PRECONDITIONERS AND WIDEBAND DECOUPLED INTEGRAL EQUATION FORMULATIONS By J. A. Hawkins A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Electrical and Computer Engineering – Doctor of Philosophy 2023 ABSTRACT Integral equations are used to analyze scattering from electromagnetic fields incident upon a perfect electrically conducting (PEC) object. Some common formulations are the electric field integral equation (EFIE), magnetic field integral equation (MFIE), and combined field integral equation (CFIE). Each of these formulations has challenges. The operator in the EFIE is ill-conditioned, and the formulation is non-unique. The operator in the MFIE is well-conditioned, but the formulation is also non-unique. The CFIE (a weighted sum of the EFIE and MFIE) is also ill-conditioned, but the formulation is unique. Due to provable uniqueness, the CFIE is often used in scattering analysis for closed, PEC objects. One approach to improve conditioning for the CFIE is to use well-known Calder´on iden- tities and precondition the EFIE with the EFIE. These identities prove the EFIE operator acting on the EFIE operator is equal to a sequence of second-kind MFIE type operators. The Calder´on preconditioner is often constructed with a lossy wavenumber to preserve the unique- ness of the CFIE formulation. The EFIE acting on the EFIE is analytically well-behaved but fraught with difficulties once the equations are discretized using the Method-of-Moments technique. The crux of the problem is the EFIE operator maps a div-conforming function to a curl-conforming function. Quasi-curl-conforming-divergence-conforming basis sets such as Buffa-Christiansen functions are needed to properly discretize the formulation, and these functions require significant, additional computation and infrastructure compared to the divergence-conforming RWG functions often used to discretize the CFIE. This thesis takes a different starting point to solve the scattering problem for PEC ob- jects. Instead of the CFIE, the decoupled field integral equation (DFIE) and decoupled po- tential integral equation (DPIE) are used to avoid low-frequency and dense-mesh breakdown, topology breakdown, and resonances (all of which contribute to ill-conditioning) for PECs. Also, the operators in the DPIE and DFIE map curl-conforming functions to curl-conforming functions and divergence-conforming functions to divergence-conforming functions. However, these formulations are not generally well-conditioned at high frequencies. The primary contribution of this thesis is a new set of Calder´on identities which may be used to construct O(N) preconditioners for a unique and wideband well-conditioned formula- tion of the DPIE or DFIE constrained to PEC objects. The new formulations are accelerable with fast methods like the multi-level fast multipole method (MLFMM) and open the door to quick and accurate computation of scattered fields from multi-scale and electrically large PEC objects using only RWG functions. Copyright by J. A. HAWKINS 2023 For Amelia v ACKNOWLEDGEMENTS This thesis is the outcome of many conversations, derivations, and much debugging with numerous academics of the highest caliber. Chief among them was Shanker, my advisor, a firm believer in the school of hard knocks and the power of creativity. Another was Luke Baumann. We derived the new Calder´on identities, probably the most enduring contribution of this work and presented herein, on a call during COVID. A special mention should be extended to Michael Lingg and Metin Aktulga in the Com- puter Science Department who both financially and intellectually supported this work, one way or another. They are patient. Of course, there is my committee who agreed to evaluate this thesis and judge its con- tribution to the scientific enterprise. I would also like to thank my friend Jorge who suggested this journey while I was an AFRL project engineer. Indeed, teamwork and talent. My lovely wife, Amelia, our son, and my steadfast parents who always emphasize being a well-rounded person . . . , ‘Τὸ λοιπόν, ἀδελφοί, ὅσα ἐστὶν ἀληθῆ, ὅσα σεμνά, ὅσα δίκαια, ὅσα ἁγνά, ὅσα προσφιλῆ, ὅσα εὔφημα, εἴ τις ἀρετὴ καὶ εἴ τις ἔπαινος, ταῦτα λογίζεσθε.’ vi TABLE OF CONTENTS CHAPTER 1 INTRODUCTION TO DECOUPLED INTEGRAL EQUATIONS . . 1.1 Maxwell’s Equations in Free Space . . . . . . . . . . . . . . . . . . . . . . . 1.2 Derivation of Electric, Magnetic, and Combined Field Integral Equations . . 1.3 Derivation of Frequency Domain Decoupled Potential and Field Integral Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Derivation of the Time Domain SPIE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Method-of-Moments CHAPTER 2 A PROCESS FOR DERIVING CALDER ´ON IDENTITIES . . . . . . 2.1 T and K Calder´on Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 SPIE Calder´on Identities for Frequency and Time Domain . . . . . . . . . . 2.3 Novel Calder´on Identities for Frequency and Time Domain . . . . . . . . . . CHAPTER 3 NOVEL FREQUENCY DOMAIN FORMULATIONS . . . . . . . . . 3.1 DPIE and DFIE for PECs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Local Calder´on Combined DPIE and DFIE Formulations . . . . . . . . . . . 3.2.1 Combined SPIE, VPIE, DPIE, and DFIE Formulations . . . . . . . . 3.2.2 Left and Right Preconditioners for Scaling . . . . . . . . . . . . . . . 3.2.3 Framework for Spectral Analysis . . . . . . . . . . . . . . . . . . . . Spectral Analysis of Combined DFIE and DPIE . . . . . . . . . . . . 3.2.4 3.2.5 Analytic Preconditioners for the Combined DPIE and DFIE . . . . . 3.2.6 Mapping Properties of DPIE and DFIE Operators . . . . . . . . . . . Spectral Properties of LC-CDPIE and LC-CDFIE . . . . . . . . . . . 3.2.7 CHAPTER 4 DISCRETIZATION OF LOCAL CALDER ´ON COMBINED DPIE AND DFIE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Method-of-Moments for LC-CDPIE and LC-CDFIE . . . . . . . . . . . . . . 4.2 Fine-Grain Localization for Piece-Wise Tesselations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Singularity Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . Incident Potentials and Field . . . . . . . . . . . . . . . . . . . . . . 4.3 LC-CDPIE using Hat and RWG Functions . . . . . . . . . . . . . . . . . . . 4.4 MLFMM Accelerated LC-CDFIE using Pulse and RWG Functions . . . . . . 4.2.1 Zero-Mean Constraint 4.2.2 4.2.3 1 1 6 11 21 25 27 27 29 32 36 37 39 39 42 43 47 47 52 54 64 64 66 67 67 67 68 75 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 vii CHAPTER 1 INTRODUCTION TO DECOUPLED INTEGRAL EQUATIONS 1.1 Maxwell’s Equations in Free Space The phenomenology of the electromagnetic scattered fields is governed by Maxwell’s equations. This set of equations is like an axiomatization of all electromagnetic formulations in terms of free charges and currents for isotropic, homogeneous constitutive parameters ∇ × E = − ∂B ∂t ∇ × H = ∂D ∂t + J ∇ · D = ρt ∇ · B = 0 (1.1a) (1.1b) (1.1c) (1.1d) where B = µH and D = ϵE. Electromagnetic fields are unique, and the tangential compo- nents of the fields over a source region are sufficient to specify all aspects of the fields interior or exterior to the source region. This thesis denotes the traces of the field as the tangential and normal field (or potential) quantities over the surface of the source region. A sketch of the uniqueness proof is as follows. If arbitrary sources produce two or more fields then the difference between these fields is non-zero. If the difference between all fields produced by arbitrary sources is zero, then there exists a one-to-one and onto map between fields and sources. Indeed, if the tangential components of the field are specified over the source region of interest, then the difference between all fields produced by the arbitrary sources is zero, and the field is unique. The continuity of charge relation is found by taking the divergence of (1.1b) and inter- changing the spatial divergence and time derivative ∇ · J = − ∂ρt ∂t . 1 (1.2) Boundary conditions are found by computing an surface integrals which conforms to a boundary. For example, consider the following integration of (1.1a) (cid:73) s ∇ × E · ds = − (cid:73) s ∂B ∂t · ds = 0. (1.3) Using Stokes Theorem and noting the line integrals contained in the interior and exterior regions exist (while the line integrals over the two regions vanish), the following boundary condition is the result ˆn × (E1 − E2) = 0. As another example, consider the following integration of (1.1d) (cid:73) v ∇ · B dv = 0 ˆn · (B1 − B2) = 0. (1.4) (1.5) Applying the same procedure to (1.1b) and (1.1c), the following is the complete set of boundary conditions for electromagnetic phenomena ˆn × (E1 − E2) = 0 ˆn × (H1 − H2) = J ˆn · (B1 − B2) = 0 ˆn · (D1 − D2) = ρt. (1.6a) (1.6b) (1.6c) (1.6d) The above boundary conditions and uniqueness are sufficient to postulate the equivalence principle. Consider (1.6b). First, notice the units of the H-field traces are equivalent to those of current density. Second, imagine a closed, PEC scatterer is embedded in free-space and a electromagnetic field is incident on the scatterer. The field interior to the PEC scatter is zero according to experiments, and the field exterior to the PEC scatter is non-zero due to induced sources responding to impressed sources according to boundary conditions. 2 According to (1.6b), the trace of the exterior H-field is equivalent to the current density on the PEC boundary. Therefore, the PEC scatterer may be removed and replaced with a current density (existing on the mathematical surface of the scatter) specified to the trace of the exterior H-field. The exterior H-field is exactly equal to the original problem according to uniqueness. Furthermore, the interior of the mathematical surface may also be filled with a non-zero field while the exterior field remains fixed. The current density to support the fixed, exterior field is not identical to the current density supporting the fixed exterior field with a zero interior field, but the exterior field is identical (through the boundary conditions). This is the equivalence principle, different sources support the same phenomenology in a region of interest due to boundary conditions. This thesis assumes the fields are time-harmonic E = E(r) E(t) = E(r) ejωt H = H(r) H(t) = H(r) ejωt J = J(r) J(t) = J(r) ejωt ρ = ρ(r) ρ(t) = ρ(r) ejωt. (1.7a) (1.7b) (1.7c) (1.7d) The utility of the time-harmonic assumption is Maxwell’s equations are solvable in the frequency domain after a Fourier Transform F{∇ × E} = F{− ∂B ∂t } ∂t + J } F{∇ × H} = F{ ∂D −→ ∇ × E = −jωB ∇ × H = jωD + J F{∇ · D} = F{ρt} F{∇ · B} = 0 ∇ · D = ρ ∇ · B = 0. The inverse Fourier Transform recovers the time-harmonic fields. Decoupling the time-harmonic Maxwell’s equations for free-space ∇ × ∇ × E = −jωµo∇ × H = −jωµo{jωD + J} = κ2E − jωµoJ √ where κ = κˆκ is the wave vector, and κ = ω ϵµ is the wave number. Using a vector identity ∇2E + κ2E = − 1 jωϵo 3 ∇∇ · J + jωµoJ. (1.10) (1.8) (1.9) Repeating the process ∇2H + κ2H = −∇ × J. (1.11) The Helmholtz operator ∇2 + κ2 is on the left-hand-side (LHS) is common in physics, and the solution is discuss below. Another approach is using potentials as an intermediary between electromagnetic sources and fields. The magnetic vector potential follows from (1.1d) and the solenoidal property of the curl operator B = ∇ × A. Using (1.1a) and the irrational property of gradient operator ∇ × E = − jω∇ × A = ∇ × (−jωA − ∇ϕ). A mixed potential representation of the E-field is then (1.12) (1.13) E = −jωA − ∇ϕ. (1.14) The curl of A is defined, but the divergence of A is not yet defined. These two components are independent, and we are free to define ∇ · A. Consider the divergence of the mixed potential (1.14) and Gauss’s law (1.1c) ∇ · E = −jω∇ · A − ∇ · ∇ϕ = ρ ϵo . We choose the Lorenz gauge ∇ · A = −jωϵoµoϕ which yields the nice result ∇2ϕ + κ2ϕ = − ρ ϵo . Using the Lorenz gauge, the magnetic vector potential representation of the E-field is E = − jωA + = (cid:18) ∇∇ · +κ2 jωϵoµo ∇∇ · A jωϵoµo (cid:19) A. 4 (1.15) (1.16) (1.17) Using the magnetic vector potential, (1.1b), and (1.17) ∇ × ∇ × A = (∇∇ · +κ2)A + µoJ which yields another nice result ∇2A + κ2A = −µoJ. (1.18) (1.19) Again, the equations for the potentials have a Helmholtz operator on the LHS, but there are no derivative operators acting on the sources on the RHS. These equations may be solved with guess-and-check. Consider the following equation whose solution is the Green’s function ∇2G(r, r′) + κ2G(r, r′) = −δ(r − r′). (1.20) Convolving both sides of (1.20) yields (∇2 + κ2)G(r, r′) ∗ ρ(r′) = −ρ(r). (1.21) The utility of the approach is now immediately clear. The solution to the Helmholtz equation is the convolution of the RHS with the Green’s function G. The radial symmetry of the Green’s function is enforced by the radial symmetry of the delta function. Expanding (1.20) into polar form 1 r2 ∂ ∂r r2 ∂ ∂r G(r, r′) + κ2G(r, r′) = −δ(r − r′) (1.22) a good guess is a Green’s function of the form G = C e−jκ|r−r′| |r − r′| . (1.23) Consider the guess as R = |r − r’| −→ 0 by taking a closed spherical volume integral of (1.20) { lim R→0 (cid:73) v ∇ · ∇Gdv + κ2 (cid:73) v Gdv} = − lim R→0 (cid:73) v δ(R)dv. (1.24) 5 Using polar coordinates, the divergence theorem, the definition of dirac delta functions, and ˆn = − ˆR (cid:73) S lim R→0 ∇G · ˆRR2 sin θdθdϕ = 1. Substituting the guess into the above and evaluating for C (cid:32) −jκ (cid:73) S e−jkR R − e−jκR R2 (cid:33) C lim r→0 R2 sin θdθdϕ = 1 C (cid:90) 2π (cid:90) π 0 0 sinθdθdϕ = 1 =⇒ C = 1 4π . And the free space Green’s function is G = e−jκ|r−r′| 4π|r − r′| . (1.25) (1.26) (1.27) (1.28) To summarize this section, the solutions to the Helmholtz equations (1.10), (1.11), (1.16), and (1.19) are E = (cid:18) 1 jωϵo ∇∇ · J − jωµoJ ∗ G (cid:19) H = (∇ × J) ∗ G A = µoJ ∗ G ρ ϵo ϕ = ∗ G (1.29a) (1.29b) (1.29c) (1.29d) where * denotes spacial convolution. 1.2 Derivation of Electric, Magnetic, and Combined Field Integral Equations Consider a PEC closed object of a simply connected region D− immersed in an isotropic, homogeneous background D+. The object’s boundary is a two-dimensional smooth manifold S = ∂D embedded in R3 with unique normal ˆn(r) pointing from ∂Ω into D+. It is assumed that a plane wave characterized by (cid:8)κ, Ei(r), Hi(r)(cid:9) is incident on the object. Our objective is to determine scattered fields in D+. 6 Three standard equations for solving the scattering problem of an incident wave upon a PEC scatterer are the EFIE, MFIE, and CFIE. These equations are derived through the boundary conditions discussed above. To derive the EFIE, consider the boundary condition for E-fields (1.6a), where the interior field is set to zero ˆn × ˆn × E = 0. (1.30) Note the exterior E-field is a superposition of the incident E-field and the scattered E-field (again, the incident E-field impresses a source which induces another source according to boundary conditions and this induced source radiates into the exterior region). Therefore, ˆn × ˆn × Ei = −ˆn × ˆn × Es. (1.31) Invoking equivalence, the scattered E-field is radiated by the equivalent, induced sources suspended in free space over the mathematical surface of the scatter. Substituting (1.29a) for Es, the EFIE is ˆn × ˆn × Ei = −ˆn × ˆn × (cid:18) 1 jωϵo ∇∇ · J − jωµoJ ∗ G. (1.32) (cid:19) The EFIE is ill-conditioned at low and high frequencies. At low-frequencies, the scalar potential dominates, and the information in the vector potential is lost. At high-frequencies, the vector potential dominates, and the information in the scalar potential is lost. However, the E-field requires the information in both the scalar and vector potential to uniquely specify the E-field (which is, indeed, physically unique) from the sources, and ill-conditioning is synonymous with the physical problem verging on non-uniqueness. One way to mitigate this problem is to replace the divergence of the current density with a charge quantity (which cancels the jω term) in the so-called Current and Charge Integral Equation. To derive the MFIE, consider the boundary condition for H-fields (1.29b), where the interior field is set to zero once again ˆn × H = J. (1.33) 7 Figure 1.1: Schematic of an ellipsoid of minor length α. Note the exterior H-field is a superposition of the incident H-field and the scattered H-field, and the current density on the surface is equivalent to the trace of the exterior H-field on the mathematical surface. Therefore, ˆn × Hi = J − ˆn × Hs. (1.34) Again, the physical situation is the incident field impresses sources and the scatter responds to the impressed sources with induced sources such that boundary conditions are satisfied. Invoking equivalence once again and substituting (1.29b) yields the MFIE ˆn × Hi = J − ˆn × (∇ × J) ∗ G. (1.35) Unlike the EFIE, the MFIE is well-conditioned because the operator is second-kind (a sum- mation of a compact operator and an identity or idempotent operator). The MFIE is invalid for open surfaces like a plate. Consider the ellipsoid in Fig. 1.1. The MFIE equals ˆn × Hi = J − = J − (cid:90) s (cid:90) s ˆn × J′ × ∇G dS′ ˆn × ˆR × J′ (cid:18) 1 + jκR 4πR (cid:19) e−jκR R dS′. (1.36) 8 The ellipsoid approximates a thin object in the limit α → 0. For thin and planar objects, ˆn × ˆR × J′ ≈ 0 due to ˆR and J nearly occupying the same plane (cid:32) lim α→0 J − (cid:90) s ˆn × ˆR × J′ (cid:18) 1 + jκR 4πR (cid:19) e−jκR R (cid:33) dS′ = 0. (1.37) For thin and non-planar objects, first consider the singularity at R = 0. In Fig. 1.1, the source approaches the observer. Near the singular region, ˆR is always in the plane of J′ and the integral is zero. However, the source may approach the observer from directly above the observer. Enclosing the singularity within a sphere of radius δ, sifting the sphere with δ (cid:0)θ − π 2 (cid:1) to extract the surface, and taking the limit δ → 0 equals (cid:18) J − (cid:90) s lim δ→0 ˆn × J′ × ∇G dS′ (cid:19) (cid:32) = lim δ→0 J − (cid:90) s ˆn × ˆR × J′ (cid:18) 1 + jκR 4πR (cid:19) e−jκR R (cid:33) dS′ ˆn × ˆn × J′ dϕ′ 1 4π (cid:90) s = J + = J 2 . (1.38) In the limit α → 0 in Fig. 1.1, the surface below is added to the surface above, but the field traces of the boundary condition are defined at an infinitesimal distance from the boundary. They are not defined on the boundary. Therefore, the limit α → 0 cannot superimpose the limit of the source approaching the observer from above and below at the boundary; the MFIE is invalid for open geometries. What about the EFIE? The EFIE is valid for open geometries. The boundary condition is defined on the bound- ary rather than an infinitismal distance from the boundary. The field traces in two regions separated by a boundary are identical an infinitesimal from the boundary. Therefore, the fields at the boundary are defined and identical as well. Neither the MFIE nor the EFIE are unique formulations. Each of these formulations includes a null-space where the incident field matches a resonant frequency for the scatterer. This is undesirable because the physical fields are unique (the incident field induces sources which radiate a unique field). Maxwell’s equations completely describe the macroscopic 9 phenomenology, and therefore there exists a unique formulation or a one-to-one and onto map between induced sources and scattered fields. The CFIE is one of these formulations. To derive the CFIE, weight the EFIE and MFIE with a constant 0 < α < 1 and 1 − α. Adding these two weighted equations yields the CFIE α ˆn × ˆn × Ei + (1 − α)ηoˆn × Hi = − α ˆn × ˆn × (cid:18) 1 jωϵo ∇∇ · J − jωµoJ ∗ G (cid:19) (1.39) + (1 − α)ηoJ − (1 − α)ηoˆn × (∇ × J) ∗ G where ηo is the intrinsic impedance of free space. The EFIE and MFIE are functions, and therefore the non-uniqueness is caused by multiple distributions of sources mapping to a single field. This is synonymous to the problem of a distribution of non-zero sources mapping to zero. The uniqueness of the CFIE will be demonstrated by showing only zero sources map to zero scattered fields. The following proof is from [Harrington and Mautz, 1978]. Consider a current density which supports zero exterior fields (E1, H1) and an unknown interior field (E2, H2) ˆn × E2 =0 ˆn × H2 =J. (1.40) This problem is equivalent to setting the incident fields in the CFIE to zero 0 = (1 − α)ηo(ˆn × H2) + α(ˆn × ˆn × E2). (1.41) Taking the complex conjugate of the above equation and projecting onto the above equation yields −2Re (cid:18)(cid:90) s ˆn · ((ˆn × E2) × (ˆn × H∗ 2)) dS (cid:19) = (cid:82) s (1 − α)2 η2 o |ˆn × H2|2 + α2 |ˆn × ˆn × E2|2 dS (1 − α) αηo . (1.42) The LHS of the above equation is the real power flowing inside the interior region and equal to zero. However, the RHS is always zero or positive. Therefore, the traces of the interior fields must be zero as well. Using boundary conditions, the only sources in the CFIE which support zero exterior fields are zero sources in this case, and the CFIE is unique. 10 The CFIE formulation inherits the limitations of the MFIE formulation. The CFIE is only applicable to closed surfaces. 1.3 Derivation of Frequency Domain Decoupled Potential and Field Integral Equations Consider the following Green’s Theorem for vectors in terms of source coordinates (cid:90) v ∇′ · (P × ∇′ × Q) dV ′ = (cid:90) s (P × ∇′ × Q) · ˆn′ dS′. (1.43) Using the vector identity within the volume integral (cid:90) v (∇′ × P · ∇′ × Q − P · ∇′ × ∇′ × Q) dV ′ = (cid:90) s (P × ∇′ × Q) · ˆn′ dS′. (1.44) Interchanging P and Q and subtracting yields Green’s second identity for vectors (cid:90) v (Q · ∇′ × ∇′ × P − P · ∇′ × ∇′ × Q) dV ′ = (cid:90) s (P × ∇′ × Q − Q × ∇′ × P) · ˆn′ dS′. (1.45) Define P = A and Q = G a where a is an arbitrary unit vector called a pilot vector and G is the unscaled free-space Green’s function e−jκR R . Note the following identities ∇′ × Q = ∇′G × a ∇′ × ∇′ × Q = aκ2G + ∇′(a · ∇′G) ∇′ × ∇′ × P = ∇′∇′ · A′ + κ2A′ + µoJ′. (1.46a) (1.46b) (1.46c) After substituting the above equations into the volume integral in (1.45), the volume integral is now (cid:90) v (a · G(∇′∇′ · A′ + κ2A′ + µoJ′) − A′ · (aκ2G + ∇′(a · ∇′G)) dV ′. (1.47) Note the following identities A′ · (∇′(a · ∇′G)) = ∇′ · (a · ∇′G A′) − a · ∇′G(∇′ · A′) a · G∇′∇′ · A′ = a · ∇′(G ∇′ · A′) − a · ∇′G(∇′ · A′). (1.48a) (1.48b) 11 Substituting the above identities into the volume integral yields a · (cid:90) v ∇′(G ∇′ · A′) + µo G J′ dV ′ − a · (cid:90) s (A′ · ˆn′)(∇′G) dS′. (1.49) Using the identity (cid:90) v ∇′ϕ′ dV ′ = (cid:90) s ˆn′ ϕ′ dS′ (1.50) the volume integral of Green’s second identity for the magnetic vector potential equals, (cid:90) v µo G J′ dV ′ + (cid:90) s ˆn′ (G ∇′ · A′) dS′ − (cid:90) s ∇′G (ˆn′ · A′) dS′ (1.51) where the pilot vector has been removed (its only purpose was to simplify the vector opera- tions). To simplify the RHS of (1.45), consider the following identities A′ × (∇′G × a) · ˆn′ = a · ∇′G × (A′ × ˆn′) (a G × ∇′ × A′) · ˆn′ = a · (∇′ × A′) × ˆn′ G. (1.52a) (1.52b) Substituting the above identities into the surface integrals of Green’s second identity for vectors and removing the pilot vector yields (cid:90) s ∇′G × (A′ × ˆn′) dS′ − (cid:90) s G ((∇′ × A′) × ˆn′) dS′. (1.53) Equating the reduced LHS and RHS of (1.45) yields (cid:90) v µo G J′ dV ′ = − (cid:90) G (ˆn′ ∇′ · A′) dS′ + (cid:90) s ∇′G (ˆn′ · A′) dS′ + (cid:90) s ∇′G × (A′ × ˆn′) dS′ s − (cid:90) s G ((∇′ × A′) × ˆn′) dS′. (1.54) To further reduce the integral above, consider the infinite volume with incisions around an observer and sources as depicted in Fig. 1.2. The surface integrals include surfaces S∞ extending to infinity, S1 enclosing all sources, and Sδ enclosing the observer marked by r. The volume integral includes the volumes V1, Vδ, and V − V1 − Vδ. The singularity at the 12 Figure 1.2: Schematic of infinite volume with incisions around sources and an observer. observer needs to be excluded. A sphere of radius δ is circumscribed about the singularity, and its normal is directed out of the volume region and toward the singularity. Each of the volumes and surfaces are separately evaluated as δ vanishes (cid:90) v µo G J′ d(V − V1 − Vδ)′ + µo G J′ dV (cid:90) v (cid:90) µo G J′ dV v (cid:90) ′ δ = (cid:90) ′ 1 + lim δ→0 ′ ∞ + s ∇′G (ˆn′ · A′) dS ∇′G × (A′ × ˆn′) dS ′ ∞ (cid:90) − s − (cid:90) − G (ˆn′ ∇′ · A′) dS ′ ∞ + (cid:90) s G ((∇′ × A′) × ˆn′) dS ′ ∞ s G (ˆn′ ∇′ · A′) dS (cid:90) ′ 1 + s − (cid:90) s ′ G ((∇′ × A′) × ˆn′) dS 1 s (cid:90) − lim δ→0 s (cid:90) s − lim δ→0 G (ˆn′ ∇′ · A′) dS (cid:90) ′ δ + lim δ→0 ′ δ. s G ((∇′ × A′) × ˆn′) dS ∇′G (ˆn′ · A′) dS ′ 1 + (cid:90) s ′ ∇′G × (A′ × ˆn′) dS 1 ∇′G (ˆn′ · A′) dS ′ δ + lim δ→0 (cid:90) s ′ ∇′G × (A′ × ˆn′) dS δ (1.55) 13 The limiting integrals evaluate as follows (cid:90) v lim δ→0 µo G J′ dV ′ δ = lim δ→0 (cid:90) v = 0 µo G J′δ2 sin θ′ dΩ′ (cid:90) s lim δ→0 G (ˆn′ ∇′ · A′) dS ′ δ = lim δ→0 (cid:90) s G (ˆn′ ∇′ · A′)δ2 sin θ′ dΩ′ = 0 (1.56) (1.57) (cid:90) s lim δ→0 ∇′G × (A′ × ˆn′) dS ′ δ = lim δ→0 (cid:90) s (cid:0)∇′G · ˆn′ A′(cid:1) dS (cid:90) ′ δ − lim δ→0 ′ A′ × (cid:0)∇′G × ˆn′(cid:1) dS δ (cid:90) ′ δ − lim δ→0 ′ ∇′G × (A′ × ˆn′) dS δ s s (cid:0)∇′G · ˆn′ A′(cid:1) dS + lim δ→0 (cid:90) = lim δ→0 s − lim δ→0 (cid:90) s (cid:90) s ′ ∇′G (cid:0)ˆn′ · A′(cid:1) dS δ ′ ∇′G (cid:0)ˆn′ · A′(cid:1) dS δ (cid:90) s lim δ→0 (cid:0)∇′G · ˆn′ A′(cid:1) dS ′ δ = lim δ→0 (cid:90) s = − lim δ→0 = − As (cid:0)∇′G · ˆn′ A′(cid:1) δ2 sin θ′ dΩ′ (cid:90) (cid:18) 1 δ (cid:90) π s (cid:90) 2π + jκ (cid:19) e−jκδ δ A′δ2 sin θ′dΩ sin θ′dθ′dϕ′ 0 = − 4πAs. 0 Substituting the above integrals into (1.55) yields (1.58) (1.59) 4πAs = − (cid:90) v − µo G J′ d(V − V1 − Vδ)′ − (cid:90) (cid:90) G (ˆn′ ∇′ · A′) dS ′ ∞ + (cid:90) s G ((∇′ × A′) × ˆn′) dS s G (ˆn′ ∇′ · A′) dS (cid:90) ′ 1 + s G ((∇′ × A′) × ˆn′) dS s − (cid:90) − s − (cid:90) s ′ ∞ ′ 1. (cid:90) µo G J′ dV ′ 1 v ∇′G (ˆn′ · A′) dS ′ ∞ + (cid:90) s ∇′G × (A′ × ˆn′) dS ′ ∞ ∇′G (ˆn′ · A′) dS ′ 1 + (cid:90) s ′ ∇′G × (A′ × ˆn′) dS 1 (1.60) 14 The equivalence principle is used to replace the source terms in V1 with null fields, and V − V1 − Vδ is without sources. The volume integrals of these regions are zero. Evaluating S∞ is equivalent to evaluating the following limit (cid:90) s (. . . ) dS∞ = lim r′→∞ (cid:90) s (. . . ) r′2 sin θ′ dθ′dϕ′. (1.61) On S∞ in Fig. 1.2, ˆR = ˆn and ˆR = −ˆr′ and R = r′. Evaluating the surface integrals on S∞ in (1.64) yields (cid:90) s (. . . ) dS∞ = lim r′→∞ (cid:18) 1 r′ + jκ (cid:19) e−jκr′ r′ ˆr′ (cid:0)ˆr′ · A′(cid:1) (cid:34) (cid:90) e−jκr′ r′ ˆr′∇′ · A′ + (cid:19) e−jκr′ + s (cid:18) 1 r′ + jκ (cid:34) (cid:90) r′ ˆr′ × (cid:0)A′ × ˆr(cid:1) + (cid:104) (cid:0)ˆr′ · A′(cid:1) ˆr′ + ˆr′ × (cid:0)A′ × ˆr′(cid:1) (cid:105) (cid:18) 1 + jκr′ (cid:19) e−jκr′ r′ r′ = lim r′→∞ s (cid:0)(cid:0)∇′ × A′(cid:1) × ˆr′(cid:1) (cid:35) r′2 sin θ′dθ′dϕ′ (cid:35) ˆr′∇′ · A′ + (cid:0)∇′ × A′(cid:1) × ˆr′(cid:105) (cid:104) + r′e−jkr′ sin θ′dθ′dϕ′ A′ + (cid:104) jκA′ + ˆr′∇′ · A′ + (cid:0)∇′ × A′(cid:1) × ˆr′(cid:105) ˆr′ (cid:35) e−jkr′ sin θ′dθ′dϕ′ (cid:34) (cid:90) s = lim r′→∞ = 0 (1.62) where the following radiation conditions are true due to the free-space Green’s function |r′A′| < ∞ (cid:0)jκA′ + ˆr′∇′ · A′ + (cid:0)∇′ × A′(cid:1) × ˆr′(cid:1) = 0. lim r′→∞ lim r′→∞ (1.63a) (1.63b) Substituting these evaluations into (1.64) leads to (cid:90) As = − G (ˆn′ ∇′ · A′) dS ∇′G (ˆn′ · A′) dS (cid:90) ′ 1 + ′ 1 + (cid:90) s ′ ∇′G × (A′ × ˆn′) dS 1 (1.64) s − (cid:90) s s ′ G ((∇′ × A′) × ˆn′) dS 1 15 where 4π is absorbed into the Green’s function such that G 4π → G in (1.64). Using ˆn × A = −ˆn × ˆn × ˆn × A As = − (cid:90) s G (cid:0)ˆn′ × ˆn′ × ˆn′ × ∇′ × A′(cid:1) dS′ − − Using ∇′ = −∇, the final result is ∇′G × (cid:0)ˆn′ × A′(cid:1) dS′ + (cid:90) s ∇′G (cid:0)ˆn′ · A′(cid:1) G (cid:0)ˆn′∇′ · A′(cid:1) dS′. (cid:90) s (cid:90) s (1.65) As = −S[ˆn × ˆn × ˆn × ∇ × A] + ∇ × S[ˆn × A] − ∇S[ˆn · A] − S[ˆn ∇ · A] (1.66) where S[x] = (cid:82) s G x′ dS′. The following notation is used with respect to the source terms a = ˆn × ˆn × ∇ × A b = ˆn × A γ = ˆn · A σ = ˆn ∇ · A. (1.67a) (1.67b) (1.67c) (1.67d) Substituting the potential form the E-field and H-field into the boundary conditions without sources yields ˆn × (−A1 − ∇ϕ1) = ˆn × (−A2 − ∇ϕ2) 1 µ1 ˆn × (∇ × A1) = 1 µ2 ˆn × (∇ × A2) ϵ1ˆn · (−A1 − ∇ϕ1) = ϵ2ˆn · (−A1 − ∇ϕ1) ˆn · (∇ × A1) = ˆn · (∇ × A2) . (1.68a) (1.68b) (1.68c) (1.68d) Using the non-unique relation between fields and potentials, these boundary conditions are constrained and decoupled by associating A terms with A terms and ϕ terms with ϕ terms. Remembering the electromagnetic scattering problem is charge neutral, the new boundary 16 conditions are 1 µ1 ˆn × ˆn × ∇ × A1 = 1 µ2 ˆn × ˆn × ∇ × A2 ˆn × A1 = ˆn × A2 ϵ1ˆn · A1 = ϵ2ˆn · A2 ∇ · A1 = ∇ · A2 ϕ1 = ϕ2 ϵ1ˆn · ∇ϕ1 = ϵ2ˆn · ∇ϕ2 (cid:90) (cid:90) s′ dS′ ˆn · A = 0 s′ dS′ ˆn · ∇ϕ = 0. (1.69a) (1.69b) (1.69c) (1.69d) (1.69e) (1.69f) (1.69g) (1.69h) Note, ˆn · (∇ × A) = −∇ · (ˆn × A) and this condition is already enforced. The last boundary condition comes through the Lorenz gauge or the divergence of the E-field ∇ · A1 = ∇ · A2. (1.70) The overall point is scalar and vector potentials may be decoupled and related through the Lorenz gauge. The traces of the scattered field are found by applying the appropriate operator to both sides of (1.66) as = ˆn × ˆn × ∇ × As bs = ˆn × As γs = ˆn · As σs = ∇ · As where (cid:82) γsdS = 0. For the exterior problem, the total magnetic vector potential equals A1 = Ai + As. 17 (1.71a) (1.71b) (1.71c) (1.71d) (1.72) The scattered traces may be compactly represented by                   as bs γs σs = Zvpie                   a b γ σ (1.73) where Zvpie =          −ˆn × ˆn × ∇ × S[ˆn × ∗] ˆn × ˆn × ∇ × ∇ × S[∗] 0 −ˆn × ˆn × ∇ × S[ˆn∗] −ˆn × S[ˆn × ∗] ˆn × ∇ × S[∗] −ˆn × ∇S[∗] −ˆn × S[ˆn∗] −ˆn · S[ˆn × ∗] ˆn · ∇ × S[∗] −ˆn · ∇S[∗] −ˆn · S[ˆn∗] −∇ · S[ˆn × ∗] 0 −κ2S[∗] −∇ · S[ˆn∗]          . Substituting the above representation of the scattered field into the total field yields                   a b γ σ 1 = =                                     ai bi γi σi ai bi γi σi                   as bs γs σs + + Zvpie                   a b γ σ . 1 (1.74) (1.75) Re-arranging the above equation and dropping the subscript yields the VPIE for the exterior region                   ai bi γi σi                   a b γ σ = (Ivpie − Zvpie) 18 (1.76) where Ivpie is a 4x4 diagonal matrix of idempotent operators (or the identity of an operator). Deriving the scalar potential integral equation (SPIE) follows a process analogous to the VPIE. Beginning with Green’s second identity for scalars and (1.16) using Fig. 1.2, where G = e−jκR R , the volume integrals once again evaluate to zero due to equivalence and the absence of sources. The S∞ surface integral equals (cid:90) s (. . . ) dS∞ = lim r′→∞ (cid:90) (cid:0)G ˆn′ · ∇′ϕ′ − ϕ′ ˆn′ · ∇′G(cid:1) dS′ ∞ e−jκr′ r′ ˆr′ · ∇′ϕ′ + ϕ′ˆr′ · ˆr′ (cid:18) 1 + jκr′ r′ (cid:19) e−jκr′ r′ (cid:35) r′2 sin θ′ dθ′dϕ′ (cid:0)∇′ · (cid:0)ϕ′ˆr′(cid:1) + jκϕ′(cid:1) r′ + ϕ′ (cid:35) e−jκr′ sin θ′ dθ′dϕ′ s (cid:90) (cid:34) s (cid:34) (cid:90) s = − lim r′→∞ = − lim r′→∞ = 0 where the following radiation conditions are true due to the free-space Green’s function (1.77) (cid:0)∇′ · (cid:0)ϕ′ˆr′(cid:1) + jκϕ′(cid:1) = 0 lim r′→∞ r′ϕ′ < ∞. lim r′→∞ (1.78a) (1.78b) The Sδ surface integral equals (cid:90) s lim δ→0 (. . . ) dSδ = − lim δ→0 = − lim δ→0 = 4πϕs. (cid:90) s (cid:90) s ϕ′ˆn′ · ∇′G dSδ ϕ′ (1 + jκδ) e−jκδ sin θ′ dθ′dϕ′ (1.79) Adding the surface integrals together and rewriting 1 4π G → G generates the following equa- tion for the scattered scalar potential ϕs = −S [ˆn · ∇ϕ] − ∇ · S [ˆnϕ] . (1.80) Deriving the SPIE follows the same process as the VPIE. If ϕ1 = ϕi + ϕs, the SPIE 19 formulation in the exterior region is       αi βi   = (Ispie − Zspie)   α β   (1.81) where Ispie is a 2x2 diagonal matrix of idempotent operators. The variables are defined as, α = ϕ β = ˆn · ∇ϕ   −∇ · S[ˆn∗] −S[∗] −ˆn · ∇∇ · S[ˆn∗] −ˆn · ∇S[∗]   Zspie =   where (cid:82) s β dS = 0. (1.82a) (1.82b) (1.82c) The decoupled potential integral equation (DPIE) is a block diagonal matrix of the VPIE (1.75) and SPIE (1.81)                 ai bi γi σi αi βi                 =    Zvpie 0 0 Zspie                                    a b γ σ α β . (1.83) Finally, the decouple field integral equation (DFIE) is derived in a manner analogous to the VPIE as well. Replacing A with E in (1.66), the scattered electric field equals Es = −S[ˆn × ˆn × ˆn × ∇ × E] + ∇ × S[ˆn × E] − ∇S[ˆn · E] − S[ˆn ∇ · E]. (1.84) Using the following physical relations for free-space ∇ × E = − jωµoH ∇ · E = 0 (1.85a) (1.85b) (1.84) is reducible to the Stratton-Chu formulation, Es = − (cid:90) s jωµ0 G (cid:0)ˆn′ × H′(cid:1) + (cid:0)ˆn′ × E′(cid:1) × ∇′G + (cid:0)ˆn′ · E′(cid:1) ∇′G dS′. (1.86) 20 If Et = Ei + Es, then the DFIE for the exterior region is            ai E bi E γi E σi E         = (Idf ie − Zdf ie)         where Zdf ie = Zvpie and Idf ie = Ivpie, and aE = ˆn × ˆn × ∇ × E bE = ˆn × E γE = ˆn · E σE = ∇ · E aE bE γE σE          (1.87) (1.88a) (1.88b) (1.88c) (1.88d) where (cid:82) s γE dS′ = 0. The VPIE and DFIE have identical forward matrices for the exterior problem in free-space. 1.4 Derivation of the Time Domain SPIE The derivation of the time domain decouple integral equation is similar to the frequency domain decoupled integral equation. The derivation is recapitulated in this section because the new time domain Calder´on identities will be derived as well in Chapter 2. This section serves as a preliminary. The time-domain mixed potential representation of the E-field is where E = − ∂A ∂t − ∇ψ dS′ G ∗ µ0J (cid:0)r′, t′(cid:1) ρ (cid:0)r′, t′(cid:1) ϵ0 (cid:90) s′ (cid:90) A = ψ = δ G = dS′ G ∗ s′ (cid:16) t − R c (cid:17) 4πR 21 (1.89) (1.90) and * denotes temporal convolution. Deriving the TD-SPIE requires Green’s Second Identity for scalars and the wave equation ∇′2ψ − ∂2 1 ∂t2 ψ′ = − c2 ∂2 1 ∂t2 G = 0 c2 (cid:90) (cid:17) dV ′ = G∇′2ψ′ − ψ′∇′2G ∇′2G − ρ′ ϵo (cid:0)G ˆn′ · ∇′ψ′ − ψ′ ˆn′ · ∇′G(cid:1) dS′. (1.91a) (1.91b) (1.91c) s , the LHS of Green’s Second Identity may be rewritten in the following (cid:90) (cid:16) v (cid:17) (cid:16) δ t−t′− R c R Using G = way (cid:16) (cid:17) G∇′2ψ′ − ψ′∇′2G dV ′ = (cid:90) v = (cid:90) (cid:90) v t (cid:16) (cid:17) G∇′2ψ′ − ψ′∇′2G dt′ dV ′ = (cid:90) v (cid:90) v (cid:90) v G (cid:18) 1 c2 ∂ ∂t′ (cid:18) 1 c2 (cid:18) G 1 c2 = − (cid:90) v G ∗ dV ′ (cid:18) 1 c2 (cid:19) − G (cid:19) ∂2 ∂t′2 G (cid:19) ρ′ ϵo (cid:90) (cid:90) dV ′ − t v dV ′ (cid:19) G ρ′ ϵo ∂2 ∂t′2 ψ′ − − ψ′ (cid:18) ∂ψ′ ∂t′ − ψ′ ∂G (cid:19)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) ∂ψ′ ∂t′ − ψ′ ∂G ρ′ ϵo dV ′ ∂t ∞ ∂t′ −∞ G ρ′ ϵo dt′ dV ′ (1.92) where G and ∂G ∂t evaluates to zero in the limit of infinity due to the properties of the delta- function. Like Section 1.3, all of space is enclosed in a volume integral with incisions around the sources and an observer as depicted in Fig. 1.2. All volume integrals may be set to zero due to the equivalence principle and the absence of sources elsewhere. The surface integral over 22 S∞ is evaluated in the following way (cid:90) s lim r′→∞ (. . . ) dS∞ = lim r′→∞ (cid:90) s (cid:0)G ˆn′ · ∇′ϕ′ − ψ′ˆn′ · ∇′G(cid:1) dS∞ (cid:17) (cid:16) ˆr′ · ∇′ψ+ (cid:34) δ (cid:90) = − lim r′→∞ s t − t′ − r′ c r′ (cid:19) δ 1 ∂ ∂t′ c t − t′ − r′ c r′ (cid:16) t − t′ − r′ c r′ (cid:17) (cid:35) r′2 sin θ′ dθ′dϕ′ (cid:17) ˆr′ · ∇′ψ′ + (cid:18) t − t′ − ψ′ r′2 δ (cid:19) r′ c ψ′ (cid:18) 1 r′ + (cid:16) (cid:34) δ (cid:90) + s ψ′ cr′ ∂ ∂t′ δ (cid:34) (cid:18) = − lim r′→∞ = − lim r′→∞ = − lim r′→∞ (cid:90) s (cid:90) s (cid:18) t − t′ − r′ c (cid:19) (cid:35) r′2 sin θ′ dθ′dϕ′ ˆr′ · ∇′ψ′ + (cid:19) ψ′ c ∂ ∂t′ (cid:35) r′ + ψ′ δ (cid:18) t − t′ − (cid:19) r′ c sin θ′ dθ′dϕ′ (cid:34) (cid:18) ∇′ · (cid:0)ψ′ˆr′(cid:1) − 1 c (cid:19) r′ sin θ′ dθ′dϕ′ ∂ ∂t′ ψ′ (cid:19) r′ c (cid:18) t − t′ − (cid:35) δ + ψ′ = 0 (cid:90) (cid:90) s t lim r′→∞ (. . . ) dt′dS∞ = 0 where ∇′G = ˆR (cid:16) 1 R + 1 c ∂ ∂t′ (cid:17) G, integration by parts, and the radiation conditions are (1.93) lim r′→∞ lim r′→∞ r′ψ′ < ∞ (cid:18) ∇′ · (cid:0)ψ′ˆr′(cid:1) − (cid:19) 1 c ∂ ∂t′ ψ′ = 0. (1.94a) (1.94b) The Sδ and Vδ integrals remain to be evaluated. As δ → 0, the Vδ integral evaluates to zero 23 once again and the Sδ integral equals − lim δ→0 (cid:90) s ψ′ˆn′ · ∇′G dSδ = − lim δ→0 (cid:90) (cid:90) ψ′ˆn′ · ∇′G r′2 sin θ′ dθ′dϕ′ s ψ′ (cid:19) (cid:18) (cid:18) 1 δ + 1 c ∂ ∂t′ ψ′δ (cid:0)t − t′(cid:1) sin θ′ dθ′dϕ′ δ t − t′ − (cid:19) δ c δ sin θ′ dθ′dϕ′ (1.95) = lim δ→0 = lim δ→0 s (cid:90) s = 4πψsδ (cid:0)t − t′(cid:1) − lim δ→0 (cid:90) (cid:90) s t ψ′ˆn′ · ∇′G dt′dSδ = 4πψs. Adding the Sδ, S∞, S1, and absorbing the 4π such that 1 4π G → G, the time domain scattered scalar potential is ψs = −Stime [ˆn · ∇ψ] − ∇ · Stime [ˆnψ] (1.96) where Stime [x(r)] = (cid:82) s G ∗ x(r′) dS′ and ∗ denotes temporal convolution. After decoupling the time-domain boundary conditions (1.6) with E = − ∂A ∂t − ∇ψ and using (1.96), the time-domain SPIE is    αi time βi time   = (Itime  spie − Ztime spie )    αtime βtime    where αtime = ψ βtime = ˆn · ∇ψ    Ztime spie = −∇ · Stime[ˆn∗] −Stime[∗] −ˆn · ∇∇ · Stime[ˆn∗] −ˆn · ∇Stime[∗]    . (1.97) (1.98a) (1.98b) (1.98c) The process for deriving the time domain scattered magnetic vector potential is analogous to the time domain scattered scalar potential, and the structure of the time domain magnetic potential is analogous to (1.66) As = −Stime[ˆn × ˆn × ˆn ×∇×A]+∇×Stime[ˆn ×A]−∇Stime[ˆn ·A]−Stime[ˆn ∇·A]. (1.99) 24 The time domain VPIE is                   ai time bi time γi time σi time = (Itime vpie − Ztime vpie )          atime btime γtime σtime          atime = ˆn × ˆn × ∇ × A btime = ˆn × A γtime = ˆn · A σtime = ∇ · A (1.100) (1.101a) (1.101b) (1.101c) (1.101d) −ˆn × ˆn × ∇ × Stime[ˆn × ∗] ˆn × ˆn × ∇ × ∇ × Stime[∗] 0 −ˆn × ˆn × ∇ × Stime[ˆn∗] −ˆn × Stime[ˆn × ∗] ˆn × ∇ × Stime[∗] −ˆn × ∇Stime[∗] −ˆn × Stime[ˆn∗] −ˆn · Stime[ˆn × ∗] ˆn · ∇ × Stime[∗] −∇ · Stime[ˆn × ∗] 0 −ˆn · ∇Stime[∗] −κ2Stime[∗] −ˆn · Stime[ˆn∗] −∇ · Stime[ˆn∗]          . (1.102) where, and, Ztime vpie =          1.5 Method-of-Moments This thesis uses the Method-of-Moments (MOM) technique to analyze and discretize the electromagnetic integral equations. The idea of MOM is to project the RHS and the LHS of the integral equations from one basis set onto another. Consider a function such that f (x) = N (cid:88) n=0 αn g(x). (1.103) And the linear, integral operator L such that L ◦ f (x) = h (x) . (1.104) 25 and the functions {tm (x)} spanning the range of L. Defining the action of testing for scalars and vectors as ⟨t (x) , g (x)⟩ = ⟨t (x) , g (x)⟩ = (cid:90) (cid:90) dx t (x) g (x) dx t (x) · g (x) (1.105) (1.104) may be tested such that       ⟨t0 (x) , Lg0 (x)⟩ ... ⟨tN (x) , Lg0 (x)⟩ ... . . . ... ⟨t0 (x) , LgN (x)⟩ ... ⟨tN (x) , LgN (x)⟩                         = ⟨t0 (x) , h (x)⟩ ... ⟨tN (x) , h (x)⟩       . (1.106) a0 ... aN (1.106) is solved iteratively or directly with standard routines like QMR, LGMRES, LU- factorization, etc. 26 CHAPTER 2 A PROCESS FOR DERIVING CALDER ´ON IDENTITIES The standard Calder´on identities are derived through the Stratton-Chu representation of the E-field and H-field in the frequency domain. The process used to derive these identities is generic and may be applied to any integral formulation derived through Green’s Second Identity and boundary conditions in both time and frequency domains. For completeness, the process is demonstrated in the original application for electromagnetics and then applied to the SPIE and VPIE in both frequency and time domains. The results are new Calder´on identities which will be deployed later to precondition the DPIE and DFIE formulations. 2.1 T and K Calder´on Identities Consider once again the free-space Stratton-Chu representation for E-fields and H-fields (cid:90) Es = − s Hs = (cid:90) s jωµ0 G (cid:0)ˆn′ × H′(cid:1) + (cid:0)ˆn′ × E′(cid:1) × ∇′G + (cid:0)ˆn′ · E′(cid:1) ∇′G dS′ jωϵ0 (cid:0)ˆn′ × E′(cid:1) G − (cid:0)ˆn′ × H′(cid:1) × ∇′G − (cid:0)ˆn′ · H′(cid:1) ∇′G dS′. Taking the curl of (2.1) results in the following equations ˆn × Es = ˆn × ∇ × S[ˆn × E] + ˆn × Hs = ˆn × ∇ × S[ˆn × H] − 1 jωϵ0 1 jωµ0 ˆn × ∇ × ∇ × S[ˆn × H] ˆn × ∇ × ∇ × S[ˆn × E]. The above equations are expressible in the compact form    ˆn × Es ηˆn × Hs     =   K T −T K       ˆn × E ηˆn × H    where K = ˆn × ∇ × S[∗] T = − jκ ˆn × ∇ × ∇ × S[∗]. 27 (2.1) (2.2) (2.3) (2.4) If we assume the total field is the following    ˆn × Et ηˆn × Ht     =   ˆn × Et ηˆn × Ht     =   ˆn × Ei ηˆn × Hi     +   ˆn × Es ηˆn × Hs. Then one possible formulation for the exterior scattering problem is    ˆn × Ei −ηˆn × Hi     =   I − K T −T I − K  = (I − ZM¨uller)         ˆn × E −ηˆn × H       ˆn × E −ηˆn × H.    (2.5) (2.6) The forward matrix in (2.6) is a Calder´on projector [Hsiao and Kleinman, 1997]. To further explain the projection property of the forward matrix in (2.6), consider the total electromagnetic field in the exterior region of the scatter once again (cid:16) Et, Ht(cid:17) (cid:16) Ei, Hi(cid:17) → = (Es, Hs) + (cid:16) Et, Ht(cid:17) = (cid:16) Ei, Hi(cid:17) (cid:16) − ZM¨uller Et, Ht(cid:17) (2.7) where (Es, Hs) = ZM¨uller (Es, Hs) in the absence of an incident field. Therefore, (cid:0)Et, Ht(cid:1). Note, ZM¨uller is also a projector and (cid:0)Et, Ht(cid:1) = (0, 0)i = (cid:16) Et, Ht(cid:17) − ZM¨uller (cid:16) Et, Ht(cid:17) = (Es, Hs) − (Es, Hs) (2.8) = (0, 0) . The ZM¨uller projector maps arbitrary scattered traces to arbitrary scattered traces such that the RHS is a null-field in the absence of an incident field. The complement projection I − ZM¨uller maps total traces to incident traces which equal the RHS in (2.6). Furthermore, both projectors satisfy P2 = P. 28 Using these projection properties, the following is true for arbitary sources ZM¨uller    ˆn × Ei ηˆn × Hi      = ZM¨uller    I − K T −T I − K           ˆn × E ηˆn × H    0 0   = ZM¨uller (I − ZM¨uller)   ˆn × E ηˆn × H        0 0   = (I − ZM¨uller) ZM¨uller   (2.9)    ˆn × E ηˆn × H     0 0 0 0    =    → I − K T −T I − K       K −T T K   . And the Calder´on identities well-known to the computational electromagnetic community follow (I − K) + T T = 0 T K − (I − K) T = 0. (2.10a) (2.10b) The above equations may be in an unfamiliar form because the principal value remains within the integral operators. Substituting K = I 2 + Kp.v. into the above immediately recovers the relations in their more common notation [Hsiao and Kleinman, 1997]. The more common form is unused in this thesis because the principle value clutters the equations. 2.2 SPIE Calder´on Identities for Frequency and Time Domain Calder´on projectors are generated by the construction of the scattering problem using Green’s Second Identity and boundary conditions. The process of deriving the Calder´on identities above generalizes to the SPIE, VPIE, and DFIE (or any other IE) where boundary conditions and Green’s Second Identity are used to formulate the scattering problem. The 29 operators to be used through these derivations are defined as D = −∇ · S[ˆn ∗] N = −ˆn · ∇∇ · S[ˆn ∗] D′ = ˆn · ∇S[∗] K′ = −∇ × S[ˆn × ∗] J 2 = ˆn × S[ˆn × ∗] J 3 = ˆn · S[ˆn × ∗] J 4 = ∇ · S[ˆn × ∗] L = 1 κ2 ∇ × ∇ × S[∗] K = ˆn × ∇ × S[∗] M3 = ˆn · ∇ × S[∗] P2 = ˆn × ∇S[∗] Q1 = ˆn × ˆn × ∇ × S[ˆn ∗] Q2 = ˆn × S[ˆn ∗] Q3 = ˆn · S[ˆn ∗]. (2.11a) (2.11b) (2.11c) (2.11d) (2.11e) (2.11f) (2.11g) (2.11h) (2.11i) (2.11j) (2.11k) (2.11l) (2.11m) (2.11n) Consider the frequency domain SPIE (1.81) and recall Zspie projects out incident scalar potential traces and Ispie −Zspie projects out scattered scalar potential traces from the total traces. Then, Zspie      = Zspie(Ispie − Zspie)   α β       α β   (2.12)    αi βi       →  0   = (Ispie − Zspie)Zspie 0  0 0 0 0   = (Ispie − Zspie)Zspie. 30 This results in the well-known frequency domain Calder´on SPIE identities [N´ed´elec, 2001] (I − D)D + SN = 0 −(I − D)S − SD′ = 0 −N D + (I + D′)N = 0 N S − (I + D′)D′ = 0. (2.13a) (2.13b) (2.13c) (2.13d) Deriving the time-domain Calder´on SPIE identities requires the time-domain SPIE in (1.97). The forward matrices in (1.97) are projectors as well. They generate the following time-domain Calder´on SPIE identities (Itime − Dtime)Dtime + StimeNtime = 0 −(Itime − Dtime)Stime − StimeD ′ time = 0 −NtimeDtime + (Itime + D NtimeStime − (Itime + D ′ time)Ntime = 0 ′ ′ time = 0 time)D (2.14a) (2.14b) (2.14c) (2.14d) where OO′[∗] = O (cid:0)O′[∗](cid:1) and O and O′ are the time domain variants of the frequency domain operators. To accommodate acoustic problems where the boundary condition includes a tempo- ral derivative on the scalar potential, the following additional time-domain Calder´on SPIE identities are true as well (cid:16) ˙Itime − ˙Dtime (cid:16) ˙Itime − ˙Dtime − − ˙Ntime ˙Dtime + ˙Ntime ˙Stime − (cid:17) ˙Dtime + ˙Stime (cid:17) ˙Stime − ˙Stime (cid:16) ˙Itime + ˙D (cid:16) ˙Itime + ˙D ′ time ′ time ˙Ntime = 0 ′ ˙D time = 0 (cid:17) ˙Ntime = 0 (cid:17) ˙D ′ time = 0 (2.15a) (2.15b) (2.15c) (2.15d) where ˙O denotes an operator O with a temporal derivative. 31 2.3 Novel Calder´on Identities for Frequency and Time Domain The Calderon´on identities for the VPIE are likewise derived with projection operators Zvpie ai bi γi          σi          0 0 0 0          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 →                            = Zvpie(Ivpie − Zvpie) = (Ivpie − Zvpie)Zvpie                   (2.16)                   a b γ σ a b γ σ = (Ivpie − Zvpie)Zvpie. 32 And the following, novel Calder´on identities are the result J (2)K J (3)K ′t)K (I − K ′t + κ2LtJ (2) − Q(1)J (4) = 0 ′t − (I − K)J (2) − P(2)J (3) − Q(2)J (4) = 0 ′t + M(3)J (2) − (I + D′)J (3) − Q(3)J (4) = 0 ′t + κ2SJ (3) − (I − D)J (4) = 0 ′t)Lt − LtK = 0 J (4)K (I − K J (2)κ2Lt + (I − K)K + P(2)M(3) = 0 J (3)κ2Lt − M(3)K + (I + D′)M(3) = 0 J (4)Lt − SM(3) = 0 LtP(2) + Q(1)S = 0 −(I − K)P(2) − P(2)D′ + Q(2)κ2S = 0 M(3)P(2) − (I + D′)D′ + Q(3)κ2S = 0 SD′ + (I − D)S = 0 ′t)Q(1) + κ2LtQ(2) + Q(1)D = 0 −(I − K −J (2)Q(1) − (I − K)Q(2) − P(2)Q(3) + Q(2)D = 0 −J (3)Q(1) + M(3)Q(2) − (I + D′)Q(3) + Q(3)D = 0 −J (4)Q(1) + κ2SQ(3) + (I − D)D = 0. (2.17a) (2.17b) (2.17c) (2.17d) (2.17e) (2.17f) (2.17g) (2.17h) (2.17i) (2.17j) (2.17k) (2.17l) (2.17m) (2.17n) (2.17o) (2.17p) Likewise, deriving the time-domain Calder´on VPIE identities requires the time-domain VPIE in (1.99). The forward matrices in (1.99) are projectors, and they generate the fol- lowing time-domain Calder´on VPIE identities in the same way as the time-domain Calder´on 33 SPIE identities (2.14) J ′t time)K (Itime − K ′t (2) time − (Itime − Ktime)J timeK ′t (3) time + M timeK (4) timeK (3) timeJ ′t time + κ2StimeJ ′t time + κ2Lt (2) time − P (2) time − (Itime + D′ (3) time − (Itime − Dtime)J (2) time − Q (3) time − Q (3) time − Q (1) timeJ (2) timeJ (3) timeJ timeJ (2) timeJ time)J J (4) time = 0 (4) time = 0 (4) time = 0 (4) time = 0 J (Itime − K ′t time)Lt (2) timeκ2Lt J (3) timeκ2Lt J time − M time + (Itime − Ktime)Ktime + P (3) timeKtime + (Itime + D′ (4) timeLt J Lt time)M time − Lt timeKtime = 0 (3) (2) time = 0 timeM (3) time = 0 (3) time = 0 time − StimeM (2) (1) time + Q timeStime = 0 (2) timeκ2Stime = 0 (3) timeκ2Stime = 0 time + Q time + Q timeP (2) timeD′ time)D′ time + (Itime − Dtime)Stime = 0 −(Itime − Ktime)P M (3) timeP (2) time − P (2) time − (Itime + D′ StimeD′ (1) time + κ2Lt ′t time)Q −(Itime − K −J −J (1) (2) time − (Itime − Ktime)Q timeQ (1) (3) (3) timeQ time + M timeQ (1) (4) time + κ2StimeQ timeQ (2) time − P (2) time − (Itime + D′ (3) time + (Itime − Dtime)Dtime = 0. time)Q −J timeQ (2) timeQ (2) time + Q (3) time + Q (3) time + Q (1) timeDtime = 0 (2) timeDtime = 0 (3) timeDtime = 0 (2.18a) (2.18b) (2.18c) (2.18d) (2.18e) (2.18f) (2.18g) (2.18h) (2.18i) (2.18j) (2.18k) (2.18l) (2.18m) (2.18n) (2.18o) (2.18p) 34 And finally, the time-derivative version is ˙J (2) time ˙K ˙J (3) time ˙K time ′t time) ˙K ( ˙Itime − ˙K ′t time + κ2 ˙Lt ′t (2) time − ( ˙Itime − ˙Ktime) ˙J time − ˙P ′t (2) (3) time − ( ˙Itime + ˙D′ time + ˙M ˙J time (4) (3) time − ( ˙Itime − ˙Dtime) ˙J ˙J time (3) time ′t time + κ2 ˙Stime (2) ˙J time − ˙Q (3) time − ˙Q ˙J (3) time − ˙Q (2) time time) ˙J (1) time (2) time ˙K ˙J ˙J ˙J (4) time = 0 (4) time = 0 (4) time = 0 (4) time = 0 ˙Ktime = 0 ˙J (2) timeκ2 ˙Lt ˙J (3) timeκ2 ˙Lt ˙J ′t time) ˙Lt time ( ˙Itime − ˙K time − ˙Lt (2) time + ( ˙Itime − ˙Ktime) ˙Ktime + ˙P time ˙Ktime + ( ˙Itime + ˙D′ time) ˙M time − ˙M ˙M (3) time ˙J time ˙M (4) time ˙Lt (3) time = 0 (3) time = 0 (3) ˙Lt time − ˙Stime time = 0 (1) (2) time + ˙Q ˙P ˙Stime = 0 time (2) (2) timeκ2 ˙Stime = 0 time (3) time) ˙D′ timeκ2 ˙Stime = 0 time + ( ˙Itime − ˙Dtime) ˙Stime = 0 ˙Dtime = 0 time + ˙Q time + ˙Q ˙D′ (2) time − ˙P −( ˙Itime − ˙Ktime) ˙P (2) time − ( ˙Itime + ˙D′ ˙P ˙D′ (3) time ˙M −( ˙Itime − ˙K ′t time) ˙Q ˙Stime (1) time + κ2 ˙Lt (2) time − ˙P (2) time − ( ˙Itime + ˙D′ (3) time + ( ˙Itime − ˙Dtime) ˙Dtime = 0. (2) time + ˙Q (3) time + ˙Q (3) time + ˙Q (2) time time) ˙Q ˙Dtime = 0 ˙Dtime = 0 (1) time − ( ˙Itime − ˙Ktime) ˙Q (1) ˙Q time + ˙M (4) − ˙J time (1) time + κ2 ˙Stime (1) time (3) time (2) time (3) time time ˙Q ˙Q ˙Q ˙Q − ˙J (2) time ˙Q − ˙J (3) time ˙Q (2.19a) (2.19b) (2.19c) (2.19d) (2.19e) (2.19f) (2.19g) (2.19h) (2.19i) (2.19j) (2.19k) (2.19l) (2.19m) (2.19n) (2.19o) (2.19p) This completes the derivation of various Calder´on identities, some of which are new, and the demonstration of a general, simple process to generate them. There may exist other Calder´on identities yet to be tabulated, but the above process is sufficient to recover them. 35 CHAPTER 3 NOVEL FREQUENCY DOMAIN FORMULATIONS As discussed in Chapter 1, various boundary integral equations (MFIE, EFIE, CFIE, etc.) predict scattered magnetic and electric fields from PEC objects of arbitrary shape using MOM. The problem is important because many real-world scatterers and communication devices are multi-scale and include wideband antennas. This approach has fewer unknowns than the finite element method (FEM) but may be integrated with FEM approaches to trun- cate the inhomogenous region (modeled with FEM) interfacing with an open, homogeneous region (modeled with BEM) [Jin, 2011]. The EFIE and MFIE components of the CFIE formulation suffer from a variety of is- sues like low-frequency breakdown [Yan et al., 2010][Zhao and Chew, 2000][Zhang et al., 2003][Qian and Chew, 2008], catastrophic cancellation [Kress, 1981], dense mesh breakdown [Valdes et al., 2011], static nullspaces or topology breakdown [Cools et al., 2009a], and a poor approximation of the identity operator in the MFIE with RWG testing and basis functions [Yan et al., 2011]. These issues have been addressed in various ways. Buffa-Christiansen testing sets better approximate the identity operator of the MFIE when the basis set is composed of RWG functions [Cools et al., 2009b]. These functions have also been used in conjunction with the Calder´on identities to precondition the ill-conditioned EFIE operator [Cools et al., 2009c]. Other suggested basis sets to alleviate breakdown are the so-called loop-star and loop-tree functions [Wilton and Glisson, 1981][Wu et al., 1995] or the related basis-free quasi-Helmholtz projection matrices [Andriulli et al., 2013], subdivision surfaces [Fu et al., 2017], and manifold harmonics [Alsnayyan and Shanker, 2022]. Yet another option is solving for current and charge densities in the current and charge integral equation (CCIE) [Taskinen and Yl¨a-Oijala, 2006]. In contradistinction, the DPIE and DFIE formulations have several niceties: no low- frequency breakdown, no dense mesh breakdown, no topological low-frequency breakdown, and well-conditioned dielectric and PEC formulations at low to medium range frequencies 36 [Vico et al., 2016][Li et al., 2019]. For PEC objects analyzed in the frequency domain, [Li et al., 2019] suggested various combinations of the potential integral equations to construct well-conditioned formulations at low frequency. Reference [Eris et al., 2022] suggested a unique combined potential formulation for dense discretizations. The low frequency behavior of the VPIE was further analyzed for PEC in [Chen et al., 2022]. For dielectric objects analyzed in the frequency domain, [Li et al., 2019][Vico et al., 2016] have suggested well- conditioned potential formulations at low frequency. A potential integral formulation for solving lossy conductors is detailed in [Sharma and Triverio, 2022] as well. The time-domain variant of these integral equations has been analyzed in [Roth and Chew, 2021]. Specifically, the decoupled potential integral equation (DPIE) approach has several niceties: no low- frequency breakdown, no dense mesh breakdown, no topological low-frequency breakdown, and well-conditioned dielectric and PEC formulations at low to medium range frequencies [Vico et al., 2016][Li et al., 2019]. Recently, the DPIE has been implemented on arbitrary dielectric objects [Baumann et al., 2022] with pulse and RWG functions, and the results therein demonstrate low singular value conditioning and a low iteration count to converge. However, fast convergence with iterative solvers has yet to be demonstrated in the high frequency region for arbitrary objects using the DPIE. This chapter extends the DPIE and DFIE property of well-conditioned to the high fre- quency region for arbitrary PEC objects by constructing new formulations using the Calder´on identities derived in Chapter 2. The spectral properties of these formulations are analyzed on a unit sphere. 3.1 DPIE and DFIE for PECs The boundary conditions force the tangential component of the total electric field to be zero in the case of a PEC. In the DPIE, this is implies α(r) = 0, σ(r) = 0, and b(r) = 0. 37 Equation (1.75) and (1.81) reduce in the PEC case to     =      αi βi S I + D′    β                   ai bi γi σi =          I − K ′t          0 P(2) I + D′ −κ2S     a γ   . J (2) J (3) J (4) (3.1a) (3.1b) These equations are over-determined. Any choice of two rows of operators in (3.1b) or one row in (3.1a) will construct a system of equations relating unknown potential quantities to the incident field. The DFIE is very similar because the DFIE and VPIE have an identical forward matrix of operators. Again, the tangential component of the electric field is zero at the PEC boundary which implies αE(r) = 0, σE(r) = 0, and bE(r) = 0. Equation (1.75) and (1.81) reduce in the PEC case to          =                   ai E bi E γi E σi E ′t I − K J (2) J (3) J (4)          0 P(2) I + D′ −κ2S     aE γE   . (3.2) As an aside, the Stratton-Chu formula assumes the free-space region is source free and sets ∇ · E = 0 in Green’s Second Identity. This reduction of Green’s Second Identity would remove the fourth row of the above equation. Again, (3.2) is over-determined. Any choice of two rows of operators will construct a system of equations relating unknown, field quantities to the incident field. 38 3.2 Local Calder´on Combined DPIE and DFIE Formulations 3.2.1 Combined SPIE, VPIE, DPIE, and DFIE Formulations Constructing a unique SPIE formulation from requires adding the two possible equations in (3.1a). These two possible equations have null-spaces at irregular frequencies. The null- spaces of the S operator correspond to a solution of a cavity with Dirichlet boundary con- ditions whereas those of I + D′ correspond to those of an interior cavity with Neumann boundary conditions. Combing these two equations with a weighting coefficient δ and 1 − δ, where 0 ≤ δ ≤ 1 yields a unique SPIE formulation. The Combined SPIE (CSPIE) is written as where δαi + (1 − δ)βi = δZSP IE 1 β + (1 − δ)ZSP IE 2 β ZSP IE 1 = S ZSP IE 2 = I + D′ (3.3) (3.4) Constructing a unique VPIE formulation follows a similar procedure. Rows 1 and 3 of (3.1b) are spectrally akin to those of an MFIE (denoted by ZV P IE 1 ) whereas rows 2 and 4 are similar to those of an EFIE (denoted by ZV P IE 2 ) ZV P IE 1 = ZV P IE 2 =       ′t I − K 0 J (3) I + D′  J (2) P(2) J (4) −κ2S   .    (3.5) However, the null-spaces of these matrices of operators are more complex than those of either the MFIE or EFIE operators. Take ZV P IE 1 of the matrix, the null-spaces fall into two categories; (a) the null-spaces of I − K for instance. Given the lower triangular nature ′t (which is equivalent to the MFIE) and (b) null-spaces of I + D′ for rotational a. 39 A linear combination of the two systems in (3.1b) does not have a null-space. As a result, the Combined VPIE (CVPIE) is prescribed as follows   i δ   a γ   + (1 − δ)   i     b σ   = δ ZV P IE 1   a γ   + (1 − δ) ZV P IE 2     a γ   . (3.6) Of course, a unique DPIE is simply a block diagonal concatenation of (3.5) and (3.6). A Combined DFIE formulation is identical to the Combined VPIE formulation, RHS aside, δP1 l  i      aE γE + (1 − δ)P2 l  i      bE σE = δ P1 l ZDF IE 1       aE γE  + (1 − δ) P2 l ZDF IE 2   aE γE (3.7)       ZDF IE 1 = ZDF IE 2 =       ′t I − K 0 J (3) I + D′  J (2) P(2) J (4) −κ2S   (3.8) P1 l = diag(1, −jκ) P2 l = diag(−jκ, 1) where the scaling matrices P1 l and P2 l are prematurely introduced. To prove the uniqueness of the Combined DFIE for an arbitrary and closed object, as is done in [Harrington and Mautz, 1978], consider the exterior and interior problems where the equivalence theorem reconstructs the scattering problem. Select the interior problem where the fields (E1, H1) in the exterior region is set to the null-field and the interior fields 40 (E2, H2) are supported by equivalent sources. The interior problem defined over the inner surface for the Combined DFIE is     0   = (1 − δ) 0       1 0 ˆn × ˆn × ∇ × E2(c, µ)          + δ 0 −jκ  −jκ 0      1 0 ˆn · E2(c, µ)  ˆn × E2(aE, γE) ∇ · E2(aE, γE) Using Faraday’s law of induction for time-harmonic fields   .         0 0   = (1 − δ)   −jκη0ˆn × ˆn × H2(aE, γE)    + δ −jκˆn · E2(aE, γE)  −jκˆn × E2(aE, γE)   . ∇ · Es 2(aE, γE) (3.9) (3.10) Selecting the first row of Eq. (3.10) and operating on both sides with ˆn× leads to 0 = (δ − 1) ηoˆn × H2(aE, γE) + δˆn × ˆn × E2(aE, γE). (3.11) Taking the complex conjugate of the above equation and projecting onto the above equation yields (cid:18) 2Re (1 − δ) δ ηo (cid:90) s ˆn · ((ˆn × E2(aE, γE)) . . . (cid:19) = (3.12) × (ˆn × H∗ 2(aE, γE))) dS (cid:90) (δ − 1)2 η2 o |ˆn × H2(aE, γE)|2 s + δ2 |ˆn × ˆn × E2(aE, γE)|2 dS. The LHS is the real power flowing in the interior. If the interior’s media is without loss, then the LHS is zero. Therefore ˆn × H2(aE, γE) = 0 ˆn × ˆn × E2(aE, γE) = 0. 41 (3.13) Now consider the following calculations, ∇ · (ˆn × H2(c, µ)) = − ˆn · (∇ × H2(c, µ)) = jωϵoˆn · E2(c, µ). In light of Eq. (3.13), we conclude ˆn · E2(c, µ) = 0. Further, ∇ · E2(c, µ) = ∇ · (ˆn × ˆn × E2(c, µ) + (ˆn · E2(c, µ)) ˆn) = 0. (3.14) (3.15) Recall the exterior fields are the null-field. Using the boundary condition ˆn × H1 − ˆn × H2 = J, we conclude J = 0 and then (aE, γE) = (0, 0) in Eq. (3.13). In other words, the total traces of the interior problem must be zero, and the Combined DFIE does not support a cavity mode for any external field. The Combined DFIE is unique. Also, the Combined VPIE forward matrix in [Hawkins et al., 2023] is identical to the Combined DFIE, and the Combined VPIE is unique as well. 3.2.2 Left and Right Preconditioners for Scaling Left and right preconditioners are used to scale the various operators in the DPIE and DFIE PEC formulations such that all unknowns are of the same units [Li et al., 2019]. The left and right scaling matrices for the PEC case are 42 P1 l = diag (1, −jκ) S1 l = − jκ P2 l = diag (−jκ, 1) S2 l = 1 (cid:16) P1 r = (cid:16) (cid:16) (cid:16) S1 r = P2 r = S2 r = (cid:17)−1 P1 l (cid:17)−1 S1 l (cid:17)−1 P2 l (cid:17)−1 . S2 l (3.16a) (3.16b) (3.16c) (3.16d) (3.16e) (3.16f) (3.16g) (3.16h) After scaling (3.5) and (3.6) with (3.16), the combined formulations are   i δP1 l   a γ     i δP1 l   a γ     i + (1 − δ)P2 l   b σ     i + (1 − δ)P2 l   b σ   =δP1 l ZDF IE 1 P1 r =δP1 l ZV P IE 1 P1 r   + (1 − δ) P2  l ZDF IE 2    ˜a ˜γ   + (1 − δ) P2  l ZV P IE 2    ˜a ˜γ   P1 r   ˜a ˜γ     P1 r   ˜a ˜γ   (3.17) δS1 l αi + (1 − δ)S2 l βi =δS1 l ZSP IE 1 S2 r [ ˜β] + (1 − δ) S2 l ZSP IE 2 S2 r [ ˜β]. 3.2.3 Framework for Spectral Analysis To analyze the spectrum of the formulations, we use spherical and vector harmonic basis and testing functions for the unit sphere to compute eigenvalues and singular values. The spherical harmonic basis functions are denoted by Bs n where n ∈ [0, Ns) and the vector harmonic basis functions are denoted by Bv in the SPIE for PEC is Ns, and the span is denoted by τ SPIE = (cid:80)Ns−1 n where n ∈ [0, Nv). The number of basis functions where ys n nySPIE n n=0 Bs are basis function coefficients. The number of basis functions in DFIE or VPIE for PEC is 43 Ns + Nv. Likewise, the span of basis functions is denoted by τ VPIE = (cid:80)Ns+Nv−1 where yv n is a list of basis function coefficients and F V = diag (Bs n, Bv n). n=0 The sets of harmonics are denoted by BY = BΨ = BΦ = (cid:18) Y 0 0 (cid:18) Ψ0 0 (cid:18) Φ0 0 . . . Y m n . . . Y Nh Nh . . . Ψm n . . . Ψ . . . Φm n . . . Φ Nh Nh (cid:19) Nh Nh (cid:19) (cid:19) where n ≥ 0 and |m| ≤ 0. The scalar and vector basis functions are then Bs = BY (cid:18) Bv = BΦ BΨ (cid:19) F V n yVPIE n (3.18a) (3.18b) (3.18c) (3.19a) (3.19b) where Ns = (Nh + 1)2 and Nv = 2 (Nh + 1)2 and Nh = κa + 2 where a is the radius of the sphere and equal to 1. The harmonic functions are defined by (cid:115) Y m n (r) = 2n + 1 4π (n − m)! (n + m)! n (cos θ) ejmϕ P m Ψm n (ˆr) = − ˆr × Φm n (ˆr) = cnr∇Y m n (ˆr) n (ˆr) = cnˆr × ∇Y m n (ˆr) Φm n (ˆr) =ˆr × Ψm   1 cn =  1√ n(n+1) n = 0 . n ̸= 0 As a demonstration, consider the following LHS l ZDF IE P1 1 P1 r       n Ψm am n n Φm bm n n Y m cm n       . 44 (3.20a) (3.20b) (3.20c) (3.20d) (3.20e) (3.21) Testing with (Ψm n Φm n Y m n )T and using the orthonormal nature of the harmonic functions as defined above       ⟨Ψm ⟨Φm n , I + ˆn × ˆn × ∇ × S[ˆn × Ψm n ]⟩ n , I + ˆn × ˆn × ∇ × S[ˆn × Ψm n ]⟩ n , jκˆn · S[ˆn × Ψm n ]⟩ ⟨Y m ⟨Ψm ⟨Φm n , I + ˆn × ˆn × ∇ × S[ˆn × Φm n ]⟩ n , I + ˆn × ˆn × ∇ × S[ˆn × Φm n ]⟩ n , jκˆn · S[ˆn × Φm n ]⟩ ⟨Y m 0 0 ⟨Y m n , I + ˆn · ∇S[Y m n ]⟩              .      am n bm n cm n (3.22) Each entry in the above matrix may be computed analytically. The following notation is used for spherical Bessel functions and spherical Hankel functions (i) nm (k, r) =cn bi ϕ (1) n =jn b n (κr) Y m n and for vector spherical harmonics (2) (1) n =h n b (i) nm =∇ϕ (i) nm L M (i) nm = 1 κ ∇ × N (i) nm = − r × ∇ϕ (i) nm N (i) nm = 1 κ ∇ × M (i) nm. and OXY or OXY is short hand for (cid:90) (cid:90) OXY = OXY = X Y ∗ dS X · Y∗ dS. 45 (3.23a) (3.23b) (3.23c) (3.24a) (3.24b) (3.24c) (3.24d) (3.24e) (3.25a) (3.25b) (3.25c) As a demonstration, consider the following computation ⟨Ψnm, Ψnm + ˆn × ˆn × ∇ × S[ˆn × Ψnm]⟩ = OΨΨ + (cid:90) Ψnm · ˆn × ˆn × ∇ × (cid:90) G ˆn × Ψ′ nmdS′dS (cid:90) = OΨΨ + Ψnm × ˆr × ˆr · ∇ × (cid:90) (cid:90) G Φ′ nmdS′dS G Φ′ nmdS′dS (cid:90) Ψnm · ∇ × (cid:90) Ψnm · ∇ × G · Φ′ nmdS′dS. = OΨΨ − (cid:90) = OΨΨ − Using the identity ∇ × G = −jk2 (cid:88) q,p 1 gp c2 p (M (1) pq (k, r) N (4)∗ pq (k, r′) + N (1) pq (k, r) M (4)∗ pq (k, r′)) and orthogonality conditions for vector spherical harmonics ⟨Ψnm, Ψnm + ˆn × ˆn × ∇ × S[ˆn × Ψnm]⟩ = OΨΨ + jk2 gn c2 n O N(1)Ψ O ′∗ M(4)Φ However, (3.26) (3.27) (3.28) ⟨Φnm, Ψnm + ˆn × ˆn × ∇ × S[ˆn × Ψnm]⟩ = 0. (3.29) Also, consider I − K ′t tested with Φnm ⟨Φnm, Φnm + ˆn × ˆn × ∇ × S[ˆn × Φnm]⟩ = OΦΦ + (cid:90) Φnm · ˆn × ˆn × ∇ × (cid:90) = OΦΦ + Φnm × ˆr × ˆr · ∇ × (cid:90) (cid:90) = OΦΦ + Φnm · (cid:90) (cid:90) G ˆn × Φ′ nmdS′dS G ˆn × Φ′ nmdS′dS ∇ × G · Φ′ nmdS′dS (3.30) = OΦΦ − jk2 gn c2 n O M(1)Φ O ′∗ N(4)Ψ . 46 However, ⟨Ψnm, Φnm + ˆn × ˆn × ∇ × S[ˆn × Φnm]⟩ = 0. (3.31) Proceeding in this manner for all entries of (3.22), matrix (3.22) reduces to        OΨΨ + jk2 gn c2 n O N(1)Ψ O ′∗ M(4)Φ 0 0 (cid:18) (jκ)2 0 O OΦΦ − jk2 gn c2 n ′∗ O N(j)Ψ N(1)Y O M(1)Φ + 1 κ2 c2 n O ′∗ N(4)Ψ 1 gn c2 n O L(1)Y O ′∗ L(4)Ψ 0 0 (cid:19) OY Y − jκ c2 n O L(1)Y O ′∗ ϕ(4)Y               .      am n bm n cm n (3.32) The eigenvalues have an analytic expression through the determinant of the above matrix. More generally, the eigenvalues of any formulation equal the concatenation of eigenvalues of each block matrix (3.32) corresponding to harmonic (m, n). 3.2.4 Spectral Analysis of Combined DFIE and DPIE The Combined DFIE and DPIE formulations are not wideband well-conditioned as shown in Fig. 3.1. The eigenvalue of the lowest order harmonic for the Combined SPIE nears (but never exactly intersects because the formulation is unique) the origin of the complex plane which results in poor conditioning in the high frequency region. The eigenvalues associated with the lowest order harmonics for the Combined VPIE and DFIE do not near the origin for any frequency. However, in the case of the Combined VPIE and DFIE, eigenvalues near the origin of the complex plane at larger frequencies like 30 GHz as shown in Fig. 3.4. 3.2.5 Analytic Preconditioners for the Combined DPIE and DFIE To improve the wideband condition of the Combined DPIE and DFIE, the identities pre- sented in (2.13) and (2.17) are used to construct analytic preconditioners. In the combined formulations, eigenvalues are collecting near the origin of the complex plane at higher fre- quencies. By preconditioning one of the matrices, the idea is to interlace the eigenmodes such that no eigenmode nears the origin in the combined system, and the spectrum shifts to a bounded circle off the origin. 47 Figure 3.1: Condition number vs frequency. Figure 3.2: Eigenvalues for Combined SPIE for harmonic Y 1 1 . 48 Figure 3.3: Eigenvalues for Combined VPIE and DFIE for harmonics (cid:0)Y 1 1 , Φ1 1, Ψ1 1 (cid:1). Figure 3.4: Eigenvalues for Combined VPIE and DFIE at 30 GHz. 49 Starting with the Combined SPIE, the N operator acting on S results in an operator that is second kind. However, N S and I + D′ share resonances due to identity (2.14d). A complexification of the wavenumber included in N interlaces the null-spaces of N S and I + D′. Specifically, we let (cid:101)PSP IE = − (cid:101)N such that instead of κ we use ˜κ = κ − j0.4H 2 3 κα [Antoine et al., 2006][Boubendir and Turc, 2014] where α is to-be-determined. The resulting Local Calder´on Combined SPIE (LC-CSPIE) formulation is resonance free and second-kind δS2 l ˜PSP IES1 r S1 l αi + (1 − δ)S2 l βi =δS2 l ˜PSP IES1 r S1 l ZSP IE 1 r [ ˜β] + (1 − δ)S2 S2 l ZSP IE 2 S2 r [ ˜β]. (3.33) Further, ˜κ provably enforces uniqueness on spheres for the LC-CSPIE, and the asymptotic analysis will be published in a future paper. Note, the localization doesn’t interlace the wavenumbers at which a null-space exists. Operators ˜N S and I + D′ share wavenumbers at which there exists a null-space. Rather, the localization interlaces the eigenmodes such that ˜N S and I + D′ do not share a vanishing eigenmode for any wavenumber. Proceeding in a similar manner for the Combined VPIE, ZV P IE 1 is second-kind and well-conditioned while ZV P IE 2 is ill-conditioned [Li et al., 2019]. As before, (2.17) provides the necessary relations for developing a preconditioner. Using (2.17), consider the following preconditioning operator and its action on ZV P IE 2    κ2Lt −Q(1) M(3) −Q(3)       J (2) P(2) J (4) −κ2S    . Using (2.19a), (2.19c), (2.19i), and (2.19k), equation (3.34) reduces to    −(I − K ′t ′t)K ′t + (I + D′)J (3) −J (3)K    . 0 (I + D′)D′ (3.34) (3.35) The above matrix includes an identity plus compact operator along the diagonal. The off- diagonal operators are compact operators acting on compact operators or compact operators acting on bounded operators, the result of both being compact [Rudin, 1991]. As a result, 50 the entire system of operators can be partitioned into identity plus compact operators, and the system is well-conditioned. Operator (3.35) and ZV P IE 1 operators dictate that the null-spaces fall into two categories; (a) those of (I − K share null-spaces. The lower triangular nature of the ′t)K ′t and (b) for any rotational a(r) unknown (for this type of a, J 3(a) = 0 which implies ′t(a) + (I + D′)J (3)(a) = 0), those of (I + D′)D′. These null-spaces are shared −J (3)K with ZV P IE 1 nullspaces . Furthermore, the below factorization demonstrates ZV P IE 1 and (3.35) share    −(I − K ′t) 0 −J (3) I + D′       ′t K 0 J (3) D′    . (3.36) Furthermore, an alignment of these operators’ nullspaces is numerically depicted in Fig. 4. Uniqueness at all frequencies is enforced by complexifying the wavenumber of the pre- conditioner (3.35). Specifically, ˜κ is once again of the form ˜κ = κ − j0.4H 2 3 κα. The new Local Calder´on-Type Combined VPIE (LC-CVPIE) formulation is   i δP1 l   a γ   + (1 − δ)P1 l ˜PV P IEP2 r P2 l   i   b σ   = δP1 l ZV P IE 1 P1 r + (1 − δ) P1 l where ˜PV P IEP2 r P2 l ZV P IE 2   P1 r   ˜a ˜γ       ˜a ˜γ   (3.37) ˜PV P IE = −   =P1  l    ˜a ˜γ    ˜κ2 ˜Lt − ˜Q(1) ˜M(3) − ˜Q(3)      (3.38)   a γ   . 51 Likewise, the new Local Calder´on Combined DFIE formulation is   i   aE γE   + (1 − δ)P1 l ˜PDF IEP2 r P2 l     i   = δP1 l ZDF IE 1 P1 r       ˜aE ˜γE + (1 − δ) P1 l ˜PDF IEP2 r P2 l ZDF IE 2 P1 r   δP1 l where bE σE     ˜aE ˜γE    ˜aE ˜γE   = P1  l       . aE γE (3.39) (3.40) The spectral analysis framework is used to select α for the localization parameter as well as the weighting parameter δ. Sweeping frequencies, the LC-CVPIE is wideband well- conditioned α = 0.5 and δ = 0.5 according to Fig. 3.5 and 3.6, respectively. Of course, the LC-CDFIE is wideband well-conditioned for the same parameters. 3.2.6 Mapping Properties of DPIE and DFIE Operators The DPIE and DFIE operators are a map between function spaces. These formulations take functions as an input and then output functions. On a sphere, harmonics are both the input and the output functions, and the following table summarizes the input-output mappings for the DPIE and DFIE 52 Figure 3.5: LC-CVPIE and LC-CDFIE conditioning for different ˜κ Figure 3.6: LC-CVPIE and LC-CDFIE conditioning with different δ weights using a = 0.5 in ˜κ. 53 ZV P IE/DF IE : ZV P IE/DF IE : ZV P IE/DF IE :                         −→             −→             −→       Ψ 0 0 0 Φ 0 0 0 Y Ψ or 0 0 0 0 Φ or Y 0       0 0 Φ ZSP IE : Y −→ Y.             (3.41a) (3.41b) (3.41c) (3.41d) These mapping properties demonstrate that div-conforming and curl-conforming func- tions map to div-conforming and curl-conforming functions, respectively, in the DPIE and DFIE. 3.2.7 Spectral Properties of LC-CDPIE and LC-CDFIE We now analyze the eigenvalues and conditioning of the LC-CDPIE and LC-CDFIE. Con- sidering the following three operators for the VPIE and DFIE G−1P1 G−1P1 l PV P IE/DF IEP2 V P IE/DF IE P1 r 1 l Z r G−1P2 l Z V P IE/DF IE 2 P1 r δG−1P1 l Z V P IE/DF IE 1 r + (1 − δ)G−1P1 P1 l PV P IE/DF IEP2 r G−1P2 l Z (3.42a) (3.42b) V P IE/DF IE 2 P1 r . (3.42c) Figure 3.7 depicts the conditioning for frequencies between 1e9 Hz and 1.1e9 Hz for formu- lations (3.42b) and (3.42c). The sharp increases in condition number mark a null-space, and the alignment indicates the two formulations share the same resonances. Therefore, Fig. 54 Figure 3.7: Condition numbers of (3.42a) and (3.42b). 3.7 numerically verifies the properties of new Calder´on-type identities developed in Chapter 2. Also, in Fig. 3.8 the condition numbers of formulations (3.42a) and (3.42b), which are second-kind, increase ∝ ω in the high-frequency region. The more common Calder´on pre- conditioning approach for the CFIE (without localization) and the dielectric implementation of the DPIE show similar results [Cools et al., 2009c][Hsiao and Kleinman, 1997][Baumann et al., 2022]. The spectra of (3.42a), (3.42b), and (3.42c) at 20 GHz are shown in Fig. 3.9, Fig. 3.10, and Fig. 3.11, respectively. All spectra have similar features: they are bounded and eigenvalues collect near the origin. The localization of the VPIE and DFIE Calder´on operator interleaves the eigenmodes of the (3.42a) and (3.42b) nearing zero for any frequency such that (3.42c) shifts the spectra off the origin. The spectrum of the LC-CVPIE and LC-CDFIE is shown in Fig. 3.12. The spectrum is now both bounded and shifted away from the origin. The conditioning of the LC- CDFIE and LC-CVPIE is shown in Fig. 3.14 where the formulations are clearly wideband well-conditioned. In Fig. 3.13, the eigenvalues of the lowest order harmonic do not collect near the origin. 55 Figure 3.8: Condition numbers of (3.42a) and (3.42b). Figure 3.9: The spectrum of (3.42a) at 20 GHz. Eigenvalues are denoted by n. The analysis for the LC-CSPIE proceeds in a similar manner. Consider the following 56 Figure 3.10: The spectrum of (3.42b) at 20 GHz. Eigenvalues are denoted by n. Figure 3.11: The spectrum of (3.42c) at 20 GHz. Eigenvalues are denoted by n. three operators for the SPIE r G−1S1 l ZSP IE 1 S2 r G−1S2 G−1S2 l PSP IES1 l ZSP IE S2 r 2 l PSP IES1 G−1δS2 r G−1S1 l ZSP IE 1 r + (1 − δ)G−1S2 S2 l ZSP IE 2 S2 r . (3.43a) (3.43b) (3.43c) 57 Figure 3.12: The spectrum of LC-CVPIE and LC-CDFIE at 20 GHz. Eigenvalues are denoted by n. Figure 3.13: Eigenvalues for LC-CVPIE and LC-CDFIE for harmonics (cid:0)Y 1 1 , Φ1 1, Ψ1 1 (cid:1). Figure 3.15 depicts the conditioning for frequencies between 1e9 Hz and 1.1e9 Hz for for- mulations (3.43a) and (3.43b) and numerically verifies the properties of well-known SPIE Calder´on identities as derived in Chapter 2. Also, in Fig. 3.8 the condition numbers of formulations (3.42a) and (3.42b), which are second-kind, increase ∝ ω in the high-frequency 58 Figure 3.14: Condition numbers for LC-CVPIE and LC-CDFIE . Figure 3.15: Condition numbers of (3.43a) and (3.42b). region as well. The spectra of (3.43a), (3.43b), and (3.43c) at 20 GHz are shown in Fig. 3.17, Fig. 3.18, and Fig. 3.19, respectively. All spectra have similar features: they are bounded and eigenvalues collect near the origin. The localization of the SPIE Calder´on operator also interleaves the eigenmodes of the 59 Figure 3.16: Condition numbers of (3.42a) and (3.42b). Figure 3.17: The spectrum of (3.43a) at 20 GHz. Eigenvalues are denoted by n. (3.42a) and (3.42b) nearing zero for any frequency such that (3.42c) shifts the spectra off the origin. The spectrum of the LC-CSPIE is shown in Fig. 3.20. The spectrum is now both bounded and shifted away from the origin. The conditioning of the LC-CSPIE is shown in Fig. 3.22 where the formulation is wideband well-conditioned. Furthermore, in Fig. 3.21, the eigenvalues of the lowest order harmonic do not intersect the origin. 60 Figure 3.18: The spectrum of (3.43b) at 20 GHz. Eigenvalues are denoted by n. Figure 3.19: The spectrum of (3.42c) at 20 GHz. Eigenvalues are denoted by n. 61 Figure 3.20: The spectrum of LC-CSPIE at 20 GHz. Eigenvalues are denoted by n. Figure 3.21: Eigenvalues for LC-CSPIE for harmonic Y 1 1 . 62 Figure 3.22: Condition numbers for LC-CSPIE . 63 CHAPTER 4 DISCRETIZATION OF LOCAL CALDER ´ON COMBINED DPIE AND DFIE 4.1 Method-of-Moments for LC-CDPIE and LC-CDFIE Analytic basis sets are useful for computing the physics of spheres, but they are not useful for computing the electromagnetic fields scattered by non-canonical objects. The MOM pro- cedure detailed in Chapter 1 is used to discretize the LC-CDPIE and LC-CDFIE formulations and model the scattered electromagnetic fields of real-world objects. The following notation is used: Bs denotes the set of scalar basis functions defined over mesh patches or nodes, Bv denotes the set of vector basis functions defined over mesh edges, and F V = diag (Bs, Bv) denotes the union of scalar and vector basis sets. In other words, Bs = Bv = (cid:18) h0 (cid:18) (cid:19) . . . hn . . . hNs (cid:19) f0 . . . fn . . . fNv (4.1a) (4.1b) where the vector functions are f0 and the scalar functions are hn. The LC-CDPIE is composed of the independent LC-CVPIE and LC-CSPIE formulations. For the LC-CVPIE, the MOM system of equations is ZVPIE ˜yVPIE = bVPIE (4.2) 64 where the elements are defined by kn = ⟨F V ZVPIE k , δG−1P1 + (1 − δ) G−1P1 l l ZV P IE 1 P1 r F V n ˜PV P IEP2 r G−1P2 l ZV P IE 2 P1 r F V n ⟩ ˜yVPIE n = P1 l yVPIE n bVPIE k = ⟨F V k , δG−1P1 l   i   a γ   + (1 − δ)G−1P1 l ˜PV P IEP2 r G−1P2 l Gkn = ⟨F V k , F V n ⟩. For the LC-CSPIE, the MOM system of equations is ZSPIEySPIE = bSPIE where the elements are defined by   i   b σ   ⟩ kn = ⟨Bs ZSPIE l PSP IES1 r G−1S1 l ZSP IE 1 r Bs S2 n k, δG−1S2 + (1 − δ)G−1S2 l ZSP IE 2 l PSP IES1 l βi⟩ k, δG−1S2 + (1 − δ)G−1S2 r Bs S2 n⟩ l αi r G−1S1 bSPIE k = ⟨Bs Gkn = ⟨Bs k, Bs n⟩. The LC-CDPIE MOM system of equations is ZDFIE ˜yDFIE = bDFIE 65 (4.3a) (4.3b) (4.3c) (4.3d) (4.3e) (4.3f) (4.4) (4.5a) (4.5b) (4.5c) (4.5d) (4.5e) (4.6) where the elements are defined by kn = ⟨F V ZDFIE k , δG−1P1 + (1 − δ) G−1P1 l l ZDF IE 1 P1 r F V n ˜PDF IEP2 r G−1P2 l ZDF IE 2 P1 r F V n ⟩ ˜yDFIE n = P1 l yDFIE n bDFIE k = ⟨F V k , δG−1P1 l  i      aE γE + (1 − δ)G−1P1 l ˜PDF IEP2 r G−1P2 l  i   ⟩    bE σE Gkn = ⟨F V k , F V n ⟩. (4.7a) (4.7b) (4.7c) (4.7d) (4.7e) (4.7f) These three MOM systems are further specified by specific basis functions in subsequent sections. 4.2 Fine-Grain Localization for Piece-Wise Tesselations Localizing the Calder´on preconditioners with a single complex wavenumber, where the parameter H is determined by the geometry’s global max-mean curvature, is not necessary for piece-wise tesselations and, in some cases, sub-optimal. A generalization of this localization approach is computing the max-mean curvature of each edge and defining the max- mean curvature of a patch as the maximum of the mean curvatures of the patch’s edges. Note, when computing the discrete preconditioner, the scalar-function and vector-function interactions are constituted by the interactions of the patches composing the scalar and vector basis and testing functions. Then, for each source-observer patch pair, the localizing complex wavenumber is generated from the maximum of the max-mean curvatures of the sources and observers. This is referred to as fine-grain localization, and it will be used to localized the Calder´on preconditioners unless otherwise stated. 66 4.2.1 Zero-Mean Constraint Both γ = ˆn · A(r), γE = ˆn · E(r), and β = ˆn · ∇ϕ(r) have a zero-mean constraint (ZMC) in the LC-CDPIE and LC-CDFIE [Li et al., 2019]. A Lagrange multiplier is used to enforce the constraint [Baumann et al., 2022] in the following way. A Lagrange multiplier is added as an unknown. An additional row and column is added to Galerkin tested forward matrix as well. The entries in the added column and row corresponding to ZMC unknowns are filled with the area of their respective scalar basis functions. Corresponding with the Lagrange multiplier, an additional entry is added to the RHS and set to zero. 4.2.2 Singularity Subtraction The singularities of the LC-CDPIE and LC-CDFIE are handled with singularity subtraction when the test and basis function are within 0.15λ. For integrating over the source, the free- space Green’s function is Taylor series expanded, the singular terms are subtracted with the free-space Green’s function, analytically evaluated, and added back. The integration over the observer is performed with a Gauss-Legendre quadrature rule. 4.2.3 Incident Potentials and Field The RHS is always a planewave in this work. In the case of the DFIE, the incident field is Ei (r) = Ee−jκ·r (4.8) where E is the polarization. For the DPIE, the decomposition suggested in [Vico et al., 2016] results in the following incident potentials which proved numerically beneficial ϕi = − (r · E) e−jκ·r Ai = − κ ω (r · E) e−jκ·r. 67 (4.9a) (4.9b) 4.3 LC-CDPIE using Hat and RWG Functions The LC-CDPIE discrete scalar and vector basis functions are hat and RWG functions on a mesh with Nf faces and Ne edges and Nn nodes. The hat functions are defined by hnn (r) =    lnn 2Ann ˆunn · (cid:0)enn − ρ+ nn (cid:1) r ∈ Tnn 0 else and the RWG functions are defined by where fne (r) =   lne 2A± ne ρ± ne  0 r ∈ T ± ne else ρ± (r) = ±(r − p± n )    0 r ∈ T ± n else (4.10a) (4.11a) (4.12a) and Ann is the area of a face connected to node nn; A± ne is the area of the faces sharing edge ne; ˆunn is the planar normal of the edge opposite of node nn on face Tnn and pointing away from node nn; lnn is the length of the edge opposite of node nn and on face Tnn; lne is the length of the edge ne; ρ± coordinates of either node on the edge opposite of node nn; T± ne; T± selected edge on the appropriate face sharing the selected edge; and p± nn originates in node nn and terminates in face Tnn; enn are the ne are the faces sharing edge ne are the coordinates for the nodes opposite of the n are the appropriate n are either nodal or edge faces; ρ± nodal coordinates for the RWG or hat basis. The scalar and vector basis functions are then Bs = Bv = (cid:18) h0 (cid:18) . . . hnn . . . hNn (cid:19) f0 . . . fne . . . fNe (cid:19) . (4.13a) (4.13b) The RWG functions are divergence-conforming. They are selected for two reasons: one, the LC-CDPIE and LC-CDFIE are well-tested with RWG functions according to their map- ping properties, and the RWG-RWG Gram matrix is well-conditioned; two, these formula- tions include divergence operators acting on vector basis functions and the free-space Green’s 68 function, together. For example, using RWG functions, the following integral arises in the novel formulations, (cid:90) s ∇s · (fn G) dS′ = (cid:90) ∂S ˆu · (fn G) ∂S (4.14) where ˆu is the outward facing unit normal vector of each edge in the ± patch of the RWG function (lying in the surface of the ± patch). The RWG functions are continuous across the normal component of an edge, zero across the normal component of the other two edges in the patch. Therefore, these integrals are equal to zero with RWG functions. Without RWGs or another divergence-conforming function, the discretization will insert fake line charges which can cause numerical anomalies [Rao et al., 1982]. The hat functions are chosen for the sake of interest and experimentation with a higher- order, scalar interpolant. The LC-CDPIE with hats and RWGs was benchmarked against the CFIE for several, increasingly challenging geometries. To begin, a PEC sphere of radius 0.5 meters was excited by a 1500 MHz frequency planewave with ˆz propogation axis and ˆx polarization. The sphere discretization is λ/10, and the electrical length is 5λ. The MOM system was solved using QMR with a tolerance of 1e-12. The LC-CVPIE converged in 32 iterations while the LC- CSPIE converged in 24 iterations. The RCS predicted by the LC-CDPIE agrees with that of the Mie solution in Fig. 4.1. To assess low-frequency performance, Fig. 4.3 shows the RCS of a PEC sphere with radius 1m illuminated at 10µ Hz with a planewave of ˆz propagation axis, ˆx polarization, and the sphere is discretized with a mean edge length of λ/1e14. The LC-CVPIE and LC- CSPIE converged in 4 and 3 iterations, respectively, using QMR with a tolerance of 1e-5. The RCS of the LC-CDPIE agrees with the RCS of the Mie solution. A multi-scale sphere was the next test. The sphere in Fig. 4.4 was illuminated with a 400 MHz planewave that propagates along the ˆz axis and polarized along ˆx polarization. The electrical length of the mesh edges ranges from λ/11 to λ/406. The LC-VPIE and LC-SPIE formulations converge more quickly than the CFIE formulation using QMR, TFQMR, and 69 Figure 4.1: RCS of a PEC sphere at 750 MHz. Figure 4.2: RCS of a PEC sphere at 10µ Hz. GMRES. Converging to a residue of 1e-12 requires less than 100 iterations for the LC-CSPIE (Group 1 in Fig. 4.5) and less than 210 iterations for the suggested LC-CVPIE (Group 2 in Fig. 4.5) while the CFIE requires over 24,000 iterations for TFQMR as well as QMR and will not converge using GMRES (Group 3 in Fig. 4.5). The next set of benchmarking tests uses non-canonical geometries. A bumpy cube and NASA Geographos asteroid as shown in Fig. 4.6 and a sharp pencil as shown in Fig. 4.11 are selected. The NASA Geographos asteroid is illuminated by a 240 MHz planewave that propagates along ˆy and polarized along ˆx. The electrical lengths of the mesh edges range 70 Figure 4.3: Number of iterations for unit spheres of various electrical lengths to converge to tol=1e-12 using QMR. Figure 4.4: Plot of multi-scale sphere. 71 Figure 4.5: Residue plot for multi-scale sphere. Figure 4.6: Plot of bumpy cube and NASA Geographos asteroid geometries. from λ/13 to λ/64, and the electrical length of the object is 4λ. The MOM systems are solved using QMR with tolerance 1e-12, and the predicted RCS of the LC-CDPIE and CFIE agree in Fig. 4.7. The LC-CVPIE and LC-CSPIE formulations converge in 258 and 112 iterations, respectively, while the CFIE converges in 2027 iterations. The plot of residues and iteration counts in Fig. 4.8 shows the LC-CVPIE and LC-CSPIE formulations converge more quickly to an arbitrary residue than the CFIE for the asteroid. The bumpy cube is illuminated by a 400 MHz planewave propagating along ˆy and polar- ized along ˆx. The electrical lengths of the mesh edges for the bumpy cube range from λ/12 to λ/50. Furthermore, the object fits in a cube of size 1.67λ. Again, the MOM systems are solved using QMR with a tolerance of 1e-12, and the predicted RCS of the LC-CDPIE and CFIE agree in Fig. 4.9. The LC-CVPIE and LC-CSPIE formulations converge in 305 and 72 Figure 4.7: RCS plot of the NASA Geographos asteroid. Figure 4.8: Residue plot for the NASA Geographos asteroid. 120 iterations, respectively, while the CFIE converges in 3761 iterations. Also, the plot of residues and iteration counts in Fig. 4.10 shows the LC-CVPIE and LC-CSPIE formulations converge more quickly to an arbitrary residue than the CFIE for the bumpy cube as well as the asteroid. The last and most challenging test is a sharp pencil as shown in Fig. 4.11. This geometry’s curvature max-mean curvature is 2585 due to the sharp, narrow tip. The sharp pencil is illuminated by a 269 MHz planewave with ˆy propagation axis and ˆx polarization. The 73 Figure 4.9: RCS plot of a bumpy cube. Figure 4.10: Residue plot for a bumpy cube. electrical lengths of the mesh edges range from λ/13 to λ/5648. This range is greater than the multi-scale sphere, bumpy cube, and asteroid. Also, the pencil’s electrical length is 9λ. The MOM systems are solved using QMR with a tolerance of 1e-12. The LC-CDPIE, CFIE, and MFIE RCS plots agree in Fig. 4.12, but the LC-CDPIE and MFIE agree most closely while the CFIE departs from both near ϕ = −π/2 and π/2. The LC-CVPIE and LC-CSPIE formulations converge in 159 and 90 iterations, respectively, while the CFIE converges in 47045 iterations and the MFIE in 1761 iterations. If the fine-grain localization approach is 74 Figure 4.11: Plot of sharp pencil geometry. Figure 4.12: RCS plot for a sharp pencil. not used but rather the global max-mean curvature localization approach is used, the LC- CVPIE and LC-CSPIE converge in 34764 and 91 iterations, respectively. Again, the plot of residues and iteration counts in Fig. 4.13 show the LC-CVPIE and LC-CSPIE formulations converge more quickly to any residue than both the CFIE and MFIE. 4.4 MLFMM Accelerated LC-CDFIE using Pulse and RWG Func- tions The main drawback to the LC-CDPIE is the sheer number of unknowns and computa- tional cost. For example, the number of unknowns in the CFIE is equal to the number of mesh edges while the number of unknowns for a Combined DPIE constrained to PEC objects is at least the number of mesh edges plus twice the number of mesh patches. The LC-CDFIE has less unknowns than the LC-CDPIE (less by the number of patches) 75 Figure 4.13: Residue plot for a sharp pencil. while retaining the same mathematical properties as the LC-CVPIE. Furthermore, the com- putational cost of the LC-CDFIE may be ameliorated with the multi-level fast multipole method (MLFMM) detailed in the Appendix and pulse functions suffice for scalar basis functions, defined as Pnf (r) =    1 r ∈ Tnf 0 else (4.15) where Tnf is the nf triangle. To accelerate the LC-CDFIE with the MLFMM, the salient idea is only the scattered electric field need to traverse the trees. The Combined DFIE may be recovered in the local- to-observer (L2O) step by applying each operator in the Green’s theorem to the unknown quantities locally expanded to the leaves. The Local Calder´on preconditioner is contained within the nearfield due to the localization ˜κ, and therefore the MLFMM accelerated LC- CDFIE amounts to an MLFMM matvec plus an additional nearfield matvec. More specifically, consider the scattered electric field reduced to the PEC case where the integration includes the principle value Es(r) = (cid:90) (cid:20) S − G(r, r′) ˆn′ × ˜aE(r′) − ∇G(r, r′) ˜γE(r′) (cid:21) dS′. (4.16) 76 The source trace quantities are represented by ˆn × ˜aE(r) = ˜γE(r) = Ne (cid:88) ne Np (cid:88) np fne (r) Pnf (r) . (4.17) Using a planewave expansion, the free-space Green’s function [Rokhlin, 1993] may be written as (cid:16) r′, r (cid:17) G = − jκ (4π)2 (cid:90) Ss e (cid:16) −jκ· s−r′(cid:17) rc (cid:16) T κ, rc Ωo (cid:16) − rs Ωs (cid:17) r−rc Ωo (cid:17) . . . d2ˆκ −jκ· e (cid:16) T κ, rc Ωo − rs Ωs ∞ (cid:88) (cid:17) = 0 (2) (−j)n (2n + 1) h n (cid:16) κ ||rc Ωo − rc Ωs (cid:17) || . . . (4.18) (cid:32) Pn ˆκ · rc Ωo ||rc Ωo − rc − rc Ωs Ωs (cid:33) || where rc Ωs is the center of a box containing Ωs, rc Ωo is the center of a box containing Ωs, κ = κˆκ, and S2 denotes a unit sphere parameterized by (θ, ϕ) ∈ [0, π]×[0, 2π]. Substituting (4.18) into (4.16), MLFMM is now able to quickly compute the scattered electric field at a point on the boundary of the scatter with a CPU and memory complexity of O(N log(N )). The field is sent up and down the oct-tree according to the multipole-to-multipole (M2M), multipole- to-local (M2L), and local-to-local (L2L) operations of MLFMM with global interpolants as described in [Hughey et al., 2019]. However, the charge-to-multipole (C2M) and local-to- observer (L2O) operations are formulation specific unlike M2M, M2L, and L2L. As a demonstration, consider a closed region Ωs sufficiently spaced from another closed region Ωo. Both contain a portion of a mesh from a tessellated geometry with Ne edges and Np patches. Each patch and edge is associated with a unit pulse and RWG basis function, respectively, the collection of which in Ωs is {fne : ne ∈ Ωs} and {Pnp : np ∈ Ωs}. The 77 traces quantities from the emitting Ωs are ˆn × ˆn × ∇ × Es (r) ˆn × Es (r) ˆn · Es (r) ∇ · Es (r) . Collect the emitting sources in a multipole expansion about location rc s using Vs (κ) = Np (cid:88) Q N p (cid:88) nq=1 i=1 γnp wnp,i jκ e (cid:18) −jκ· (cid:19) ′ rc s−r np,i Pnp (cid:17) (cid:16) r′ np,i Q N e (cid:88) Ne (cid:88) ne i=1 − (cid:18) −jκ· ane wne,i e (cid:19) ′ rc s−r ne,i (cid:16) ˆnne,i × fne (cid:16) r′ ne,i (cid:17)(cid:17) (4.19) (4.20) where N Q p and N Q e are the number of quadrature samples for integrating over RWG and pulse basis functions with a Gauss-Legendre rule. Collect the emissions from {Ωs : s = 1, 2, · · · , NΩ} (where NΩ is the number of source regions) about location rc o in Ωo using a multipole-to-local expansion, Uo (κ) = (cid:88) s T (κ, X) Vs (κ) . (4.21) And finally use a local expansion to compute the observed radiation quantities at location r, (cid:90) jκ − S2 EF F (r) = o) Uo (κ) d2ˆκ ∇ × EF F (r) = (4π)2 e−jκ·(r−rc (4π)2 e−jκ·(r−rc (4π)2 e−jκ·(r−rc where EF F are the farfield emissions observed in Ωo. The local expansions are weighted o) jκ × Uo (κ) d2ˆκ o) jκ · Uo (κ) d2ˆκ. ∇ · EF F (r) = (4.22) S2 S2 jκ jκ − − (cid:90) (cid:90) with either δ or (1 − δ), as determined by the combined system, and added to the observed trace quantities from sources emitting in the nearfield. The collection approximates the total observed emissions from sources. 78 Figure 4.14: Error convergence vs oversampling parameter of the CDFIE on a sphere. The MLFMM accelerated LC-CDFIE is tested for a variety of geometries and bench- marked against the CFIE with a diagonal preconditioner. All results were generated with a shared memory, dual-socketed architecture of AMD EPYC 7H12 64-core processors with 2 threads per core for a total of 256 CPUs. The nearfield, C2M, M2M, M2L, L2L, and L20 operations were fine-grain parallelized using OpenMP. The controllable error and convergence of the MLFMM accelerated CDFIE is verified in Fig. 4.14 and Fig. 4.15. The first test in Fig. 4.14 uses 35,840 charges and 107,520 dipoles on a 8λ unit sphere. The tree is 7 levels with 3 buffer-boxes for the farfield. The second test in Fig. 4.15 uses 60,928 charges and 182,784 dipoles on a 8λ × 4λ thin box. The tree is 7 levels with 3 buffer-boxes for the farfield. Both tests demonstrate the MLFMM CDFIE monotonically converges to the direct computation of the CDFIE (viz., sources are in the nearfield of all observers) according to an oversampling parameter χ which controls the number of global interpolants used in the MLFMM. The new formulation is verified by exciting a unit PEC sphere using a 3.6 GHz planewave with a ˆz propogation axis and ˆx polarization axis. The sphere is 24λ and 1, 966, 080 edges and 1, 310, 720 patches. The LC-CDFIE converged to tol=1e-5 in 1 outer iteration of LGMRES using 30 inner iterations. The radar cross section (RCS) is plotted in Fig. 4.16, and there is 79 Figure 4.15: Error convergence vs oversampling parameter of the CDFIE on a thin box. Figure 4.16: RCS(dB) of a PEC unit sphere at 3.6 GHz from θ = [π, −π] with ϕ = 0. good agreement with the Mie series solution. The next test benchmarks the scaling of matvec timings of the MLFMM accelerated CDFIE against the direct implementation of the CDFIE. In Fig. 4.17, the matvec times are reported using a collection of increasingly large unit spheres discretized with a mean-edge length of λ/19. The x-axis reports and increasing number of unknowns. The matvec cost scales quadratically at N 1.96 in the direct implementation while the FMM scales linearly at N 1.02. Beyond 8,326 unknowns, the MLFMM accelerated CDFIE is more efficient than the direct approach for iterative solvers. 80 Figure 4.17: Timings of one matvec of the CDFIE. The asteroid and bumpy cube geometries are used to benchmark the LC-CDFIE against a diagonally preconditioned CFIE for non-canonical geometries. The bumby cube is shown in Fig. 4.6. The first test uses a bumby cube mesh with 122,607 edges and 81,738 patches excited at 850 MHz (or a bumby cube contained within a 3.67λ × 3.67λ × 3.67λ box) using a planewave with a propagation axis of ˆz and polarization axis ˆx. The edge lengths range from λ/15 to λ/71. The RCS agrees between the CFIE and LC-CDFIE as shown in Fig. 4.18. The CFIE has 122,607 unknowns and converged in 45 matvecs, and the LC-CDFIE has 204,345 unknowns but converged in 23 matvecs using TFQMR with tol=1e-5. The second test uses a bumby cube mesh with 1, 949, 904 edges and 1, 299, 936 patches excited at 3 GHz (or a 81 Figure 4.18: RCS(dB) of bumby cube at 850 MHz from θ = [π, −π] with ϕ = 0. Figure 4.19: RCS(dB) of bumby cube at 3 GHz from θ = [π, −π] with ϕ = 0. bumby cube contained within a 13λ × 13λ × 13λ box) using a planewave with a propagation axis of ˆz and polarization axis ˆx. The lengths of the mesh edges range from λ/14 to λ/131. There were 3, 249, 840 unknowns, and the LC-CDFIE converged in 32 iterations of TFQMR to tol=1e-5. The RCS is shown in Fig. 4.19. The diagonally preconditioned CFIE did not converge with TFQMR but converged with GMRES in 5 outer iterations (where one outer iteration is equal to 30 inner iterations) to tol=1e-5. The final benchmark uses a complicated plane geometry shown in the Appendix. The plane mesh has 1,114,944 edges and 743,296 patches excited at 127 MHz using a planewave 82 Figure 4.20: RCS(dB) of plane at 127 MHz from θ = [π, −π] with ϕ = 0. with a propagation axis of ˆz and polarization axis ˆx. The edge lengths range from λ/13 to λ/245, and the plane’s electrical length is 32λ. The LC-CDFIE converged in 10 outer iterations of LGMRES (60 inner iterations) to a tolerance of 1e-5. The RCS plot is shown in Fig. 4.20 and compared to the CFIE. 83 BIBLIOGRAPHY 84 BIBLIOGRAPHY [Alsnayyan and Shanker, 2022] Alsnayyan, A. and Shanker, B. (2022). Iso-geometric inte- gral equation solvers and their compression via manifold harmonics. IEEE Transactions on Antennas and Propagation, 70(8):6893–6904. [Andriulli et al., 2013] Andriulli, F. P., Cools, K., Bogaert, I., and Michielssen, E. (2013). On a well-conditioned electric field integral operator for multiply connected geometries. IEEE Transactions on Antennas and Propagation, 61(4):2077–2087. [Antoine et al., 2006] Antoine, X., Darbas, M., and Lu, Y. Y. (2006). Computer methods in applied mechanics and engineering. Arch. Elektr. Ubertrag, 195(33-36):4060–4074. [Baumann et al., 2022] Baumann, L., Aktulga, H. M., Macon, C., and Shanker, B. (2022). integral equation for electromagnetic scattering from arbitrarily Decoupled potential shaped dielectric objects. Antennas and Propagation, 168:73–86. [Boubendir and Turc, 2014] Boubendir, Y. and Turc, C. (2014). Well-conditioned boundary integral equation formulations for the solution of high-frequency electromagnetic scattering problems. Computers and mathematics with applications, 67(10):1772–1805. [Chen et al., 2022] Chen, R., Ulku, H. A., Andriulli, F. P., and Bagci, H. (2022). On the low- frequency behavior of vector potential integral equations for perfect electrically conducting scatterers. IEEE Transactions on Antennas and Propagation, 70(12):12411–12416. [Cheng et al., 2006] Cheng, H., Crutchfield, W., Gimbutas, Z., Greengard, L., Ethridge, J., Huang, J., Rokhlin, V., Yarvin, N., and Zhao, J. (2006). A wideband fast multipole method for the helmholtz equation in three dimensions. Journal of Computational Physics, 216:300–325. [Coifman et al., 1993] Coifman, R., Rokhlin, V., and Wandzura, S. (1993). The fast mul- IEEE Antennas and tipole method for the wave equation: a pedestrian prescription. Propagation Magazine, 35(3):7–12. [Cools et al., 2009a] Cools, K., Andriulli, F., Olyslager, F., and Michielssen, E. (2009a). Nullspaces of mfie and calder´on preconditioned efie operators applied to toroidal surfaces. IEEE Trans. Antennas and Propagation, 57(10):3205–3215. [Cools et al., 2009b] Cools, K., Andriulli, F. P., Olyslager, F., and Michielssen, E. (2009b). Improving the mfie’s accuracy by using a mixed discretization. IEEE Antennas and Prop- agation Society International Symposium, pages 1–4. [Cools et al., 2009c] Cools, K., Andriulli, F. P., Olyslager, F., and Michielssen, E. (2009c). Time domain calder´on identities and their application to the integral equation analysis of scattering by pec objects part i: Preconditioning. IEEE Transactions on Antennas and Propagation, 57(8):2352–2364. 85 [Epton and Dembart, 1995] Epton, M. and Dembart, B. (1995). Multipole translation the- ory for three-dimensional laplace and helmholtz equations. SIAM Journal on Scientific Computing, 16:865–897. [Ergul and Gurel, 2008] Ergul, O. and Gurel, L. (2008). Efficient parallelization of the mul- tilevel fast multipole algorithm for the solution of large-scale scattering problems. IEEE Transactions on Antennas and Propagation, 56(8):2335–2345. [Eris et al., 2022] Eris, O., Karaova, G., and Erg¨ul, O. (2022). A novel combined potential- field formulation for densely discretized perfectly conducting objects. IEEE Transactions on Antennas and Propagation, 70(6):4645–4654. [Fu et al., 2017] Fu, X., Li, J., Jiang, L. J., and Shanker, B. (2017). Generalized debye sources-based efie solver on subdivision surfaces. IEEE Transactions on Antennas and Propagation, 65(10):5376–5386. [Guo et al., 2018] Guo, H., Jun, Y., and Michielssen, E. (2018). A butterfly-based direct solver using hierarchical lu factorization for poggio-miller-chang-harrington-wu-tsai equa- tions. Microw Opt Technol Lett., 60:1381– 1387. [Harrington and Mautz, 1978] Harrington, R. and Mautz, J. (1978). H-field, e-field and com- bined field solution for conducting bodies of revolution. Arch. Elektr. Ubertrag, 32(4):157– 164. [Harrington, 1968] Harrington, R. F. (1968). Field Computation by Moment Methods. New York: Macmillan. [Hawkins et al., 2023] Hawkins, J. A., Baumann, L., Aktulga, H. M., Dault, D., and Shanker, B. (2023). Analytic preconditioners for decoupled potential integral equations and wideband analysis of scattering from pec objects. IEEE Transactions on Antennas and Propagation, 71(8):6753–6765. [He et al., 2022] He, W. J., Huang, X. W., Wang, W., Yang, M. L., and Sheng, X. Q. (2022). Solving electromagnetic scattering problems with tens of billions of unknowns using gpu accelerated massively parallel mlfma. IEEE Antennas and Wireless Propagation Letters, 70(7):5672–5682. [Hsiao and Kleinman, 1997] Hsiao, G. C. and Kleinman, R. E. (1997). Mathematical founda- tions for error estimation in numerical solutions of integral equations in electromagnetics. IEEE Transactions on Antennas and Propagation, 45(3):316–328. [Hughey et al., 2019] Hughey, S., Aktulga, H. M., Vikram, M., Lu, M., Shanker, B., and Michielssen, E. (2019). Parallel wideband mlfma for analysis of electrically large, nonuni- form, multiscale structures. IEEE Transactions on Antennas and Propagation, 67(2):1094– 1107. [Jakob-Chien and Alpert, 1997] Jakob-Chien, R. and Alpert, B. (1997). A fast spherical filter with uniform resolution. Journal of Computational Physics, 136:580–584. 86 [Jin, 2011] Jin, J. M. (2011). Theory and computation of electromagnetic fields. John Wiley Sons. [Kress, 1981] Kress, R. (1981). A singular perturbation problem for linear operators with an application to the limiting behavior of stationary electromagnetic wave fields for small frequencies. Meth. Verf. Math. Phys., pages 5–30. [Li et al., 2019] Li, J., Fu, X., and Shanker, B. (2019). Decoupled potential integral equa- tions for electromagnetic scattering from dielectric objects. IEEE Transactions on Anten- nas and Propagation, 67(3):1729–1739. [Lingg et al., 2022] Lingg, M. P., Hughey, S. M., Shanker, B., and Aktulga, H. M. (2022). High performance evaluation of helmholtz potentials using the multi-level fast multipole algorithm. IEEE Transactions on Parallel and Distributed Systems, 33(12):3651–3666. [Lu and Michielssen, 2004] Lu, M. and Michielssen, E. (2004). A local filtering scheme for fmm/pwtd algorithms. IEEE Antennas and Propagation Society Symposium, 4:4523–4526. [Michiels et al., ] Michiels, B., Fostier, J., Bogaert, I., and De Zutter, D. Full-wave simula- tions of electromagnetic scattering problems with billions of unknowns. IEEE Transactions on Antennas and Propagation, 63(2):796–799. [N´ed´elec, 2001] N´ed´elec, J. C. (2001). Acoustic and Electromagnetic Equations. Springer- Verlag. [Qian and Chew, 2008] Qian, Z. G. and Chew, W. C. (2008). A quantitative study on the low frequency breakdown of efie. Microwave and Optical Technology Letters, 50(5):1159–1162. [Rao et al., 1982] Rao, S., Wilton, D., and Glisson, A. (1982). Electromagnetic scattering by surfaces of arbitrary shape. IEEE Transactions on Antennas and Propagation, 30(3):409– 418. [Rokhlin, 1993] Rokhlin, V. (1993). Diagonal forms of translation operators for the helmholtz equation in three dimensions, applied and computational harmonic analysis. Applied and Computational Harmonic Analysis, 1:82–93. [Roth and Chew, 2021] Roth, T. E. and Chew, W. C. (2021). Potential-based time domain integral equations free from interior resonances. IEEE Journal on Multiscale and Multi- physics Computational Techniques, 6:81–91. [Rudin, 1991] Rudin, W. (1991). Functional Analysis. McGraw-Hill. [Sarvas, 2003] Sarvas, J. (2003). Performing interpolation and anterpolation entirely by fast fourier transform in the 3-d multilevel fast multipole algorithm. SIAM Journal on Numerical Analysis, 41:2180–2196. [Sharma and Triverio, 2022] Sharma, S. and Triverio, P. (2022). Electromagnetic modeling of lossy interconnects from dc to high frequencies with a potential-based boundary element formulation. IEEE Transactions on Microwave Theory and Techniques, 70(8):3847–3861. 87 [Song and Chew, 1995] Song, J. M. and Chew, W. C. (1995). Multilevel fast-multipole algo- rithm for solving combined field integral equations of electromagnetic scattering. Microw. Opt. Technol. Lett., 10:14–19. [Sundar et al., 2008] Sundar, H., Sampath, S., and Biros, G. (2008). Bottom-up construction and 2: 1 balance refinement of linear octrees in parallel. SIAM Journal on Scientific Computing, 30:2675–2708. [Taskinen and Yl¨a-Oijala, 2006] Taskinen, M. and Yl¨a-Oijala, P. (2006). Current and charge integral equation formulation. Applied Computational Electromagnetics Society Journal, 54(1):58–67. [Valdes et al., 2011] Valdes, F., Andriulli, F., Bagci, H., and Michelssen, E. (2011). A calder´on-preconditioned single source combined field integral equation for analyz- ing scattering from homogeneous penetrable objects. IEEE Trans. Antennas Propag., 59(26):2315–2328. [Vico et al., 2016] Vico, F., Ferrando, M., Greengard, L., and Gimbutas, Z. (2016). The decoupled potential integral equation for time-harmonic electromagnetic scattering. Com- munications on Pure and Applied Mathematics, 69(4):771–812. [Vikram et al., 2009] Vikram, M., Huang, H., Shanker, B., and Van, T. (2009). A novel wideband fmm for fast integral equation solution of multiscale problems in electromag- netics. IEEE Transactions on Antennas and Propagation, 57(7):2094–2104. [Wilton and Glisson, 1981] Wilton, D. R. and Glisson, A. W. (1981). On improving the electric field integral equation at low frequencies. Proc. URSI Radio Sci. Meet. Digest. [Wu et al., 1995] Wu, W.-L., Glisson, A. W., and Kajfez, D. (1995). A study of two numer- ical solution procedures for the electric field integral equation at low frequency. Applied Computational Electromagnetics Society Journal, 10(3):69–80. [Yan et al., 2010] Yan, S., Jin, J., and Nie, Z. (2010). Efie analysis of low-frequency problems with loop-star decomposition and calder´on multiplicative preconditioner. IEEE Transac- tions on Antennas and Propagation, 58(3):857–867. [Yan et al., 2011] Yan, S., Jin, J. M., and Nie, Z. (2011). Improving the accuracy of the second-kind fredholm integral equations by using the buffa-christiansen functions. IEEE Transactions on Antennas and Propagation, 59:1299–1310. [Yang et al., 2019] Yang, M. L., Wu, B. Y., Gao, H. W., and Sheng, X. Q. (2019). A ternary parallelization approach of mlfma for solving electromagnetic scattering problems with over 10 billion unknowns. IEEE Transactions on Antennas and Propagation, 67(11):6965– 6978. [Zhang et al., 2003] Zhang, Y., Cui, T. J., Chew, W. C., and Zhao, J.-S. (2003). Magnetic field integral equation at very low frequencies. IEEE Trans. Antennas and Propagation, 51(8):1864–1871. 88 [Zhao and Chew, 2000] Zhao, J.-S. and Chew, W. C. (2000). Integral equation solution of maxwell’s equations from zero frequency to microwave frequencies. IEEE Trans. Antennas and Propagation, 48(10):1635–1645. 89 APPENDIX 90 SCALABILITY ANALYSIS OF MLFMM ACCELERATED CFIE Accurately computing the currents induced by an incident field upon the surface of an elec- trically large and PEC object using full-wave boundary element method (BEM) formulations like the combined field integral equation (CFIE) has been an ongoing problem in the com- putational electromagnetics (CEM) community. There are several challenges for the fast and accurate computation of such real-world physics with the CFIE: first, when the CFIE is discretized using the Method-of-Moments (MOM) [Harrington, 1968][Jin, 2011], the forward matrix and number of unknowns is very large for electrically large problems; two, electri- cally large, real-world problems often have multi-scale physics such as an antenna embedded in a larger object; three, the CFIE is ill-conditioned for low-frequency and high-frequency excitations. Focusing upon the fast computation of problems with large CFIE forward matrices, direct methods have been developed with CPU complexity O(N 1.5 log N ) and memory complexity O(N log2 N ) [Guo et al., 2018]. Iterative methods have been developed as well because the matvec may be accelerated with the multi-level fast multipole method (MLFMM) to a CPU and memory complexity of O(N log(N )) [Song and Chew, 1995]. For electrically large prob- lems where the number of unknowns is very large, iterative methods are significantly more efficient than direct methods and therefore most publications on electrically large problems include an iterative approach [Hughey et al., 2019][Ergul and Gurel, 2008][Michiels et al., ][Yang et al., 2019][Vikram et al., 2009][He et al., 2022]. There are two main instances of the MLFMM accelerated CFIE. The first approach uses local interpolants to manage the upsampling and downsampling required of the full-wave MLFMM algorithm. Some examples of this approach are [Ergul and Gurel, 2008] and [Yang et al., 2019]. The anterpolation step of this approach introduces error due to aliasing from the imperfect filtering of local interpolants. Bandlimited local interpolants like approximate spherical prolates (APS) functions can demonstrably control the anterpolation error [Lu and Michielssen, 2004], but the downside is the significant oversampling cost [Hughey et al., 91 2019]. The second approach uses spherical harmonics or global interpolants to manage the interpolation and anterpolation of MLFMM. An example is found in [Hughey et al., 2019]. This approach has demonstrable, analytically provable, and monotonic error-control through a single oversampling parameter determing the number of harmonics. The filtering is exact, and the error is due to the number of harmonics not spanning the entire function space [Rokhlin, 1993] rather than an error propagation due to aliasing. Also, the sampling is optimal. This analysis proceeds with the global interpolant MLFMM to accelerate the CFIE be- cause the method is exact, error-controlable, and the sampling is optimal. However, the computation of electrically large geometries cannot be stored in a single node of a computer architecture. This necessitates the use of a multi-core processor and MPI library to manage communications between many nodes. Several difficulties then follow concerning scalabil- ity such as parallel efficiency and memory usage. This appendix analyzes the scalability of a hybrid parallalization of the CFIE accelerated with an error-controllable MLFMM (viz., global/harmonic interpolants) with tasks using Intel OpenMP and MPI libraries. MLFMM Accelerated CFIE Using the CFIE, the scattered fields may be obtained from the unknown surface current density J (r) where r ∈ ∂D αˆn × ˆn × Ei(r) + (1 − α)ˆn × Hi(r) = −αL{J}(r) + (1 − α)K{J}(r) where 0 < α < 1 and L{J}(r) = ˆn × ˆn × (cid:90) S (cid:90) G (cid:0)r, r′(cid:1) · J(r′) dS′ G (cid:0)r, r′(cid:1) · J(r′) dS′ S ∇∇ κ2 (cid:21) g (cid:0)r, r′(cid:1) 1 jkη (cid:20) K{J}(r) = ˆn × G (cid:0)r, r′(cid:1) = −jκη I + 92 (23) (24) and the Green’s function is g (cid:0)r, r′(cid:1) = e−jκ|r−r′| 4π|r − r′| . (25) These equations are discretized with MOM using RWG testing and basis functions defined on a triangular mesh such that the surface current on ∂D is where and J (r) = Ns (cid:88) n Infn (r) fn (r) =    ln 2A± n ρ± n r ∈ T ± n 0 else ρ± (r) =    ±(r − p± n ) r ∈ T ± n 0 else (26) (27) (28) and T ± n denotes the two triangular patches associated with every edge, ln denotes the length of every edge, and p± n denotes the coordinates of the nodes opposite each edge for each patch sharing an edge. The resulting, discretized system is where and ZI = R R = (R0 . . . RN )T I = (I0 . . . IN )T       Z = Z(0,0) ... . . . Z(0,N ) . . . ... Z(N,0) . . . Z(N,N )       Rm = ⟨fm, αˆn × ˆn × Ei(r) + (1 − α)ˆn × Hi(r)⟩ Zm,n = ⟨fm, −αL{fn}(r) + (1 − α) K{fn}(r)⟩. 93 (29) (30a) (30b) (30c) (31a) (31b) The above system is usually iteratively solved with LGRMES or GMRES. Furthermore, (29) may be partitioned into nearfield and farfield matrices ZN F and ZF F matrices where ZF F is accelerated with MLFMM. Equation (25) may be rewritten as [Rokhlin, 1993][Coifman et al., 1993], g(X + d) = − jκ (4π)2 (cid:90) S2 e−jκ(ds+do)T (κ, X)d2ˆκ ∞ (cid:88) T (κ, X) = (−j)n(2n + 1)h (2) n (kX)Pn(ˆκ · ˆX) n=0 (32a) (32b) where (r − r′) = do + ds + X as shown in Fig. A.21. Substituting (32a) into Zm,n for α = 1 yields κ2η (4π2) (cid:90) S dS ˆn × fm · d2ˆκ dS′ (cid:90) (cid:90) S′ Sˆκ (cid:21) (cid:20) I + ˆn × ∇∇ κ2 e−jκdsT (κ, X)e−jκdo · f (33) ′ n. The above integration (where d2ˆκ corresponds to the unit sphere in momentum-space called the Ewald sphere) may be computed with a Gauss-Legendre quadrature rule depending on |ds| and |do| as discussed in [Hughey et al., 2019][Cheng et al., 2006][Epton and Dembart, 1995] provided |X| is greater than |do| + |ds|. Furthermore, different collections of sources in the set of sources (f0 . . . fN ) may share the same X or do, and the divide-and-conqueror algorithm exploits this fact to accelerate the computation of (33). This is effectively the MLFMM algorithm. As a demonstration, in Fig. A.21, the blue points denote sources corresponding to the terminus of r′ and red points denote observers corresponding to the terminus of r. Clearly, the spatial convolutions in (23) require every source to interact with every observer which corresponds to 16 computations in Fig. A.21. However, considering (33), e−jκds may be used to shift the five groups of blue points to five centers of a circumscribing sphere provided the Ewald sphere is sampled with 2 ∗ Nh + 1 samples of ϕ and Nh + 1 samples of θ at the Legendre polynomial zeros of order 2 ∗ Nh + 1 and Nh + 1, respectively. The number of 94 Figure A.21: Diagram of the MLFMM scheme. harmonics is determined by the planewave expansion theorem where a planewave function on a sphere may be expanded in terms of spherical bessel functions of the first kind which evaluates to zero for orders greater than twice the special function argument of κr where r is radius of the sphere. Therefore, Nh = ⌈χκD⌉ + 1 where D is the diameter of the circumscribing spheres in Fig. A.21. This step is called charge-to-multipole (C2M). Next, four of the five groups of sources in Fig. A.21 maybe be shifted once again to the center of a bigger circle which circumscribes the four groups of sources using e−jκds once again. Each ds shift requires the samples of the Ewald sphere to be interpolated to the number of ϕ and θ points determined by Nh = ⌈χκD⌉ + 1 where D is now the diameter of the larger circle. For hybrid trees with harmonic interpolants, the bottom 4 levels are intepolated/anterpolated using the spherical harmonic transform (SHT) [Jakob-Chien and Alpert, 1997] and every other level is interpolated/anterpolated with the FFT [Sarvas, 2003]. The spherical region is where the SHT is used, and the uniform region is where the FFT is used. Each shifted group is added after the interpolation. This step is called multipole-to- multipole (M2M). There are two segments from groups of sources to the group of observers representing the translation operation or multipole-to-local (M2L). These two segments correspond to 95 two different T (κ, X) operators where each X differs in origin but shares a terminus in the center of the circle circumscribing the observers. The number of samples of the translation operators also differ because the two groups of sources are circumscribed by two different circles. The translation operators and M2M samples are multiplied element-by-element, and this multiplication is a filtration in momentum-space due to the bandlimited nature of planewaves on a sphere and the non-bandlimited nature of the translation operator [Hughey et al., 2019][Coifman et al., 1993]. After M2L, the samples of the Ewald sphere corresponding to the larger group of sources are anterpolated to the number of samples corresponding to the smaller group of sources using the SHT or FFT using the same rules for M2M. The two sets of samples are then added, and the sum is shifted using e−jκdo where each do is from the center of the big circle circumscribing the two observer groups to the center of a circle circumscribing one of the observer groups in Fig. A.21. This step is called local-to-local (L2L). The final step is shifting the samples of each Ewald sphere associated with each group of observers using e−jκdo where each do has an origin in the center of the circle circumscribing each group of observers and a terminus at an observer within the group. This final step is local-to-observer (L2O). The implementation in this paper repeats the above process for each Cartesian component of the magnetic vector and the scalar potential in (23). The graph in Fig. A.21 is clearly a tree, and thus our implementation traverses the tree four times. Upon the completion of all four traversals, the samples of the Ewald sphere shifted to each observer are summed for each tree and the outer-most integral over the observer is numerically computed using a Gauss-Legendre quadrature rule once again. Notice, 16 interactions have been reduced to 9 interactions in Fig. A.21. In sum, the MLFMM algorithm for the CFIE exchanges O(N 2) for O(N log(N )). The filtration of the translation operator and optimal sampling of the functions on the Ewald spheres using spherical harmonic functions guarantees controllable error throughout the tree 96 Figure A.22: Four trees traversed for MLFMM accelerated CFIE and the assignment of nodes to processes. traversal with the χ parameter as demonstrated in [Hughey et al., 2019]. This is an important feature for guaranteeing the convergence of an iterative solver to arbitrary tolerances. Next we discuss parallelization of the MLFMM accelerated CFIE with Intel’s OpenMP and MPI libraries for FORTRAN. CFIE with Task OpenMP and MPI MLFMM We proceed by detailing a novel implementation of OpenMP and MPI parallelism for the CFIE with MLFMM at a high level and then discuss the implementation details of each MLFMM operator discussed in Section 4.4. Task+MPI CFIE with MLFMM Our algorithm uses four trees to compute the CFIE. The parallization of the traversal of these four trees includes dividing each tree into processes and threading the process by generating and asynchronously executing tasks within each process. The assignment of processes to nodes within the tree is depicted in Fig. A.22. The same node-to-process assignment is used for each tree. The dependencies of C2M, M2M, M2L, L2L, and L2O operations for traversing a tree are depicted in Fig. A.23. Our implementation creates separate nearfield and farfield parent tasks. Farfield parent tasks include C2M, M2M, M2L, L2L, and L2O operations as 97 Figure A.23: Dependencies of tree traversal operations in MLFMM. child tasks. Nearfield parent tasks include the interaction of leaf nodes with neighboring leaf nodes as child tasks after an all-to-all communication of neighboring leaf nodes on separate processes to processes owning the observer leaf node during initialization. The algorithm interleaves child tasks of the farfield and nearfield parent tasks over the four tree traversals. The cores of a process waiting for communications from other processes within the farfield parent task execute nearfield tasks during any iteration of the tree loop. We do not interleave C2M, M2M, M2L, L2L, and L20 because the Intel OpenMP library for FORTRAN does not allow a variable list of dependencies. For a 3-D manifold, a single node may depend on as may as 189 other nodes and therefore the Intel OpenMP library requires every dependency to include an explicit compiler directive. This makes the asynchronous interleaving of the tree traversal operations in FORTRAN impractical. Furthermore, we do not interleave common operators across the four trees. For example, even though the node-to-process assignment is identical over all four trees, we do not generate tasks of all M2M or any other traversal operation over all trees and asynchronously execute these tasks. Memory would become a substantial issue because all the data from sampling Ewald spheres would be stored for all four trees at the same time rather than one tree at time. This fourfold increase in allocations is prohibitively expensive for modeling electrically large geometries. 98 Algorithm 1 M2M traversal 1: !$OMP PARALLEL SHARED 2: !$OMP SINGLE 3: for tree = 1, 4 do !$OMP TASK 4: CALL C2M traversal(tree) 5: !$OMP END TASK 6: 7: 8: 9: 10: 11: 12: 13: 14: !$OMP TASK CALL M2M traversal() !$OMP END TASK !$OMP TASK CALL M2L traversal !$OMP END TASK !$OMP TASK CALL L20 traversal(tree) !$OMP END TASK !$OMP TASK CALL L2L traversal !$OMP END TASK 15: 16: 17: 18: 19: 20: 21: 22: 23: end for 24: !$OMP TASK 25: CALL nearfield 26: !$OMP END TASK 27: !$OMP END SINGLE 28: !$OMP END PARALLEL Task+MPI C2M and M2M All leaf nodes are divided among processes as depicted in Fig. A.22. The C2M operation necessarily occurs within a single process. The algorithm makes the C2M operation for each leaf node a task. Computatation within C2M is not parallelized due to the low cost of ex- panding the Green’s function into the momentum-space integral in (32a) and approximating the integral with a quadrature rule for leaf boxes as well as the need for locks to avoid write-write errors while aggregating the C2M operations for each source. After assigning leaf nodes to processes, parent nodes are distributed among processes according to the following three rules: parent nodes with child nodes belonging to the same process also belong to the same process; parent nodes with children belonging to two dif- ferent processes are denoted as plural nodes and the (ϕ, θ) samples of the parent nodes are partitioned between the two processes such that the union of the samples stored in each process equals the total samples of the parent node; parent nodes whose children are plural nodes are also plural nodes, and the samples of the parent nodes are evenly partitioned across the union of processes in the child nodes. The partitioning of a plural parent node whose children are also plural nodes is depicted in Fig. A.24. A process has at most two plural nodes per level per process. The hybridization of task OpenMP and MPI is described in the pseudocode of Algorithm 2 and expounded here. In lines 1-27 each non-plural and spherical node is interpolated, 99 Figure A.24: Even partition of node samples across processes for child and parent nodes. shifted, and aggregated to the parent node. Plural spherical nodes are duplicated across processes and handled like non-plural nodes whether uniform or spherical. The interpolation, shifting, and aggregation of these nodes is a task. The task has the input dependency of a finished child node and an output dependency of all child nodes interpolated, shifted, and aggregated in the parent node. The plural nodes within the spherical region require a reduce and scatter to complete M2M. This is done in lines 14-18. The reduce and scatter is confined to the partition of data owned by a process. This scheme of interpolating whole plural node data duplicated across processes and reducing and scattering the partition is nearly-optimal in the spherical region where the nodes are no more than 4 hops from the leaves, and the amount of data is small. A task yield within Line 16 stops the task from proceeding with the reduce and scatter until a process receives all communications from other processes. The core may execute other tasks meanwhile. In lines 28-55 of Algorithm 2, each plural node in the uniform region is interpolated, shifted, and aggregated in a process’s partition of the node as depicted in Fig. A.24. If the tree is composed of spherical and uniform regions, care must be taken transitioning between these regions. Line 10 aggregates top level spherical nodes into the parent residing in the first level of the uniform region. If there are plural nodes at this first uniform region, lines 100 else cycle end if end if end for !$OMP TASK CALL aggregate interpolated, shifted data from remote processes. !$OMP END TASK !$OMP TASK CALL serial interpolation of child node CALL shift interpolated child to parent node CALL aggregate shifted node to parent node !$OMP END TASK if child node is plural then CALL get child nodes of node for each child node do end for if node is plural and in spherical region then !$OMP TASK CALL reduce and scatter node partitions !$OMP END TASK Algorithm 2 M2M traversal 1: for each node in process do 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: end for 20: if there are spherical and uniform tree regions then for each plural node in transition level do 21: 22: 23: 24: 25: 26: 27: end if 28: for each uniform level above first level do 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: 51: 52: 53: 54: 55: 56: end for !$OMP TASK aggregate non-plural children to plural CALL node !$OMP END TASK end for if no child node is plural then for each plural child of plural node do if one child is plural then for each plural node do !$OMP TASK CALL aggregate non-plural children and both plural children !$OMP END TASK !$OMP TASK CALL aggregate non-plural and single plural to plural node children !$OMP END TASK !$OMP TASK CALL parallel FFT interpolation of plural child CALL M2M shift of plural child to parent !$OMP END TASK end for end if end if else else 101 20-27 aggregate the data from all processes sharing the node due to line 10 as a task. Lines 28-55 handle the interpolation, shifting, and aggregation of plural nodes in the uniform region. There are three cases of aggregation which are broken into three separate tasks where the task dependencies vary. In lines 37-39, the plural node has no plural children, and therefore the task input dependency is the interpolation and shifting of each non-plural child. In lines 42-45, there is one plural child, and therefore the input dependency of the task is the interpolation and shifting of the plural child and the non-plural children. Finally, the last case is the plural parent has two plural children, and the task input dependency is the interpolation and shifting of all non-plural children and the two plural children. Hybridizing task OpenMP and MPI allows the parallel interpolation and aggregation steps to avoid communications between cores sharing memory on a multi-core processor. Also, tasks reduce duplicated, temporary memory allocations for interpolating plural parent nodes as well as interlace aggregations and interpolations for plural nodes in the uniform region. Task+MPI M2L After M2M is completed, M2L commences according to Algorithm 3. M2L sends, receives, and translates the intersection of observer node data and source nodes in the farfield on local and remote processes. In M2M, the number of tasks was determined by the tree, but the number of tasks in M2L depends on the tree and the buffer size used to send messages between nodes on a multi-process architecture. Therefore, a parent task is created in line 1 to spawn and manage the child tasks. M2L continues to run until there are no more messages to send, no more messages to receive, and no more source node data to translate to the observer node. There are two options for handling the communications. The first option is communicate each source and observer pair for M2L as a separate communication. This is not practically feasible for modeling the scattering from electrically large objects because there are many such pairs. The second option is communicate all source data required by 102 Figure A.25: Diagram of M2L send communication using OpenMP and MPI. observers on another process at the same time under the constraint of a message buffer size. This is more efficient, but the communication may be very large. Our algorithm uses this second option but divides large communications into several block communications using a message buffer to send and receive data. The MPI communications in M2L are therefore optimal [Lingg et al., 2022]. More specifically, lines 3-17 create the tasks for filling and then sending a message buffer to a remote process as shown in Fig. A.25 where colors denote node data. Lines 18-39 receive and translate the source node data to the appropriate observer node as shown in Fig. A.26 and Fig. A.27. If another task is writing to the same observer, the observer node is locked and the thread yields the task and executes other tasks until the observer node is unlocked. In lines 40-55 the source nodes owned by the process are translated to the appropriate observe nodes on the process in a task. Again, the thread yields the task while another thread is writing to the observer node and executes another task until the observer node is unlocked. There are several benefits to this novel hybrid parallelization of M2L. The send, re- ceive, and local translation tasks are independent and interlaced with OpenMP. Also, the 103 Figure A.26: Diagram of M2L receive communication using OpenMP and MPI. Figure A.27: Diagram of M2L memory efficient translation using OpenMP and MPI. pre-computed translation operators are duplicated across less processes which leads to con- siderable memory savings. And finally, there are far less MPI communications between processes for the send and receive operations which is a significant problem at the upper levels of trees with many levels. Task+MPI L2L and L2O The L2L traversal is similar to the M2M traversal, and the pseudocode is shown in Algorithm 4. First, plural nodes are shifted and anterpolated in the uniform region in lines 1-36. The data from a plural parent with two children on the process is copied into two temporary memory arrays for anterpolation in lines 5-6 as a task. The data from a plural parent with 104 else end if end if end for send messages complete if message from remote process needed then if not sending message to remote process then if process must send message to remote process then if messages need sending then for each remote process do if messages need receiving then for each remote process do end if if not receiving message from remote process then !$OMP TASK call fill send buffer with message until buffer filled call send buffer to remote process !$OMP END TASK !$OMP TASK call create receive buffer for message call fill receive buffer with message call perform translation operation on message if observer node unlocked then call add translated message to observer node data Algorithm 3 M2L translation 1: !$OMP TASK 2: while (send messages, receive messages, and local translations incomplete) do 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: 51: 52: 53: 54: 55: 56: 57: end if 58: end while 59: !$OMP END TASK !$OMP TASK call perform translation operation on source node data if observer node unlocked then for each observer needing translated sources do for each source of observer not translated do end if if local translations remain then call add translated source node call to observer node data end if !$OMP END TASK end if !$OMP END TASK local translations complete !$OMP TASK YIELD !$OMP TASK YIELD receive messages complete end for end for end for end if end if else else else else 105 one child on the process is copied to a single temporary array in line 10 as a task. If the level doesn’t have all plural children, then the whole parent node is stored on the process (rather than a partition). These nodes are shifted, anterpolated, and reduced in lines 17-20 as a task. If a level has all plural nodes, only the processes’s partition of the plural node is stored on the process. Each plural parent is shifted and anterpolated as a task. If the child node is at a level without all plural nodes then the anterpolation is reduced from all processes owning the parent to the children. The task yield stops the task and allows the task to execute other tasks until the child receives all messages from remote processes. If the child node resides in an all plural node level, then the process sends and receives with point-to-point communications in lines 31-32. The task yield yields the thread to execute other tasks until all messages are sent to the children and received at the child level. Second, whole nodes are shifted, anterpolated, and stored in lines 38-50. This is a single task, and the anterpolation uses a FFT or SHT depending on the spherical region of the node. The L2O operation necessarily occurs within a single process and after L2L. The algo- rithm makes the L2O operation for each leaf node a task and computation within L2O is not parallelized due to the low computational cost. Locks are used to avoid write-write errors to each observer. Like M2M, hybridizing task OpenMP and MPI for L2L reduces communications and duplicated memory across processes and interlace anterpolations for plural nodes in the uniform region. Scalability Analysis on 375λ Plane This section presents the results of the novel implementation of OpenMP and MPI in the parallelization of the MLFMM accelerated CFIE and analyzes the scalability of the algorithm. The implementation was verified by comparing the radar cross section (RCS) of a 40λ PEC sphere discretized at λ/10 using GRMES and tolerance 1e-3 with the Mie series solution. The Task+MPI CFIE agrees with analytic result as shown in Fig. A.28. 106 if parent node has two children on process then !$OMP TASK CALL copy parent data into temporary memory for each child for anterpolation !$OMP END TASK !$OMP TASK CALL copy parent data into temporary memory for anterpolation !$OMP END TASK end for if level doesn’t have all plural nodes then !$OMP TASK CALL shift parent node data to child node CALL parallel anterpolation routine CALL reduce anterpolation to each process residing on plural child !$OMP END TASK !$OMP TASK CALL shift parent node data to child node CALL parallel anterpolation routine if child node is at level w/o all plural nodes then CALL reduce anterpolation to each process residing on plural child !$OMP TASK YIELD CALL point-to-point send anterpolated partition to other processes CALL point-to-point receive anterpolated partition from other processes !$OMP TASK YIELD else else end if end for for each plural node do for each plural node do Algorithm 4 L2L traversal 1: for each level in uniform region do 2: for each plural node do 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: end for 38: for each node in tree do 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: end for for each child of node do if node is plural then end for end for end if end if end if cycle else else !$OMP TASK CALL shift node to child CALL anterpolate node CALL store anterpolated data in child !$OMP END TASK 107 Figure A.28: RCS of unit sphere. Figure A.29: Plane with 71 million edges. To test the OpenMP and MPI implementation, the plane geometry with 71 million edges shown in Fig. A.29, is excited at 1.5 GHz such that the geometry is discretized at λ/11. At this frequency, the electrical length of the plane is 375λ. For the MLFMM acceleration, all tree leaves are at the same level and the number of spherical level is set to 4. For the 1.5 GHz excitation of the plane, the tree is 12 levels with 4 spherical levels and 8 uniform levels. The computation uses NERSC’s Cori which has a dual-socket 16-core Haswell architecture 108 Figure A.30: Memory usage of OpenMP and MPI CFIE and pure MPI CFIE with MLFMM scaling MPI processes. with 32 cores per node. More specifically, the test uses 64, 128, 256, and 512 processes with 4, 8, and 16 threads where one thread corresponds to one Haswell core. In what follows, the results are reported in terms of the number of cores to easy the comparison between the two implementations. The Task+MPI CFIE data is simply the best result for any combination of threads and processes where threads × processes = cores for a fixed number of cores. Also, the specific load balancing scheme is described in [Hughey et al., 2019][Sundar et al., 2008] and adapted to the case of plural nodes partitioned among processes in the uniform tree region for the Task+MPI CFIE. The OpenMP and MPI implementation is more memory efficient than the pure MPI implmentation of the CFIE as is shown in Fig. A.32. The difference in maximum amount of allocated memory during the run-time is caused by OpenMP using the shared memory of the Haswell node. OpenMP eliminates the MPI communications between these cores with shared memory, reduces the number of communications between these cores and cores on other nodes, reduces the duplication of translation operators between cores within a node, and the duplication of temporary memory allocations in general. The OpenMP and MPI CFIE is also more parallel efficient than MPI CFIE. In Fig. A.35, 109 Figure A.31: Memory usage of OpenMP and MPI CFIE and pure MPI CFIE with MLFMM scaling MPI threads. Figure A.32: Memory usage of OpenMP and MPI CFIE and pure MPI CFIE with MLFMM scaling cores. 110 Figure A.33: Parallel efficiency of OpenMP and MPIE CFIE and pure MPI CFIE with MLFMM scaling MPI processes. the OpenMP implementation is more parallel efficient for all numbers of cores, but neither implementation scales for large numbers of cores. Again, OpenMP leverages shared memory to reduce the MPI communications and interleave computations, but the MPI communica- tions still exist and become a bottleneck for the M2L operation, especially between nodes at the higher tree levels, as reported for the pure MPI implementation of MLFMM for the oscillatory free-space Green’s function (25) in [Lingg et al., 2022]. The OpenMP reduces but does not resolve the M2L issue. Furthermore, the OpenMP task scheduler doesn’t op- timally prioritize alternative tasks. For example, the priority of a task is documented in the OpenMP API specification as a recommendation rather than a rule. In other words, the error-controllable MLFMM accelerated CFIE (or any other full-wave integral equation formulation accelerated with MLFMM) is not arbitrarily scalable, and the improvements of a hybridization of OpenMP and MPI are limited by the OpenMP library. The scalability limit for this instance of MLFMM is fundamental. The tree structure causes an explosion in the number of M2L MPI communications between nodes at the upper tree levels for an increasing number of processes [Lingg et al., 2022]. Therefore, for sufficiently large problems, APS local interpolants are necessary at upper-most tree levels. 111 Figure A.34: Parallel efficiency of OpenMP and MPIE CFIE and pure MPI CFIE with MLFMM scaling threads. Figure A.35: Parallel efficiency of OpenMP and MPIE CFIE and pure MPI CFIE with MLFMM scaling cores. 112 Figure A.36: Time to complete a matvec for OpenMP and MPIE CFIE and pure MPI CFIE with MLFMM scaling MPI processes. Figure A.37: Time to complete a matvec for OpenMP and MPIE CFIE and pure MPI CFIE with MLFMM scaling threads. The parallel efficiency issues manifests in matvec times for the Task+MPI CFIE and pure MPI CFIE as shown in Fig. A.38. The OpenMP and MPI implementation is faster than the pure MPI implementation, but both formulations converge to similar times for a large number of cores. The cost of one process sending and receiving MPI messages to all other processes outweighs the benefit of adding another process. We also note Task+MPI matvec times behave more regularly than MPI matvec times on NERSC Cori HPCC. 113 Figure A.38: Time to complete a matvec for OpenMP and MPIE CFIE and pure MPI CFIE with MLFMM scaling cores. Figure A.39: Load balancing of pure MPI implementation. A OpenMPI and MPI implementation also improves load balancing. Comparing Fig. A.39 and Fig. A.40, the each Task+MPI rank (or MPI process) completes at nearly the same time while the ranks of the pure MPI do not. The better balancing is caused by a thread completing other tasks using the shared memory while the core waits to receive MPI communications necessary to complete a task and, again, reduce MPI communications. 114 Figure A.40: Load balancing of OpenMP and MPI implementation. Final Remarks on MLFMM Parallelization In this paper we have analyzed the scalablity of the global interpolant MLFMM acceler- ated CFIE using the Intel OpenMP and MPI FORTRAN libraries. The parallel hybridization yields notable improvements over the pure MPI implementation in terms of load balancing, matvec timings, parallel efficiency, and especially memory usage. However, the scaling of a hybrid or pure MPI parallelization is limited at large numbers of processes due to the bottleneck of MPI communications between nodes during M2L at upper tree levels. This issue is fundamental to any global, error-controllable MLFMM used in a full-wave solver run on a multi-core architecture. The hybridization of OpenMP and MPI alleviates the issue, but the practical limitations of the Intel OpenMP library, such as the task scheduler and sub-optimal routing of communications from threads to other nodes through a root thread, mitigate the performance improvement of interleaving the algorithm. For sufficiently large problems, M2L communications necessitate APS local interpolants for sampling the Ewald spheres at the upper-most levels. 115