EFFICIENT CHARGE CONSERVING UNCONDITIONALLY STABLE FINITE ELEMENT FORMULATIONS FOR PARTICLE-IN-CELL SIMULATIONS By Omkar Harishankar Ramachandran A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Electrical Engineering – Doctor of Philosophy 2023 ABSTRACT The simulation of systems involving charged particles moving in the presence of electromagnetic fields is of great interest in a number of domains in physics, with applications including the characterization of pulsed power devices and accelerators, design of high precision etching and sterilization implements. As a result, several methods have been proposed to accurately simulate such systems. One such method is the particle-in- cell (PIC) technique, which characterizes the distribution of a plasma in phase space through a collection of statistically significant macroparticles. While contemporary implementations of electromagnetic PIC (EM-PIC) have typically relied on a finite-difference time-domain (FDTD) stencil to evaluate the fields, there has been a push for the adoption finite element methods that allow for the use of better geometry representations and more robust function spaces. In particular, recent developments in the field have focused on developing implicit finite element solvers that are free of mesh dependent stability constraints while natively conserving fundamental quantities such as charge and energy. The goals of this dissertation are to develop efficient, charge-conserving, implicit finite element particle-in-cell (EM-FEMPIC) methods. First, (i) we construct a formulation of PIC built around expontential predictor-corrector particle integrators. We demonstrate that this approach has significantly better error convergence than equivalent polynomial methods, thus allowing for accurate evaluation of particle trajectories even at the large step-sizes afforded by implicit EM solvers. Next, (ii) for devices of a narrowband tendency, we construct a novel EM-FEMPIC method based on envelope tracking. This allows us to accurately simulate the EM response of such a device while sampling at the narrow bandwidth, rather than at the highest absolute frequency of interest. Furthermore, we explore the consequences on charge-conservation for such a method and propose a rubric to ensure exact satisfaction of Gauss’ laws. We then consider (iii) the matter of energy conservation in an implicit EM-FEMPIC scheme and propose a set of guidelines that ensure the conservation of average energy over the course of a simulation. Finally, (iv) we reformulate a parameter extraction method originally proposed for efficient device-agnostic simulation of EM systems attached to lumped nonlinear devices to make it applicable to a system of moving particles. We couple this approach with a domain-decomposition framework to construct an efficient, ’particle-agnostic’ extraction framework. Taken together these contributions address several open problems in the field and extend the applicability of EM-FEMPIC methods to larger, more relevant problems. Copyright by OMKAR HARISHANKAR RAMACHANDRAN 2023 To my parents for their unconditional love and support v ACKNOWLEDGMENTS Reserach is very fundamentally a team sport and while there is only one author associated with this work, there are many people whose contributions were indispensable to its creation. I first wish to extend my gratitude to my advisor Dr. Balasubramaniam Shanker for his patience and guidance. Without the many meetings, sometimes at odd hours in the evenings debugging code and bouncing ideas, none of this work would’ve been done. I would also like to extend my gratitude to members of my guidance committee Dr. Leo Kempel, and Dr. John Luginsland, Dr. Prem Chahal and Dr. Michael Murillo for their guidance and continued support. I have had many interesting conversations with them during my time as a graduate student and almost every chapter of this work involved ideas that directly stemmed from these talks. Next, I want to sincerely thank the other graduate students in the computational electromagnetics group for their friendship and support. Unlike a lot of other groups in academia, the CEML was extremely collaborative, with students often ringing each other to get help or advise on solving research problems. First and foremost, I want to thank Dr. Zane Crawford and Dr. Scott O’Connor whose work laid the foundations of the material presented in this thesis. I had the immense fortune of joining the group when Scott and Zane were both senior students who understood computational EM far better than I did. I remember asking for documentation of our 200K+ line fortran code when I first joined the group only for Zane to laugh and say "I AM the documentation!". I would like to thank Dr. Abdel Alsnayyan for being the ultimate partner-in-crime throughout my time in grad school. We never published together, but a lot of my ideas would never have come to fruition without the many discussions I had with Abdel. Next, I want to acknowledge the other members of the finite element subgroup, Chad Moorman and Sean DePalma for the many fun discussions and meetings we’ve had over the past couple of years. Sean, in particular, got up to speed with my early port extraction results and extended it to solve complex optimization problems in his short stint as a Master’s student. Chad, being the vi sole surviver from the group’s time at MSU, is now set to inherit this mess and do great things with it. Additionally, I’d like to acknowledge the other students in my group, Jacob Hawkins, Luke Baumann and Elliot Lu for many interesting conversations and interactions over the years. I’d next like to thank the many students of the wider electromagnetics research group (EMRG) for making the work environment lively, fun and intellectually stimulating. In no particular order, I’d like to thank Dr. Stavros Vakalis, Dr. Serge Mghabhab, Anton Schlegel, Cory Hilton, Jacob Randall, Jason Merlo, Daniel Chen, Jorge Colon-Berrios, William Torres, Ahona Bhattacharrya, Niam Shandi, Dr. Saranraj Karuppuswami, Dr. Saikat Mondal, Dr. Michael Craton, and Amer Abu Arisheh. Having interests and hobbies outside research was critical to my success in grad school. To that end, I’d like to thank everyone involved with the MSU Fencing Club for more or less being the entirity of my social life, as well as the coaches and staff at the MSU tennis center for many fun weekly lessons. Finally, this section would be grossly incomplete without mention of my family, whose unconditional love and support was absolutely critical to my productivity and success. In particular, I’d like to thank my parents Dr. V. Bharathi Harishankar and Dr. Hari Ramachandran. My almost daily conversations with them has been the source of many great ideas that made this work possible. I’d also like to thank my aunt Dr. Vijaya Ramachandran and my uncle Dr. Ramesh Narayan for giving me a home in the US and making my time here comfortable and fun. vii PREFACE In reference to IEEE copyrighted material which is used with permission in this thesis, the IEEE does not endorse any of Michigan State University’s products or ser- vices. Internal or personal use of this material is permitted. If interested in reprint- ing/republishing IEEE copyrighted material for advertising or promotional purposes or for creating new collective works for resale or redistribution, please go to http: //www.ieee.org/publications_standards/publications/rights/rights_link.html to learn how to obtain a License from RightsLink. viii TABLE OF CONTENTS CHAPTER 1 INTRODUCTION AND BACKGROUND . . . . . . . . . . . . . . . . 1 CHAPTER 2 AN EXPONENTIAL PREDICTOR CORRECTOR FRAMEWORK FOR RELATIVISTIC EM-FEMPIC . . . . . . . . . . . . . . . . . . . . . 16 CHAPTER 3 AN ENVELOPE TRACKING APPROACH FOR PARTICLE IN CELL SIMULATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 CHAPTER 4 ANALYSIS AND BOUNDS FOR ENERGY CONSERVATION IN EM-FEMPIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 . CHAPTER 5 A TRANSIENT PARAMETER EXTRACTION TECHNIQUE FOR FINITE ELEMENT PARTICLE IN CELL SOLVERS . . . . . . . . . . . 85 CHAPTER 6 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 ix CHAPTER 1 INTRODUCTION AND BACKGROUND The simulation of moving charged particles in an electromagnetic field distribution is of great interest in a number of engineering applications. These include fundamental plasma physics, such as the design of particle accelerators, plasma processing applications such as sterilization of medical implements and etching of high precision integrated chips [1, 2, 3], among many others. Likewise, high fidelity simulation of migrating electrons and positive holes are frequently required in the design and analysis of cutting edge semiconductor devices [4]. Due to the varied subject areas where these systems show up, a number of methods have been proposed over the last few decades to accurately and efficiently solve for the physics of these systems. These methods include fluid based methods, which involve directly solving the kinetic equations to obtain phase space distributions of the charged species [5, 6], and particle based methods, wherein a finite number of samples of the distribution function are used in conjunction with an appropriate electromagnetic solver to characterize the physics of the system [7]. While many interesting problems and applications exist with fluid based solvers, we restrict our analysis in this work purely to particle based methods and their application to electromagnetic problems. For the remainder of this chapter, we will introduce and describe the general setup of the problems considered in the remainder of the thesis (Section 1.1) and a brief overview of the many implementations of the Particle-in-Cell method (Sections 1.2 and 1.3), before providing an overview of the current state of the art in electromagnetic finite element particle in cell (EM-FEMPIC) in the subsequent sections. Finally, we will conclude this chapter with a brief overview of the chapters to follow. 1 Figure 1.1: A general schematic of the simulation domain. 1.1 Problem Statement Consider a region Ω ∈ R3 bounded by a surface 𝜕Ω containing a single charged species. This region is subjected to an external field due to which the charged species accelerate, and in turn, produce spatially and temporally varying electric and magnetic fields denoted by E(r, 𝑡) and B(r, 𝑡), respectively, with r ∈ Ω and 𝑡 ∈ [0, ∞). The dynamics of the particles in phase space can be represented by a Phase Space Distribution Function (PSDF) 𝑓 (𝑡, r, v) that follows the Vlasov equation: 𝜕𝑡 𝑓 (𝑡, r, v) + v · ∇ 𝑓 (𝑡, r, v)+ 𝑞 𝑚 [E(r, 𝑡) + v × B(r, 𝑡)] · ∇𝑣 𝑓 (𝑡, r, v) =0. (1.1) In what follows, we assume that the background media in Ω is free space. As a result, we denote the permittivity and permeability of free space by 𝜖0 and 𝜇0, respectively, and the speed of light by 𝑐. Finally, we will also assume that the system is quiescent for 𝑡 ≤ 0. 1.2 The Particle-in-Cell (PIC) Technique The hallmark of PIC techniques to solve for systems of moving charges is that the simulator elects to not solve (1.1) directly, but rather, represent the charge and current distributions in Ω using a finite number of samples – or moments – of the PSDF through 𝜌(r, 𝑡) = ∫ 𝑞v 𝑓 (𝑡, r, v)𝑑v. These samples usually manifest as 𝑞 𝑓 (𝑡, r, v)𝑑v and J(r, 𝑡) = ∫ Ω Ω 2 Particle Push Map Fields to Particles Map particle in current and charge Update Maxwell’s Equations Figure 1.2: The simulation cycle of a standard electromagnetic PIC scheme. using 𝑁𝑝 macroparticles – with charges and mass corresponding to a clump of multiple particles. One can evolve the positions and velocity of these particles together with Maxwell’s equations through the imposition of a spatial shape function 𝑆(r), 𝜌(r, 𝑡) = 𝑁𝑝 (cid:213) 𝑝=1 𝑞𝑝𝑆(r − r𝑝(𝑡)), J(r, 𝑡) = 𝑁𝑝 (cid:213) 𝑝=1 𝑞𝑝v𝑝(𝑡)𝑆(r − r𝑝(𝑡)). (1.2a) (1.2b) where r𝑝(𝑡) and v𝑝(𝑡) refer to the positions and velocities as functions of time of the 𝑝th macroparticle. The evolution of the fields E(r, 𝑡) and B(r, 𝑡) over space and time within Ω follow Maxwell’s equations, given by ∇ × E(r, 𝑡) = −𝜕𝑡B(r, 𝑡), ∇ × 𝜇−1 0 B(r, 𝑡) = J𝑖(r, 𝑡) + J(r, 𝑡) + 𝜖0𝜕𝑡E(r, 𝑡), (1.3a) (1.3b) Here J𝑖(r, 𝑡) describes impressed currents within Ω. Furthermore, the solutions to the curl equations in (1.3) also need to satisfy Gauss’ Laws, ∇ · 𝜖0E(r, 𝑡) = 𝜌𝑖(r, 𝑡) + 𝜌(r, 𝑡), ∇ · B(r, 𝑡) = 0. (1.4a) (1.4b) wherein 𝜌𝑖(r, 𝑡) are impressed charges. In what follows, we will assume that both the impressed current and the corresponding charge densities are zero. If they are not, it is trivial to include then in the analysis framework rubric described in the next section. 3 As usual, boundary conditions need to be imposed on E(r, 𝑡) and B(r, 𝑡) on sections of the outer boundary 𝜕Ω, to ensure unique solutions. These are assumed to either Dirichlet, Neumann or impedance boundary conditions on non-overlapping surfaces 𝜕Ω𝐷, 𝜕Ω𝑁 and 𝜕Ω𝐼, with 𝜕Ω = 𝜕Ω𝐷 + 𝜕Ω𝑁 + 𝜕Ω𝐼, and are defined as follows: ˆ𝑛 × E(r, 𝑡) = Ψ𝐷(r, 𝑡) on Ω𝐷 , ˆ𝑛 × 𝜇−1 B(r, 𝑡) = Ψ𝑁 (r, 𝑡) on Ω𝑁 , ˆ𝑛 × 𝜇−1 B(r, 𝑡) − 𝑌 ˆ𝑛 × ˆ𝑛 × E(r, 𝑡) = Ψ𝐼(r, 𝑡) on Ω𝐼 . (1.5a) (1.5b) (1.5c) The evolution of the macroparticles in space and time is determined by solving for the relativistic equations of motion with the acceleration determined by the Lorentz force. This yields the following coupled system of equations for ordinary differential equations (ODEs) for v𝑝(𝑡) and r𝑝(𝑡): 𝑑𝛾𝑝v𝑝(𝑡) 𝑑𝑡 = 𝑞𝑝 𝑚𝑝 (cid:2)E(r𝑝(𝑡), 𝑡) + v𝑝(𝑡) × B(r𝑝(𝑡), 𝑡)(cid:3) 𝑑r𝑝(𝑡) 𝑑𝑡 = v𝑝(𝑡) (1.6a) (1.6b) 1.3 Contemporary EM-PIC Solution Methods In EM-PIC, the solution cycle typically consists of two parts: (1) a full-wave electromag- netic field solver to update fields within the simulation domain as a function of space and time used in conjunction with (2) a Newton solver that evolves the particle trajectories in response to these fields. These two steps need to be done self-consistently in order to simulate the physics of the moving charges. Due to the simplicity of this scheme as well as its ability to generate accurate solutions, many different combinations of field and particle solvers have been developed. The remainder of this section will provide a brief overview of these methods. 4 Finite Difference Time Domain The oldest and by far the most well analyzed EM-PIC method involves using a finite difference time domain (FD-TD) stencil to update the fields along with a modified leapfrog stencil developed by Boris [8] to evolve the particles. The simplest FD-TD methods work primarily by breaking up the simulation domain into a Yee grid [9] and placing electric and magnetic field quantities on the primal and dual meshes respectively. The history of FD-TD based PIC methods are vast and well studied; we refer the reader to these reviews – and the references therein – for a deeper explanation of the method [7, 10, 11]. One of the fundamental bottlenecks with using an FDTD based method is the relative difficulty involved with representing curved surfaces, leading to well known ‘staircasing’ errors, as documented in [12, 13, 14, 15, 16]. To mitigate this, several ‘conformal’ methods have been proposed and analysed. While the earliest implementations were limited to straight edged domains [17, 18, 19, 20], these were quickly extended to curved/curvilinear boundaries [21, 22, 23] with several updates proposed to resolve problems with stability [24] and better incorporation of dielectric near a curved interface [25, 26]. Currently, popular implementations of conformal FDTD include methods that modify boundary cell with an appropriate polygon and enforce the correct boundary condition on the fields. The best exponent of this is the Dey-Mittra scheme [27, 28, 29, 30, 31, 32], with subsequent contributions resolving issues with the maximum allowable step-size [33, 34, 35, 36], allowing the method to be used without a prohibitively small stability constraint. Likewise, irregular interfaces separating dielectrics have also been analysed using various weighting tricks and constraints [25]. We note at this point that this is only a small sampling of the methods that exist to deal with applying FDTD to systems with curved interfaces. We refer the reader to this review for a deeper explanation [37]. 5 Finite Volume Methods Similar to FDTD, Finite Volume Time Domain (FVTD) methods have also been used extensively to solve general electromagnetic problems, with the earliest implementations as in [38, 39, 40]. Improvements in higher order representation and extension to higher order grids [41]; and stencils that exactly preserve charge [42, 43] and energy [44] were achieved in recent years. FVTD based solvers have been implemented in conjunction with particle pushers in a PIC scheme, notably including charge conserving schemes with higher order particle evolution in time [45, 46], drift diffusion based methods for simulating glow discharges [47], along with a number of updates for charge correction [48, 49, 50] and modelling stochastic collisions [51, 52, 53]. Discontinuous Galerkin Methods In a similar manner, there exists a significant body of work on using discontinuous Galerkin time domain methods (DGTD) for PIC [54, 55, 56, 57]. DGTD-PIC methods have been used with success to simulate a number of particle systems [58, 59, 60] in addition to Vlasov-Poisson [61, 62, 63] and Vlasov-Ampere systems [64, 65]. 1.4 Electromagnetic Finite Element PIC (EM-FEMPIC) While the aforementioned methods and have been widely used, we note that that over the past three decades, research has allowed finite element methods in electromagnetics to mature [66]. Today, it is at a state where it has become the de-facto modeling algorithm used by commercial software companies. With this background, it follows that FEM could potentially be a robust tool for PIC, and examination of the bottlenecks and methods developed to overcome them constitutes the rest of the chapter. The viability of the finite element field solver was proven through a series of seminal papers [67, 68, 69, 70, 71] demonstrated both the constraints on charge mapping, as well as basis function representation that enabled an FEM solver to be integrated with a PIC scheme without 6 breaking important conservation properties. These results in totality are often denoted as structure-preserving methods. These results have been augmented with the development of symplectic formulations methods to better conserve charge and energy [72, 73]. By and large, the methods developed rely on explicit field and particle updates, as these naturally conserving charge. Unfortunately, explicit methods are conditionally stable and the stability criterion is related to the finest feature in the mesh. In other words, the time step size that can be used is governed by the smallest mesh element. This implies that analysis of geometrically complex systems requires significant computational resources. Indeed, this feature is shared by all other methods discussed thus far. An obvious remedy is an implicit scheme that is unconditionally stable. Such schemes for FEM are well known [74]. The main question is, how can one adapt such methods to satisfy the conservation of quantities necessary for a PIC scheme? This question was partially addressed in a collection of recent papers [75, 76, 77], wherein a collection of fundamental rules were prescribed. These need to be satisfied by the EM solver and particle evolution scheme for charge to be innately conserved without needing expensive post-processing measures like divergence cleaning. The framework established through the aforementioned papers have been expanded further to more efficiently solve systems with narrowband field responses by implementing a time-marching scheme built around envelope tracking [78], higher order basis sets [79], domain decomposition to glean efficiency [80], and relativistic motion [81]. Fundamentally, [75, 78] identified the failure to conserve charge as being the result of inconsistent choices of representation and testing functions (in space and time) in the discretization of Ampere’s and Gauss’ Laws. This understanding led to the proposal of a slight modification to the current term in Ampere’s Law, which was shown to ensure charge conservation to machine precision [76]. But implicit time stepping schemes come with their own challenge; they admit null spaces that correspond to a DC field (in the case of Maxwell solvers) or a time growing DC field (in case of wave equation field solvers). The existence of these is clearly evident in [77]. Overcoming this bottleneck involved a 7 fundamental change in spatial representation, and through the use of a quasi-Helmholtz decomposition[77]. Matrix definitions involved in EM-FEMPIC discretization As a precursor to the description of the EM-FEMPIC scheme, we first detail a number of commonly used matrix definitions. This section will be referred to when the appropriate quantities are used. To begin, the sets 𝒩, ℰ, ℱ and 𝒯 are defined as the set of nodes, edges, faces and tets respectively having 𝑁𝑛, 𝑁𝑒, 𝑁 𝑓 and 𝑁𝑡 elements. The various submatrices used are as follows: [★𝜖]𝑖,𝑗 = ⟨W (1) 𝑖 (r), 𝜀 · W (1) 𝑗 (r)⟩; 𝑖, 𝑗 ∈ ℰ [★𝜇−1]𝑖,𝑗 = ⟨W (2) 𝑖 (r), 𝜇−1 · W (2) 𝑗 (r)⟩; 𝑖, 𝑗 ∈ ℱ [★𝜌]𝑖,𝑗 = ⟨𝑊 (3) 𝑖 (r), 𝑊 (3) 𝑗 (r)⟩; 𝑖, 𝑗 ∈ 𝒯 (1.7a) (1.7b) (1.7c) where W (1) 𝑖 , W (2) 𝑖 and 𝑊 (3) 𝑖 are the Whitney edge, face and volume basis functions respec- tively. Further, we define the following matrices: [M𝑔]𝑖,𝑗 = ⟨W (1) 𝑖 (r), ∇𝑊 (0) 𝑗 (r)⟩; 𝑖 ∈ ℰ, 𝑗 ∈ 𝒩 [M𝑐]𝑖,𝑗 = ⟨W (2) 𝑖 (r), [∇×]W (1) 𝑗 (r)⟩; 𝑖 ∈ ℱ , 𝑗 ∈ ℰ [M𝑑]𝑖,𝑗 = ⟨𝑊 (3) 𝑖 (r), [∇·]W (2) 𝑗 (r)⟩; 𝑖 ∈ 𝒯 , 𝑗 ∈ ℱ [∇] = 𝜀[★𝜖]−1[M𝑔] [∇×] = 𝜇−1[★𝜇−1]−1[M𝑐] [∇·] = [★𝜌]−1[M𝑑] Likewise the submatrices involved in (3.44) are as follows: [Z]11 = [C 𝑐 ]𝑇[P]Λ 𝑏 𝑏 [C 𝑏 𝑐 ] [Z]12 = [C 𝑐 ]𝑇[∇×][★𝜀]−1[P]Λ 𝑏 𝑒 [★𝜀][C 𝑒 𝑐] 8 (1.8a) (1.8b) (1.8c) (1.8d) (1.8e) (1.8f) (1.9a) (1.9b) 0 - form 1 - form 2 - form 3 - form Primal Dual 𝜙 𝜌 ∇ E ∇× B ∇· [★𝜖] [★𝜇] ∇· D H ∇× ∇ 0 0 3 - form 2 - form 1 - form 0 - form Figure 1.3: Overview of the de-Rham complex and the discrete operations needed to map functions from one space to another. [Z]13 = [C 𝑐 ]𝑇[∇×][★𝜀]−1Σ[C 𝑏 𝑒 𝑧] [Z]21 = [C 𝑒 𝑐]𝑇[P]Λ 𝑒 [★𝜀][C 𝑒 𝑐] [Z]22 = [C 𝑐]𝑇[∇×]𝑇[★𝜇−1][P]Λ 𝑒 𝑏 [C 𝑏 𝑐 ] [Z]23 = [C 𝑐]𝑇Σ[C 𝑒 𝑒 𝑧] (1.9c) (1.9d) (1.9e) (1.9f) where the [𝐶]𝑏 matrices are mappings that identify unknowns that reside on the cotree. Constructing this mapping is trivial for simply connected structures, but is trickier for multiply connected geometries. Discretization in Space FEM discretization of Eq. (1.4) requires choosing appropriate spatial basis functions that respect the unique continuity conditions demanded by Maxwell’s equations: Namely, the tangential continuity of E(r, 𝑡) and the normal continuity of B(r, 𝑡). Traditionally, these conditions have been met by choosing Whitney edge and face basis functions that live on and appropriately transform following the de-Rham complex (depicted in Fig. 1.3). Assuming a tetrahedral discretization of Ω with 𝑁𝑒 and 𝑁 𝑓 faces, E(r, 𝑡) and B(r, 𝑡) are 9 interpolated in space as E(r, 𝑡) = B(r, 𝑡) = 𝑁𝑒(cid:213) 𝑖=1 𝑁 𝑓 (cid:213) 𝑖=1 1 𝑖 (r), 𝑒𝑖(𝑡)W 2 𝑖 (r). 𝑏𝑖(𝑡)W (1.10) 1 Here, W 𝑖 (r) ∈ 𝐻(𝑐𝑢𝑟𝑙; Ω) and W 2 𝑡 (r) ∈ 𝐻(𝑑𝑖𝑣; Ω) represent the Whitney edge function defined on the 𝑖th edge and Whitney face function defined on the 𝑡th face, respectively. The function spaces 𝐻(𝑐𝑢𝑟𝑙; Ω) and 𝐻(𝑑𝑖𝑣; Ω) are defined as 𝐻(𝑐𝑢𝑟𝑙; Ω) = {u ∈ L 2(Ω); ∇ × u ∈ L 2(Ω)}, 𝐻(𝑑𝑖𝑣; Ω) = {u ∈ L 2(Ω); ∇ · u ∈ L 2(Ω)}, (1.11) where L2(Ω) refers to the space of square integrable functions on Ω. Further details on mixed finite elements can be found in [82, 83, 84, 85, 86] and in the references therein. One can obtain a discrete system of equations by Galerkin testing, resulting in the following matrix ODE to solve for the vector of field coefficients ¯𝐵(𝑡) and ¯𝐸(𝑡) at a given instance of time: (cid:105) (cid:104) ¯¯𝑆 · (cid:105) (cid:104) ¯¯𝑀 · + ¯𝐵(𝑡)       ¯𝐸(𝑡)       𝜕𝑡 𝜕𝑡       ¯¯𝐹 = ¯𝐵(𝑡)       ¯𝐸(𝑡) where the various matrix definitions are as follows: ,  [∇×]      0 , 0     [★𝜖]   (cid:104) ¯¯𝑆 (cid:104) ¯¯𝑀 (cid:105) (cid:105) 0 =     −[∇×]𝑇    [★𝜇−1]      ¯¯𝐹 = − = 0 0 .             ¯𝐽(𝑡) 𝜖0 (1.12) (1.13) Furthermore, ¯𝐵(𝑡) = (cid:2)𝑏1(𝑡), · · · , 𝑏𝑁 𝑓 where 𝑗𝑗(𝑡) = , ˜J(r𝑠 , 𝑡) W (cid:68) (cid:69) (1) 𝑗 (𝑡)(cid:3) 𝑇, ¯𝐸(𝑡) = [𝑒1(𝑡), · · · , 𝑒𝑁𝑒 (𝑡)]𝑇 and ¯𝐽(𝑡) = [𝑗1(𝑡), · · · , 𝑗𝑁𝑒 (𝑡)]𝑇 . One can likewise trivially formulate a wave-equation solver 10 for just the electric or magnetic fields, as done in [76]. The matrices in (1.13) are defined in 1.4. Charge conservation and Temporal Discretization To convert (1.12) into a discrete update stencil in time, one has to choose representation and testing functions in time as done in the spatial setup. In general, the vector of coefficients ¯𝐵(𝑡) and ¯𝐸(𝑡) can be interpolated by a set of temporal basis functions 𝑁𝑖(𝑡), such that at any given time 𝑡, ¯𝐵(𝑡) ¯𝐸(𝑡) (cid:169) (cid:173) (cid:171) = (cid:170) (cid:174) (cid:172) 𝑁𝑡(cid:213) 𝑘=0 ¯𝐵(𝑡𝑘) ¯𝐸(𝑡𝑘) (cid:170) (cid:174) (cid:172) 𝑁𝑘(𝑡) (cid:169) (cid:173) (cid:171) (1.14a) and tested by an appropriate function 𝑊𝑛(𝑡) to obtain a marching scheme. Both 𝑁𝑖(𝑡) and 𝑊𝑛(𝑡) have local support and are determined by the order of these functions. While the traditional approach to PIC largely relies on explicit updates, i.e, where the testing scheme for (1.14) is chosen such that the fields at the next timestep depend only on information from previous timesteps, a lot of work has been done on extending the applicability of PIC to implicit methods. The choice of temporal discretization has significant consequences on conservation of charge and satisfaction of Gauss’ Laws, with potential methods to mitigate these presented in [76, 77]. Fundamentally, charge conserving implicit EM-FEMPIC methods are highly desirable due to the availability of unconditionally stable EM solvers in time [83]. A deeper analysis of temporal representation and its consequences on charge conserva- tion will be presented in Chapter 3. 1.5 Nullspaces in implicit EM-FEMPIC and Quasi-Helmholtz Decompo- sition Nullspaces always exist in implicit EM-FEMPIC, owing to the solutions to the curl equations not directly satisfying Gauss’ Laws. Because of this, taking a discrete divergence 11 as is needed to populate the space charges, leads to corruption of charge conservation. This corruption is of the form ∇𝜙(r) in a implicit mixed EM-FEMPIC solver and of the form 𝑡∇𝜙(r) in implicit wave equation EM-FEMPIC. This problem can be mitigated to some extent by solving for the fields at very high tolerance, with the level of spurious excitation dependant upon the tolerance used for computation. Significantly for EM-FEMPIC, the nullspaces will corrupt the satisfaction of Gauss’ Laws despite using careful testing as proposed in [76]. A robust means to mitigate the effect of nullspaces in implicit EM-FEMPIC solvers was presented in [77]. We note that fields E(r, 𝑡) and B(r, 𝑡) can be decomposed into solenoidal and non-solenoidal components. As a primer on notation, ¯𝐸𝑛𝑠(𝑡) refers to the non-solenoidal coefficients of the electric field. Similarly, ¯𝐸𝑠(𝑡) and ¯𝐵𝑠(𝑡) refer to the solenoidal coefficients of the electric field and magnetic flux density, respectively. All relevent sub-matrices involved are defined entirely in Section 1.4. Projectors Projectors to separate out the non-solenoidal components from the Whitney basis functions used in a traditional FEM solve are defined as [ ¯𝑃]Σ 𝑒 = Σ(Σ𝑇Σ)†Σ𝑇 [ ¯𝑃]Λ 𝑒 = ℐ − [ ¯𝑃]Σ 𝑒 (cid:2) ¯𝑃(cid:3) Λ 𝑏 = ℐ − Σ𝑚 (cid:16) Σ𝑇 𝑚Σ𝑚 (cid:17) † Σ𝑇 𝑚 (1.15a) (1.15b) (1.15c) where † represents a Moore-Penrose pseudoinverse, [Σ] = 𝜖0[ ¯𝑀𝑔] and [Σ]𝑚 = [∇·]𝑇. Numerically this is done by separating the field unknowns using a minimum spanning tree and it’s associated cotree [66]. Applying these projectors to the field coefficients will have the effect of separating out the solenoidal components as ¯𝐷𝑛 = Σ ¯𝐸𝑛 𝑛𝑠 + (cid:2) ¯𝑃(cid:3) Λ 𝑒 ¯𝐷𝑛 (1.16) 12 for the electric field and 𝑠 = (cid:2) ¯𝑃(cid:3) Λ ¯𝐵𝑛 𝑏 ¯𝐵𝑛 (1.17) for the the magnetic flux density. By definition, the discrete divergence of the projectors are zero and as a result, the magnetic flux density ¯𝐵𝑛 will have an identically zero divergence. Note that the use of these projectors is tantamount to the imposition of the discrete Coulomb gauge. Therefore, by design, these will satisfy both Gauss’ laws. Discrete System To construct a stencil from these projectors, one can apply a discrete divergence operator to (1.16) to obtain [𝐶 𝑒 𝑧]𝑇 [∇]𝑇 [★𝑒] [∇] [𝐶 𝑒 𝑧] ¯𝐸𝑛𝑠(𝑡) = − [𝐶 𝑒 𝑧]𝑇 [∇]𝑇 ¯𝐺(𝑡). (1.18) Therefore, the non-solenoidal components of the electric field can be related exactly to the charge density. The operation of (1.18) is the numerical analogue of strongly enforcing the Coulomb Gauge. Upon using a mapping to find only the ’cotree’ unknowns in the mesh (which house the solenoidal) components, these can be solved for through (cid:2) ¯𝑍(cid:3) 11 𝜕𝑡 ¯𝐵𝑠(𝑡) + (cid:2) ¯𝑍(cid:3) ¯𝐸𝑠(𝑡) = − (cid:2) ¯𝑍(cid:3) ¯𝐸𝑛𝑠(𝑡) 13 12 (cid:2) ¯𝑍(cid:3) 21 𝜕𝑡 ¯𝐸𝑠(𝑡) − (cid:2) ¯𝑍(cid:3) 22 ¯𝐵𝑠(𝑡) = ¯𝐺(𝑡) − (cid:2) ¯𝑍(cid:3) 𝜕𝑡 ¯𝐸𝑛𝑠(𝑡) 23 (1.19) The time derivatives can be evaluated by using an implicit Newmark-𝛽 operator. Note that the solutions to ¯𝐸𝑠(𝑡) and ¯𝐵𝑠(𝑡) will still contain nullspace excitations. However, when used to obtain the fields through (1.16) and (1.17), these excitations will have no influence on charge conservation. 1.6 Goals and Outline The overarching goals of this work are to take the advances outlined in this chapter and construct improvements to make them more applicable to complex, real-world devices. To 13 that end, the chapters presented in the document address the following problems: • Chapter 2 extends the EM-FEMPIC method, complete with quasi-Helmholtz, to systems where the particle motion is relativistic. While some analysis exists in the literature for the use of non-standard particle-integartion schemes, this has not been extended to the use of predictor-corrector schemes or integrators using non- polynomial representation. This is particularly critical in implicit systems involving relativistic particles as the standard means to push particles in PIC schemes are explicit and suffer from instability due to the stiff nature of the particle update. We present a rigorous overview of these methods from the standpoint of charge and energy conservation and demonstrate the inclusion of a predictor-corrector scheme in an EM-FEMPIC scheme using an implicit electromagnetic solver. • Next, we explore possible optimizations to the electromagnetic solver in Chapter 3. Specifically, we expore the potential use of envelope-tracking methods in a Particle- in-Cell scheme to make the method far more efficient for devices characterized by narrowband high-frequency field responses (as is typical in high power microwave applications). We also analyse the consequences of temporal representation and testing on charge conservation in great detail. Some of the insight presented in this introductory paper found its origin in the work that eventually made up the second chapter. • Chapter 4 extends the analysis done in Chapters 2 and 3 and uses them to construct a method that exactly conserves energy. Similar to the technique proposed in [76], we propose a rubric of constraints that, when followed, will guarantee energy conserving when using the unconditionally stable Newmark-𝛽 method to evolve Maxwell’s equations. We further show that this method satisfies Gauss’ Laws to machine precision, following the framework laid out in [87]. 14 • Finally, in Chapter 5, we explore a technique to extract port information at a discrete number of edges of a finite element mesh to decouple a linear electromagnetic solver from arbitary constraints on said edges enforced by nonlinear ODEs. While the initial extraction method was originally proposed for electromagnetic systems attached to lumped circuit ports, we propose an approach that combines extraction with a domain decomposition framework to improve computational complexity and this make the implicit EM-FEMPIC scheme applicable to larger, more complex problems. 15 CHAPTER 2 AN EXPONENTIAL PREDICTOR CORRECTOR FRAMEWORK FOR RELATIVISTIC EM-FEMPIC One of the principal challenges in particle-in-cell methods is to ensure self-consistency at every time step; i.e the fields, positions and velocities need to be self consistent. What is typically done is to reduce both Maxwell and Newton equations to a set of coupled first order partial differential equations. These are then collectively evolved using an appropriate integrator. Alternately, one offsets the evolution of Maxwell’s equations from that of Newton’s by one time step, i.e., the evolution is explicit. While the former is self-consistent, the latter is not. Delving into the latter for the sake of exposition, we note the following: (a) evolution of Newton’s laws using Boris push (a very robust and popular technique) and leap-frog for Maxwell systems is one of the most popular approaches in setting up a PIC scheme [67]; (b) Boris time stepping is either momentum or energy conserving, preserves the volume of phase space in either case, and has low computational overhead [88, 89]; (c) phase errors are known to exist as one increases the timestep size [90]; (d) leap-frog evolution for Maxwell systems is conditionally stable with time step sizes depending on the smallest feature. There is significant interest in constructing methods that are self-consistent [91]. Existing implementations of such methods typically follow a so-called ‘kinetic enslavement’ approach, with Maxwell and particle systems combined together into a large coupled solver, which can then be evaluated through an appropriate nonlinear solver [92]. Such an approach comes with the obvious downside of being very computationally expensive, since the size of the system is in the order of the number of particles. As a result, many implementations of kinetic enslavement often rely on matrix-free Jacobian free nonlinear solvers to avoid the enormous storage footprint of the combined system matrix [93]. As is evident, relying on a specific nonlinear solver comes with downsides. Further, time step sizes used for kinetic enslavement is also subject 16 to stability constraints, and much like the Boris push, position and velocity updates are polynomial based with convergence rates following the order of approximation. The primary goal of this work is to investigate another approach for self-consistent evolution of particle and Maxwell equations with the following requirements; (a) the evolution of Maxwell systems should be unconditionally stable, i.e., the time step sizes should really be only governed by physics and not feature size used to describe the geometry; (b) it should naturally satisfy Gauss’ Laws, thereby conserving charge and (c) it should show superior error convergence. The objectives of this work, therefore, are twofold: (a) First, we analyse the accuracy and error convergence for a novel predictor-corrector integrator based on exponential basis sets [94] over a number of particle-only numerical experiments. (b) Using this integrator, we construct a charge conserving EM-FEMPIC scheme that functions under a predictor corrector update. We validate both the proposed integrator and the overall PIC scheme through a number of analytical and numerical results. The outline of the paper will be as follows: Sec. 2.1 and Sec. 2.1 will outline the construction of the quasi-Helmholtz based EM solver. Sec. 2.1 and following will detail and analyze energy conservation for a number of different particle evolution schemes. Sec. 2.2 will then validate both the proposed particle evolution scheme and the overall PIC method through a number of analytical and numerical tests, before concluding remarks are presented in Sec. 2.3. 2.1 Formulation Problem Statement Consider a region Ω ∈ R3 bounded by a surface 𝜕Ω containing a single charged species. This region is subjected to an external field due to which the charged species accelerate, and in turn produce spatially and temporally varying electric and magnetic fields denoted by E(r, 𝑡) and B(r, 𝑡), respectively, with r ∈ Ω and 𝑡 ∈ [0, ∞). The dynamics of the particles 17 in phase space can be represented by a distribution function (PSDF) 𝑓 (𝑡, r, v) that follows the Vlasov equation: 𝜕𝑡 𝑓 (𝑡, r, v) + v · ∇ 𝑓 (𝑡, r, v)+ 𝑞 𝑚 [E(r, 𝑡) + v × B(r, 𝑡)] · ∇𝑣 𝑓 (𝑡, r, v) =0. (2.1) 𝑞 and 𝑚 here refer to the particle charge and mass respectively. In what follows, we assume that the background media in Ω is free space. As a result, we denote the permittivity and permeability of free space by 𝜖0 and 𝜇0, respectively, and the speed of light by 𝑐. Discrete Solution Solutions to (2.1) is found through a Particle-in-Cell formulation, wherein the charge and current distributions within the domain are represented in terms of a finite number of moments of the PSDF 𝑓 (𝑡, v, r) 𝜌(r, 𝑡) = 𝑁𝑝 (cid:213) 𝑝=1 𝑞𝑝𝑆(r − r𝑝(𝑡)), J(r, 𝑡) = 𝑁𝑝 (cid:213) 𝑝=1 𝑞𝑝v𝑝(𝑡)𝑆(r − r𝑝(𝑡)). (2.2a) (2.2b) Here, 𝑁𝑝 refers to the total number of such samples or macroparticles in the simulation. And likewise 𝑞𝑝 and 𝑚𝑝 refer to the charge and mass of the 𝑝th particle respectively. The dynamics of the fields within the simulation domain Ω obey Maxwell’s Equations which can be written as ∇ × E(r, 𝑡) = −𝜕𝑡B(r, 𝑡), ∇ × 𝜇−1 0 B(r, 𝑡) = 𝜕𝑡G(r, 𝑡) + 𝜕𝑡 𝜖0E(r, 𝑡), (2.3a) (2.3b) where G(r, 𝑡) = ∫ 𝑡 magnetic laws 0 J(r, 𝜏)𝑑𝜏. Further, the fields are constrained by Gauss’ electric and ∇ · 𝜖0E(r, 𝑡) = 𝜌(r, 𝑡), (2.4a) 18 The curl equations in (2.3) are discretized in space using a finite element formulation over ∇ · B(r, 𝑡) = 0. (2.4b) a tetrahedral mesh with 𝑁𝑒 edges and 𝑁 𝑓 faces, wherein the fields are represented in space using Whitney basis forms, E(r, 𝑡) = (cid:205)𝑁𝑒 𝑖=1 some time-varying coefficient 𝑒𝑖(𝑡) and 𝑏𝑖(𝑡) respectively and subsequently Galerkin tested. (r), B(r, 𝑡) = (cid:205)𝑁 𝑓 𝑖=1 𝑏𝑖(𝑡)W (r), for 𝑒𝑖(𝑡)W (1) 𝑖 (2) 𝑖 Temporally, this system is discretized using a Newmark-𝛽 [74] formulation with 𝛾 = 0.5 and (𝑡)(cid:3). We 𝛽 = 0.25. In what follows, ¯𝐸 = [𝑒1(𝑡), 𝑒2(𝑡), · · · , 𝑒𝑁𝑒 (𝑡)], ¯𝐵 = (cid:2)𝑏1(𝑡), 𝑏2(𝑡), · · · , 𝑏𝑁 𝑓 denote evaluations of quantities at 𝑡 𝑛 = 𝑛Δ𝑡 where Δ𝑡 is the time step size, via a superscript 𝑛 such as ¯𝐸𝑛 = ¯𝐸(𝑡 𝑛). Complete details on the mixed FEM discretization can be found in [82, 83, 84, 85, 86] and the references therein. Quasi Helmholtz Using the formulation described in Sec. 2.1 with an implicit time-marching scheme is well known to not satisfy Gauss’ Laws due to null spaces for both Maxwell solver or the wave-equation formulation [95, 96], both with implicit unconditionally stable time stepping. The most comprehensive method to mitigate this issue is to use a so-called Quasi-Helmholtz decomposition to partition the field components into solenoidal and non-solenoidal components. The non-solenoidal components can then be explicitly solved for, thereby ensuring exact satisfaction of the Coulomb gauge. What follows is a brief overview of the quasi-Helmholtz setup described in great detail in [77]. In what follows, variables with the subscript ‘𝑛𝑠’ will refer to non-solenoidal quantities and ‘𝑠’ will likewise denote solenoidal quantities. Relevant Matrix Definitions For ease of reading, the relevent matrix definitions from Chapter 1 are repeated here. Let the sets 𝒩, ℰ, ℱ and 𝒯 be defined as the set of nodes, edges, faces and tets within the mesh of the simulation domain respectively having 𝑁𝑛, 𝑁 𝑓 , 𝑁𝑒 and 𝑁𝑡 elements. The 19 various submatrices that will be used in describing the Quasi-Helmholtz framework are as follows: [★𝜖]𝑖,𝑗 = ⟨W (1) 𝑖 (r), 𝜀 · W (1) 𝑗 (r)⟩; 𝑖, 𝑗 ∈ ℰ [★𝜇−1]𝑖,𝑗 = ⟨W (2) 𝑖 (r), 𝜇−1 · W (2) 𝑗 (r)⟩; 𝑖, 𝑗 ∈ ℱ [★𝜌]𝑖,𝑗 = ⟨𝑊 (3) 𝑖 (r), 𝑊 (3) 𝑗 (r)⟩; 𝑖, 𝑗 ∈ 𝒯 (2.5a) (2.5b) (2.5c) where W (1) 𝑖 , W (2) 𝑖 and 𝑊 (3) 𝑖 are the Whitney edge, face and volume basis functions respec- tively. Further, we define the following matrices: [M𝑔]𝑖,𝑗 = ⟨W (1) 𝑖 (r), ∇𝑊 (0) 𝑗 (r)⟩; 𝑖 ∈ ℰ, 𝑗 ∈ 𝒩 [M𝑐]𝑖,𝑗 = ⟨W (2) 𝑖 (r), [∇×]W (1) 𝑗 (r)⟩; 𝑖 ∈ ℱ , 𝑗 ∈ ℰ [M𝑑]𝑖,𝑗 = ⟨𝑊 (3) 𝑖 (r), [∇·]W (2) 𝑗 (r)⟩; 𝑖 ∈ 𝒯 , 𝑗 ∈ ℱ [∇] = 𝜀[★𝜖]−1[M𝑔] [∇×] = 𝜇−1[★𝜇−1]−1[M𝑐] [∇·] = [★𝜌]−1[M𝑑] (2.6a) (2.6b) (2.6c) (2.6d) (2.6e) (2.6f) Projector Definitions To separate the non-solenoidal components from the basis forms used for representing the electric field, we define projectors [ ¯𝑃]Σ 𝑒 and [ ¯𝑃]Λ 𝑒 to break up the electric field as follows (where † represents a Moore-Penrose pseudoinverse): [ ¯𝑃]Σ 𝑒 = Σ(Σ𝑇Σ)†Σ𝑇 [ ¯𝑃]Λ 𝑒 = ℐ − [ ¯𝑃]Σ 𝑒 (cid:2) ¯𝑃(cid:3) Λ 𝑏 = ℐ − Σ𝑚 (cid:16) Σ𝑇 𝑚Σ𝑚 (cid:17) † Σ𝑇 𝑚 (2.7a) (2.7b) (2.7c) where [Σ] = 𝜖0[M𝑔] and [Σ]𝑚 = [∇·]𝑇. Using these projectors, we can now define a decomposition for the electric flux density as ¯𝐷(𝑡) = Σ ¯𝐸𝑛𝑠(𝑡) + (cid:2) ¯𝑃(cid:3) Λ 𝑒 ¯𝐷(𝑡) (2.8) 20 and the magnetic flux density as ¯𝐵𝑠(𝑡) = (cid:2) ¯𝑃(cid:3) Λ 𝑏 ¯𝐵(𝑡) (2.9) Put simply, these projectors allow us to separate out ¯𝐵(𝑡) and ¯𝐷(𝑡) into terms that have exactly zero divergence (i.e, the ’solenoidal components’ ¯𝐵𝑠(𝑡) and (cid:2) ¯𝑃(cid:3) Λ 𝑒 with non-zero divergence (Σ ¯𝐸𝑛𝑠(𝑡)). Since the application of a discrete divergence operator ¯𝐷(𝑡)) and terms on either projector is 0 (as demonstrated in [77]), one can see that the formulation forces the divergence of ¯𝐵𝑠(𝑡) to zero. Discrete System As stated earlier, our ultimate goal is to solve all of Maxwell’s equations. First given the decomposition, the application of the discrete divergence on (3.42) and (3.41) results in an identically zero matrix and [𝐶 𝑒 𝑧]𝑇 [∇]𝑇 [★𝑒] [∇] [𝐶 𝑒 𝑧] ¯𝐸𝑛𝑠(𝑡) = − [𝐶 𝑒 𝑧]𝑇 [∇]𝑇 ¯𝐺(𝑡). (2.10) Upon rewriting Maxwell’s equations to make use of the decomposition in (3.41) and (3.42); and with ¯𝐸𝑛𝑠(𝑡) in (2.10) explicitly solved for, we obtain (cid:2) ¯𝑍(cid:3) (cid:2) ¯𝑍(cid:3) 11 21 𝜕𝑡 ¯𝐵𝑠(𝑡)+ (cid:2) ¯𝑍(cid:3) 𝜕𝑡 ¯𝐸𝑠(𝑡)− (cid:2) ¯𝑍(cid:3) 12 22 ¯𝐸𝑠(𝑡) = − (cid:2) ¯𝑍(cid:3) ¯𝐸𝑛𝑠(𝑡) 13 ¯𝐵𝑠(𝑡) = −𝜕𝑡 ¯𝐺(𝑡) − (cid:2) ¯𝑍(cid:3) 𝜕𝑡 ¯𝐸𝑛𝑠(𝑡) 23 (2.11) where ¯𝐸𝑠 and ¯𝐵𝑠 are extracted from the simply connected mesh using tree-cotree maps (cid:3). Constructing this mapping is trivial for simply connected structures, but is [𝐶 𝑒 𝑐 ] and (cid:2)𝐶𝑏 𝑐 trickier for multiply connected geometries. The various submatrices involved in (3.44) are defined as: [Z]11 = [C 𝑏 𝑐 ]𝑇[P]Λ 𝑏 [C 𝑏 𝑐 ] [Z]12 = [C 𝑐 ]𝑇[∇×][★𝜀]−1[P]Λ 𝑏 𝑒 [★𝜀][C 𝑒 𝑐] [Z]13 = [C 𝑐 ]𝑇[∇×][★𝜀]−1Σ[C 𝑏 𝑒 𝑧] 21 (2.12a) (2.12b) (2.12c) [Z]21 = [C 𝑐]𝑇[P]Λ 𝑒 𝑒 [★𝜀][C 𝑒 𝑐] [Z]22 = [C 𝑐]𝑇[∇×]𝑇[★𝜇−1][P]Λ 𝑒 𝑏 [C 𝑏 𝑐 ] [Z]23 = [C 𝑒 𝑐]𝑇Σ[C 𝑒 𝑧] (2.12d) (2.12e) (2.12f) As alluded to earlier, the solution to (3.44) is obtained using a Newmark-𝛽 time stepping scheme. The solution to ¯𝐸𝑠(𝑡) and ¯𝐵𝑠(𝑡) still has a 𝐷𝐶 null space, but their divergence is exactly zero. As a result, Gauss’ laws are still exactly satisfied. Particle Evolution A particle evolution scheme is primarily involved in obtaining solutions in space and time of the position and velocity of a particle in response to a prescribed set of electromagnetic fields. Specifically, the particle motion can be obtained by 𝜕𝛾[v𝑝]v𝑝(𝑡) 𝜕𝑡 = a𝑝(𝑡) = 𝑞 𝑚 (cid:0)E(r𝑝 , 𝑡) + v𝑝 × B(r𝑝 , 𝑡)(cid:1) 𝜕r𝑝(𝑡) 𝜕𝑡 = v𝑝(𝑡) (2.13a) (2.13b) where v𝑝(𝑡) and r𝑝(𝑡) are each vectors in R3 and refer to a given particle’s velocity and position respectively at time 𝑡, and 𝛾[v] = (1 − |v|2 /𝑐2) is the relativistic time dialation functional. In what follows, we analyze different approaches to solving these equations as well as their stability properties. Boris Push The Boris method is fundamentally a modified leap-frog solver for (2.13a) and (2.13b) that breaks the update for the relativistic velocity u𝑝 into three fundamental steps. Defining 𝑝 and u𝑛+1 u𝑛 as particle 3-velocities at timestep 𝑛, 𝑝 1. First, half of the electric force is added to a dummy variable u− 𝑝 . − 𝑝 = u 𝑛 𝑝 + 𝜖Δ𝑡 u 22 (2.14) where 𝜖 = (𝑞𝑝/2𝑚𝑝)E(r 𝑛+1/2 𝑝 , 𝑡 𝑛+1/2) and Δ𝑡 is the timestep size. 2. Next, the magnetic contribution is added, + 𝑝 = u u − 𝑝 + Δ𝑡 (cid:16) 𝑞𝑝 𝑚𝑝 𝑛+1/2 𝑝 v × B(r 𝑛+1/2 𝑝 , 𝑡 𝑛+1/2) (cid:17) 3. Finally, the other half of the electric force is added to obtain u𝑛+1 𝑝 . 𝑛+1 𝑝 = u + 𝑝 + 𝜖Δ𝑡 u (2.15) (2.16) While different ways of evaluating (2.15) have been proposed over the years, with some introducing phase errors [90], they all share the fundamental property that they execute a perfect rotation of u− 𝑝 and as a result, an applied magnetic force will never do work on a particle. Hence, the method as a whole is energy conserving under the action of an arbitrary magnetic field. While many different methods exist to prove energy conservation, we considered the 𝑧-transform of the single step update for the implicit leap-frog representation with the forcing terms evaluated at half steps. Proving that this stencil conserves energy is sufficient to show that (2.14)-(2.16) broken up into substeps will conserve energy. We can write the velocity update stencil for a particle moving due to a general magnetic field as follows: (I − Ω) v 𝑛+1 𝑝 = (I + Ω) v 𝑛 𝑝 . (2.17) Here Ω can be written as Ω = 𝜔𝑧 −𝜔𝑦 0 −𝜔𝑧 (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) , 𝑡 𝑛+1/2) 𝜔𝑦 −𝜔𝑥 𝜔𝑥 0 (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) where 𝜔𝑘 = Δ𝑡(𝑞/𝑚𝛾𝑛+1/2 matrix in R3×3. Defining M = (I + Ω)−1 (I − Ω), the 𝑧-transform of (2.17) becomes for 𝑘 ∈ 𝑥, 𝑦, 𝑧 and I is the regular identity 𝑛+1/2 𝑝 )B𝑘(r 𝑝 0 , (2.18) M𝑧 = 𝜆𝑧 = 1 23 (2.19) where 𝜆 refers to the eigenvalues of M. Under the spectral theorem, for a system to be stable, |𝑧| <= 1 for any combination of eigenvalues chosen. However, we can define a stronger condition |𝑧| = 1 wherein the update stencil strictly defines a perfect rotation on the complex plane under the action of a magnetic field. This approach has been used in the past to appraise numerical methods for energy conservation; see [74, 83] and the references therein. As a result, which implies that the particle’s kinetic energy is exactly conserved regardless of the form of the magnetic field. For the Boris update, the (cid:12) (cid:12) (cid:12) (cid:12) (cid:16) v𝑛+1 𝑝 (cid:17) 2(cid:12) (cid:12) (cid:12) (cid:12) (cid:16) v𝑛 𝑝 = (cid:12) (cid:12) (cid:12) (cid:12) (cid:17) 2(cid:12) (cid:12) (cid:12) (cid:12) (2.20) eigenvalues 𝜆 are 𝜆 = 1 √ 1−(𝜔2 𝑥+𝜔2 1−(𝜔2 𝑥+𝜔2 𝑥+𝜔2 1+𝜔2 𝑥+𝜔2 1+𝜔2 𝑥)−2𝑗 𝑥+𝜔2 𝑥)+2𝑗 𝑥+𝜔2 𝜔2 𝑦+𝜔2 √ 𝑧 𝜔2 𝑦+𝜔2 𝑧 𝑥+𝜔2 𝑥+𝜔2 𝑥 𝑥+𝜔2 𝑥+𝜔2 𝑥       It can be shown through simple algebraic manipulation that that each of the three eigenvalues in (2.20) has exactly unit magnitude for any 𝜔𝑥 , 𝜔𝑦 , 𝜔𝑧 ∈ R. Therefore, when combined with (2.19), we see that the roots of the 𝑧-transform has to lie exactly on the unit circle, thereby conserving energy. Polynomial Predictor Corrector Next, we describe the Adams predictor-corrector update. These methods are generally derived by representing the forcing terms a𝑝(𝑡) = 𝑞𝑝/𝑚𝑝 (cid:0)E(r𝑝 , 𝑡) + v𝑝(𝑡) × B(r𝑝 , 𝑡)(cid:1) and v𝑝(𝑡) in terms of polynomial functions for a prescribed order and testing them with a pulse function defined over the length of one timestep. Specifically, the explicit predictor step of the process (typically referred to as an Adams-Bashforth update), involves writing the acceleration a𝑝 and relativistic velocity u𝑝 as a𝑝(𝑡) = v𝑝(𝑡) = 𝑁𝑡(cid:213) 𝑖=1 𝑁𝑡(cid:213) 𝑖=1 𝑝𝐿𝑘(𝑡 − 𝑡 𝑖) 𝑖 a 𝑝𝐿𝑘(𝑡 − 𝑡 𝑖) 𝑖 v 24 (2.21a) (2.21b) where 𝐿𝑘(𝑡 − 𝑡𝑖) refers to a 𝑘th order Lagrange polynomial involving the sample at 𝑡𝑖 and the 𝑘 − 1 samples before it in time. Next, a stencil is obtained from the differential equation by taking an inner product with a pulse function 𝑃𝑛(𝑡) = 1 if 𝑡 ∈ (cid:2)𝑡 𝑛 , 𝑡 𝑛+1(cid:3) , 0 otherwise    (2.22) giving us the following update stencil upon evaluation of the requisite integrals for 𝑘 = 4: 𝑛+1 𝑝 = u 𝑛 𝑝 + u Δ𝑡 24 (55a 𝑛 𝑝 − 59a 𝑛−1 𝑝 + 37a 𝑛−2 𝑝 − 9a 𝑛−3 𝑝 ), 𝑛+1 𝑝 = r 𝑛 𝑝 + r Δ𝑡 24 (55v 𝑛 𝑝 − 59v 𝑛−1 𝑝 + 37v 𝑛−2 𝑝 − 9v 𝑛−3 𝑝 ). (2.23a) (2.23b) One can similarly device an implicit scheme by modifying (2.21) such that the Lagrange polynomials center on 𝑛 + 1 and go back 𝑘 − 1 steps. Upon using the same testing function, we obtain a so-called Adams-Moulton update, which for 𝑘 = 4 looks as follows: 𝑛+1 𝑝 = u 𝑛 𝑝 + u Δ𝑡 24 (9a 𝑛+1 𝑝 + 19a 𝑛 𝑝 − 5a 𝑛−1 𝑝 + a 𝑛−2 𝑝 ) 𝑛+1 𝑝 = r 𝑛 𝑝 + r Δ𝑡 24 (9v 𝑛+1 𝑝 + 19v 𝑛 𝑝 − 9v 𝑛−1 𝑝 + v 𝑛−2 𝑝 ). (2.24a) (2.24b) To construct a predictor-corrector scheme using these stencils, at each timestep, a guess velocity ˜v𝑝 and position ˜r𝑝 are computed following (2.23). These are then inserted into (2.24) in place of v𝑛+1 . Going forward, we shall refer to this predictor corrector and r𝑛+1 𝑝 𝑝 setup with fourth order polynomials as ’Adams4’, and likewise refer to a similar method using third order polynomials as ’Adams3’. Analysis of Energy Conservation We apply the same means of analysing energy conservation here as we did for the Boris update. Consider a simple system wherein the magnetic field is constant and only pointing along the ˆ𝑧 direction. To give the method the best possible chance of conserving energy, let us assume that u𝑛 𝑝 , u𝑛−1 𝑝 and u𝑛−2 𝑝 have identical magnitude. Therefore, 𝛾𝑛 𝑝 = 𝛾𝑛−1 𝑝 = 𝛾𝑛−2 𝑝 . 25 If under these very favorable conditions, the method fails to conserve energy, then it is sufficient to assume that it will fail to do so under a general field. Under our prescriptions, the corrector step of the Adams4 velocity update can be written as follows (I + 9Ω) u 𝑛+1 𝑝 − (I − 19Ω) u 𝑛 𝑝 + 5Ωu 𝑛−1 𝑝 − Ωu 𝑛−2 𝑝 = 0 (2.25) where Ω has the same form as (2.18), but with 𝜔𝑥 = 𝜔𝑦 = 0 and 𝜔𝑧 = (Δ𝑡 𝑞𝑝/24𝑚𝑝)𝛾𝑛 𝑝 B(r𝑛 𝑝 , 𝑡 𝑛). Replacing each matrix product with its eigenvalues and taking a 𝑧-transform gives us (cid:16) 𝜆1𝑧3 − 𝜆2𝑧2 + 5𝜆3𝑧 + 𝜆4 (cid:17) u𝑝(𝑧) = 0 For our chosen system, the eigenvalues can take the following values: 𝜆1 = {1, 1 − 𝑗9𝜔𝑧 , 1 + 𝑗9𝜔𝑧 } 𝜆2 = {1, 1 + 𝑗19𝜔𝑧 , 1 − 𝑗19𝜔𝑧 } 𝜆3 = 𝜆4 = {0, 𝑗𝜔𝑧 , −𝑗𝜔𝑧 } (2.26) (2.27a) (2.27b) (2.27c) By inspection, it is fairly clear that only one combination of eigenvalue choices yields a root for 𝑧 that is on the unit circle, i.e when 𝜆1 = 𝜆2 = 1 and 𝜆3 = 𝜆4 = 0, giving us, 𝑧3 − 𝑧2 = 𝑧2(𝑧 − 1) = 0 (2.28) which admits a root of 𝑧 = 1. Every other combination of eigenvalues leads to roots with magnitudes larger or smaller than 1 in the general case. While one might be able to find specific values of 𝜔𝑧 for which multiple combinations of eigenvalues yield roots on the unit sphere, this would make the method as a whole extremely sensitive to step-size and generally impractical as an energy-conserving method to advance particles. Exponential Predictor-Corrector So far, we have considered update schemes that use polynomials to interpolate the particle parameter samples. However, it is possible to use exponential functions for this 26 purpose. One method to construct a 𝑃𝐸(𝐶𝐸)𝑚 method for ODEs was proposed in [94], wherein the update for the relativistic velocity and position expressions can be written as 𝑛+1 𝑝 = u 𝑘 (cid:213) (cid:104) 𝑖=1 𝑝𝑖v 𝑛+1−𝑖 𝑝 + 𝑝𝑘+𝑖a 𝑛+1−𝑖 𝑝 (cid:105) 𝑛+1 𝑝 = r 𝑘 (cid:213) (cid:104) 𝑖=1 𝑝𝑖r 𝑛+1−𝑖 𝑝 + 𝑝𝑘+𝑖v 𝑛+1−𝑖 𝑝 (cid:105) (2.29a) (2.29b) for an appropriate set of coefficients 𝑝𝑖, the specifics of constructing which can be found in [94]. This choice of coefficients encodes an exponential interpolation for the unknowns v𝑝(𝑡) and r𝑝(𝑡) in the interval [𝑡 𝑛 , 𝑡 𝑛+𝑘], which can be written as 𝑛+𝑘 (cid:213) 𝑖=𝑛 𝑛+𝑘 (cid:213) u𝑝(𝑡) = r𝑝(𝑡) = 𝛼𝑢 𝑖 u 𝑝 𝑒𝜆𝑖 𝑡 𝑖 𝛼𝑟 𝑖 r 𝑝 𝑒𝜆𝑖 𝑡 𝑖 (2.30a) (2.30b) 𝑖=𝑛 where 𝛼𝑘 , 𝜆𝑘 ∈ C refer to appropriately chosen coefficients in the complex plane for the parameters in the setup. Given the number of prior timesteps involved at any given update, it is not possible to prove stability/energy conservation for an arbitrary applied field. However, unlike the polynomial predictor corrector method, we show through numerical examples that the method does conserve energy for both single particle systems and when incorporated into a full EM-FEMPIC simulation. Computational Complexity Since the coefficients for the exponential PC method are not known analytically, there is computational overhead in computing them. However, solving for 𝑝𝑖 only needs to be done once after which it can be tabulated and used to solve different ODEs discretized using the same number of basis sets. The costs, therefore, of using this scheme within a PIC method is identical to any 𝑚-step predictor corrector method. Suppose one uses an iterative solver for the fields, the complexity of evolving Maxwell’s equations at any given timestep can 27 be written as 𝒪(𝑁iter𝑁dof) where 𝑁iter denotes the number of iterations the solver takes to converge and 𝑁dof the total number of unknowns in the system. In practice, this cost is significantly higher than the particle evolution. Furthermore, the unconditionally stable nature of the field solve allows for taking much larger timesteps than would otherwise be possible using an explicit scheme. As a result, even though the Boris and Adams particle update require fewer multiplications than the PC1 scheme used in the paper, the added complexity is negligible in relation to the cost associated with solving for the fields. 2.2 Results The numerical examples in this section are broken up into three parts. (1) First, we consider systems forced by uniform electric and magnetic fields; (2) We then consider systems with nonuniform fields; (3) Finally, the entire EM-FEMPIC scheme is validated through an expanding particle beam simulation. Uniform fields In this section, we will consider the motion of particles in response to a combination of electric and magnetic fields that are assumed to be spatially and temporally constant. Further, for the sake of simplicity, we consider a coordinate system where 𝑞𝑝 = 𝑚𝑝 = 𝑐 = 1. Since all of our experiments have analytical solutions, the initial conditions of both the multi step polynomial and exponential methods are appropriately populated. Finally, in each numerical example, the exponential PC method used is the ’PC1’ stencil described in [94]. Specifically, we effected the SVD required to obtain the coefficients to a tolerance 10−12 (this serves as a lower bound for the error) as well as chose a semidisk radius 𝜌 = 3.15 and constructed a method that uses 22 steps of prior timestep information interpolated with 18 exponential basis sets. 28 Figure 2.1: Comparison of particle trajectory between exponential PC and an analytical solution. Figure 2.2: Convergence of error (defined as the relative norm between computed and analytical particle trajectories) for the linear acceleration example. Note that the curve for the exponential PC is truncated due to hitting the floor of the tolerance of the method. Quantity Natural 𝑞𝑝 𝑚𝑝 𝑐 E𝑝(r, 𝑡) 𝑣0 1 1 1 1 √ 1/ 2 MKS 1.6 × 10−19 C 9.1 × 10−31 kg 3 × 108 m/s 5.6 × 10−12 V/m √ 𝑐/ 2 m/s Table 2.1: Conversion between natural and MKS units for the direct E acceleration example 29 Quantity Natural 𝑞𝑝 𝑚𝑝 𝑐 B𝑝(r, 𝑡) 𝑟𝑐 𝑣0 1 1 1 1 1 √ 1/ 2 MKS 1.6 × 10−19 C 9.1 × 10−31 kg 3 × 108 m/s 1.2 × 10−3 T 1 m √ 2 m/s 𝑐/ Table 2.2: Conversion between natural and MKS units for the cyclotron motion example Linear Acceleration E(r, 𝑡) = (1, 0, 0), B(r, 𝑡) = ¯0 To begin, we consider the simple example of a particle being accelerated by an electric field of peak amplitude 1 oriented along the ˆ𝑥 axis. A mapping from natural to MKS units are provided in 2.1. The initial relativistic momentum of the particle was set such that p0 = 𝛾0 𝑝v0 𝑝 = (1, 0, 0). Fig. 2.1. shows a plot of the trajectories that the particles were predicted to follow by the exponential method, laid over the analytical result. All of the methods closely agreed with the predicted curve, though the error in the position was significantly lower in the exponential method. This is demonstrated in Fig. 2.2, where we see Boris and the Adams predictor corrector methods roughly follow the trends expected from the polynomial order. The coarsest and finest timestep size used are Δ𝑡 = 0.1 and Δ𝑡 = 0.005 respectively. The exponential method, on the other hand, performs much better, showing better than 8th order convergence before flattening out due to hitting the tolerance of the method. The analytical expression used is as follows, with the full derivation available at [97]: (cid:115) (cid:18) 1 + 1 + (cid:19) 2 𝑞|𝐸|𝑡 𝑚 r(𝑡) = 𝑚 𝑞|𝐸| (cid:169) (cid:173) (cid:171) . − 𝛾(0)(cid:170) (cid:174) (cid:172) (2.31) The factor |𝐸| refers to the magnitude of the electric field used, and 𝛾(0) refers to the initial relativistic factor. 30 Figure 2.3: Comparison of particle energy over multiple cyclotron cycles. We note that the polynomial PC schemes spuriously gain energy for the timestep size chosen. Figure 2.4: Comparison of particle trajectories obtained from the exponential method compared to Boris and the analytical solution. Note how the improved accuracy of the exponential method causes the trajectory to lie exactly on the analytical curve, whereas the Boris solution is slightly shifted for the same timestep size. 31 Figure 2.5: Convergence of position error (defined as the relative norm between computed and analytical particle trajectories) over a single cycle for different evolution schemes. Once again, we note that the polynomial methods show approximately the expected convergence for the order of the polynomial used, while the exponential method converges much faster (11th order) before it hits the set tolerance. Cyclotron motion E(r, 𝑡) = ¯0, B(r, 𝑡) = (0, 0, 1) Next, we considered the dynamics of a particle moving under a constant ˆ𝑧-directed magnetic field. As before, the initial momentum of the particle was set to p0 = 𝛾0 𝑝v0 𝑝 = (1, 0, 0). A mapping from natural to MKS units are provided in 2.2. The canonical solution for this problem predicts a shortening of the non-relativistic cyclotron frequency and an increase in the radius of the loop by exactly a factor of 𝛾. The cyclotron test stresses a number of different aspects of the Newton solver. First, since the particle is moving only under the influence of a magnetic field, this is the perfect setup to validate our predictions about energy conservation. We note from Fig. 2.3 that while Boris and the exponential method conserve the relativistic energy 𝑚𝑝 𝛾𝑝 𝑐 to machine precision, the polynomial methods exhibit spurious heating. Second, since the location of the gyrocenter is extremely sensitive to early time shifts, one can look at how well the different methods predict its location. As is evident from Fig. 2.4, despite conserving energy exactly, the Boris method has a slight shift of the orbit that is not seen in the more accurate exponential method. We can also look at error convergence in the position over a single loop, as shown in Fig. 32 Figure 2.6: Convergence of error (defined as the relative norm between computed and analytical particle trajectories) for the |E| = |B| setup. The exponential method still outperforms Boris and the polynomial PC methods, but by a smaller order. 2.5. The coarsest and finest stepsizes in the figure correspond to Δ𝑡 = 0.5 and Δ𝑡 = 0.05 respectively. Here, we note that the Boris push and polynomial predictor corrector schemes converge at a slope approximately corresponding to the order of representation. The exponential method on the other hand, shows much better convergence. To stress the system further, the orbits were made more relativistic by setting the initial particle velocity such that 𝛾 = 2. Looking once again at Fig. 2.5, we see that the convergence is even better with the method hitting its tolerance with as little as 10 steps per cycle. As in the previous case, the analytical solution is derived in [97], and looks as follows: r(𝑡) = 𝑣0𝛾0 |𝐵| (cid:20) sin (cid:19) (cid:18) 𝛾0𝑞|𝐵|𝑡 𝑚 (cid:18) ˆ𝑥 + −1 + cos (cid:18) 𝛾0𝑞|𝐵|𝑡 𝑚 (cid:19) (cid:19) (cid:21) ˆ𝑦 . (2.32) The term |𝐵| refers to the magnitude of the ˆ𝑧 directed applied magnetic field. |E| = |B| Finally, we considered the more complex case of relativistic motion forced both by mutually perpendicular electric and magnetic fields of the same magnitude. As before, a mapping from natural to MKS units are provided in 2.3. In this instance, we chose E(r, 𝑡) = (0, 1, 0) and B(r, 𝑡) = (0, 0, 1), with the particle starting at rest at the origin. While 33 Quantity Natural 𝑞𝑝 𝑚𝑝 𝑐 B𝑝(r, 𝑡) E𝑝(r, 𝑡) 𝑣0 1 1 1 1 1 √ 1/ 2 MKS 1.6 × 10−19 C 9.1 × 10−31 kg 3 × 108 m/s 1.2 × 10−3 T 5.6 × 10−12 V/m √ 𝑐/ 2 m/s Table 2.3: Conversion between natural and MKS units for the |E| = |B| case. The conversion ensures that the electric and magnetic forces acting on the positron at 𝑡 = 0 are equal. the dynamics of the particle are more complicated as one reaches relativistic speeds, it is possible to derive an analytical solution, as done in [97]. The final expression for our choice of parameters simplifies to where r(𝑡) = 𝑈 3 6 ˆ𝑥 + 𝑈 2 2 ˆ𝑦, (cid:18) (cid:16)√ 9𝑡2 + 8 + 3𝑡 (cid:17) 2/3 (cid:19) − 2 𝑈 = (cid:16)√ 9𝑡2 + 8 + 3𝑡 (cid:17) 1/3 . (2.33) (2.34) Upon running this setup with each of our solvers, we once again observe from Fig. 2.6 that the exponential method significantly outperforms the polynomial based methods, though by a smaller polynomial factor than the other examples presented thus far. The coarsest and finest stepsizes in Fig. 6 correspond to Δ𝑡 = 1.0 and Δ𝑡 = 0.02 respectively. Space Varying Fields Next, we tested the long-time stability of the exponential method by considering a system wherein the scalar potential and magnetic fields varied as follows, 𝜙(r, 𝑡) = 0.01(cid:112)𝑥2 + 𝑦2, B(r, 𝑡) = ˆ𝑧(cid:112)𝑥2 + 𝑦2. Once again employing units of 𝑚𝑝 = 𝑞𝑝 = 𝑐 = 1, the particle was 𝑝 = (0.1, 0, 0). Further, we chose a step size Δ𝑡 = 1.5 × 10−5 initialized at r0 𝑝 = (0.9, 0, 0) and u0 and ran the system for 50000 timesteps. We note from Fig. 2.7 that exponential PC method accurately tracks both the fast gyrations and the slow rotations without losing or gaining energy. 34 Figure 2.7: Plot of particle trajectory for the first and 80th orbits, showing good preservation of phase space volume Figure 2.8: Plot of particle trajectories for the magnetic bottle predicted by (left) Boris, (center) Exp PC and (right) 4th order Adams overlaid on a lineplot of the magnetic field along the 𝑦-𝑧 plane. We note that spurious energy gain in the Adams method causes the trapped particle to escape despite being initialized with a velocity outside the loss cone. Particle trapped in a Magnetic Bottle To stress energy conservation further, we considered a setup where an electron was initialized within a magnetic bottle defined using two magnetic dipole oriented along the 𝑧-direction, with field strength B(r) = 𝜇0 4𝜋 (cid:18) 3r(m · r) |r|5 − m |r|3 (cid:19) . (2.35) The two dipoles were separated by a distance of 20 m and the magnetic moment was set to 𝑚 = 108 ˆ𝑧. The mass, charge and speed of light were represented in MKS units. The particle initially begins at rest at position (0, 5, 0) and is accelerated for 10 ns by a 𝑦-directed electric field before it is switched off and moves only in the magnetic field from that point. The 35 Figure 2.9: Snapshot of the particle spread at 𝑡 = 10 ns. We note that the particle spead generally follows the expansion expected from the physics, and further agrees well with equivalent results from XOOPIC Figure 2.10: Satisfaction of Gauss’ Law over the run for the expanding particle beam. The figure shown is the relative norm between the ∇ · D(r, 𝑡) and ∇ · G(r, 𝑡). field lines of the dipoles, as well as the trajectory predicted by Boris, exponential PC and the 4th order Adams predictor corrector method are shown in Fig. 2.8. We note that the spurious energy gain of the Adams method permits the particle to escape the bottle, while the other methods predict trajectories that keep the particle contained. Expanding Particle Beam To test the validity of the exponential predictor-corrector scheme on a full EM-FEMPIC scheme, we analysed the physics of an expanding beam contained within a conducting 36 Figure 2.11: Time history of particle and field energy for the expanding beam. Due to the high initial particle momentum, the energy is largely dominated by the kinetic energy of the particles. However, we note from the data that both field and particle energy are conserved across the run. Figure 2.12: Snapshot of the particle spread at 𝑡 = 10 ns predicted by the proposed method compared to results from an EM-FEMPIC formulation run using a leapfrog integrator. Again, we note very good agreement between the two methods. 37 cylindrical cavity of length 10 cm and radius 2 cm oriented such that the axis of rotation aligned along ˆ𝑧. The walls of the cavity were assumed to be perfectly conducting and a particle beam composed of electrons was initialized at 𝑧 = 0. The beam was characterized by a driving voltage of 500 𝑘𝑉, a driving current of 1 𝐴 and an emitter radius of 0.8 cm. This yielded an initial setup wherein the electrons were accelerated at roughly 𝛾 = 2 or 𝑝 ∼ 2.6 × 108 m/s. The exponential predictor-corrector was used with a timestep size equal v0 to Δ𝑡 = 15 ps – while the comparison runs with XOOPIC[98] and an explicit EM-FEMPIC code were run at Δ𝑡 = 1 ps. The tetrahedral mesh used to discretize the system had an average cell size of 8.2 mm and the maximum number of particles within the system at any one time was approximately 8000 particles, resulting in approximately 3.5 particles per cell. At each step, one predictor step was taken followed by as many corrector steps required as to reach the desired tolerance of 10−9. In this specific run, achieving this threshold rarely required more than one corrector step. The particle beam also had a turn-on time of 1 ns so as to have a smooth increase in field strength. A snapshot of of the particle beam at 10 ns – compared with an equivalent simulation set-up on XOOPIC[98] – is shown in Fig. 2.9 and the satisfaction of Gauss’ Law over the course of the entire run is shown in Fig. 2.10. As is evident, the proposed method both agrees with the expected physics and satisfies Gauss’ Laws to machine precision. Likewise, we note from Fig. 2.11 that particle and field energies are conserved over the course of the simulation. Finally, we note from Fig. 2.12 that the proposed method compares well to an explicit leapfrog based EM-FEMPIC solver, albeit at a stepsize that was 15 times larger. Particle Beam in a Klystron To stress the proposed solver in a system with excited cavity modes, we analysed the behavior of a simple cylindrical klystron, with dimensions as in Fig. 2.13. Once again, the system was discretized by a tetrahedral mesh with average edge length of 2.93 mm. The solver was run with a step size of 10 ps and compared to XOOPIC running at 1 ps. 38 Figure 2.13: Snapshot of the position distribution at 𝑡 = 4 ns. The proposed method agrees well with XOOPIC and predicts the expected particle bunching behavior around the same region as XOOPIC (at approximately 𝑧 = 2.5 cm). Figure 2.14: Time history of particle and field energy for the klystron. Unlike in the previous run, most of the energy is in the fields – we note here that once again average energy is conserved across time for both fields and particles. A sinusoidal current source was defined on a cylindrical surface existing at the middle of the neck of the klystron, with an amplitude of 100 A and frequency of 4.1 GHz. These dimensions correspond to an average of around 2 particles per cell, though there are regions in the domain with higher density as shown by the bunching observed in Fig. 2.13. Once again, we note good agreement with XOOPIC, and observe from Fig. 2.14 that both field and particle energies are conserved on average over multiple cycles of the run. 39 2.3 Summary In this work, we have proposed and validated a fully relativistic charge conserving EM-FEMPIC scheme that makes use of a novel exponential predictor corrector framework. We have numerically benchmarked our proposed framework over both single particle full PIC runs, demonstrating superior accuracy in the former and good agreement with existing validated methods in the latter. The improvement of scalability of this scheme to make it more applicable to complex devices that are of interest in high power design is currently being analysed and will be reported on in the near future. 40 AN ENVELOPE TRACKING APPROACH FOR PARTICLE IN CELL SIMULATIONS CHAPTER 3 The advances in EM-FEMPIC outlined in the introductory chapter have made it possible to (1) simulate plasma phenomena while capturing the underlying geometry to very good precision and (2) analyse the system in time at the frequency of interest for capturing the desired physics as opposed to the satisfaction of mesh-based stability criteria. Despite the massive progress that has been accomplished in recent years, there are still classes of devices for which the current state-of-the-art is inefficient to simulate for large scale problems. One such example is the analysis of devices that have a narrowband high-frequency response. Similarly, devices with features smaller than the electromagnetic wavelength that are nonetheless important for the electrostatic physics are computationally untenable to solve with contemporary techniques. Despite having a relatively small window of significant frequency content, to simulate such a device, one would need to discretize the simulation in time at a rate determined by the maximal frequency of interest. Efficient ways to simulate these devices exist when one considers a purely electromag- netic analysis (i.e. without particles). Termed envelope tracking, these methods solve an altered form of Maxwell’s equations wherein the high frequency offset present in the fields is analytically prescribed. Performing this operation, effectively a Hilbert Transform of Faraday’s and Ampere’s laws, yields a new set of equations that still vary in time, but can be discretized at the bandwidth of the signal, which is significantly lower than the maxmial frequency in the unshifted system. Extending these results to a system with moving macroparticles is challenging for two primary reasons: (1) While the fields themselves oscillate within a narrowband high frequency window, the particle positions and velocities generally have significant DC components that force analysis of Newton’s equations at the highest frequency of interest. (2) While recent advances make it possible to conserve charge within an implicit time 41 marching scheme [76], these techniques break down when used with an analytically prescribed plane wave component (We discuss this in greater detail in Section 3.2). In the remainder of this chapter, we construct an EM-FEMPIC method wherein the fields are solved for using an envelope tracking method, henceforth referred to as ET-FEMPIC. We address both of the challenges listed above by (1) constructing an integration scheme for the particle system that allows Newton’s laws to be solved at the highest frequency of interest while being mapped self consistently with the Maxwell solver and (2) by constructing a Quasi-Helmholtz framework to explicitly satisfy the Coulomb gauge. We show analytically and through numerical experiments that this framework conserves charge to machine precision while producing results that match a traditional EM-FEMPIC implementation. 3.1 Problem Statement Consider a region Ω ∈ R3 bounded by a surface 𝜕Ω containing a single charged species. This region is subjected to an external field due to which the charged species accelerate, and in turn produce spatially and temporally varying electric and magnetic fields denoted by E(r, 𝑡) and B(r, 𝑡), respectively, with r ∈ Ω and 𝑡 ∈ [0, ∞). The dynamics of the particles in phase space can be represented by a distribution function (PSDF) 𝑓 (𝑡, r, v) that follows the Vlasov equation: 𝜕𝑡 𝑓 (𝑡, r, v) + v · ∇ 𝑓 (𝑡, r, v)+ 𝑞 𝑚 [E(r, 𝑡) + v × B(r, 𝑡)] · ∇𝑣 𝑓 (𝑡, r, v) =0. (3.1) In what follows, we assume that the background media in Ω is free space. As a result, we denote the permittivity and permeability of free space by 𝜖0 and 𝜇0, respectively, and the speed of light by 𝑐. In what follows, we will assume that either the external fields impressed on Ω or the source exciting fields is narrowband in that the center frequency of the excitation 𝑓0 ≫ 𝑓𝑏𝑤, where 𝑓𝑏𝑤 is the bandwidth. Likewise, we will assume that the system is quiescent for 𝑡 ≤ 0. 42 3.2 Formulation We follow the usual path of not solving (3.1) directly, instead representing the moments of the PSDF using a charge and current density as, 𝜌(𝑡, r) = 𝑞 ∫ 𝑞 ∫ 𝑓 (𝑡, r, v)𝑑v and J(𝑡, r) = Ω v 𝑓 (𝑡, r, v)𝑑v. We then use a particle approximation of these moments, and evolve their location and velocity together with Maxwell’s equations. As a result, assuming a shape Ω functions 𝑆(r), one obtains 𝜌(𝑡, r) = 𝑞 𝑁𝑝 (cid:213) 𝑝=1 𝑆(r − r𝑝(𝑡)), J(𝑡, r) = 𝑞 𝑁𝑝 (cid:213) 𝑝=1 v𝑝(𝑡)𝑆(r − r𝑝(𝑡)). (3.2a) (3.2b) where 𝑁𝑝 denotes the number of macroparticles. Furthermore, r𝑝(𝑡) and v𝑝(𝑡) refer to the positions and velocities as a function of time of the 𝑝th macroparticle. With no loss of generality, we will assume that there exist a source J𝑖(r𝑠 , 𝑡) = ˜J𝑖(r𝑠 , 𝑡)𝑒 𝑗𝜔0𝑡 at points r𝑠 that excites the system. Given this temporal dependence, we posit that the field and fluxes have a similar behavior, i.e., E(r, 𝑡) = ˜E(r, 𝑡)𝑒 𝑗𝜔0𝑡 and B(r, 𝑡) = ˜B(r, 𝑡)𝑒 𝑗𝜔0𝑡. In these expressions, the quantities with a tilde are slowly varying with respect to time. Using these expressions in Maxwell’s equations, we can write ∇ × ˜E(r, 𝑡) = −𝑗𝜔0 ˜B(r, 𝑡) − 𝜕𝑡 ˜B(r, 𝑡), ∇ × 𝜇−1 0 ˜B(r, 𝑡) = ˜J𝑖(r𝑠 , 𝑡) + J(r, 𝑡)𝑒−𝑗𝜔0𝑡 + 𝑗𝜔0𝜖0 ˜E(r, 𝑡) + 𝜖0𝜕𝑡 ˜E(r, 𝑡), Following [76], we replace J(r, 𝑡) with its time integral G(r, 𝑡) = 𝑡 ∫ 0 𝑡 𝑁𝑝 (cid:213) ∫ J(r, 𝜏)𝑑𝜏 = 𝑞 v𝑝(𝜏)𝛿(r − r𝑝(𝜏))𝑑𝜏 𝑝=1 𝑁𝑝 (cid:213) = 𝑞 0 r𝑝(𝑡) ∫ 𝑑˜r𝛿(r − ˜r) 𝑝=1 r𝑝(0) 43 (3.3a) (3.3b) (3.3c) in (3.3b), to obtain ∇ × 𝜇−1 0 ˜B(r, 𝑡) = ˜J𝑖(r𝑠 , 𝑡) + 𝑒−𝑗𝜔0𝑡 𝜕𝑡G(r, 𝑡) + 𝑗𝜔0𝜖0 ˜E(r, 𝑡) + 𝜖0𝜕𝑡 ˜E(r, 𝑡) and satisfy Gauss’ electric and magnetic laws: ∇ · 𝜖0 ˜E(r, 𝑡) = 𝜌𝑖(r𝑠 , 𝑡) + 𝜌(r, 𝑡) exp{−𝑗𝜔0𝑡}, ∇ · ˜B(r, 𝑡) = 0. (3.3d) (3.4a) (3.4b) It is important to note at this point that G(r, 𝑡) and 𝜌(r, 𝑡) are not necessarily narrowband. As a result, they cannot be decomposed into fast and slow varying components. As usual, boundary conditions need to be imposed on ˜E(r, 𝑡) and ˜B(r, 𝑡) on sections of the outer boundary 𝜕Ω. These are assumed to either Dirichlet, Neumann or impedance boundary conditions on non-overlapping surfaces 𝜕Ω𝐷, 𝜕Ω𝑁 and 𝜕Ω𝐼, and are defined as follows: ˆ𝑛 × ˜E(r, 𝑡) = Ψ𝐷(r, 𝑡) on Ω𝐷 , ˆ𝑛 × 𝜇−1 ˜B(r, 𝑡) = Ψ𝑁 (r, 𝑡) on Ω𝑁 , ˆ𝑛 × 𝜇−1 ˜B(r, 𝑡) − 𝑌 ˆ𝑛 × ˆ𝑛 × ˜E(r, 𝑡) = Ψ𝐼(r, 𝑡) on Ω𝐼 . (3.5a) (3.5b) (3.5c) Note, it is assumed that 𝜕Ω𝐷 + 𝜕Ω𝑁 + 𝜕Ω𝐼 = 𝜕Ω. The evolution of the macroparticles in space and time is determined by solving for the equations of motion with the acceleration determined by the Lorentz force. This yields the following coupled system of equations for ordinary differential equations (ODEs) for v𝑝(𝑡) and r𝑝(𝑡): 𝑑v𝑝(𝑡) 𝑑𝑡 𝑞 𝑚 = (cid:2) ˜E(r𝑝(𝑡), 𝑡) + v𝑝(𝑡) × ˜B(r𝑝(𝑡), 𝑡)(cid:3) exp{𝑗𝜔0𝑡} 𝑑r𝑝(𝑡) 𝑑𝑡 = v𝑝(𝑡) (3.6a) (3.6b) In what follows, we present a method to self-consistently solve both Maxwell’s equations and equations of motion, especially under the narrow band approximation. 44 Figure 3.1: Depiction of electric field ¯𝑒 𝑛 and magnetic flux density ¯𝑏𝑛 coefficients along a cell consisting of a single tetrahedron. The arrows denote the directions of the unit vector along the edges and unit normal along the faces. Discretization in Space and Time Solutions to Eq. (3.3) are obtained by spatially representing ˜E(r, 𝑡) and ˜B(r, 𝑡) in terms of Whitney edge and face basis functions respectively (shown for a single tet in Fig. 3.1): ˜E(r, 𝑡) = ˜B(r, 𝑡) = 𝑁𝑒(cid:213) 𝑖=1 𝑁 𝑓 (cid:213) 𝑖=1 1 𝑒𝑖(𝑡)W 𝑖 (r), 2 𝑖 (r). 𝑏𝑖(𝑡)W (3.7) 1 Here, W 𝑖 2 𝑡 (r) represent the lowest order Whitney edge function defined on the 𝑖th (r) and W edge and face function defined on the 𝑡th face, respectively. Further, 𝑁𝑒 and 𝑁 𝑓 denote the number of edges and faces in the tetrahedral mesh used to discretize Ω; details on mixed finite elements can be found [82, 83, 84, 85, 86] and reference therein. Using Gakerkin testing results in the following matrix ODE to solve for the vector of field coefficients ¯𝐵(𝑡) and ¯𝐸(𝑡) in time: (cid:105) (cid:104) ¯¯𝑀 · + (cid:105) (cid:104) ¯¯𝑆 · ¯𝐵(𝑡)       ¯𝐸(𝑡)       ¯¯𝐹 = 𝜕𝑡 𝜕𝑡       ¯𝐵(𝑡)       ¯𝐸(𝑡) (3.8) 45 where the various matrix definitions are as follows:       + 𝑗𝜔0  [★𝜇−1]      0 , 0     [★𝜖]   (3.9) (cid:104) ¯¯𝑆 (cid:104) ¯¯𝑀 (cid:105) (cid:105) =             ¯¯𝐹 = − = 0 [∇×] −[∇×]𝑇 0 , 0     [★𝜖]   −       𝜕𝑡 [★𝜇−1] 0 0 ¯𝐽𝑖(𝑡) 𝜖0             0 .       exp{−𝑗𝜔0𝑡} ¯𝐺(𝑡) 𝜖0 (𝑡)(cid:3) 𝑇, ¯𝐸(𝑡) = [𝑒1(𝑡), 𝑒2(𝑡), · · · , 𝑒𝑁𝑒 (𝑡)]𝑇, Furthermore, ¯𝐵(𝑡) = (cid:2)𝑏1(𝑡), 𝑏2(𝑡), · · · , 𝑏𝑁 𝑓 (cid:68) [𝑔1(𝑡), 𝑔2(𝑡), · · · , 𝑔𝑁𝑒 (𝑡)]𝑇 where 𝑔𝑖(𝑡) = (1) 𝑗 ¯𝐺(𝑡) = and ¯𝐽𝑖(𝑡) = [𝑗1(𝑡), 𝑗2(𝑡), · · · , 𝑗𝑁𝑒 (𝑡)]𝑇 . We preface at this point that the method reduces to where 𝑗𝑗(𝑡) = , ˜J𝑖(r𝑠 , 𝑡) , G(r, 𝑡) (1) 𝑖 W W (cid:69) (cid:68) (cid:69) a traditional EM-FEMPIC solve when 𝜔0 = 0, so the proposed method represents a generalization of the EM-FEMPIC formulation. To convert (3.8) into a discrete stencil at different intervals of time, we utilize a Newmark-𝛽 [74] scheme with 𝛾 = 0.5 and 𝛽 = 0.25. Furthermore, we define Δ𝑡,𝜔0 = (30𝜉 𝑓max)−1 and Δ𝑡,𝜔𝑏𝑤 = (30𝜉 𝑓bw)−1 as the timestep size appropriate for the fast varying and downshifted systems respectively, with 𝜉 being a real oversampling factor. This choice of parameters results in representing the field coefficients in terms of second order Lagrange polynomials in time and testing with an average acceleration condition (represented by 𝑊(𝑡)) such that for 𝑡 ∈ [𝑡𝑛−1, 𝑡𝑛+1] where 𝑡𝑛 = 𝑛Δ𝑡,𝜔bw ¯𝐵(𝑡) (cid:170) (cid:174) (cid:172) ¯𝐸(𝑡) (cid:169) (cid:173) (cid:171) = = 2 (cid:213) 𝑘=0 2 (cid:213) 𝑘=0 𝑁𝑛,𝑘(𝑡) (cid:169) (cid:173) (cid:171) ¯𝐵(𝑡𝑛+𝑘−1) (cid:170) (cid:174) ¯𝐸(𝑡𝑛+𝑘−1) (cid:172) 𝑁𝑛,𝑘(𝑡) (cid:169) (cid:173) (cid:171) ¯𝐵𝑛+𝑘−1 ¯𝐸𝑛+𝑘−1 , (cid:170) (cid:174) (cid:172) 𝐿𝑛,𝑘(𝑡) = 𝑡 − 𝑡𝑛+1−𝑗 𝑡𝑛+1−𝑘 − 𝑡𝑛+1−𝑗 2 (cid:214) 𝑗=0 𝑗≠𝑘 46 (3.10a) (3.10b) 𝐿𝑛,𝑘(𝑡) 𝑁𝑛,𝑘(𝑡) = 𝑡 ∈ [𝑡𝑛−1, 𝑡𝑛+1] otherwise. 𝑊𝑛(𝑡) = 𝑡𝑛−𝑡 Δ𝑡 𝑡−𝑡𝑛 Δ𝑡 𝑡 ∈ [𝑡𝑛−1, 𝑡𝑛] 𝑡 ∈ [𝑡𝑛 , 𝑡𝑛+1] . 0 otherwise 0       (3.10c) (3.11) Discretization and Exact Charge Conservation Next, we re-examine rubrics of a charge conserving PIC scheme from a slightly different perspective. The discrete divergence of the set of equations derived (3.8) should yield Gauss’ laws. But a word of caution, the discrete equations are a result of measuring their continuous counterparts in time. More precisely, we note that the discrete divergence of Ampere’s laws yields 𝜖0 [∇]𝑇 (cid:0)[★𝜖] 𝜕𝑡 ¯𝐸(𝑡) + 𝑗𝜔0 [★𝜖] ¯𝐸(𝑡)(cid:1) = − [∇]𝑇 (cid:18) ¯𝐽𝑖(𝑡) + 𝑒−𝑗𝜔0𝑡 𝜕𝑡 ¯𝐺(𝑡) (cid:19) . Discretizing Gauss law, however, yields 𝜖0 [∇]𝑇 [★𝜖] ¯𝐸(𝑡) = ¯𝜌𝑖(𝑡) + ¯𝜌(𝑡) (3.12) (3.13) where ¯𝜌𝑖(𝑡) = [𝜌𝑖(𝑡), · · · , 𝜌𝑁𝑛 (𝑡)]𝑇, 𝜌𝑖(𝑡) = ⟨𝑊 0 (r), 𝜌𝑖(r𝑠 , 𝑡)⟩, and ¯𝜌(𝑡) = [𝜌1(𝑡), · · · , 𝜌𝑁𝑛 (𝑡)]𝑇 = 𝑖 [∇]𝑇 ¯𝐺(𝑡)𝑒−𝑗𝜔0𝑡. Note, in the above equations, the [∇]𝑇 matrix effects a divergence of the flux density (which lies in the dual grid) in terms of quantities defined on the primal grid. In practice, as one only evolves the curl equations, the solution at every time step will automatically satisfy (3.12) provided care is taken in constructing the right hand side [76]. But what is critical for satisfaction of Gauss’ law is that this discrete solution also 47 satisfies (3.13). In the text that follows, we take some liberties in verbiage: (a) it will be understood that (3.12) and (3.13) are not explicitly discretized in time and evolved; (b) the system of equations resulting from taking the discrete divergence of the time discretized Ampere’s laws and those from temporal discretization of (3.12) will be identical; (c) we use the word discretization and measurement interchangeably they imply an inner product with a temporal basis function; and (d) our goal is to understand how one should measure or discretize (3.13) so as to to be consistent with the measurement of (3.12). Indeed, if we choose different temporal basis to measure (3.12) and (3.13), it must be shown that the two measures are consistent. For instance, if the basis functions used to measure (3.12) are first order and those used to measure (3.13) are delta functions, they will yield different results; this is a point that we will return to later. Note, we have specifically, assumed that ¯𝐺(𝑡) is known. Indeed, as shown in [76, 75], using ¯𝐺(𝑡) is critical for exact satisfaction of Gauss’ electric law. To understand these issues better, we analyze two cases; (a) 𝜔0 = 0 and (b) 𝜔0 ≠ 0. To simplify our discussion, we assume both ¯𝐽𝑖(𝑡) = 0 = ¯𝜌𝑖(𝑡). Case 1: 𝜔0 = 0 When 𝜔0 = 0, (3.12) simplifies to 𝜖0 [∇]𝑇 [★𝜖] 𝜕𝑡 ¯𝐸(𝑡) = − [∇]𝑇 𝜕𝑡 ¯𝐺(𝑡) (3.14) Note, parenthetically, we note that [∇]𝑇 ¯𝐺(𝑡) = − ¯𝜌(𝑡). In keeping with the Newmark-𝛽 scheme, to ensure late time stability, we have to use 𝑊𝑛(𝑡) as testing functions in time. If a scheme is charge conserving, then at any point 𝑡𝑛, the system arising from the discrete divergence of Ampere’s law ⟨𝑊𝑛(𝑡), 𝜖0 [∇]𝑇 [★𝜖] 𝜕𝑡 ¯𝐸(𝑡)⟩ = ⟨𝑊𝑛(𝑡), 𝜕𝑡 ¯𝜌(𝑡)⟩ (3.15) should satisfy Gauss’ law, but under what measure. To make the ensuing text notationally less dense, we use ¯𝜙(𝑡) = 𝜖0 [∇]𝑇 [★𝜖] ¯𝐸(𝑡). Thus, (3.15) can be written as ⟨𝑊𝑛(𝑡), 𝜕𝑡 ¯𝜙(𝑡)⟩ = ⟨𝑊𝑛(𝑡), 𝜕𝑡 ¯𝜌(𝑡)⟩. (3.16) 48 Evaluation of this inner product results in ¯𝜙𝑛+1 − ¯𝜙𝑛−1 2 − ¯𝜌𝑛+1 − ¯𝜌𝑛−1 2 = 0, (3.17) which is the relations that coefficients of the electric field will satisfy in keeping with the substitution defined earlier. The questions are two fold; (a) since Gauss’ law in (3.13) is never solved, what is an equivalent discrete system, and (b) will the update equation for this system be consistent with those obtained from (3.16). This may seem trivial as both sides of (3.14) have a time derivative and one can evaluate them analytically to obtain (3.13) together with null initial conditions. But it is important to remember (a) that right hand side of Ampere’s law is deliberately chosen to be different from the conventional one where one discretizes current [71] and (b) we effect the solution of (3.8) by choosing temporal basis sets such that the solution is unconditionally stable, and do not treat it as a first order ODE. Under these circumstances, we can ask what should (cid:101)𝑊𝑛(𝑡) to reduce (3.13) to a discrete system. To make our analysis as general as possible, lets us denote (cid:101)𝑊𝑛(𝑡) as a basis that is used to discretize (3.13): ⟨ (cid:101)𝑊𝑛(𝑡), 𝜖0 [∇]𝑇 [★𝜖] ¯𝐸(𝑡)⟩ = ⟨ (cid:101)𝑊𝑛(𝑡), ¯𝜌(𝑡)⟩. Using the abbreviated notation defined earlier, we get ⟨ (cid:101)𝑊𝑛(𝑡), ¯𝜙(𝑡)⟩ = ⟨ (cid:101)𝑊𝑛(𝑡), ¯𝜌(𝑡)⟩ Consider two potential choices; for (cid:101)𝑊𝑛(𝑡) = 𝛿 (𝑡 − 𝑡𝑛+1), one gets ⟨𝛿(𝑡 − 𝑡𝑛+1), ¯𝜙(𝑡)⟩ = ⟨𝛿(𝑡 − 𝑡𝑛+1), ¯𝜌(𝑡)⟩ ¯𝜙𝑛+1 = ¯𝜌𝑛+1 and for (cid:101)𝑊𝑛(𝑡) = 𝑊𝑛(𝑡), one gets ¯𝜙𝑛+1 + 2 ¯𝜙𝑛 + ¯𝜙𝑛−1 4 − ¯𝜌𝑛+1 + 2 ¯𝜌𝑛 + ¯𝜌𝑛−1 4 = 0. 49 (3.18) (3.19) (3.20) (3.21) In (3.21), if we were to assume an quiescent initial condition at 𝑡 = 0, it is trivial to show that one recovers solutions obtained in (3.17). However, for (3.21) one needs Gauss’ laws to be satisfied at both 𝑡 = {0, Δ𝑡 } for it to be consistent with solutions in (3.17). Indeed, one can use any test function as long as it is piecewise continuous, causal and one imposes additional initial conditions. We digress a little to note that this is a consequence of using G(𝑡, r) and not J(𝑡, r). To set the stage, consider ¯𝐽(𝑡) = (cid:2)˜𝑗1(𝑡), ˜𝑗2(𝑡), · · · , ˜𝑗𝑁𝑒 (𝑡)(cid:3) 𝑇 where ˜𝑗𝑖(𝑡) = standard discretized Ampere’s yields (cid:68) (1) 𝑖 W , J(r, 𝑡) (cid:69) . The 𝜖0 [∇]𝑇 [★𝜖] 𝜕𝑡 ¯𝐸(𝑡) = [∇]𝑇 ¯𝐽(𝑡). Using ¯𝜂(𝑡) = [∇]𝑇 ¯𝐽(𝑡) and testing with 𝑊𝑛(𝑡) simplifies the notation to (cid:10)𝑊𝑛(𝑡), 𝜕𝑡 ¯𝜙(𝑡)(cid:11) = ⟨𝑊𝑛(𝑡), ¯𝜂(𝑡)⟩ Evaluating the integrals as listed above yields the following stencil: ¯𝜙𝑛+1 − ¯𝜙𝑛−1 = Δ𝑡,𝜔bw ¯𝜂𝑛+1 + 2 ¯𝜂𝑛 + ¯𝜂𝑛−1 4 (3.22) (3.23) (3.24) But Gauss’ law is slightly different; indeed, ¯𝜌(𝑡) = ∫ 𝑡 𝑜 𝑑𝜏 ¯𝜂(𝜏). This implies that the discrete evolution of Gauss’ law yields (cid:10)𝑊𝑛(𝑡), ¯𝜙(𝑡)(cid:11) = ⟨𝑊𝑛(𝑡), ¯𝜌(𝑡)⟩ Then (3.25) simplifies to ¯𝜙𝑛+1 + 2 ¯𝜙𝑛 + ¯𝜙𝑛−1 = ¯𝜌𝑛+1 + 2 ¯𝜌𝑛 + ¯𝜌𝑛−1. (3.25) (3.26) But as only ¯𝜂(𝑡) is available, one would need to integrate this to obtain the charge density. Choosing a backward Euler scheme for illustration, we obtain ¯𝜌𝑛 = ¯𝜌𝑛−1 + Δ𝑡,𝜔bw ¯𝜂𝑛. Using (3.27) in (3.24) results in ¯𝜙𝑛+1 − ¯𝜙𝑛−1 = 1 4 (cid:16) ¯𝜌𝑛+1 + ¯𝜌𝑛 − ¯𝜌𝑛−1 − ¯𝜌𝑛−2(cid:17) 50 (3.27) (3.28) It is apparent from above equations, that irrespective of the integration scheme, the coefficient obtained by solving Ampere’s law will not satisfy the discrete Gauss’ law. The above discussion presents the nuances of the challenge in ensuring that the solution of the discrete curl equations satisfy Gauss’ law. When 𝜔0 = 0, one can somewhat get around the challenge by defining ¯𝐺(𝑡). As we will see, this challenge cannot be overcome when 𝜔0 ≠ 0. Case 2: 𝜔0 ≠ 0 Next, we consider the effect of adding an analytically prescribed fast-varying function to our field solution. As before, we begin our analysis with the divergence of Ampere’s Law in (3.12) and obtain 𝜖0 [∇]𝑇 (cid:0)[★𝜖] 𝜕𝑡 ¯𝐸(𝑡) + 𝑗𝜔0 [★𝜖] ¯𝐸(𝑡)(cid:1) = − [∇]𝑇 (cid:16) 𝑒−𝑗𝜔0𝑡 𝜕𝑡 ¯𝐺(𝑡) (cid:17) . (3.29) Using the same notation as the previous subsection, suppose 𝜙 (𝑡) = ˜𝜙 (𝑡) 𝑒 𝑗𝜔𝑡. We can represent this as 𝜙 (𝑡) = 𝑛+1 (cid:213) 𝑖=𝑛−1 ˜𝜙𝑖 𝑁𝑖 (𝑡) 𝑒 𝑗𝜔𝑡 (3.30) where 𝑁𝑖 (𝑡) represents the regular second order Newmark basis functions. 𝜌 is defined as is normally done in Newmark: Now, 𝜕𝑡 𝜙 − 𝜕𝑡𝜌 becomes 𝜌 (𝑡) = 𝑛+1 (cid:213) 𝑖=𝑛−1 𝜌𝑖 𝑁𝑖 (𝑡) 𝑛+1 (cid:213) 𝑖=𝑛−1 (cid:104) ˜𝜙𝑖𝜕𝑡 (cid:16) 𝑁𝑖 (𝑡) 𝑒 𝑗𝜔𝑡 (cid:17) + 𝜌𝑖𝜕𝑡 𝑁𝑖 (𝑡) (cid:105) = 0 (3.31) (3.32) 51 𝑊-testing gives us (cid:42) 𝑊 (𝑡) , 𝑛+1 (cid:213) 𝑖=𝑛−1 (cid:104) ˜𝜙𝑖𝜕𝑡 (cid:16) 𝑁𝑖 (𝑡) 𝑒 𝑗𝜔𝑡 (cid:17) + 𝜌𝑖𝜕𝑡 𝑁𝑖 (𝑡) (cid:43) (cid:105) (cid:42) = 𝑊 (𝑡) , (cid:42) + 𝑊 (𝑡) , 𝑛+1 (cid:213) 𝑖=𝑛−1 𝑛+1 (cid:213) 𝑖=𝑛−1 ˜𝜙𝑖 (cid:16) 𝑁𝑖 (𝑡) 𝜕𝑡 (cid:16) 𝑒 𝑗𝜔𝑡 (cid:17) + 𝑒 𝑗𝜔𝑡 𝜕𝑡 𝑁𝑖 (𝑡) (cid:43) (cid:17) (3.33) 𝜌𝑖𝜕𝑡 𝑁𝑖 (𝑡) (cid:43) = 0 Thus, there are 6 relevant terms in the expansion of (cid:10)𝑊 , 𝜙 (𝑡)(cid:11). They are as follows: (cid:68) 𝑊 (𝑡) , 𝜕𝑡 (cid:16) 𝑁𝑘 𝑒 𝑗𝜔𝑡 (cid:17) (cid:69) = 𝑒 𝑗𝜔𝑡𝑛 𝑛+1 (cid:213) 𝑘=𝑛−1 (cid:2)𝛼𝑘,0 + 𝛼𝑘,Δ𝑡 𝑒 𝑗𝜔Δ𝑡 + 𝛼𝑘,−Δ𝑡 𝑒−𝑗𝜔Δ𝑡 (cid:3) where 𝛼𝑛+1,0 = 4𝑗 + 2𝜔Δ𝑡 2𝜔2Δ3 𝑡 2𝑗 + 𝜔Δ𝑡 2𝜔2Δ3 𝑡 𝛼𝑛+1,−Δ𝑡 = − 𝛼𝑛+1,Δ𝑡 = 𝛼𝑛,0 = 𝛼𝑛,−Δ𝑡 = 𝛼𝑛,Δ𝑡 = 𝛼𝑛−1,0 = 𝛼𝑛−1,−Δ𝑡 = −2𝑗 − 3𝜔Δ𝑡 + 2𝑗 (𝜔Δ𝑡)2 + 2 (𝜔Δ𝑡)3 2𝜔2Δ3 𝑡 −4𝑗 − 2𝑗 (𝜔Δ𝑡)2 𝜔2Δ3 𝑡 2𝑗 − 2𝜔Δ𝑡 𝜔2Δ3 𝑡 2𝑗 + 2𝜔Δ𝑡 𝜔2Δ3 𝑡 4𝑗 − 2𝜔Δ𝑡 2 (𝜔Δ𝑡)3 −2𝑗 + 3𝜔Δ𝑡 + 2𝑗 (𝜔Δ𝑡)2 − 2 (𝜔Δ𝑡)3 2 (𝜔Δ𝑡)3 𝛼𝑛−1,Δ𝑡 = −2𝑗 − 𝜔Δ𝑡 2 (𝜔Δ𝑡)3 (3.34) (3.35a) (3.35b) (3.35c) (3.35d) (3.35e) (3.35f) (3.35g) (3.35h) (3.35i) We can conclude by inspection from the coefficients defined in (3.35a) that for general 𝜔, all nine coefficients can be nonzero. As a result, the quantities 𝛽𝑘 = (cid:2)𝛼𝑘,0 + 𝛼𝑘,Δ𝑡 𝑒 𝑗𝜔Δ𝑡 + 𝛼𝑘,−Δ𝑡 𝑒−𝑗𝜔Δ𝑡 (cid:3) 52 defined for 𝑘 = {𝑛 − 1, 𝑛, 𝑛 + 1} must also generally by nonzero. Then, (cid:10)𝑊 (𝑡) , 𝜕𝑡 𝜙 − 𝜕𝑡𝜌(cid:11) = 𝛽𝑛+1𝜙𝑛+1 + 𝛽𝑛 𝜙𝑛+ 𝛽𝑛−1𝜙𝑛−1𝑒 𝑗𝜔𝑡𝑛 − (cid:18) 𝜌𝑛+1 − 𝜌𝑛−1 2 (cid:19) = 0 (3.36) cannot be captured in a pointwise manner, since all three coefficients multiplying 𝜙𝑘 can generally by nonzero, while the coefficients multiplying 𝜌𝑘 only exist when 𝑘 = {𝑛 − 1, 𝑛 + 1}. If charge conservation is achieved, then we should be able to find a testing function (cid:101)𝑊𝑛(𝑡) that, when used to measure Gauss’ Law will reduce it to a form that is satisfied by the solution to (3.32). Let us first begin by setting (cid:101)𝑊𝑛(𝑡) = 𝛿(𝑡 − 𝑡𝑛). Testing Gauss’ law with this function will lead to (cid:10)𝛿(𝑡 − 𝑡𝑛), ¯𝜙(𝑡)(cid:11) = (cid:10)𝛿(𝑡 − 𝑡𝑛), exp{−𝑗𝜔0𝑡} ¯𝜌(𝑡)(cid:11) which, when evaluated leads to the following relation for ¯𝜙𝑛 ¯𝜙𝑛 = exp{−𝑗𝜔0𝑡𝑛 } ¯𝜌𝑛 (3.37) (3.38) Comparing this relation to (3.32), we find that the pointwise relation for 𝜙𝑛 does not satisfy the stencil for the divergence of Ampere’s Law. Interestingly, the solutions do agree in the limit when 𝜔0 → 0, but otherwise, they are inconsistent. Likewise, if we were to perform a similar analysis with (cid:101)𝑊𝑛(𝑡) = 𝑊𝑛(𝑡), we end up with the following update expression for 𝜙𝑛+1 Δ𝑡,𝜔bw 4 ¯𝜙𝑛+1 = − Δ𝑡,𝜔bw 2 ¯𝜙𝑛 − Δ𝑡,𝜔bw 4 ¯𝜙𝑛−1 + 𝛼𝑛+1(𝜔0) ¯𝜌𝑛+1 (3.39) + 𝛼𝑛(𝜔0) ¯𝜌𝑛 + 𝛼𝑛−1(𝜔0) ¯𝜌𝑛−1 Once again, the coefficients 𝛼𝑛+1(𝜔0), 𝛼𝑛(𝜔0) and 𝛼𝑛−1(𝜔0) refer respectively to the inner products of 𝑊𝑛(𝑡) and exp{−𝑗𝜔0𝑡} multiplying the basis functions used to represent the field quantities in time 𝑁𝑛,0(𝑡), 𝑁𝑛,1(𝑡)and 𝑁𝑛,2(𝑡) respectively (defined in (3.10c)). As before, (3.39) and (3.32) agree in the limit of 𝜔0 → 0, they are inconsistent for finite 𝜔0. While neither choice of (cid:101)𝑊𝑛(𝑡) results in discrete systems that are consistent with (3.32), and there may never be, we take a different and more robust path. In addition to the above 53 difficulties, we note the following: The standard Newmark time stepping scheme excites a null space that is time independent. While the null space can be small, it will not have a trivial divergence and will corrupt the satisfaction of Gauss’ law [83]. A way to overcome both problems is to use a Helmholtz decomposition or impose a Coulomb gauge in the discrete setting. The means to do so is elaborated next. Quasi Helmholtz or Coulomb Gauge Much of the development of what follows has been detailed in our earlier paper [77]. The following discussion is purely for completeness of the paper. In a nutshell, ˜E(r, 𝑡) and ˜B(r, 𝑡) can decomposed into solenoidal (components that are divergence free) and non-solenoidal components (have a finite curl). Hence, the use of the prefix “quasi.” In this framework, Gauss’ laws are explicitly discretized and solved. Furthermore, while the solution for the solenoidal component has a null space, its divergence is exactly zero. As a result, Gauss’ laws are exactly satisfied. In what follows, we use ¯𝐸𝑛 𝑛𝑠 refers to the non-solenoidal coefficients of the electric field at the 𝑛th timestep. Likewise, ¯𝐸𝑛 𝑠 and ¯𝐵𝑛 𝑠 refer to the divergence-free solenoidal coefficients of the electric field and magnetic flux density, respectively. A complete prescription of all sub-matrices involved are provided in Section 1.4. Requisite Proejctors To project the non-solenoidal components from the basis used for representing the electric field, we define projectors [ ¯𝑃]Σ 𝑒 and [ ¯𝑃]Λ 𝑒 that, when operated on the complete electric field, respectively extract the non-solenoidal and solenoidal components (where † represents a Moore-Penrose pseudoinverse): [ ¯𝑃]Σ 𝑒 = Σ(Σ𝑇Σ)†Σ𝑇 [ ¯𝑃]Λ 𝑒 = ℐ − [ ¯𝑃]Σ 𝑒 54 (3.40a) (3.40b) (cid:2) ¯𝑃(cid:3) Λ 𝑏 = ℐ − Σ𝑚 (cid:16) Σ𝑇 𝑚Σ𝑚 (cid:17) † Σ𝑇 𝑚 (3.40c) where [Σ] = 𝜖0[ ¯𝑀𝑔] and [Σ]𝑚 = [∇·]𝑇. Using these projectors„ we can now define a decomposition for the electric flux density as ¯𝐷(𝑡) = Σ ¯𝐸𝑛𝑠(𝑡) + (cid:2) ¯𝑃(cid:3) Λ 𝑒 ¯𝐷(𝑡) and the magnetic flux density as ¯𝐵𝑠(𝑡) = (cid:2) ¯𝑃(cid:3) Λ 𝑏 ¯𝐵(𝑡) (3.41) (3.42) Using this projector, it is rather straightforward to show that the divergence of ¯𝐵(𝑡) is zero. Discrete System To use this decomposition, we use all of Maxwell’s equations. First given the decom- position, the application of the discrete divergence on (3.42) and (3.41) results in zero and [𝐶 𝑒 𝑧]𝑇 [∇]𝑇 [★𝑒] [∇] [𝐶 𝑒 𝑧] ¯𝐸𝑛𝑠(𝑡) = − [𝐶 𝑒 𝑧]𝑇 [∇]𝑇 𝑒−𝑗𝜔0𝑡 ¯𝐺(𝑡). (3.43) Upon delta-testing this expression at timestep 𝑛, we get a direct relation between ¯𝐸𝑛𝑠(𝑡) and ¯𝐺(𝑡). Likewise, using decomposition in (3.8) and with (3.43) results in ¯𝐵𝑠(𝑡) + (cid:2) ¯𝑍(cid:3) ¯𝐸𝑠(𝑡) = − (cid:2) ¯𝑍(cid:3) ¯𝐸𝑛𝑠(𝑡) 13 12 (cid:2) ¯𝑍(cid:3) (cid:2) ¯𝑍(cid:3) 11 21 𝜕𝑡 ¯𝐵𝑠(𝑡)+𝑗𝜔0 (cid:2) ¯𝑍(cid:3) 𝜕𝑡 ¯𝐸𝑠(𝑡)+𝑗𝜔0 (cid:2) ¯𝑍(cid:3) 11 21 −𝑒−𝑗𝜔0𝑡 𝜕𝑡 ¯𝐺(𝑡) − (cid:2) ¯𝑍(cid:3) 𝑗𝜔0 23 ¯𝐸𝑠(𝑡) − (cid:2) ¯𝑍(cid:3) ¯𝐵𝑠(𝑡) = 22 ¯𝐸𝑛𝑠(𝑡) − (cid:2) ¯𝑍(cid:3) 𝜕𝑡 ¯𝐸𝑛𝑠(𝑡) 23 (3.44) where ¯𝐸𝑠(𝑡) and ¯𝐵𝑠(𝑡) are extracted from the simply connected mesh using tree-cotree (cid:3) that are defined in Section 1.4. The evaluation of time derivatives is maps [𝐶 𝑒 𝑐 ] and (cid:2)𝐶𝑏 𝑐 effected through the basis defined earlier; it has been extensively discussed in [76]. As alluded to earlier, since we directly discretize Gauss’ laws and since the discrete divergence of (3.44) is identical to zero, charge conservation is exactly satisfied. 55 Particle Push and Integration Scheme for ¯𝐺 Next, we describe the framework used to evolve particle trajectories following (3.6). We recall from Section 3.2 that the error in the non-solenoidal component of the electric field is tied intimately to the accuracy in the evaluation of ¯𝐺(𝑡), making it important to evaluate the particle positions and velocities to high accuracy. But it is also known that the Fourier transform of the velocity or the position of any particle is not necessarily narrow-band. As a result, the equations of motion and the path integral need to be evaluated at a step size Δ𝑡,𝜔0, while samples of the field are only known at Δ𝑡,𝜔bw . There are two ways to get around this mismatch. First, one can incorporate the analytically known fast varying field components into the Lorentz update (r𝑝 , 𝑡) = 𝑞𝑝 𝑚𝑝 Re (cid:8) ˜E(r𝑝 , 𝑡) exp{𝑗𝜔0𝑡}(cid:9) (3.45) v(r𝑝 , 𝑡) × Re (cid:8) ˜B(r𝑝 , 𝑡) exp{𝑗𝜔0𝑡}(cid:9) , 𝑑v 𝑑𝑡 𝑞𝑝 𝑚𝑝 + which can be numerically integrated with the fast varying components treated analytically. The specifics of the resulting integration rule can be written as v(r 𝑛+1 𝑝 , 𝑡𝑛+1) − v(r 𝑛 𝑝 , 𝑡𝑛) = 𝑞𝑝 𝑚𝑝 2 (cid:213) 𝑘=0 E(r 𝑛−1−𝑘 𝑝 , 𝑡𝑛−1−𝑘) (cid:16) 𝑀 𝑟 𝑛,𝑘 + 𝑀 𝑖 𝑛,𝑘 (cid:17) + 𝑞𝑝 𝑚𝑝 2 (cid:213) 𝑘=0 v(r 𝑛−1−𝑘 𝑝 , 𝑡𝑛−1−𝑘) × B(r 𝑛−1−𝑘 𝑝 , 𝑡𝑛,1,𝑘) (cid:16) 𝑛,𝑘 + 𝐿𝑖 𝐿𝑟 𝑛,𝑘 (cid:17) where the operators used are defined as follows: 𝑡𝑛+1∫ 𝑁𝑛,𝑗(𝑡)𝑁𝑛,𝑘(𝑡) cos(𝜔0𝑡)𝑑𝑡, 𝐿𝑟 𝑛,𝑗,𝑘 = 𝑡𝑛 𝑡𝑛+1∫ 𝑁𝑛,𝑗(𝑡)𝑁𝑛,𝑘(𝑡) sin(𝜔0𝑡)𝑑𝑡, 𝐿𝑖 𝑛,𝑗,𝑘 = − 𝑡𝑛 56 (3.46) (3.47a) (3.47b) 𝑡𝑛+1∫ 𝑁𝑛,𝑗(𝑡) cos(𝜔0𝑡)𝑑𝑡, 𝑀 𝑟 𝑛,𝑗 = 𝑡𝑛 𝑡𝑛+1∫ 𝑀 𝑖 𝑛,𝑗 = − 𝑁𝑛,𝑗(𝑡) sin(𝜔0𝑡)𝑑𝑡. (3.47c) (3.47d) The particle positions can then be integrated consistently in time at Δ𝑡,𝜔0 through a fourth 𝑡𝑛 order Adams-Bashforth stencil: 𝑛+1 𝑝 = r 𝑛 𝑝 + r Δ𝑡,𝜔0 24 (55v 𝑛 𝑝 − 59v 𝑛−1 𝑝 + 37v 𝑛−2 𝑝 − 9v 𝑛−3 𝑝 ) (3.48) Since the particle positions cannot be downshifted in the same way as the velocity update, some oversampling is still required to maintain accuracy, but as we show in Sec. 3.3, this factor is relatively small. The computed trajectories can then be used to obtain ¯𝐺(𝑡) (cid:42) ¯𝐺𝑛 𝑙 = (1) 𝑙 W (r𝑝(𝑛Δ𝑡,𝜔bw )), 𝑁𝑝 (cid:213) 𝑝=1 𝑞 r𝑝(𝑛Δ𝑡,𝜔bw ∫ ) (cid:43) (cid:101)r𝑆(r − 𝑑 (cid:101)r) , r𝑝(0) (3.49) where 𝑛 refers to the number of timesteps evolved by the EM system, and 𝑙 refers to a specific edge in the EM system. In systems where the number of particles is not very large, and therefore the cost of a particle update is negligible in relation to the cost of a field solve, we can follow a simpler integration setup and simply interpolate the electric field to time points that are spaced at Δ𝑡,𝜔0. In essence, the magnitudes of the electric fields and magnetic flux density at the location of a given particle 𝑝, can be reconstructed from the downshifted quantities at a given time 𝑡 ∈ [𝑡𝑛−1, 𝑡𝑛+1] as E(r𝑝 , 𝑡) = B(r𝑝 , 𝑡) = 𝑁𝑒(cid:213) 2 (cid:213) 𝑖=1 𝑁 𝑓 (cid:213) 𝑘=0 2 (cid:213) 𝑖=1 𝑘=0 𝑁𝑛,𝑘(𝑡) ¯𝐸𝑘 𝑖 exp{𝑗𝜔0𝑡𝑛−1+𝑘 }W 1 𝑖 (r𝑝) 𝑁𝑛,𝑘(𝑡) ¯𝐵𝑘 𝑖 exp{𝑗𝜔0𝑡𝑛−1+𝑘 }W 2 𝑖 (r𝑝) (3.50a) (3.50b) 57 These fields can then be used in a Lorentz update at the smaller stepsize. Since the particle paths are now known at the finer timestep size Δ𝑡,𝜔0, the path integral in (3.49) can be evaluated as ) (cid:101)r𝑆(r − 𝑑 (cid:101)r) = r𝑝(𝑛Δ𝑡,𝜔bw ∫ r𝑝(0) 𝑀−1 (cid:213) 𝑗=0 r𝑝((𝑗+1)Δ𝑡,𝜔0 ∫ ) (cid:101)r𝑆(r − 𝑑 (cid:101)r). r𝑝(𝑗Δ𝑡,𝜔0 ) (3.51) where 𝑀 refers to the number of timesteps needed at Δ𝑡,𝜔0 to advance to a time of 𝑛Δ𝑡,𝜔bw . We show, once again, in Section 3.3 that this approach can be used to get precision similar to the actively downshifted integration scheme described earlier in the section, but the rate of oversampling required is larger. Note, in both these methods, we have assumed non-relativistic motion. Studies are underway on efficient methods in relativistic regimes. 3.3 Numerical Experiments In this section, we detail results obtained from an implementation of the ET-FEMPIC scheme described above. The results will be structured as follows: (A) First, we demonstrate the viability and computational gains of using envelope tracking in a linear EM system without particles by analysing the radiated power due to a monopole antenna. (B) Next. we demonstrate the accuracy of the integration scheme developed in Section 3.2. Then, we proceed to analyse two systems with particles, thereby including non-linear effects from the active media. As a prelude, we note that the modulated Gaussian functions used in some of the numerical examples are defined as follows: 𝑣(𝑡) = cos(𝜔0𝑡) exp (cid:18) − (cid:19) (𝑡 − 6𝜎)2 2𝜎2 𝜎 = 2 𝜔bw (3.52) Radiated Power from a Monopole Antenna We consider a conducting strip suspended over a finite ground plane, as specified in Fig. 3.2a. The coupling between the EM system and the driving circuit is achieved 58 (a) Geometry of the monopole antenna. The di- mensions are in cm. (b) Power radiated due to a 1 mV source compared to measured data and regular MFEM Figure 3.2: Geometry and radiated power results for a monopole antenna. across a vertical 1 cm edge going from the conducting plane to the strip. The current driving the antenna is generated by a Thevenin source characterized by 𝑓0 = 300 MHz and 𝑓bw = 200 MHz connected in series to a 100 Ω resistor. The circuit subsystem itself was modelled through a Modified Nodal Analysis network [99], with the constituent equations appropriately modified to account for the analytically known fast field components. The voltage and current across the port feed were then used to compute the complex impedance of the feed as a function frequency. The radiated power curves for the antenna were computed from the impedance obtained from a regular MFEM solve and through the envelope tracking technique described in this work. As can be seen in Fig. 3.2b, there is good agreement in the radiated power as a function of frequency between a regular MFEM and the envelope tracking method, despite the timestep size in the latter being 2.5 times larger. Fidelity of the Particle Push Routine Since the particle positions and velocities are not known to conform to the same frequency spread as the fields, we proposed two different time-stepping schemes in Section 59 𝑓bw (MHz) 𝑁Regular 20 10 5 1 100 200 400 2000 𝜖ℒ2 (cid:0)𝑁Regular(cid:1) 𝑁Downshift 4.135 × 10−8 3.308 × 10−8 3.591 × 10−8 1.195 × 10−8 21 54 117 501 𝜖ℒ2 (𝑁Downshift) 3.916 × 10−8 6.392 × 10−8 7.997 × 10−8 2.410 × 10−8 Table 3.1: Errors for the two particle integration methods, compared against numerically obtained data at Δ𝑡,bench = Δ𝑡,𝜔max/10. 𝑁Regular and 𝑁Downshift refer to the oversampling factor for the naively oversampled and downshifted methods respectively, and 𝜖ℒ2 refers to the ℒ2 error for a given method compared against the benchmark data. We note that the oversampling factor required for the downshifted method is significantly lower than with naive oversampling. In each case, 𝑓0 = 2 GHz. 3.2 to accurately evolve Newton’s equations. The numerical results presented in this section demonstrate the viability of both methods. To set the stage, we consider a system consisting of one particle moving under the influence of E(r, 𝑡) = ˆ𝑥𝑣(𝑡) and B(r, 𝑡) = ˆ𝑧𝑣(𝑡) with 𝑣(𝑡) as defined in (3.52). The positions of the fields and velocities are then solved for using two different methods. First, we use a particle solver that operates at a smaller timestep size Δ𝑡,𝜔max than the field solve. The values of the field are then interpolated to the finer timesteps and used to evolve Newton’s equations. Since the fields conform to the bandwidth amenable for an envelope tracking analysis, the interpolation should be as accurate as a time-marching routine that evolves at steps of Δ𝑡,𝜔max. Second, we evolve the particle using the downshifted integration scheme described in Sec. 3.2. The final position curves obtained from both methods was compared against a solution obtained using a fourth order Adams Bashforth method evolved at Δ𝑡,bench = Δ𝑡,𝜔max/10. The relative error of the two methods in relation to this benchmark is reported in Table 3.1. As is evident, both methods yield the same order of error over a range of center frequency/bandwidth combinations. Klystron exicted by a gap voltage Next, we analyze the EM response for a device with strong particle effects. Specifically, we examine the reduction in the quality factor of a Klystron under beam loading, with the 60 𝑡max(ns) 𝑁𝑡,ET-MFEM 𝑄ET-MFEM 𝑁𝑡,MFEM 𝑄MFEM 𝑄EM-FEMPIC 𝑄ET-FEMPIC 13500 9000 6750 1079.1 556.4 231.4 1084 561.4 246.7 903.7 484.9 211.5 899.4 488.4 207.8 135.6 90.4 67.8 1500 1000 750 Table 3.2: Tabulated quality factor data in the Klystron solve. 𝑁𝑡 and 𝑄 refer respectively to the number of timesteps used in the analysis and the respective quality factor obtained. As can be seen in the table, the Envelope tracking approach closely tracks the results predicted by the MFEM solve. set-up as shown in Fig. 3.3. A sinusoidal RF source placed at the connecting neck, a 40 kV, 3A beam was introduced on one end of the feed-tube. The gap voltage as a function of frequency was then computed from the Fourier transform of the measured electric field across an edge spanning the walls of the neck. The quality factor of the cavity was estimated by locating the half power points beneath the resonance peak, and computing 𝑄 as 𝑄 = 𝑓peak Δ 𝑓 (3.53) We note from Fig. 3.4 and Table. 3.2 that the quality factor when the klystron is loaded drops slightly in comparison to an unloaded run, with some of the energy of the cavity used to accelerate the particles. This acceleration can be seen by observing a histogram of particle velocities as a function of position along the length of the tube, as shown in Fig. 3.6. Furthermore, the quality factor derived using ET-FEMPIC are close to those derived from a traditional EM-FEMPIC solve but computed at approximately the tenth the number of time steps. Furthermore, the drop observed is similar to that observed in [100] simulated using an axi-symmetric finite difference time domain code. Finally, we note from Fig. 3.5 that Gauss’ Law is satisfied to machine tolerance. Relativistic MILO To stress the viability of the algorithm for a complex real world device, we considered a setup with a relativistic MILO, as discussed in [101]. The layout of an axial cross-section the device is as shown in Fig. 3.8, with red sections denoting PEC surfaces, green sections 61 Figure 3.3: Schematic of the Klystron. The figure represents a cross-section in the 𝑦-𝑧 plane. The device is assumed to be cylindrically symmetric about the 𝑧 axis. Furthermore, all walls are PEC. representing surfaces with an applied potential difference 𝑉in(𝑡), dark green sections representing locations of particle emitters, and finally dashed blue sections representing surfaces where the output power 𝑃out(𝑡) in the form of the Poynting flux across the surface is measured. 𝑉in(𝑡) = sin(2𝜋 𝑓0𝑡) was defined as a sinusoidal impulse with magnitude 1.8 MV at a center frequency of 𝑓0 = 1.2 GHz. Likewise, the particles were emitted from inner conductor with a purely radial velocity of 2.12 × 108 𝑚𝑠−1. Due to the high particle velocities in this example, the particle push algorithm described earlier was modified to solve for the relativistic momentum. The particle current was set to 50 kA. This setup was simulated using the proposed method with a timestep size Δ𝑡 = 33.3 𝑝𝑠. To provide a 62 Figure 3.4: Gap voltage vs frequency in the loaded and unloaded case – both run in the downshifted frame. We note that the peaks line up at the same frequency and the loaded response shows a dip in the saturated gap voltage thereby reducing the quality factor. direct comparison, we also simulated this setup on XOOPIC [102] for the same parameters with Δ𝑡 = 1 𝑝𝑠. A 𝑧-𝑟 phase plot of the particles is shown in Fig. 3.8. Likewise, the spectrum of the output power is reported in Fig. 3.9. We note from both figures that the proposed method predicts particle distributions that agree closely with XOOPIC and further predicts an output spike at 1.2 GHz, as expected from other similar results in the literature [101]. 63 Figure 3.5: Plot of relative error per particle in satisfaction of discrete Gauss’ Law for the Klystron setup simulated through ET-FEMPIC. Computational Complexity Finally, we discuss the computational complexity of the method and the range devices for with the technique maximizes savings in compute time. We wish to emphasize at this point that any time domain simulation method approach requires some means of fixing Δ𝑡. For explicit schemes with a mesh dependant stability constraint, this is done quite trivially by looking at the smallest mesh features. For implicit methods that are similarly constrained by the mesh, the user needs to estimate a good Δ𝑡 based on a presumptive highest frequency of interest. Typically, this timestep size is inversely related to a factor 64 Figure 3.6: Velocity histogram at 𝑡 = 100 ns along the length of the Klystron showing particle acceleration when exposed to the RF source. multiplied by the highest frequency, i.e Δ𝑡 = 1 𝜉 ( 𝑓0 + 𝑓bw) . (3.54) As noted previously, when 𝜔0 = 0, the method reduces back to a traditional EM-FEMPIC solve, so for highly-broadband devices, the method is no more expensive than a traditional implicit EM-FEMPIC solve. For narrowband devices whose step size is heavily determined by the frequency offset in the fields, as opposed to the feature scales of the geometry, the savings in the number of timesteps can be estimated as follows. Under the proposed method, the frequency offset is analytically provided for, leading to a new timestep size 65 Figure 3.7: Comparison between the loaded cavity responses obtained through the regular and downshifted frames. given by Δ𝑡 = 1 𝜉 ( 𝑓bw) . (3.55) Therefore, the number of timesteps required to go to a certain simulation time is reduced by a factor of 𝑓bw/( 𝑓0 + 𝑓bw). This figure represents the reduction in number of steps for a device whose stepsize is determined entirely by the bandwidth of the fields. If the geometry of the simulation setup demands a smaller stepsize (as in the case of the relativistic MILO), one may have to choose a smaller stepsize than what would be demanded by the physics. 66 Figure 3.8: Layout and 𝑧-𝑟 phase distribution of particle positions along the MILO at 50 ns. Red sections denote conducting surfaces, green sections denote places with imposed potential differences and dashed blue segments represent surfaces where the output power spectrum is measured. 3.4 Summary In this paper, we have proposed a technique to greatly improve the computational performance of EM-FEMPIC codes for a class of narrowband high frequency devices. Using a Quasi-Helmholtz setup, we show that our method exactly satisfies charge conservation and achieves similar fidelity to a traditional FEMPIC solve. As an aside, we have spent considerable fraction of the paper examining nuances of charge conservation from a different perspective. The upshot of this discussion is that the underlying rubric of quasi- Helmholtz decomposition is always robust and immune null spaces that would otherwise corrupt the system. Additionally, our results demonstrate real world speedups equal to the 67 Figure 3.9: Plot of the spectrum of output power obtained from the MILO using the method proposed in this work. We recover a peak at 1.2 GHz, as predicted by similar experiments in the literature [101]. ratio between the frequency shift and the bandwidth. In particular, the Klystron example used close to a tenth the number of timesteps compared to a traditional EM-FEMPIC solve. This is before any kind of additional speed up for parallel processing is applied. 68 ANALYSIS AND BOUNDS FOR ENERGY CONSERVATION IN EM-FEMPIC CHAPTER 4 The full wave simulation of charged particles interacting with electromagnetic fields is of vast importance in many applications across physics and engineering, including pulsed power, high precision etching, accelerator design among many others. One of the most popular techniques involved in such analysis is the electromagnetic particle-in-cell (EM-PIC) method. While PIC solvers have traditionally employed FD-TD, advances in finite element based solvers have recently made it increasingly viable – particularly due its ability to robustly handle complex geometries. Furthermore, novel contributions in current mapping and the use of quasi-Helmholtz decomposition to strongly enforce the Coulomb gauge [77] have made it possible to use unconditionally stable implicit EM solvers while conserving charge. One area within PIC development that has seen relatively little advance is in the use of more sophisticated particle-push methods, as opposed to the ubiquitous Boris integrator [88]. This is particularly relevant when using an implicit method for the field solve. Since Boris is explicit, the movement of the charges is computed solely using fields from the previous timestep, leading to loss in accuracy. At the same time, using a fully implicit particle updater, such as is done in so-called ’Kinetic enslavement’ methods [92] comes with high computational cost. The goal of this work is to contribute to the existing state of the art by (i) proposing a rigorous set of conditions that, when followed, ensures overall energy conservation while using an unconditionally stable Newmark-𝛽 field integrator, and (ii) by suggesting ’velocity-correction’ schemes that can be used after a field solve to satisfy the proposed rubrics. We show through numerical examples that the method conserves average energy over asymptotically large timescales, both for a single particle as well as within an EM-FEMPIC simulation. 69 4.1 Problem Statement Consider a 3D region Ω containing a distribution of charged particles. This region is subjected to an external field due to which the charged species accelerate, and in turn produce spatially and temporally varying electric and magnetic fields denoted by E(r, 𝑡) and B(r, 𝑡), respectively, with r ∈ Ω and 𝑡 ∈ [0, ∞). The dynamics of the particles in phase space can be represented by a distribution function (PSDF) 𝑓 (𝑡, r, v) that follows the Vlasov equation. Furthermore, we assume that the background media in Ω is free space. 4.2 Formulation Following the Particle-in-Cell approach, we represent the charge and current dis- tributions in Ω using the moments of the PSDF through 𝜌(r, 𝑡) = 𝑞 ∫ 𝑓 (𝑡, r, v)𝑑v and J(r, 𝑡) = 𝑞 ∫ Ω v 𝑓 (𝑡, r, v)𝑑v. Representing these moments using 𝑁𝑝 macroparticles, one can evolve their positions and velocity together with Maxwell’s equations. Assuming a shape Ω function 𝑆(r), we obtain 𝜌(r, 𝑡) = 𝑞 𝑁𝑝 (cid:213) 𝑝=1 𝑆(r − r𝑝(𝑡)), J(r, 𝑡) = 𝑞 𝑁𝑝 (cid:213) 𝑝=1 v𝑝(𝑡)𝑆(r − r𝑝(𝑡)). (4.1a) (4.1b) where r𝑝(𝑡) and v𝑝(𝑡) refer to the positions and velocities as functions of time of the 𝑝th macroparticle. The evolution of the fields E(r, 𝑡) and B(r, 𝑡) over space and time within Ω follow Maxwell’s equations, given by where, ∇ × E(r, 𝑡) = −𝜕𝑡B(r, 𝑡), ∇ × 𝜇−1 0 B(r, 𝑡) = 𝜕𝑡G(r, 𝑡) + 𝜖0𝜕𝑡E(r, 𝑡), G(r, 𝑡) = 𝑡 ∫ 0 J(r, 𝜏)𝑑𝜏 = 𝑞 𝑁𝑝 (cid:213) r𝑝(𝑡) ∫ 𝑑˜r𝛿(r − ˜r) 𝑝=1 r𝑝(0) 70 (4.2a) (4.2b) (4.3) and J(r, 𝑡) describes impressed currents within Ω. Furthermore, the solutions to the curl equations in (4.2) also need to satisfy Gauss’ Laws, ∇ · 𝜖0E(r, 𝑡) = 𝜌𝑖(r, 𝑡) + 𝜌(r, 𝑡), ∇ · B(r, 𝑡) = 0. (4.4a) (4.4b) where in 𝜌𝑖(r, 𝑡) are impressed charges. In what follows, we will assume that both the impressed current and the corresponding charge densities are zero. If they are not, it is trivial to include then in the analysis framework rubric described in the next section. As usual, boundary conditions need to be imposed on E(r, 𝑡) and B(r, 𝑡) on sections of the outer boundary 𝜕Ω, to ensure unique solutions. These are assumed to either Dirichlet, Neumann or impedance boundary conditions on non-overlapping surfaces 𝜕Ω𝐷, 𝜕Ω𝑁 and 𝜕Ω𝐼, with 𝜕Ω = 𝜕Ω𝐷 + 𝜕Ω𝑁 + 𝜕Ω𝐼, and are defined as follows: ˆ𝑛 × E(r, 𝑡) = Ψ𝐷(r, 𝑡) on Ω𝐷 , ˆ𝑛 × 𝜇−1 B(r, 𝑡) = Ψ𝑁 (r, 𝑡) on Ω𝑁 , ˆ𝑛 × 𝜇−1 B(r, 𝑡) − 𝑌 ˆ𝑛 × ˆ𝑛 × E(r, 𝑡) = Ψ𝐼(r, 𝑡) on Ω𝐼 , (4.5a) (4.5b) (4.5c) where the functions Ψ𝐷(r, 𝑡), Ψ𝑁 (r, 𝑡) and Ψ𝐼(r, 𝑡) refer to the imposed Dirichlet, Neumann and Impedance boundary condition respectively. The field equations in (4.2) are discretized using a mixed finite element scheme and evolved in time using a Newmark-𝛽 [74] scheme with parameters 𝛾 = 0.5 and 𝛽 = 0.25, as described in [76]. This formulation is known to be unconditionally stable [83]. 4.3 Rubrics for Energy Conservation To set the stage for analysing energy conservation, consider a cavity with perfectly conducting walls. Using matrix definitions from Section 1.4, we begin by writing the semi-discrete form of Maxwell’s equations in the following form [★𝜇−1] 0 0 [★𝜖]       (cid:164)¯𝑏(𝑡) (cid:164)¯𝑒(𝑡)       (cid:169) (cid:173) (cid:171) + (cid:170) (cid:174) (cid:172)       0 −[M𝑐]𝑇  [M𝑐]      0 ¯𝑒(𝑡) (cid:170) (cid:174) ¯𝑏(𝑡) (cid:172) (cid:169) (cid:173) (cid:171) . 0 (cid:170) (cid:174) ¯𝑗(𝑡) (cid:172) = (cid:169) (cid:173) (cid:171) (4.6) 71 Here, ¯𝑒(𝑡) and ¯𝑏(𝑡) refer to the vector of time varying coefficients of the elctric and magnetic field unknowns discretized using the standard Whitney edge and face basis functions, and (r), J(r, 𝑡)(cid:11). To derive a set of rules that need to be followed for energy to be ¯𝑗𝑖(𝑡) = (cid:10)W conserved, we take the following approach. We shall first demonstrate that in a source free 1 𝑖 cavity (i.e with 𝑗(𝑡) = 0), the unconditionally stable Newmark stencil guarantees that the difference of electric and magnetic energy stays constant over a timestep. Then, we shall consider the electromagnetic side of an EM-FEMPIC with ¯𝑗(𝑡) = 𝑞 (cid:205)𝑖 v𝑡(𝑡). 4.4 Analysis of Energy Conservation Velocity Correction Scheme The evolution of the macroparticles in space and time is determined by solving for the relativistic equations of motion with the acceleration determined by the Lorentz force. This yields the following coupled system of equations for ordinary differential equations (ODEs) for v𝑝(𝑡) and r𝑝(𝑡): 𝑑𝛾𝑝v𝑝(𝑡) 𝑑𝑡 𝑞 𝑚 = (cid:2)E(r𝑝(𝑡), 𝑡) + v𝑝(𝑡) × B(r𝑝(𝑡), 𝑡)(cid:3) 𝑑r𝑝(𝑡) 𝑑𝑡 = v𝑝(𝑡) (4.7a) (4.7b) To solve (4.7), we make use of a particle integration system comprised of an explicit 4th order Adams-Bashforth update given by 𝑛+1 𝑝 = v 𝑛 𝑝 + v 𝑛+1 𝑝 = r 𝑛 𝑝 + r Δ𝑡 24 Δ𝑡 24 (55a 𝑛 𝑝 − 59a 𝑛−1 𝑝 + 37a 𝑛−2 𝑝 − 9a 𝑛−3 𝑝 ) (55v 𝑛 𝑝 − 59v 𝑛−1 𝑝 + 37v 𝑛−2 𝑝 − 9v 𝑛−3 𝑝 ). (4.8) (4.9) where a𝑝(𝑡, r𝑝) = 𝑞𝑝 𝑚𝑝 (cid:0)E(𝑡, r𝑝) + v𝑝(𝑡, r𝑝) × B(𝑡, r𝑝)(cid:1) and v𝑛 𝑝 = v𝑝(𝑡 = 𝑡𝑛), r𝑛 𝑝 = r𝑝(𝑡 = 𝑡𝑛). In general, the stencil in (4.8) does not conserve energy asymptotically, however, the locally higher order error convergence is essential to accurately map currents and thus compute fields. 72 To enable the particles to be energy conserving, we couple the stencil in (4.8) with an implicit Newmark-𝛽 integrator with 𝛾 = 0.5 and 𝛽 = 0.25, given by 𝑛+1 𝑝 = v 𝑛−1 𝑝 + v (cid:16) Δ𝑡 2 𝑛+1 𝑝 + 2a 𝑛 𝑝 + a 𝑛−1 𝑝 a (cid:17) (4.10) Utilizing the spectral theorem, we can analyse the stability of (4.10) by analysing its associated 𝑧-transform. In the absence of an electric field, (4.10) becomes, 𝑛+1 𝑝 = v 𝑛−1 𝑝 + v (cid:16) Ω𝑛+1 · v 𝑛+1 𝑝 + 2Ω𝑛 · v 𝑝 + Ω𝑛−1 · v 𝑛 𝑛−1 𝑝 (cid:17) (4.11) where Ω can be written as Ω = 0 𝜔𝑧 −𝜔𝑦 −𝜔𝑧 0 𝜔𝑦 −𝜔𝑥 𝜔𝑥 0 (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) , (4.12) . To show that the proposed scheme conserves energy over long timescales, it is sufficient to show that the roots of one step update of the proposed scheme lie on the unit circle. The eigenvalues of Ω are given by 0 (cid:113) 𝑗 𝜔2 𝑥 + 𝜔2 𝑦 + 𝜔2 𝑧 (cid:113) −𝑗 𝜔2 𝑥 + 𝜔2 𝑦 + 𝜔2 𝑧 𝜆 =       . As a result, the 𝑧-transform is as follows: 𝑧2 = 1 + (cid:16) 𝜆𝑧2 + 2𝜆𝑧 + 𝜆 + 1 (cid:17) (4.13) (4.14) The roots of 𝑧 lie almost exactly on the unit circle when 𝑞𝑝Δ𝑡 |B(r, 𝑡)|/𝑚𝑝 ≪ 1, where Δ𝑡 denotes the timestep size. This condition is easily satisfied since the timestep size is usually chosen in order to appropriately discretize the cyclotron frequency 𝜔𝑐 = 𝑞𝑝 |B(r, 𝑡)|/𝑚𝑝. While the Boris integrator itself is not symplectic, it ensures by construction that the particle velocity under the influcence of a magnetic field does not change magnitude [88]. 73 Energy Balance in the Field Solver Definition 4.4.1 (Electric Field Energy). The electric field energy at any time 𝑡 can be written as 𝑊𝐸(𝑡) = 1 2E(r, 𝑡) · D(r, 𝑡) = 1 2 ¯𝑒(𝑡)𝑇 · [★𝜖] · ¯𝑒(𝑡). Definition 4.4.2 (Magnetic Field Energy). The magnetic field energy at any time 𝑡 is given by 𝑊𝐵(𝑡) = 1 ¯𝑏(𝑡)𝑇 · [★𝜇−1] · ¯𝑏(𝑡). 2B(r, 𝑡) · H(r, 𝑡) = 1 2 and the energy in the particles 𝑊𝑃 as 𝑊𝑃(𝑡𝑛) = 𝑁𝑝 (cid:213) 𝑝=1 𝑚𝑝 𝑐(𝛾𝑝 − 1), (4.15) where 𝛾𝑝 refers to the relativistic correction. As an abuse of notation, we shall denote ¯𝑒 𝑛 to be the same as ¯𝑒(𝑡𝑛). Next, we note that the Hodge matrices [★𝜖] and [★𝜇−1] are exactly symmetrical. Furthermore, the difference of either field quantity between timesteps 𝑛 and 𝑛 + 1 can be written through the Newmark relation as Lemma 4.4.1. ¯𝑒 𝑛+1 − ¯𝑒 𝑛 = 1 2Δ𝑡((cid:164)¯𝑒 𝑛+1 + (cid:164)¯𝑒 𝑛) + 𝒪(Δ2 𝑡 ) for 𝛾 = 2𝛽 Proof. From [103], the newmark relation for an unknown 𝑢(𝑡) can be written as 𝑢𝑛+1 = 𝑢𝑛 + Δ𝑡 (cid:164)𝑢𝑛 + (cid:19) − 𝛽 (cid:18) 1 2 𝑡 (cid:165)𝑢𝑛 + 𝛽ℎ2 (cid:165)𝑢𝑛+1 + 𝒪(Δ2 Δ2 𝑡 ). (4.16) Likewise, the velocity (cid:164)𝑢(𝑡) can be related to the acceleration through (cid:164)𝑢𝑛+1 = (cid:164)𝑢𝑛 + Δ𝑡(1 − 𝛾) (cid:165)𝑢𝑛 + 𝛾Δ𝑡 (cid:165)𝑢𝑛+1 + 𝒪(Δ2 𝑡 ). (4.17) Subtracting Δ𝑡/2 times (4.17) from (4.16) yields 𝑢𝑛+1 − 𝑢𝑛 = 1 2 Δ𝑡( (cid:164)𝑢𝑛+1 + (cid:164)𝑢𝑛) + (𝛽 − 1 2 𝛾)Δ2 𝑡 ( (cid:165)𝑢𝑛+1 − (cid:165)𝑢𝑛) + 𝒪(Δ2 𝑡 ). (4.18) Replacing 𝑢(𝑡) ≡ ¯𝐸(𝑡) and setting 𝛾 = 2𝛽, gives us the required result. Lemma 4.4.2. 𝛿𝑛 𝜖𝐹 = 𝑊𝐸(𝑡 𝑛+1) − 𝑊𝐸(𝑡 𝑛) + 𝑊𝐵(𝑡 𝑛+1) − 𝑊𝐵(𝑡 𝑛) = 0 ∀𝑛, if 𝜕𝑡 ¯𝐺 = 0 and 𝛾 = 2𝛽. 74 Proof. The fractional change in field energy can be written as 𝛿𝑛 𝜖𝐹 = [★𝜖] Δ𝑛 ¯𝐸 ( ¯𝐸𝑛+1 + ¯𝐸𝑛) 2 + ( ¯𝐵𝑛+1 + ¯𝐵𝑛) 2 (cid:2)★𝜇−1(cid:3) Δ𝑛 ¯𝐵. From Lemma 4.4.1, the differences in field unknowns can be written as 𝛿𝑛 𝜖𝐹 =Δ𝑡 [★𝜖] ( (cid:164)¯𝐸𝑛+1 + (cid:164)¯𝐸𝑛) ( ¯𝐸𝑛+1 + ¯𝐸𝑛) 4 + Δ𝑡 ( ¯𝐵𝑛+1 + ¯𝐵𝑛) 4 (cid:2)★𝜇−1(cid:3) ( (cid:164)¯𝐵𝑛+1 + (cid:164)¯𝐵𝑛). From the semi-discrete form of Ampere’s Law, 𝛿𝑛 𝜖𝐹 =Δ𝑡 ( ¯𝐸𝑛+1 + ¯𝐸𝑛) 4 [★𝜖] ( (cid:164)¯𝐸𝑛+1 + (cid:164)¯𝐸𝑛) − Δ𝑡 ( ¯𝐸𝑛+1 + ¯𝐸𝑛) 4 ( ¯𝐸𝑛+1 + ¯𝐸𝑛) 2 ¯𝐺 = 0, 𝛿𝑛 𝜖𝐹 = 0 (upto 𝒪(Δ2 = Δ𝑡 𝑡 )). [∇×]𝑇 ( (cid:164)¯𝐵𝑛+1 + (cid:164)¯𝐵𝑛) ¯𝐺𝑛+1 + 𝜕𝑡 2 ¯𝐺𝑛) (𝜕𝑡 Clearly, when 𝜕𝑡 (4.19a) (4.20) (4.21) We note at this point that the Newmark-𝛽 approach assumes a truncated form of the intermediate value theorem. Thus, the relation between function values and its derivatives will always admit an error of Δ2 𝑡 for a mixed finite element setup. However, 𝑧-transform analysis has demonstrated that the method as a whole is asymptotically non-dissipative [83]. As a result, there may be local oscillations at any given timestep of order 𝒪(Δ2 𝑡 ), but there will be no large scale gain or loss of energy. In essence average energy over long time will remain conserved. Current Generated from Particles Consider a generic Particle-in-Cell scheme with J(r, 𝑡) = (cid:205)𝑝 𝑞𝑝v𝑝(r𝑝 , 𝑡). Thus, 𝑗(𝑡) = (cid:205)𝑝 𝑞𝑝 ¯𝑣𝑝(𝑡). The semi-discrete Maxwell system can now be written as  [★𝜇−1]      0 0     [★𝜖]   (cid:164)¯𝑏(𝑡) (cid:170) (cid:174) (cid:172) (cid:164)¯𝑒(𝑡) (cid:169) (cid:173) (cid:171) +       0 −[M𝑐]𝑇  [M𝑐]      0 ¯𝑒(𝑡) (cid:170) (cid:174) ¯𝑏(𝑡) (cid:172) (cid:169) (cid:173) (cid:171) = (cid:169) (cid:173) (cid:171) 0 (cid:205)𝑝 𝑞𝑝 ¯𝑣𝑝(𝑡) . (cid:170) (cid:174) (cid:172) (4.22) 75 To construct an energy balance, consider the inner product of ¯𝑒 𝑛+1 and Ampere’s Law minus the similar product with ¯𝑒 𝑛. Using similar Manipulations from the previous section, this difference can be written as (¯𝑒 𝑛+1+¯𝑒 𝑛)𝑇·[★𝜖]·((cid:164)¯𝑒 𝑛+1+(cid:164)¯𝑒 𝑛)+( ¯𝑏𝑛+1+ ¯𝑏𝑛)𝑇·[★𝜇−1]·( (cid:164)¯𝑏𝑛+1+ (cid:164)¯𝑏𝑛) = (¯𝑒 𝑛+1)𝑇· (cid:32) (cid:213) 𝑝 (cid:33) −(¯𝑒 𝑛)𝑇 (cid:32) (cid:213) (cid:33) 𝑞𝑝 ¯𝑣𝑛 𝑝 𝑞𝑝 ¯𝑣𝑛+1 𝑝 𝑝 (4.23) The LHS of (4.23) is identical to Δ𝑡 𝜖𝐹. On the RHS, the electric fields can be taken inside the sum to yield 𝛿𝜖𝐹 = Δ𝑡 (cid:32) (cid:213) 𝑝 𝑞𝑝(¯𝑒 𝑛+1)𝑇 · ¯𝑣𝑛+1 𝑝 (cid:33) − (cid:32) (cid:213) 𝑝 (cid:33) 𝑞𝑝(¯𝑒 𝑛)𝑇 · ¯𝑣𝑛 𝑝 Now, if our particle integrator evaluates cross products correctly, 𝛿𝜖𝐹 = Δ𝑡 (cid:32) (cid:213) 𝑝 𝑞𝑝(¯𝑒 𝑛+1 + ¯𝑣𝑛+1 𝑝 · Ω( ¯𝑏𝑛+1))𝑇 · ¯𝑣𝑛+1 𝑝 (cid:33) − (cid:32) (cid:213) 𝑝 𝑞𝑝(¯𝑒 𝑛 + ¯𝑣𝑛 𝑝 · Ω( ¯𝑏𝑛))𝑇 · ¯𝑣𝑛 𝑝 The product of the Lorentz force RHS with the velocity at time 𝑡𝑛 is simply 𝑚𝑝 ¯𝑢𝑛 𝑝 · ¯𝑣𝑛 𝑝 = 𝑐 𝑑(𝑚𝑝(𝛾𝑝 − 1)) 𝑑𝑡 = 𝛿𝜖𝑃 Δ𝑡 Using (4.26) in (4.25) gives us 𝛿𝜖𝐹 = 𝛿𝜖𝑝 (4.24) (cid:33) (4.25) (4.26) (4.27) implying the any change in particle energy manifests as a change in field energy. Importantly, three rules need to be followed for this to occur: 1. The field solver must use 𝛾 = 0.5 and 𝛽 = 0.25, both for stability and for energy conservation. 2. The particle solver has to evaluate a pure rotation as part of the Lorentz force update. In other words v × B has to be exactly perpendicular to v. 3. Finally, the particle update stencil must be constructed such that it integrates 𝑞 ¯𝑒(𝑡) correctly, without spuriously gaining or losing energy. This condition forces the particle update by itself to be symplectic. 76 Figure 4.1: Average kinetic energy of the particle as a function of time. We note that the proposed method conserves energy better than a 4th order multistep method. One potential way to satisfy these conditions is to simply use a Boris rotation on the fields at 𝑛 + 1 as a corrector after the field solve. Likewise, one can use a Newmark integrator with timestep sizes chosen to be small in relation to the cyclotron frequency to ’almost’ satisfy these conditions. In the following section, we use the latter approach and show that energy is conserved within the scope of both a single-particle and complete PIC simulation. 4.5 Results Cyclotron Motion over Asymptotically Large Time In order to show that the proposed predictor-corrector method conserves energy over large timescales, we consider a single particle moving under the influence of a 𝑧-directed magnetic field. We choose a coordinate system wherein 𝑚𝑝 = 𝑐 = 𝑞𝑝 = 1, with the particle starting at the origin with relativistic velocity u(𝑡 = 0) = 𝑢0 ˆ𝑥, with 𝑢0 = 1. The timestep size was set to Δ𝑡 = 0.1, allowing for 𝑞𝑝 |B(r, 𝑡)|/𝑚𝑝Δ𝑡 = 0.1. The particle was integrated for 77 Parameter 𝑞 𝑚 Beam Voltage Beam Current Δ𝑡 Δ𝑡,XOOPIC 𝑁𝑝/timestep Value 1.6 × 10−19 C 9.1 × 10−31 kg 174 kV 10 A 10 ps 1 ps 50 Table 4.1: Parameters for the expanding beam run. a thousand timesteps (corresponding roughly to 16.6 gyrations). The average energy was then computed by finding a best fit line through the relativistic energy 𝑚𝑝 𝑐2/(cid:112)1 − v(𝑡)2/𝑐2. We note from Fig. 4.1 that the proposed method conserves average energy far better than the 4th order Adams-Bashforth/Adams-Moulton predictor corrector method. The error in the particle velocity was found to converge quadratically with stepsize in the proposed method, outperforming the first order Boris method, while preserving average energy. Expanding Particle Beam Next, consider a conducting cylindrical cavity of length 10 cm and radius 2 cm oriented such that the axis of rotation aligned along ˆ𝑧. The walls of the cavity were assumed to be perfectly conducting and a particle beam composed of electrons was initialized at 𝑧 = 0. The parameters of the simulation are as in Table. 4.1. The tetrahedral mesh used to discretize the system comprised of 3229 tetrahedra had an average edge length of 2.3 mm. Single Particle Test To demonstrate that merely cobbling together an energy conserving EM scheme with an energy conserving particle scheme is insufficient, we looked at the overall difference in field and particle energy per timestep for a single particle injected into the system. The energy difference curve shown in Fig. 4.2 clearly demonstrates that when a correcter step as proposed is not employed, there are spurious spikes in particle energy that result from 78 Figure 4.2: Plot of energy change over one timestep for a single particle. Note the spurious spikes when a corrector step is not employed. a mismatch between particle velocities and currents on the mesh. Likewise, we also see proof that the Newmark system is not perfectly energy conserving, due to the relation between zeroth and first derivatives only being accurate to Δ2 𝑡 . These manifest as harmonic oscillations in the solution, but do not grow over time, since the Newmark stencil by itself is non-dissipative over large timescales [83]. 79 Figure 4.3: Plot of energy change over timestep for the full beam run. The spikes from the previous case are still present with the traditional approach, but is a lot more noisy due to the presence of multiple particles Full Expanding Beam A snapshot of the 𝑧-𝑟 phase space distribution particle beam at 5 ns – compared with an equivalent simulation set-up on XOOPIC[98] – is reported in Fig. 4.4 and the satisfaction of Gauss’ Law over the course of the entire run is shown in Fig. 4.5. As is evident, the implicit method with the proposed velocity correction shows very good agreement to the reference method, while satisfying Gauss’ Law to machine precision. Looking at the energy curve in Fig. 4.3, we note similar spikes in particle energy as in the previous example. Due to many particles being present, however, these oscillations make the overall conservation 80 Figure 4.4: Snapshot of the expanding plasma beam at 𝑡 = 5 ns. We note that the distribution predicted by EM-FEMPIC using a 4th order Adams integrator closely agrees with equivalent results from XOOPIC. Figure 4.5: Satisfaction of Gauss’ Law for the expanding particle beam. We note that the error is at machine precision over the course of the run. 81 Figure 4.6: Particle distribution within a cylindrical klystron at 𝑡 = 7.6 𝑛𝑠. We note that both the EM-FEMPIC using an energy preserving predictor-corrector particle integrator agrees with a leapfrog based method and exhibits particle bunching at 𝑧 ∼ 20 mm. Parameter 𝑞 𝑚 Beam Voltage Beam Current Δ𝑡 Δ𝑡,Leapfrog 𝑁𝑝/timestep Value 1.6 × 10−19 C 9.1 × 10−31 kg 31.05 kV 1 A 11.6 ps 1 ps 100 Table 4.2: Parameters for the klystron run. chart far noisier. In contrast, the method employing the corrector step is smoother and is dominated by the 𝒪(Δ2 𝑡 ) error in the Newmark approximation. Particle Beam in a Klystron Next, we considered the performance of the implicit EM-FEMPIC method to an FEMPIC method built using leapfrog, by analysing the behavior of a particle beam accelerated into 82 Figure 4.7: Satisfaction of Gauss’ Law for the klystron. We note that the error is at machine precision over the course of the run. Figure 4.8: Particle and field energy predicted by the implicit EM-FEMPIC case for the klystron. 83 a cylindrical klystron. The geometry of the device is shown in Fig. 4.6, with all walls assumed to be perfectly conducting. At the neck of the device (𝑟 = 4 mm, 𝑧 = 14 − 16 mm), we placed a current source with frequency 3.9 GHz, oriented along the ˆ𝑧-direction. Once again, the system was discretized by a tetrahedral mesh comprised of 3229 tetrahedra, this time with an average edge length of 2.63 mm. The solver was run with a step size of 11.6 ps and compared to the leapfrog based method running at 1 ps. The parameters of the beam are as reported in Table 4.2. Once again, we note very good agreement between the implicit and explicit EM-FEMPIC implementations in Fig. 2.13, despite the former running at approximately 11 times the stepsize. Likewise, we note machine precision satisfaction of Gauss’ Law in Fig. 4.7 and good energy conservation in Fig. 4.8. 4.6 Conclusion In this chapter, we have presented a set of rubrics that need to be followed in order to exactly preserve average energy while using an unconditionally stable implicit EM-solve. These rubrics were demonstrated through a number of numerical examples that made use of a velocity correction scheme that satisfied the required conditions when using a small stepsize in relation to the cyclotron frequency. We further demonstrated good theoretical agreement in each result as well as conservation of charge and energy. 84 CHAPTER 5 A TRANSIENT PARAMETER EXTRACTION TECHNIQUE FOR FINITE ELEMENT PARTICLE IN CELL SOLVERS The methods and techniques outlined in the thesis so far have dealt with either mitigating the effects of using a larger stepsize afforded by an implicit field solver or coming up with ways to use even larger stepsizes while exactly satisfying all of Maxwell’s equations. In order, we have looked at 1. Using highly accurate particle push methods to evolve particles in phase space more accurately than with widely used polynomial predictor corrector methods. Doing this allows us to accurately trace particles despite having a larger stepsize than would be afforded by an explicit PIC scheme. 2. Using envelope tracking to use even larger step-sizes for narrowband devices. 3. Deriving rubrics for energy conservation in a PIC scheme making use of an implicit field solver. We also suggested a simple velocity correction scheme if having a high order polynomial predictor is essential to accurately map currents to the mesh. However, none of these address the fundamental drawback with finite element based solvers, namely that one has to invert a matrix (or iteratively solve an A · 𝑥 = 𝑏 problem) at every timestep. This is fundamentally different to FDTD based solvers where the update matrix is strictly diagonal [7]. One way to do this is to increase the timestep size and reduce the number of matrix solves needed to evolve a system for a prescribed span of time. However, as noted in Chapter 3, there are limits to how far we can push the stepsize, dictated primarily by the frequency response of the device being analysed. In this chapter, we attempt to tackle this problem using a slightly different approach. A similar issue is present in the analysis of devices that involve lumped nonlinear com- ponents. Such simulations are of great interest in modern RF design, particularly as the 85 characterization of radiative coupling effects becomes important early in the design process. Some examples of designs that benefit from such analysis include packaged systems, active antenna devices/arrays and high speed interconnects [104, 105, 106]. Traditionally, simulations of such systems would involve a circuit or device model run self-consistently in lockstep with either an integral equation [107, 108] or finite element [109] solver. As one can imagine, if the attached lumped device is nonlinear, the number of matrix solves required goes from one per timestep to one per nonlinear iteration per timestep, making it very expensive for real world devices. One potential way to get around this problem is to use parameter-extraction, where the significantly larger EM solver is reduced to a set of time-varying port responses. Since the nonlinearities exist only the device system, any computational cost associated with nonlinear iteration is restricted purely to the device model. Furthermore, once extracted, the transient port response can be used with any attached device so long as the overall EM geometry remains unchanged (we refer to such methods as being ’device agnostic’). This approach has been applied to nonlinear spice-like nodal circuit networks [99] attached to integral equation based systems [110] and, more recently, to a general physical device model attached to a finite element electromagnetic solver [111]. The latter citation is attached as an Appendix to this dissertation. Before we consider applying this idea to PIC simulations, there is still a problem to mitigate: As noted in [111], the computational savings in port extraction are achieved primarily due to the fact that the number of nonlinear devices in a typical simulation is significantly smaller than the number of electromagnetic unknowns in the system. This assumption is simply not true in a general PIC simulation, where any field unknown can potentially be associated with a particle current. In this work, we attempt to solve this problem by coupling parameter extraction with domain decomposition. In particular, we will attempt to break up a general EM geometry into a set of non-overlapping volumes, and extract transient responses on each edge on the interfaces connecting these volumes. Therefore, we drastically reduce the number of parameters that need to be extracted. 86 Domain decomposition, in the form of the finite element tearing and interconnecting FETI method, has been used in conjunction with EM-FEMPIC in the recent past [80]. We will use this as a starting point for our method. 5.1 Transient Port Extraction for Static Ports Taking a short break from plasma physics, we consider the use of full-wave electro- magnetic analysis in more general RF and microwave design applications. As mentioned earler, these devices are often characterized by the presence of strongly nonlinear elements (amplifiers, FETs, diodes etc.) as well as the presence of radiative coupling effects between connected device elements. As a result, any simulation scheme we use must employ self-consistent, time-domain analysis. The predominant approach to transient analysis of EM-circuit system is to solve the system self consistently [112]. This involves solving both the linear and non-linear system at every time step. Obviously, the tight integration implies that the solution is not circuit agnostic. Alternatives that have been explored is to use frequency domain methods to construct a transient “impulse response” that take the form either as 𝑅𝐶 extraction [113]) or 𝑆 parameter methods [114]. The extracted response is readily incorporated into a circuit simulator. While this approach is somewhat effective, the advantages and limitation are apparent; (a) the approach is independent of the circuit system; (b) given the bandwidth of excitation, the harmonics generated due to non-linearity and need to capture early time behavior, the number of frequency samples necessary can be very high; and (c) often only a subset of these frequencies are used. In weakly non-linear systems or when the coupling is not strong the errors accrued may be tolerable. When analysis of the circuit system is possible using frequency domain techniques (harmonic balance) under the assumption of weakly non-linear systems, one often takes recourse to using a Schur complement approach to couple EM to circuit systems [115, 110]. In addition to being circuit agnostic, this is computationally more efficient as there are fewer ports than number of spatial degrees of 87 freedom. It follows, that developing such a methodology for transient analysis will have the same benefits, in addition to potential integration with multiphysics codes that model device physics. Extracting port parameters of the EM system is analogous to computing its numerical impulse response. Doing this in time domain is challenging because of known instabilities associated with deconvolution [116]. A recently proposed technique [110] for solving coupled circuits with time domain integral equation (TDIE) solvers overcame this fundamental bottleneck. Extending this technique to finite element based solvers involves several changes in the extraction process. First, the extracted signal manifests itself as a transient admittance in the circuit system as opposed to an impedance in [110]. As a result, the feed model used is changed to a current probe as opposed to a delta-gap feed, leading to differences both in coupling and in the extraction process. Finally, using a finite element scheme allows for integration with different set differential equations used to model the device subsystem; in this work, we demonstrate this capability via coupling with a non-linear Drift-Diffusion equations to model a Schottky diode. The principal contributions of this section are (a) the development of a method for extracting transient port parameters in the EM-circuit interface for finite element systems, (b) the demonstration that solutions obtained through this method are identical to those obtained using a fully coupled solution to solver precision, and (c) integration with device specific differential equations. Furthermore, we also briefly demonstrate the implementation of a Perfectly Matched Layer (PML) system for mixed finite element electromagnetic solvers. Via numerous examples, we will demonstrate the application of these for analysis of linear and nonlinear circuits coupled to EM systems. Where possible, we will show comparison against data that exists in the literature (either measured or modeled). We note that while our results are obtained using a implicit mixed FEM system, the prescribed procedure is applicable to the traditional wave equation solvers. Our rationale for using mixed FEM as opposed to the usual wave equation is that the latter 88 has a time growing null-space of the form 𝑡∇𝜓(r), whereas the former has a null space of the form ∇𝜓(r). In mixed FEM, the magnitude of the null-space excited depends on the threshold used for the iterative solver. That said, it should noted that a gauging constraint as described in [95, 96, 117, 87] eliminates this null space. Finally, the nonlinearities are assumed to be lumped or pointwise. While the proposed method can potentially be used to isolate small regions of continuous nonlinear materials, this is relegated to future work. Preliminaries Consider an object ΩEM ∈ R3 bounded by a surface 𝜕ΩEM, that describes the geometry of an electromagnetic object containing 𝑁𝑝 ports, each associated with a lumped circuit subsystem. The currents flowing across these ports are collectively represented as JCKT(r, 𝑡) with r ∈ Ω𝐸𝑀. We assume that any voltage sources in the circuit system are bandlimited to some frequency range [ 𝑓min, 𝑓max] with 𝑓min > 0. Furthermore, we assume that the amplitude of these sources are zero when 𝑡 ≤ 0. The entire system is excited by a signal, either at a port or elsewhere. Our goal is to obtain a self-consistent transient solution that couples full wave electromagnetics with the attached device subsystems. In what follows, all regions/quantities will tagged as belonging either to the linear electromagnetic (EM) or device (DEV) system. To construct a numerical model, we use a finite element scheme with Newmark-Beta temporal basis functions to model the Maxwell system and higher order temporal basis functions for the device model. The discrete EM system arising from using these basis sets can be written as ℒ ◦ (cid:2) ¯𝐸EM(𝑡)(cid:3) + 𝒞EM ◦ (cid:2)¯𝐽DEV(𝑡)(cid:3) = 0 (5.1) where ℒ is a discrete linear differential operator, 𝒞EM is a discrete coupling operator that relates quantities in the device subsystems to the EM system, ¯𝐸EM(𝑡) are the coefficients of electric fields for all space and ¯𝐽DEV(𝑡) represents impressed current distributions in the device system. Likewise, the behavior of the DEV systems can be described by operators 89 𝒟, ℱ and couples to (5.1) through 𝒞DEV, forming 𝒟 ◦ (cid:2)¯𝐽DEV(𝑡), ¯𝐸EM(𝑡)(cid:3) + 𝒞DEV ◦ (cid:2) ¯𝐸EM(𝑡)(cid:3) = ℱ ◦ (cid:2) ¯𝐸EM(𝑡)(cid:3) . (5.2) We note, at this point, that 𝒟, 𝒞DEV and ℱ can have arbitrarily complex dependencies and are not restricted to a simple MNA circuit system. As will be shown in Sec. 4.5, 𝒟 can also depend on factors such as carrier density and temperature. Mechanics of Port Extraction Fundamentally, transient port extraction works by exploiting the linearity of the Operator ℒ. For a one port system, suppose the electric field and currents in (5.1) are defined as follows: ¯𝐸(𝑡) = ¯𝐽DEV(𝑡) = 𝑁𝑡(cid:213) 𝑖=0 𝑁𝑡(cid:213) 𝑖=0 𝑒𝑖𝜆(𝑡 − 𝑡𝑖) 𝑗𝑖𝜆(𝑡 − 𝑡𝑖) (5.3) where 𝜆(𝑡) refers to a prescribed, compactly supported basis time basis function. Suppose ¯𝐺(𝑡) denotes the solution of (5.1) due to a current defined as a single time basis function defined at the system’s singular port (this current is denoted as ¯𝐹Δ𝑡 (𝑡)), ℒ ◦ ¯𝐺(𝑡) + 𝒞EM ◦ ¯𝐹Δ𝑡 (𝑡) = 0 =⇒ ¯𝐺(𝑡) = −ℒ−1 ◦ 𝒞EM ◦ ¯𝐹Δ𝑡 (𝑡) (5.4) Since we know that ¯𝐽DEV(𝑡) is expanded in terms of 𝜆(𝑡), we can rewrite (5.1) as ¯𝐸(𝑡) = ℒ−1 ◦ 𝒞EM ◦ 𝑁𝑡(cid:213) 𝑖=0 𝑗𝑖𝜆(𝑡 − 𝑡𝑖) = 𝑁𝑡(cid:213) 𝑖=0 ¯𝐺(𝑡 𝑁𝑡 −𝑖) = ¯𝑗 ★ ¯𝑔 𝑗𝑖 (5.5) where ¯𝑔 refers to the temporal weighting coefficients of the extracted signal ¯𝐺 and ★ describes a discrete convolution operation. To describe the method succinctly: Because the EM response is represented as a collection of temporal basis sets, all we need is a function ¯𝐺(𝑡) such that ℒ ◦ ¯𝐺(𝑡) + 𝒞EM ◦ ¯𝐹Δ𝑡 (𝑡) = 0 where ¯𝐹Δ𝑡 refers to a single basis function 90 excitation. After such a function is obtained, ¯𝐸EM(𝑡) can be reconstructed for arbitrary ¯𝐽DEV(𝑡) through ¯𝐸EM(𝑡) = ¯𝐺(𝑡) ★ ¯𝐽DEV(𝑡) without any further input from the EM system. Discretization of EM and Device Systems We construct a Maxwell solver following a mixed finite element scheme using Whitney edge and face basis functions E(r, 𝑡) = (cid:205)𝑁𝑒 𝑖=1 𝑁𝑒 and 𝑁 𝑓 are the number of edges and faces respectively of the tetrahedral mesh to (r) and B(r, 𝑡) = (cid:205)𝑁 𝑓 𝑖=1 (r) , where 2 𝑏𝑖(𝑡)W 𝑖 1 𝑒𝑖(𝑡)W 𝑖 discretize the domain; see [118] and references therein. The EM unknowns are represented in time as e(𝑡) = (cid:205)𝑁𝑡 𝑗=1 of these functions are defined in [119]. A Newmark-𝛽 time stepping stencil with 𝛾 = 0.5 𝑏 𝑗 𝑁(𝑡 − 𝑡𝑗), and tested by 𝑊(𝑡 − 𝑡𝑖). Both 𝑒𝑗 𝑁(𝑡 − 𝑡𝑗) and b(𝑡) = (cid:205)𝑁𝑡 𝑗=1 and 𝛽 = 0.25 is used to solve for e(𝑡) and b(𝑡) and an appropriately configured PML to truncate the computational domain. Contemporary implementations of PML systems follow the general framework first outlined by Berenger [120] with more recent additions, including the use of stretched coordinates [121]. The implementation of these systems is done by either directly evaluating the convolutions resulting from the use of a stretched coordinate system or defining and solving for two auxiliary variables in addition to the regular field unknowns to achieve the same effect. The PML implementation used in this chapter directly evaluates the convolution integrals. To do so, we define a stretched coordinate system via the following transform Λ(𝜔) = 𝑠𝑦 𝑠𝑧 𝑠𝑥 0 0 (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) 0 𝑠𝑥 𝑠𝑧 𝑠𝑦 0 0 0 𝑠𝑥 𝑠𝑦 𝑠𝑧 (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) (5.6) with 𝑠𝑖 = 1 + 𝜎𝑖 𝑗𝜔𝜖0 to match the absorbing layers to free space. Here, 𝜎𝑖 are the components of an anisotropic conductivity that governs the field loss. Stretching coordinates in this 91 manner alters Maxwell’s equations as follows in frequency domain: Λ(𝜔)−1 · ∇ × E(r, 𝜔) = −Λ(𝜔)−1 · 𝑗𝜔B(r, 𝜔) ∇ × 𝜇−1 Λ(𝜔)−1 · B(r, 𝜔) = J(r, 𝜔) + 𝜖(r)Λ(𝜔) · 𝑗𝜔E(r, 𝜔) (5.7) Obtaining a time marching scheme involves inverse Fourier transforming (5.7) to obtain L2(𝑡) ∗ ∇ × E(r, 𝑡) = −L2(𝑡) ∗ 𝜕B(r, 𝑡) 𝜕𝑡 ∇ × 𝜇−1 L2(𝑡) ∗ B(r, 𝑡) = J(r, 𝑡) + 𝜖0L1(𝑡) ∗ E(r, 𝑡) (5.8a) (5.8b) where 𝐿1(𝑡) = F −1 (𝑗𝜔Λ(𝜔)) and 𝐿2 = F −1 (cid:16) by testing (5.8a) with a W2(r) basis function and (5.8b) with W1(r). Furthermore, the . We discretize these equations Λ(𝜔)−1(cid:17) convolution terms are evaluated as done in [121]. The behavior of the attached devices at each port can be described generally by operators 𝒟, ℱ and couples to the EM system through 𝒞CKT, forming 𝒟 ◦ (cid:2)J CKT(r, 𝑡), e(𝑡)(cid:3) + 𝒞CKT ◦ [e(𝑡)] = ℱ ◦ [e(𝑡)] . (5.9) Where 𝒟 and ℱ are general nonlinear operators and 𝒞CKT is a coupling operator that relates quantities in the EM system to those in the attached device. For the results presented in this work, we restrict 𝒟 to either be an circuit network implemented through Modified Nodal Analysis [99]; used entirely by itself or in conjunction system governed by a set of Drift-Diffusion equations to model diodes [122, 123]. In the case of the former, we temporally represent the voltage and circuit unknowns using a pth order backward Lagrange interpolation function 𝐿𝑝(𝑡 − 𝑡𝑖). We note that this choice of representation is what is commonly used in contemporary implementations of MNA, but are in no way the only feasible choice. Upon using our chosen representation, we obtain the following system YV CKT(𝑡) = f CKT(𝑡) + f CKT 𝑛𝑙 CKT, 𝑡). (V (5.10) which is subsequently delta tested to obtain a time marching scheme. Here VCKT is a vector containing both the nodal voltages and branch currents in the circuit, fCKT and f refer to CKT 𝑛𝑙 92 the excitations due to linear and nonlinear components. The linearized form in (5.10) can be solved at each timestep using a multi-dimensional Newton-Raphson scheme similar to [109]. Similarly, when employing a drift diffusion operator to model diodes in the system, the currents due to electrons and holes J𝑛(r, 𝑡) and J𝑝(r, 𝑡) running through the device are related to carrier densities 𝑛(r, 𝑡), 𝑝(r, 𝑡) and potential 𝜙(r, 𝑡) through J𝑛(r, 𝑡) = 𝑞𝐷𝑛∇𝑛(r, 𝑡) + 𝑞𝜇𝑛(E(r, 𝑡))𝑛(r, 𝑡)∇𝜙(r, 𝑡) J𝑝(r, 𝑡) = −𝑞𝐷𝑝∇𝑝(r, 𝑡) + 𝑞𝜇𝑝(E(r, 𝑡))𝑝(r, 𝑡)∇𝜙(r, 𝑡) (5.11a) (5.11b) where 𝜇𝑛 and 𝜇𝑝 are field dependent mobility rates for the electrons and holes respectively; and 𝐷𝑝 and 𝐷𝑛 are corresponding diffusion coefficients. The currents and carrier densities are further related through a set of continuity equations: 𝜕𝑛(r, 𝑡) 𝜕𝑡 = ∇J𝑛(r, 𝑡) 𝑞 − 𝑅 + 𝐺 𝜕𝑝(r, 𝑡) 𝜕𝑡 = − ∇J𝑝(r, 𝑡) 𝑞 − 𝑅 + 𝐺 (5.11c) (5.11d) where 𝑅 and 𝐺 respectively denote the electron-hole recombination and the collision ionization rates. Finally, the carrier densities are related to the potential through Poisson’s equation ∇ · (𝜖∇𝜙(r, 𝑡)) = −𝑞 (𝑝(r, 𝑡) − 𝑛(r, 𝑡) + 𝑁𝑡(r, 𝑡)) (5.11e) where 𝑁𝑡(r, 𝑡) refers to the doping concentration. Solution to the drift diffusion system can be obtained by discretizing (5.11) using an appropriate finite element or finite difference method; see [122, 123, 124] and the references therein for a detailed analysis. The results presented in this chapter only involve 1D drift-diffusion systems and as a result we discretize (5.11) using a corresponding 1D finite difference system. We describe the interaction between EM and device subsystems in two parts. First, we consider a device system modelled using MNA. In this instance, the quantities involved in the device system are voltages and currents, which need to be related to fields and current 93 densities in the EM system. Specifically, if the 𝑘th FEM edge (denoted by l𝑘) is attached to the 𝑗th circuit subsystem, the current impressed on the EM system is given by ⟨𝑊(𝑡 − 𝑡𝑖), 𝐽CKT 𝑘 (𝑡)⟩ = ⟨𝑊(𝑡 − 𝑡𝑖), 𝐼CP 𝑗 (𝑡) ∫ |l𝑘 | ˆ 1 𝑘 𝑑r⟩ l𝑘 · W (5.12) = ⟨𝑊(𝑡 − 𝑡𝑖), 𝐼CP 𝑗 (𝑡)𝐶𝑘 𝑗⟩ with 𝐶𝑘 𝑗 denoting a coupling coefficient that relates quantities in the device subsystem to the EM solver. Furthermore, 𝐼CP 𝑗 (𝑡) refers to the magnitude of the current impressed by the circuit subsystem over the coupling edge l𝑘. Similarly, the voltage across the coupling branch can be related to the electric field across the 𝑘th FEM edge ⟨𝛿(𝑡 − 𝑡𝑖), 𝑉 CKT 𝑗 (𝑡)⟩ = ⟨𝛿(𝑡 − 𝑡𝑖), 𝑒𝑘(𝑡) ∫ |l𝑘 | ˆ 1 𝑘(r)𝑑r⟩ l𝑘 · W (5.13) = ⟨𝛿(𝑡 − 𝑡𝑖), 𝑒𝑘(𝑡)𝐶𝑗𝑘⟩. 𝐶𝑗𝑘 here likewise denotes a coupling coefficient that relates quantities in EM solver to the device. We observe from (5.13) and (5.12) that our choice of testing/representation functions leads the two coupling coefficients to be identical. For a drift diffusion setup, the electric field at the location of the port is related to the electron and hole mobilities 𝜇𝑛(E(r, 𝑡)) and 𝜇𝑛(E(r, 𝑡)) respectively. Since the devices are assumed to be lumped, E(r, 𝑡) at the location of the port can be used directly to compute the carrier mobilities, since the field is assumed to be spatially constant within the device. Likewise, we can use J𝑝(r, 𝑡) and J𝑛(r, 𝑡) are to construct the net current passing through the diode, which can then be reintroduced to the EM system following (5.12). Extraction of the Numerical Impulse Response The computational bottlenecks involved with solving a coupled system as described in the previous Section are twofold: (1) Resolving nonlinear elements in the circuit system involves performing a solve of the combined matrix equation and (2) changing any of the attached circuit subsystems would require the coupled problem to be solved again, despite 94 the EM system remaining unaltered. A potential way to exploit the linearity of the EM system is to extract its impulse response at each EM-circuit interface and use it through (5.13) in the circuit solve. Unfortunately, it is well known that deconvolution required to implement this is unstable [116]. The key insight in the method proposed herein is as follows. The current deposited on a given port edge is represented in time through a linear combination of 𝑁𝑡 basis functions. As a result, given the EM response due to a single temporal basis function, we can exploit the linearity of Maxwell’s equations and reconstruct the field anywhere in the system. Since this sequence of operations only involves reconstructing the current at a given port in terms of basis functions by which it is represented in the coupled solve, the respective fields computed by both methods should be numerically indistinguishable. With 𝑝(𝑞) denoting the set of FEM edges associated with the port 𝑞, we define an excitation vector e𝑞(𝑡) defined as follows 𝑒𝑞,𝑘(𝑡) = 𝑁(𝑡 − 𝑡𝛿) 𝑘 ∈ 𝑝(𝑞) 0 otherwise    (5.14) where 𝛿 is the timestep at which the excitation is applied. e𝑞 is then used to define the forcing function JCKT(𝑡) through (5.12) with 𝐽CKT(𝑡) = 𝐶𝑘𝑞e𝑞(𝑡). This function is then applied to the RHS of (5.8) to obtain a solution vector x𝑞. In order to solve the device equations, however, we only require the coefficients associated with each port, allowing us to construct a matrix 𝐺𝑘𝑞 = 𝑥 𝑘,𝑝(𝑞) of dimensions 𝑁𝑝 × 𝑁𝑝 × 𝑁𝑡. Each column of 𝐺 represents a discrete impulse response for a pulse centered at the edge 𝑝(𝑞) measured at the 𝑘th edge. As a result, constructing the electric field at port 𝑘, in response to an arbitrary set of currents can be done by simply summing the convolutions of 𝐺𝑘𝑞 with 𝐼𝑞 for each attached circuit port. The reconstructed fields can then be related using the appropriate coupling equations to quantities the device subsystems. For instance, if the attached port is governed through MNA, the voltage across the 𝑗th port 𝑉 CKT 𝑗 (𝑡) in (5.13) can now be 95 (a) Geometry of the monopole system. The di- mensions are in cm. (b) Plots of the input admittance in S measured from 0.5 GHz to 4.5 GHz compared to existing results in the literature [112]. Figure 5.1: Description of the geometry and obtained results for the cylindrical monopole antenna. written in terms of 𝐼CP(𝑡) 𝑉 CKT 𝑗 (𝑡𝑖) = ⟨𝛿(𝑡 − 𝑡𝑖), 𝐶𝑗𝑘 𝑁𝑝 (cid:213) 𝑞=1 𝐺𝑘𝑞(𝑡) ∗ 𝐼CP 𝑞 (𝑡)⟩. (5.15) yielding a standalone matrix equation for the device system. Numerical Examples For the results presented in the remainder of this section, 𝑁𝑡 denotes the number of timesteps that the simulation is run over and 𝑁EM, 𝑁CKT denote the numbers of EM and circuit unknowns respectively in the system. Unless specified otherwise, voltage sources are defined using 𝑣(𝑡) = cos(2𝜋 𝑓0𝑡)𝑒−𝑡2/2𝜎2 where 𝜎 = 3 × (2𝜋 𝑓bw)−1, with 𝑓max = 𝑓0 + 𝑓bw. The timestep size Δ𝑡 = (30 𝑓𝑚𝑎𝑥)−1. Finally, GMRES was used to solve the system iteratively to a tolerance of 10−12. 96 Input Impedance of a Monopole Antenna In this first example, we validate our technique by analysing a cylindrical monopole suspended above an infinite ground plane. Specifically, the monopole has a length of 5 cm, radius of 1.52 mm and is suspended 1.6 mm above a conducting square of side length 10 cm, as shown in Fig. 5.1a. To mimic an infinite ground plane, the truncating walls of the simulation domain are in direct contact with the ends of the square. The ground plane is coupled to the cylinder by a single, vertically oriented edge, across which is connected a driving circuit given by a time varying voltage source connected in series to a 100Ω resistor. The voltage fed to the resistor is assumed to be a modulated Gaussian with center frequency 𝑓0 = 2.5 GHz and bandwidth 𝑓BW = 2 GHz The timestep size Δ𝑡 was set to be (30 𝑓max)−1. The mesh used to discretize the domain had an average edge length of (20 𝑓max)−1, resulting in 𝑁EM = 512, 436 and the simulation was run for 𝑁𝑡 = 2001. The setup is geometrically identical to an example in [112] and looking at Fig 5.1b, we see good agreement between the admittance curves generated through port extraction and a coupled time domain solver for the same simplified probe model. The solve time per timestep performing the extraction as detailed in Section 5.1 was approximately 8 seconds per timestep, with the subsequent circuit solve completing its entire run of 2001 timesteps in under 20 ms. Input impedance of a strip above a Finite Ground Plane We consider a conducting strip suspended over a finite ground plane, as specified in Fig. 5.2a. The coupling between the EM system and the driving circuit is achieved across a vertical 1 cm edge going from the conducting plane to the strip. The circuit is assumed to be a Thevenin source characterized by 𝑓0 = 1 GHz and 𝑓bw = 999 MHz connected in series to a 100 Ω resistor. The simulation domain is discretized using a tetrahedral mesh with approximate average edge length set to (20 𝑓max)−1, yielding 𝑁EM = 2, 000, 936. The system was run for 𝑁𝑡 = 4000 timesteps (with each timestep taking approximately 13 seconds 97 (a) Geometry of the monopole antenna. The di- mensions are in cm. (b) Power radiated due to a 1 mV source compared to measured data and FDTD [104] Figure 5.2: Description of the geometry and obtained results for the monopole strip suspended over a finite ground plane. (a) Geometry of the microstrip amplifier with a FET attached between G and D [109]. The dimensions are in mm. (b) Comparison of 𝑆11 and 𝑆21 for a microwave amplifier as described in [109] Figure 5.3: Geometry description and calculated 𝑆-parameters for the microwave amplifier. to converge) and the port parameters were extracted through Fourier transforms of the time-series data. As is evident from Fig. 5.2b, the radiated power curve shows very good agreement to measured data and FD-TD. 98 (a) Schematic of the microstrip rectifier. All di- mensions are in mm. (b) Computed conversion efficiency for a diode rectifier with a simulation done through a physical model compared to ADS [125]. Figure 5.4: Geometry layout and comparison of conversion efficiency for a microstrip rectifier circuit. Microstrip Amplifier Next, we validate the proposed technique for nonlinear circuit systems by comparing the reflection coefficient and gain for a microstrip amplifier. The geometry and driving circuits are exactly as in [109] and we obtain the S parameters through small signal analysis, with 𝑓0 = 5.5 GHz, 𝑓bw = 3.5 GHz and 𝑓max = 𝑓0 + 𝑓bw. The tetrahedral mesh used to discretize the domain has an average edge length of (15 𝑓max)−1 with 𝑁EM = 5, 134, 732. The data used to compute the scattering parameters was obtained by running this setup for 𝑁𝑡 = 6000 timesteps. We note from Fig. 5.3b that the measured S parameters show good agreement to results from [109]. Extracting this response took approximately 37 seconds per timestep, and the nonlinear circuit solve completed in just over 5 seconds. Miscrostrip Rectifier modelled through Drift-Diffusion Until now we have demonstrated the use of port extraction on linear and nonlinear systems connected to nodal circuit networks. Next, we aim to show that the proposed 99 method works with systems where the devices are governed nonlinear differential equations. The system under analysis is a microstrip rectifier circuit as shown in Fig. 5.4a. The thickness of the board was 1 mm and the relative permittivity of the substrate 2.65. A HSMS-282B diode is placed across port 𝑃 (with physical parameters for (5.11) as in [125]) with a 10 pF filter capacitor attached to a variable load across port 𝐶. The input source was assumed to be a modulated Gaussian with 𝑓0 = 2.45 GHz and 𝑓bw = 0.25 GHz. We performed two experiments on a microstrip rectifier circuit: (1) First, the Schottky diode in the layout was modeled using an equivalent circuit network, mimicking a similar setup simulated on ADS. (2) Next, using the same extracted port response as in the first experiment, we modeled the diode using a set of Drift-Diffusion [125] equations in (5.11). In each case, the conversion efficiency of the rectifier 𝜂 = 𝑃DC 𝑃source · 100% (5.16) where 𝑃DC denotes the power measured at the output end 𝑃source the corresponding quantity at the soruce was compared against data from [125]. As is evident in Fig. 5.4b, in the first experiment, results obtained through the proposed method agree well with corresponding results obtained through ADS EM Co-simulation. In the second experiment, our results better match measured data of the rectifier circuit than the corresponding efficiency curve predicted by ADS. We emphasize the fact that the results from the equivalent circuit do not agree with experimental measurements, due to the network not being representative of the actual diode for the parameters chosen, thereby illustrating a situation where the ability to couple the EM layout with a general device model is a significant advantage. Strip above a Finite Ground Plane driven by different circuits In keeping with objectives stated earlier, we first extracted the port parameters following the procedure outlined earlier in the chapter for the strip-monopole example with Δ𝑡 = 16 ps. This extracted response was then attached to a Chebyshev filter and a Diode Mixer circuit 100 50 Ω 16.308 nH 16.308 nH 𝑉𝑠(𝑡) 9.05 pF 13.48 pF 9.05 pF + 𝑉port(𝑡) − (a) Schematic of the Chebyshev filter used. (b) Plot of the port voltage for the Chebyshev filter from Fig. 5.5a obtained through the fully coupled and extracted responses. Figure 5.5: Circuit description and comparison of port voltages between the port extraction and fully coupled methods for a linear circuit system. respectively, and the obtained port voltages in time were compared to equivalent results obtained from a direct solution of the coupled system. Chebyshev filter First, we use the extracted transient port parameters on a Chebyshev filter as shown in Fig. 5.5a. 𝑉𝑠(𝑡) was characterized by 𝑓0 = 1.5 GHz and 𝑓bw = 0.5 GHz. The timestep size in the circuit system was set to the same size used in the extraction of the EM response. The comparison of the port voltages as a function of time are shown in Fig. 5.5b. The 𝐿2 error between the two solutions was 3.1 × 10−12. Diode Mixer Next, we use extracted port parameter with a nonlinear Diode Mixer as shown in Fig. 5.6a. The diode between nodes 3 and 7 has a saturation current 𝐼𝑠 = 2 nA, emission coefficient 𝜂 = 2.0 and 𝑘𝐵𝑇/𝑞 = 25.6 mV. The RF and LO sources were assumed to be sine waves of magnitude 0.4 V with frequencies 900 MHz and 800 MHz respectively. The 101 1 100 Ω 2 10 Ω 1 pF RF 1 M Ω 100 Ω 1 pF LO 5 7 .1 pF 8 6 nH 3 nH 1𝑘Ω 10 Ω 4 3 6 9 + 𝑉𝑏𝑖𝑎𝑠 − (a) Schematic of the Diode Mixer. The EM system is attached between nodes 8 and ground (b) Plot of the port voltage for the Diode Mixer from Fig. 5.6a obtained through the fully coupled and extracted responses. Figure 5.6: Circuit description and comparison of port voltages between the port extraction and fully coupled methods for a nonlinear circuit system. current across the diode was modelled using the Shockley equation and the bias voltage was set to 0.7 V to activate the diode. The relative 𝐿2 error between the two curves in Fig. 5.6b was 4.7 × 10−12. Computational Complexity Finally, we discuss the asymptotic cost complexity of the proposed method. Let 𝑁EM and 𝑁CKT denote the number of degrees of freedom of the EM and circuit systems, respectively; 𝑁𝑡 the number of timesteps; 𝑁NL the number of nonlinear iterations per time step; 𝑁𝑝 the number of circuit ports and 𝑁GMRES the number of matrix multiplications required to solve the linearized system denoted in the superscript. The cost for solving the fully coupled system is 𝐶coupled = 𝒪 (cid:16) 𝑁𝑡 𝑁 coupled NL 𝑁 coupled GMRES (𝑁EM + 𝑁CKT) (cid:17) . On the other hand, the cost of port extraction is 𝐶PE = 𝐶PE,1 + 𝐶PE,2 (5.17) (5.18) = 𝒪(𝑁𝑡 𝑁𝑝 𝑁 PE GMRES𝑁EM) + 𝒪(𝑁𝑡 𝑁 PE NL𝑁 CKT GMRES𝑁CKT). 102 Figure 5.7: Cumulative solution time of the linearized system from Section 5.1 as a function of the number of iterations. Before, we proceed, we note the following. Typically, 𝑁EM ≫ 𝑁CKT. As a result, we ignore the cost of computing the Jacobian in (5.17). In (5.18), the first portion refers to the cost of exciting each port and obtaining the corresponding response at other ports. It is a one-time cost and not incurred as one marches through (indeed, it is the characteristic of the EM systems and circuit agnostic). The second term in (5.18), is the cost of non-linear solve at each port. Typically, the number of non-linear solves, 𝑁 coupled NL 𝑁 coupled GMRES in (5.17), is significantly larger than 𝑁𝑝 𝑁 PE GMRES as it involves a fully coupled solve involving all the degrees of freedom in the system. In order to meaningfully compare computational costs, it is important to incorporate the contributions of both 𝐶PE,1 and 𝐶PE,2 against 𝐶coupled. To do this, we considered the solve time per nonlinear iteration within a single timestep of each 103 solve for the numerical example involving the diode mixer attached to a strip-monopole antenna. This includes the cost for evaluating a single timestep in the extraction process as well as the cost of computing 𝑁NL nonlinear iterations, where 𝑁NL is the average number of nonlinear iterations required to achieve convergence per timestep. In this example, 𝑁NL= 6, per time step. Extracting the impulse response took 13s per 𝑁𝑡, i.e., 𝐶PE,1 = 13𝑁𝑡. What is compared in Fig. 5.7 are 𝐶PE,2/𝑁𝑡 and 𝐶Coupled/𝑁𝑡. As is evident, 𝐶PE,2 ≪ 𝐶coupled. 5.2 Port Extraction with EM-FEMPIC Fundamentally, the challenge of extending port extraction to a full EM-FEMPIC problem is due to the complexity of the method scaling badly with the number of ports 𝑁𝑝. As shown in the previous section, the complexity of parameter extraction can be written as follows 𝐶PE = 𝐶PE,1 + 𝐶PE,2 (5.19) = 𝒪(𝑁𝑡 𝑁𝑝 𝑁 PE GMRES𝑁EM) + 𝒪(𝑁𝑡 𝑁 PE NL𝑁 CKT GMRES𝑁CKT), for a solver that evaluates the EM and CKT systems using an iterative method like GMRES. We immediately see from (5.19) that as 𝑁𝑝 grows large, the cost of port extraction quickly dwarfs any benefit gained by relegating the nonlinear solve to the circuit. Unfortunately, in an EM-FEMPIC problem, the ’nonlinear’ contributions of the system can generally exist everywhere in the mesh, and as a result, we need to do a bit more work to get port extraction to be viable. The solution is to first reduce the volume EM solve to an interface problem, and then extract port parameters for every interface unknown in the system. The remainder of this chapter details this process as well as explores the regimes in which the proposed method is more efficient than a traditional EM-FEMPIC solve. 104 Spatial Discretization The FEM formulation used in this work is the mixed finite element method. The basis functions used are the 1-form W(1)(r) ∈ 𝐻(𝑐𝑢𝑟𝑙; Ω) which allows for tangential continuity of fields, and the 2-form W(2)(r) ∈ 𝐻(𝑑𝑖𝑣; Ω) which allows for normal continuity of fluxes. The reconstructed fields used to accelerate the particles are defined as E(𝑡, r) = (cid:205)𝑁𝑒 𝑒𝑖(𝑡)W(1)(r) 𝑖=1 and B(𝑡, r) = (cid:205)𝑁 𝑓 𝑏𝑖(𝑡)W(2)(r) where 𝑁𝑒 are the number of edges and 𝑁 𝑓 are the number 𝑖=1 of faces. Testing Faraday’s law with the 2-form and Ampere’s law with the 1-form leads to the semidiscrete Maxwell system [★𝜇−1] 0 (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)       (cid:124) 0     [★𝜖0]   (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) ¯𝐵 𝜕𝑡 ¯𝐸 𝜕𝑡             (cid:123)(cid:122) ¯¯𝐴1 +       (cid:124) 0 −𝑐2[𝑀𝑐]𝑇 (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) ¯¯𝐴0 [𝑀𝑐]     𝑐[★𝐼]   (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) (5.20) = ¯𝐵 ¯𝐸             0       ¯𝐿     𝜀   (cid:124)(cid:123)(cid:122)(cid:125) ¯¯𝐹 where the degree of freedom vectors ¯𝐸 = [𝑒1(𝑡), 𝑒2(𝑡), . . . , 𝑒𝑁𝑒 (𝑡)], ¯𝐵 = [𝑏1(𝑡), 𝑏2(𝑡), . . . , 𝑏𝑁 𝑓 and ¯𝐿 = [𝑙1(𝑡), 𝑙2(𝑡), ...𝑙𝑁𝑒 (𝑡)] with 𝑙𝑖(𝑡) = ∫ currents J𝑠(𝑡, r) exist on 𝜕Ω due to a Neumann or Robin boundary condition. (r) · (J𝑠(𝑡, r) − 𝜕𝑡G(𝑡, r)). The surface 𝑑ΩW (1) 𝑖 Ω (𝑡)], Temporal Discretization This work utilizes the Newmark-𝛽 time marching scheme to evolve the fields in time with parameters 𝛾 and 𝛽, which creates an unconditionally stable time marching scheme when 𝛾 = .5 and 𝛽 = .25. Other choices can lead to different stability conditions. The parameter choices effect the definition of a temporal basis function 𝑁(𝑡) that is tested by a function 𝑊(𝑡) [74]. When (5.20) is discretized with Newmark-𝛽 with 𝛾 = 0.5 and 𝛽 = 0.25, 105 Figure 5.8: Breakup of the simulation domain Ω into subvolumes the fully discretized system becomes (0.5 ¯¯𝐴1 + 0.25Δ𝑡 + (0.5 ¯¯𝐴1 + .25Δ𝑡 ¯¯𝐴0) ¯𝑋 𝑛+1 − .5Δ𝑡 ¯¯𝐴0 ¯¯𝐴0) ¯𝑋 𝑛−1 + 0.5Δ𝑡 ¯𝑋 𝑛 ˜𝐺𝑛+1 + 0.5Δ𝑡 ˜𝐺𝑛−1 (5.21) + 0.25Δ𝑡 ˜𝐹𝑛+1 − 0.5Δ𝑡 ˜𝐹𝑛 + 0.25 ˜𝐹𝑛−1 = 0. Here, ¯𝑋 𝑚 = [ ¯𝐵𝑚 ¯𝐸𝑚]𝑇, ˜𝐺𝑚 = [0 𝜀−1 ¯𝐺]𝑚,𝑇, and ˜𝐹𝑚 = [0 𝜀−1 ¯𝐽 𝑚 𝑠 ] at time step 𝑚 with time step size Δ𝑡. This time marching scheme is separate from that used to evolve the currents through Newton’s equations. However, the time step sizes and indices between the two systems are consistent. The boundary current ¯𝐽 𝑚 𝑠 is tangential to a surface. As it has no normal component, it does not contribute to Gauss’ law and therefore does violate charge conservation. Domain Decomposition Preliminaries Consider the region Ω depicted in Fig. 5.8 that is divided into 𝑁𝑣 non-overlapping subdomains where subdomain Ω𝑖 and Ω𝑗 are separated by boundary Γ𝑖𝑗. The junction of more than two volumes is a "corner" denoted by Γ𝑐. In each subdomain Ω𝑖, (5.21) is defined 106 and is assumed to be a self contained “primal" problem that has fictitious excitations from "dual" unknowns. The dual unknowns, the Lagrange multipliers denoted by Λ, affect either Neumann or Robin boundary conditions such that the surface equivalence theorem can be applied for each Ω𝑖 with respect to Ω using the external currents on 𝜕Ω𝑖, the fictitious boundary current from Λ, and the particle current in Ω𝑖. By solving for the dual unknowns and a small subset of the primal unknowns, the overall computational cost of solving the original monolithic system in (5.21) is reduced. Finite Element Tearing and Interconnecting The boundaries between Ω𝑖 and Ω𝑗, Γ𝑖𝑗 and Γ𝑗𝑖, are distinct and have a unique Lagrange multiplier defined for each subdomain. As described in [80], the boundary condition on the Lagrange multipliers becomes Λ(𝑖)(𝑡, r) = ˆ𝑛𝑖 × 𝜇−1 0 B(𝑡, r) − 𝜂−1 ˆ𝑛𝑖 × ˆ𝑛𝑖 × E(𝑡, r) ∈ Γ𝑖𝑗. (5.22) The edges on Γ𝑖𝑗, as well as the interior unknowns and unknowns on 𝜕Ω𝑖 are included in a vector ¯𝑋(𝑖) 𝐼 . The Lagrange multipliers for the corner edges remain the same, with a unique Lagrange multiplier for each feature. The faces on an Γ𝑖𝑗 are grouped together with the corner edges, which we will denote as ¯𝑋𝑁 . The system of equations solved in each subdomain is ¯¯𝐴(𝑖) 𝐼𝐼 ¯¯𝐴(𝑖) 𝑁 𝐼 where ¯¯𝐴 = .5 ¯¯𝐴1 + .25Δ𝑡       ¯𝑋(𝑖),𝑛+1 ¯𝑋(𝑖),𝑛+1 𝑁 𝐼 ¯¯𝐴(𝑖) 𝐼𝑁 ¯¯𝐴(𝑖) 𝑁 𝑁       ¯¯𝐴0 and given       =       ℛ(𝑖) 𝐼 ℛ(𝑖) 𝑁             −       .5Δ𝑡 (𝑖),𝑛+1 ¯Λ 𝐼 (𝑖) ¯Λ 𝑁       ∈ Ω𝑖 (5.23) ¯¯𝐴0 ¯𝑋 𝑛 ℛ = .5Δ𝑡 − (.5 ¯¯𝐴1 + .25Δ𝑡 ¯¯𝐴0) ¯𝑋 𝑛−1 − .5Δ𝑡 ˜𝐺𝑛+1 − .5Δ𝑡 ˜𝐺𝑛−1 (5.24) − .25Δ𝑡 ˜𝐹𝑛+1 + .5Δ𝑡 ˜𝐹𝑛 − .25 ˜𝐹𝑛−1 − .5Δ𝑡Λ𝑛 + .25Λ𝑛−1, ℛ(𝑖) 𝐼 contains ℛ associated with interior unknowns, external unknowns, and unknowns of edges on Γ𝑖𝑗 and ℛ(𝑖) 𝑁 contain the remaining unknowns in subdomain Ω𝑖. To derive the set 107 of equations to solve for the dual unknowns, first consider the summation of (5.22) for two volumes Λ(𝑖)(𝑡, r) + Λ(𝑗)(𝑡, r) = (𝑌𝑖 + 𝑌𝑗) ˆ𝑛𝑖 × ˆ𝑛𝑖 × E(𝑡, r) (5.25) where 𝑌𝑖 is the surface impedance on Γ𝑖𝑗 and 𝑌𝑗 the surface impedance on Γ𝑗𝑖. Define an unsigned Boolean matrix such that ¯¯𝐵(𝑖) ¯𝑋𝐼 = ¯𝑒(𝑖) , the unknowns for the electric field on Γ𝑖𝑗. 𝐼 𝐼 Also, define the map from the boundary edges on Γ𝑖𝑗 to its counterpart on Γ𝑗𝑖, ¯𝑇𝑖𝑗. Testing (5.25) with curl-conforming basis functions W𝑒, generalizing the case to any number of subdomains, and using ¯¯𝐵(𝑖) 𝐼 and ¯¯𝑇𝑗𝑖 restrict and map the unknowns of Ω𝑗 to Ω𝑖 yields Λ(𝑖) + (cid:213) ¯¯𝑇𝑖𝑗(Λ(𝑗) − (𝑌𝑖 + 𝑌𝑗) ¯¯𝑀(𝑗) 𝑇 ¯¯𝐵(𝑗) 𝐼 ¯𝑋(𝑗) 𝐼 ) = 0 (5.26) 𝑗∈𝜎(𝑖) where 𝜎(𝑖) are all subdomains Ω𝑗 which share a boundary with Ω𝑖 and the matrix ¯¯𝑀(𝑖) 𝑑Γ𝑖𝑗 ˆ𝑛 × W𝑒 · ˆ𝑛 × W𝑒. Rewriting the first set of equations in (5.23) as 𝑇 = ∫ Γ𝑖𝑗 ¯𝑋(𝑖),𝑛+1 𝐼 = ¯¯𝐴(𝑖),−1 𝐼𝐼 (ℛ(𝑖) 𝐼 − .5Δ𝑡 ¯Λ (𝑖),𝑛+1 𝐼 − ¯¯𝐴(𝑖) 𝐼𝑁 ¯𝑋(𝑖),𝑛+1 𝑁 ) (5.27) Now, (5.27) can be plugged into (5.26) to yield Λ(𝑖)+(cid:213) 𝑇𝑖𝑗(¯¯𝐼(𝑗) − (𝑌𝑖 + 𝑌𝑗) ¯¯𝑀(𝑗) 𝑇 ¯¯𝐵(𝑗) 𝐼 ¯¯𝐴(𝑗),−1 𝐼𝐼 ¯¯𝐵(𝑗),𝑇 𝐼 )Λ(𝑗) 𝑗∈𝜎(𝑖) (cid:213) = 𝑗∈𝜎(𝑖) 𝑇𝑖𝑗 ¯¯𝐵(𝑗) 𝐼 ¯¯𝐴(𝑗),−1 𝐼𝐼 ( ¯¯𝐴(𝑗) 𝐼𝑁 ¯¯𝐵(𝑗) 𝑁 ¯𝑋𝑁 − ℛ(𝑗) 𝐼 ). (5.28) In order to complete the definition of the equation to solve for Λ𝐼 over ∪Γ𝑖𝑗, we first define the equation for ¯𝑋𝑁 , the unknowns associated with faces on Γ𝑖𝑗 and edges on Γ𝑐. The derivation of the equation to solve for ¯𝑋𝑁 follows a similar path as ¯𝑋𝑐 from MFET-DP1. First, substitute (5.27) into the second set of equations in (5.26) to yield (cid:16) ¯¯𝐴(𝑖) 𝑁 𝑁 − ¯¯𝐴(𝑖),−1 𝑁 𝐼 (cid:17) ¯𝑋(𝑖),𝑛+1 ¯¯𝐴(𝑖),−1 𝐼𝐼 − ¯¯𝐴(𝑖) 𝑁 𝐼 ¯¯𝐴(𝑖) 𝐼𝑁 ¯¯𝐴(𝑖),−1 𝐼𝐼 𝑁 (cid:16) ℛ(𝑖) 𝐼 = ℛ(𝑖) 𝑁 (cid:17) − ¯Λ (𝑖) 𝐼 The Boolean matrix ¯¯𝐵(𝑖) 𝑁 is defined as ¯¯𝐵(𝑖) 𝑁 = 0 ¯¯𝐵(𝑖) 𝑐       ¯¯𝐵(𝑖) 𝑟, 𝑓 0       108 − ¯Λ (𝑖) 𝑁 . (5.29) (5.30) to enforce the impedance boundary conditions for the faces on Ω𝑖 and edges on corners where ¯¯𝐵(𝑖) 𝑟, 𝑓 that acts on the face degrees of freedom. UAs a result, is the portion of ¯¯𝐵(𝑖) 𝑟 where ¯¯𝐾𝑁 𝑁 𝑁 = ¯ℛ𝑁 + ¯¯𝐾𝑁 𝐼Λ𝐼 . ¯𝑋 𝑛+1 ¯¯𝐾𝑁 𝑁 = 𝑁𝑣(cid:213) 𝑖=1 ¯¯𝐵(𝑖) 𝑁 ( ¯¯𝐴(𝑖) 𝑁 𝑁 − ¯¯𝐴(𝑖), 𝑁 𝐼 ¯¯𝐴(𝑖),−1 𝐼𝐼 ¯¯𝐴(𝑖),−1 𝐼𝑁 ) ¯¯𝐵(𝑖),𝑇 𝑁 𝑁𝑣(cid:213) ¯¯𝐾𝑁 𝐼 = ¯¯𝐵(𝑖) 𝑁 ¯¯𝐴(𝑖) 𝑁 𝐼 ¯¯𝐴(𝑖),−1 𝐼𝐼 ¯¯𝐵(𝑖),𝑇 𝐼 ¯¯𝐵(𝑖),𝑇 𝐿 𝑖=1 𝑁𝑣(cid:213) ¯¯𝐵(𝑖) 𝑁 𝑖=1 ¯ℛ𝑁 = (ℛ(𝑖) 𝑁 − ¯¯𝐴(𝑖) 𝑁 𝐼 ¯¯𝐴(𝑖),−1 𝐼𝐼 ℛ(𝑖) 𝐼 ). (5.31) (5.32a) (5.32b) (5.32c) The global system of equations for the edge Lagrange multipliers Define a Boolean matrix ¯¯𝐵(𝑖) 𝐿 that maps the local Lagrange multipliers to a global vector. The Boolean matrix is applied to (5.28) and discretized in time to define the equation ¯¯𝐾𝐼𝐼Λ𝐼 = ¯ℛ𝐼 − ¯¯𝐾𝐼𝑁 ¯𝑁 𝑛+1 𝑁 where ¯¯𝐾𝐼𝐼 = 𝐼 + 𝑁𝑣(cid:213) 𝐵(𝑖) 𝐿 𝑖=1 ¯¯𝑇𝑖𝑗(¯¯𝐼(𝑗) − 𝑀(𝑗) 𝑇 (cid:213) 𝑗∈𝜎(𝑖) ¯¯𝐵(𝑗) 𝐼 ¯¯𝐴(𝑗),−1 𝐼𝐼 ) ¯¯𝐵(𝑗),𝑇 𝐼 𝐵(𝑗),𝑇 𝐿 ¯¯𝐾𝐼𝑁 = − 𝑁𝑣(cid:213) (cid:213) 𝐵(𝑖) 𝐿 𝑇 𝑗𝑖 𝑀(𝑗) 𝑇 𝐵(𝑗) 𝐼 ¯¯𝐴(𝑗),−1 𝐼𝐼 ¯¯𝐴(𝑗) 𝐼𝑁 𝐵(𝑗),𝑇 𝑁 𝑖=1 𝑁𝑣(cid:213) ¯ℛ𝐼 = − 𝑗∈𝜎(𝑖) (cid:213) 𝐵(𝑗) 𝐿 𝑇 𝑗𝑖 ¯¯𝑀(𝑗) 𝑇 𝐵(𝑗) 𝐼 ¯¯𝐴(𝑗),−1 𝐼𝐼 ℛ(𝑖) 𝐼 ) 𝑖=1 𝑗∈𝜎(𝑖) (5.33) (5.34a) (5.34b) (5.34c) Now, we can apply the port extraction method proposed earlier in the chapter to every unknown in (5.31) and (5.33). 109 Figure 5.9: Complexity curves as a function of number of interface to volume unknowns for different values of 𝑁𝑡. Curves in blue represent savings in relation to a traditional EM-FEMPIC scheme and curves in red represent configurations where the extraction method is less efficient. Complexity Analysis To understand the computational implications of using transient port extraction coupled with a FETI based domain decomposition scheme, we first need to identify the costs associated with different parts of the overall method. First, suppose there are 𝑁 unknowns in the overall domain. By sectioning the domain into 𝑁𝑣 partitions, we obtain 𝑁𝑣 block matrices that each have a solution complexity of (𝑁/𝑁𝑣)3 using naive matrix inversion. Since mesh subsectioning tools in the literature traditionally follow a paradigm of each subvolume having roughly the same number of unknowns, we can restrict our analysis without loss of generality to subsectioning a cube into 𝑁 equal volumes. Upon doing so, it is (cid:17) fairly easy to demonstrate that the cost of the interface solution is (𝑁/𝑁 𝑓 )2 (cid:16) where 𝑁 𝑓 refers to 𝑁 1/3. Since the port extraction only operates on the interface system, 3/𝑁 𝑓 − 3/𝑁 2 𝑓 , 110 Figure 5.10: 𝑟-𝑧 phase space distribution for the expanding particle beam compared between a regular EM-FEMPIC method and the proposed extraction based method and the block matrices are only computed once and stored, the overall complexity can therefore be written as: 𝐶PEPIC = (𝑁/𝑁𝑣)3 + 𝑁𝑡(𝑁/𝑁 𝑓 )2 (cid:16) 3/𝑁 𝑓 − 3/𝑁 2 𝑓 (cid:17) (5.35) The trend of this complexity function is plotted in Fig. 5.9. We note as the number of timesteps rises in relation to the number of unknowns in the system, the extraction method gets progressively less viable as most of the cost is relegated to the interface problem. However, with an unconditionally stable time-stepper as we have used throughout this work 𝑁𝑡 ≤ 𝑁 is a fairly easy condition to meet for any representative problem. Numerical PIC Example: Expanding Particle Beam Consider a conducting cylindrical cavity (as we have seen in the previous chapters) of length 10 cm and radius 2 cm. The walls of the cavity were assumed to be perfectly conducting and a particle beam composed of electrons was initialized at 𝑧 = 0. The 111 Parameter 𝑞 𝑚 Beam Voltage Beam Current Δ𝑡 Δ𝑡,XOOPIC 𝑁𝑝/timestep Value 1.6 × 10−19 C 9.1 × 10−31 kg 120 kV 10 A 10 ps 1 ps 50 Table 5.1: Parameters for the expanding beam run. parameters of the simulation are as in Table. 5.1. The mesh used to discretize the system comprised of 3229 tetrahedra (resulting in 9879 unknowns) had an average edge length of 2.3 mm. The geometry was split into two volumes with the interface containing 37 unknowns. The extraction was done for 𝑁𝑡 = 2000 and we note from Fig. 5.10 that the extraction method agrees closely with the standard EM-FEMPIC method. The discrepancy in particle positions are primarily due to the individual particle orbits being chaotic and they this diverge from a small initial offset exponentially fast. However, the overall physical characteristics (beam radius, etc.) are very closely matched. 5.3 Conclusion In this chapter, we introduced a transient parameter extraction method that allows for seamless decoupling of a self-consistent electromagnetic problem from lumped nonlinear devices attached at ports in the simulation domain. This method was further extended to be applicable to an EM-FEMPIC system. We demonstrated through numerical examples good agreement between the proposed method and their equivalent self-consistent simulation method. The complexity of the method was also rigorously derived, with specific conditions based on problem size and timesteps proposed. 112 CHAPTER 6 CONCLUSION The development of finite element particle-in-cell methods has been a very active area of research in recent years, bringing with it the ability to robustly discretize complex 3D devices and allow access to more complex function spaces to represent the fields Through a sequence of seminal papers, it is now possible to combine implicit FEM solvers with arbitrary particle push routines while exactly conserving charge. The main objectives of this work were to (1) make the implicit EM-FEMPIC method more applicable to large scale problems, and (2) analyse potential means to ensure conservation of other fundamental physical quantities. We first presented a novel formulation in Chapter 2 that coupled an unconditionally stable implicit FEM field solver to a particle push scheme that utilized a predictor-corrector loop. Our method was shown to conserve charge to machine precision this paradigm and as a result, made the method as a whole more applicable to highly relativistic problems where the traditionally used explicit particle pusher can become unstable at the time-step sizes afforded by the implicit field solver. Additionally, the energy conserving characteristics of different particle push methods were examined, with the exponential method shown to be non-dissipative under the numerical examples tested. However, it remains an open problem at time of writing as to whether this behavior is simply due to enhanced accuracy, or if the exponential stencil is fundamentally energy-conserving. Next in Chapter 3, we borrowed the well-established idea of envelope tracking methods from mainstream computational electromagnetics and extended it to be applicable to particle-in-cell formulations. Breaking up the fields into a slowly varying envelope modulated with a fast harmonic carrier was shown to have enormous consequences on charge conservation. As a result, new formalism was derived to come up with guidelines for choosing testing and representation functions under a general field breakup. The 113 proposed method was shown to achieve significant computational speedup for plasma systems that were manipulated by high power RF sources, which are commonplace in many relevant pulsed-power and accelerator systems. In Chapter 4, we formalized the requirements for energy conservation in an implicit EM-FEMPIC solver. In particular, we re-iterated the idea that having a non-dissipative field solver attached to a non-dissipative particle solver is not sufficient to ensure energy conservation. The mapping of currents from particle movements has to be done with care in order to relate energies in the EM solver to the particle push. In our analysis we proposed a velocity correction method that uses an implicit Tajima scheme as a post-processing step after the standard EM-FEMPIC run. We show through our numerical results that the resulting stencil removes any spurious energy gain from the particles and conserves energy as well as the field solver without active sources. Finally in Chapter 5, we proposed another approach to speed up an implicit EM- FEMPIC solver by employing a transient parameter extraction method originally proposed for electromagnetic system driven by lumped nonlinear devices. The port extraction method was derived in detail and its complexity rigorously derived. However, since the particle-in-cell method can impose nonlinearities everywhere in the mesh (since the particles are allowed to move freely inside the domain), naive port extraction at every edge is not computationally feasible. To solve this problem, we incorporated port extraction into a domain decomposition algorithm and derived relations between the number of timesteps and the number of degrees of freedom in the mesh where port extraction would yield computational benefit. Future Work In totality, this work has made progress on addressing a number of open problems in implicit EM-FEMPIC solver techniques. A number of interesting avenues of future research remains open. First, the derived rubrics for energy conservation in Chapter 4 114 were strictly limited to a standard Maxwell solver. Extending this analyis to the envelope tracking method proposed in Chapter 3 would be of great interest as the partcle update currently used is known to not conserve energy over asymptotic timescales. Likewise, a proof (or disproof) of energy conservation for the exponential particle update used in chapter 2 is again an open problem. The rubrics for energy conservation derived in chapter 5 is still bounded by oscillations in the field solver, which conserves energy on average, but admits an error of 𝒪(Δ2 𝑡 ) at any given timestep. Deriving a Newmark based method that either does not admit this error, or relegates it to a higher power of Δ𝑡 would make the method even more precise. Additionally, the quasi-Helmholtz scheme used only removed the effect of nullspace excitations from the non-solenoidal components of the field. Spurious contributions to the solenoidal fields can still have an impact on energy conservation. Thus, more work needs to be done to analyse and mitigate these effects. Similar issues exist in the formulation of the envelope tracking approach and is an interesting avenue of future work. Finally, the domain-decomposition port extraction method derived in Chapter 5 was fairly simplistic without much emphasis placed on the effects of matrix inverse storage and assuming no preconditioning. With more efficient linear algebra modules, the complexity bounds can be made even more robust and thus make the method applicable to a larger range of problems. 115 BIBLIOGRAPHY [1] [2] [3] Richard Marchand. PTETRA, a tool to simulate low orbit satellite–plasma interaction. IEEE Transactions on Plasma Science, 40(2):217–229, 2011. RW Lemke, TC Genoni, and TA Spencer. Three-dimensional particle-in-cell simula- tion study of a relativistic magnetron. Physics of Plasmas, 6(2):603–613, 1999. E Fourkal, B Shahine, M Ding, JS Li, T Tajima, and C-M Ma. Particle in cell simulation of laser-accelerated proton beams for radiation therapy. Medical Physics, 29(12):2788–2798, 2002. [4] Dragica Vasileska and Stephen M. Goodnick. Computational Electronics. Springer International Publishing, 2006. [5] Christian Klingenberg and Knut Waagan. Relaxation solvers for ideal mhd equations -a review. Acta Mathematica Scientia, 30(2):621–632, 2010. [6] Alexander A. Schekochihin. MHD turbulence: a biased review. Journal of Plasma Physics, 88(5):155880501, October 2022. Publisher: Cambridge University Press. [7] [8] [9] [10] [11] Charles K Birdsall and A Bruce Langdon. Plasma physics via computer simulation. CRC press, 2004. Jay P Boris. Relativistic plasma simulation-optimization of a hybrid code. In Proc. Fourth Conf. Num. Sim. Plasmas, pages 3–67, 1970. Kane Yee. Numerical solution of initial boundary value problems involving maxwell’s equations in isotropic media. IEEE Transactions on antennas and propagation, 14(3):302– 307, 1966. John Villasenor and Oscar Buneman. Rigorous charge conservation for local electro- magnetic field solvers. Computer Physics Communications, 69(2-3):306–316, 1992. John P Verboncoeur. Particle simulation of plasmas: review and advances. Plasma Physics and Controlled Fusion, 47(5A):A231, 2005. [12] R. Holland. Pitfalls of staircase meshing. Compatibility, 35(4):434–439, 1993. IEEE Transactions on Electromagnetic [13] J.B. Schneider and K.L. Shlager. Fdtd simulations of tem horns and the implications for staircased representations. IEEE Transactions on Antennas and Propagation, 45(12):1830– 1838, 1997. [14] Saul Abarbanel, Adi Ditkowski, and Amir Yefet. Bounded error schemes for the wave equation on complex domains. Journal of Scientific Computing, 26(1):67–81, Jan 2006. [15] A. Akyurtlu, D.H. Werner, V. Veremey, D.J. Steich, and K. Aydin. Staircasing errors in fdtd at an air-dielectric interface. IEEE Microwave and Guided Wave Letters, 9(11):444–446, 1999. 116 [16] J.P. Verboncoeur. Aliasing of electromagnetic fields in stair step boundaries. Computer Physics Communications, 164(1):344–352, 2004. Proceedings of the 18th International Conferene on the Numerical Simulation of Plasmas. [17] Korada Umashankar and Allen Taflove. A novel method to analyze electromagnetic scattering of complex objects. IEEE Transactions on Electromagnetic Compatibility, EMC-24(4):397–405, 1982. [18] A. Taflove and K. Umashankar. A hybrid moment method/finite-difference time- domain approach to electromagnetic coupling and aperture penetration into complex geometries. IEEE Transactions on Antennas and Propagation, 30(4):617–627, 1982. [19] Kenneth K. Mei, Andreas Cangellaris, and Diogenes J. Angelakos. Conformal time domain finite difference method. Radio Science, 19(5):1145–1147, 1984. [20] W.K. Gwarek. Analysis of an arbitrarily-shaped planar circuit a time-domain approach. IEEE Transactions on Microwave Theory and Techniques, 33(10):1067–1072, 1985. [21] G. Kriegsmann, A. Taflove, and K. Umashankar. A new formulation of electromag- netic wave scattering using an on-surface radiation boundary condition approach. IEEE Transactions on Antennas and Propagation, 35(2):153–161, 1987. [22] T.G. Jurgens, A. Taflove, K. Umashankar, and T.G. Moore. Finite-difference time- domain modeling of curved surfaces (em scattering). IEEE Transactions on Antennas and Propagation, 40(4):357–366, 1992. [23] T.G. Jurgens and A. Taflove. Three-dimensional contour fdtd modeling of scattering from single and multiple bodies. IEEE Transactions on Antennas and Propagation, 41(12):1703–1708, 1993. [24] Chris J. Railton, Ian James Craddock, and John B. Schneider. Improved locally distorted cpfdtd algorithm with provable stability. Electronics Letters, 31:1585–1586, 1995. [25] M.Celuch Marcysiak and Wojciech K. Gwarek. Higher-order modelling of media interfaces for enhanced fdtd analysis of microwave circuits. In 1994 24th European Microwave Conference, volume 2, pages 1530–1535, 1994. [26] N. Kaneda, B. Houshmand, and T. Itoh. Fdtd analysis of dielectric resonators with curved surfaces. IEEE Transactions on Microwave Theory and Techniques, 45(9):1645–1649, 1997. [27] S. Dey, R. Mittra, and S. Chebolu. A technique for implementing the fdtd algorithm on a nonorthogonal grid. Microwave and Optical Technology Letters, 14(4):213–215, 1997. [28] S. Dey and R. Mittra. A locally conformal finite-difference time-domain (fdtd) algo- rithm for modeling three-dimensional perfectly conducting objects. IEEE Microwave and Guided Wave Letters, 7(9):273–275, 1997. 117 [29] S. Dey, R. Mittra, and S. Chebolu. A technique for implementing the fdtd algorithm on a nonorthogonal grid. Microwave and Optical Technology Letters, 14(4):213–215, 1997. [30] Supriyo Dey and Raj Mittra. A modified locally conformal finite-difference time- domain algorithm for modeling three-dimensional perfectly conducting objects. Microwave and Optical Technology Letters, 17(6):349–352, 1998. [31] Supriyo Dey and R. Mittra. A conformal finite-difference time-domain technique for modeling cylindrical dielectric resonators. IEEE Transactions on Microwave Theory and Techniques, 47(9):1737–1739, 1999. [32] W. Yu and R. Mittra. A conformal fdtd software package modeling antennas and microstrip circuit components. IEEE Antennas and Propagation Magazine, 42(5):28–39, 2000. [33] [34] I.A. Zagorodnov, R. Schuhmann, and T. Weiland. A uniformly stable conformal fdtd-method in cartesian grids. International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, 16(2):127–141, 2003. Igor Zagorodnov, Rolf Schuhmann, and Thomas Weiland. Conformal fdtd-methods to avoid time step reduction with and without cell enlargement. Journal of Computational Physics, 225(2):1493–1507, 2007. [35] Tian Xiao and Q.H. Liu. Enlarged cells for the conformal fdtd method to avoid the time step reduction. IEEE Microwave and Wireless Components Letters, 14(12):551–553, 2004. [36] Tian Xiao and Qing Huo Liu. A 3-d enlarged cell technique (ect) for the conformal fdtd method. IEEE Transactions on Antennas and Propagation, 56(3):765–773, 2008. [37] Collin S Meierbachtol, Andrew D Greenwood, John P Verboncoeur, and Balasub- ramaniam Shanker. Conformal electromagnetic particle in cell: A review. IEEE Transactions on Plasma Science, 43(11):3778–3793, 2015. [38] Niel K. Madsen and Richard W. Ziolkowski. A three-dimensional modified finite volume technique for maxwell’s equations. Electromagnetics, 10(1-2):147–161, 1990. [39] Vijaya Shankar, Alireza H. Mohammadian, and William F. Hall. A time-domain, finite-volume treatment for the maxwell equations. Electromagnetics, 10(1-2):127–145, 1990. [40] Alireza H. Mohammadian, Vijaya Shankar, and William F. Hall. Computation of electromagnetic scattering and radiation using a time-domain finite-volume discretization procedure. Computer Physics Communications, 68(1):175–196, 1991. [41] R. Holland, V.P. Cable, and L.C. Wilson. Finite-volume time-domain (fvtd) techniques for em scattering. IEEE Transactions on Electromagnetic Compatibility, 33(4):281–294, 1991. 118 [42] S.D. Gedney and F. Lansing. A parallel discrete surface integral equation method for the analysis of three-dimensional microwave circuit devices with planar symmetry. In Proceedings of IEEE Antennas and Propagation Society International Symposium and URSI National Radio Science Meeting, volume 3, pages 1778–1781 vol.3, 1994. [43] Niel K. Madsen. Divergence preserving discrete surface integral methods for Journal of maxwell’s curl equations using non-orthogonal unstructured grids. Computational Physics, 119(1):34–45, 1995. [44] François Hermeline. A finite volume method for solving maxwell equations in inho- mogeneous media on arbitrary meshes. Comptes Rendus Mathematique, 339(12):893– 898, 2004. [45] Masami Matsumoto and Shigeo Kawata. Tripic: Triangular-mesh particle-in-cell code. Journal of Computational Physics, 87(2):488–493, 1990. [46] Alan M. Winslow. Numerical solution of the quasilinear poisson equation in a nonuniform triangle mesh. Journal of Computational Physics, 135(2):128–138, 1997. [47] G. Lapenta, F. Iinoya, and J.U. Brackbill. Particle-in-cell simulation of glow discharges in complex geometries. IEEE Transactions on Plasma Science, 23(4):769–779, 1995. [48] C-D Munz, Pascal Omnes, Rudolf Schneider, Eric Sonnendrücker, and Ursula Voss. Divergence correction techniques for maxwell solvers based on a hyperbolic model. Journal of Computational Physics, 161(2):484–511, 2000. [49] C.-D. Munz, P. Omnes, R. Schneider, E. Sonnendrucker, E. Stein, U. Voss, and T. Westermann. Kad12d-a particle-in-cell code based on finite-volume methods. In 12th International Conference on High-Power Particle Beams. BEAMS’98. Proceedings (Cat. No.98EX103), volume 1, pages 541–544 vol.1, 1998. [50] Claus-Dieter Munz, Rudolf Schneider, Eric Sonnendrücker, E. Stein, Ursula Voß, and Thomas Westermann. A finite-volume particle-in-cell method for the numerical treatment of maxwell-lorentz equations on boundary-fitted meshes. International Journal for Numerical Methods in Engineering, 44:461–487, 1999. [51] Nikolaos Gatsonis and Anton Spirkin. Unstructured 3D Simulations of Field Emission Array Cathodes for Micropropulsion Applications. 2002. [52] Anton Spirkin and Nikolaos Gatsonis. Unstructured 3D PIC Simulation of Plasma Flow in a Segmented Microchannel. 2003. [53] Anton Spirkin and Nikolaos A. Gatsonis. Unstructured 3d pic simulations of the flow in a retarding potential analyzer. Computer Physics Communications, 164(1):383–389, 2004. Proceedings of the 18th International Conferene on the Numerical Simulation of Plasmas. [54] J.S Hesthaven and T Warburton. Nodal high-order methods on unstructured grids: I. time-domain solution of maxwell’s equations. Journal of Computational Physics, 181(1):186–221, 2002. 119 [55] Masoud Movahhedi, Abdolali Abdipour, Alexandre Nentchev, Mehdi Dehghan, and Siegfried Selberherr. Alternating-direction implicit formulation of the finite- element time-domain method. IEEE Transactions on Microwave Theory and Techniques, 55(6):1322–1331, 2007. [56] Serge Piperno and Loula Fatima Fezoui. A centered Discontinuous Galerkin Finite Volume scheme for the 3D heterogeneous Maxwell equations on unstructured meshes. Technical Report RR-4733, INRIA, February 2003. [57] Arne Taube, Michael Dumbser, Claus-Dieter Munz, and Rudolf Schneider. A high- order discontinuous galerkin method with time-accurate local time stepping for the maxwell equations. International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, 22(1):77–103, 2009. [58] G.B. Jacobs and J.S. Hesthaven. High-order nodal discontinuous galerkin particle-in- cell method on unstructured grids. Journal of Computational Physics, 214(1):96–121, 2006. [59] Gustaaf Jacobs, Jan Hesthaven, and Giovanni Lapenta. Simulations of the Weibel Instability with a High-Order Discontinous Galerkin Particle-In-Cell Solver. 2006. [60] G.B. Jacobs and J.S. Hesthaven. Implicit–explicit time integration of a high-order particle-in-cell method with hyperbolic divergence cleaning. Computer Physics Communications, 180(10):1760–1767, 2009. [61] Ross Evan Heath. Analysis of the discontinuous Galerkin method applied to collisionless plasma physics. PhD thesis, University of Texas, Austin, January 2007. [62] James A. Rossmanith and David C. Seal. A positivity-preserving high-order semi- lagrangian discontinuous galerkin scheme for the vlasov–poisson equations. Journal of Computational Physics, 230(16):6203–6232, 2011. [63] Blanca Ayuso, José A. Carrillo, and Chi-Wang Shu. Discontinuous galerkin methods for the one-dimensional vlasov-poisson system. Kinetic and Related Models, 4(4):955– 989, 2011. [64] Yingda Cheng, Irene M. Gamba, Fengyan Li, and Philip J. Morrison. Discontinuous galerkin methods for the vlasov–maxwell equations. SIAM Journal on Numerical Analysis, 52(2):1017–1049, 2014. [65] Yingda Cheng, Andrew J. Christlieb, and Xinghui Zhong. Energy-conserving dis- continuous galerkin methods for the vlasov–ampère system. Journal of Computational Physics, 256:630–655, 2014. [66] Jian-Ming Jin. The finite element method in electromagnetics. John Wiley & Sons, 2015. [67] Martin Campos Pinto, Sébastien Jund, Stéphanie Salmon, and Eric Sonnendrücker. Charge-conserving fem–pic schemes on general grids. Comptes Rendus Mecanique, 342(10-11):570–582, 2014. 120 [68] Haksu Moon, Fernando L Teixeira, and Yuri A Omelchenko. Exact charge-conserving scatter–gather algorithm for particle-in-cell simulations on unstructured grids: A geometric perspective. Computer Physics Communications, 194:43–53, 2015. Jonathan Squire, Hong Qin, and William M Tang. Geometric integration of the vlasov-maxwell system with a variational particle-in-cell scheme. Physics of Plasmas, 19(8):084501, 2012. [69] [70] Haksu Moon, Fernando L Teixeira, and Yuri A Omelchenko. Exact charge-conserving scatter–gather algorithm for particle-in-cell simulations on unstructured grids: A geometric perspective. Computer Physics Communications, 194:43–53, 2015. [71] S O’Connor, ZD Crawford, J Verboncoeur, J Lugisland, and B Shanker. A set of benchmark tests for validation of 3d particle in cell methods. arXiv preprint arXiv:2101.09299, 2021. [72] Michael Kraus, Katharina Kormann, Philip J Morrison, and Eric Sonnendrücker. Gempic: geometric electromagnetic particle-in-cell methods. Journal of Plasma Physics, 83(4), 2017. [73] XIAO Jianyuan, QIN Hong, and LIU Jian. Structure-preserving geometric particle-in- cell methods for vlasov-maxwell systems. Plasma Science and Technology, 20(11):110501, 2018. [74] Olgierd Cecil Zienkiewicz. A new look at the newmark, houbolt and other time stepping formulas. a weighted residual approach. Earthquake Engineering & Structural Dynamics, 5(4):413–418, 1977. [75] Zane D Crawford, Scott O’Connor, John Luginsland, and B Shanker. Rubrics for charge conserving current mapping in finite element particle in cell methods. arXiv preprint arXiv:2101.12128, 2021. [76] Scott O’Connor, Zane D. Crawford, O. H. Ramachandran, John Luginsland, and B. Shanker. Time integrator agnostic charge conserving finite element pic. Physics of Plasmas, 28(9):092111, 2021. [77] Scott O’Connor, Zane D. Crawford, O.H. Ramachandran, John Luginsland, and B. Shanker. Quasi-helmholtz decomposition, gauss’ laws and charge conservation for finite element particle-in-cell. Computer Physics Communications, 276:108345, 2022. [78] Omkar H. Ramachandran, Zane D. Crawford, Scott O’Connor, John Luginsland, and B. Shanker. An envelope tracking approach for particle in cell simulations, 2022. [79] Zane D. Crawford, O. H. Ramachandran, Scott O’Connor, John Luginsland, and B. Shanker. Higher order charge conserving electromagnetic finite element particle in cell method, 2021. [80] Zane D. Crawford, O. H. Ramachandran, Scott O’Connor, Daniel L. Dault, John Luginsland, and B. Shanker. Domain decomposition framework for maxwell finite element solvers and application to pic, 2022. 121 [81] Omkar H. Ramachandran, Leo C. Kempel, John Luginsland, and B. Shanker. A charge conserving exponential predictor corrector fempic formulation for relativistic particle simulations, 2022. [82] A. Bossavit. Whitney forms: a class of finite elements for three-dimensional compu- tations in electromagnetism. IEE Proceedings A - Physical Science, Measurement and Instrumentation, Management and Education - Reviews, 135(8):493–500, Nov 1988. [83] Zane Crawford, Jie Li, Andrew Christlieb, and Balasubramaniam Shanker. Uncon- ditionally stable time stepping method for mixed finite element maxwell solvers. Progress In Electromagnetics Research, 103:17–30, 2020. [84] Man-Fai Wong, O. Picon, and V. Fouad Hanna. A finite element method based on whitney forms to solve maxwell equations in the time domain. IEEE Transactions on Magnetics, 31(3):1618–1621, May 1995. [85] Bo He and F.L. Teixeira. Geometric finite element discretization of maxwell equations in primal and dual spaces. Physics Letters A, 349(1–4):1 – 14, 2006. [86] A. Bossavit and L. Kettunen. Yee-like schemes on a tetrahedral mesh, with diagonal lumping. International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, 12(1-2):129–142, 1999. [87] Scott O’Connor, Zane D Crawford, OH Ramachandran, John Luginsland, and B Shanker. Quasi-helmholtz decomposition, gauss’ laws and charge conservation for finite element particle-in-cell. arXiv preprint arXiv:2103.06737, 2021. [88] J. P. Boris. Relativistic plasma simulation-optimization of a hybrid code. Proceeding of Fourth Conference on Numerical Simulations of Plasmas, November 1970. [89] Hong Qin, Shuangxi Zhang, Jianyuan Xiao, Jian Liu, Yajuan Sun, and William M. Tang. Why is boris algorithm so good? Physics of Plasmas, 20(8):084503, 2013. [90] Seiji Zenitani and Takayuki Umeda. On the boris solver in particle-in-cell simulation. Physics of Plasmas, 25(11):112110, 2018. [91] Fabio Bacchini, Jorge Amaya, and Giovanni Lapenta. The relativistic implicit particle- in-cell method. Journal of Physics: Conference Series, 1225(1):012011, may 2019. [92] Stefano Markidis and Giovanni Lapenta. The energy conserving particle-in-cell method. Journal of Computational Physics, 230(18):7037–7052, 2011. [93] Stefano Markidis. Development of Implicit Kinetic Simulation Methods, and their Ap- plication to Ion Beam Propagation in Current and Future Neutralized Drift Compression Experiments. ProQuest Dissertations Publishing, 2010. [94] Andreas Glaser and Vladimir Rokhlin. A new class of highly accurate solvers for ordinary differential equations. Journal of Scientific Computing, 38:368–399, 2009. 122 [95] Rui Wang, Douglas J Riley, and Jian-Ming Jin. Application of tree-cotree splitting to the time-domain finite-element analysis of electromagnetic problems. IEEE transactions on antennas and propagation, 58(5):1590–1600, 2010. [96] Neelakantam V Venkatarayalu and Jin-Fa Lee. Removal of spurious dc modes in edge element solutions for modeling three-dimensional resonators. IEEE transactions on microwave theory and techniques, 54(7):3019–3025, 2006. [97] L. D. Landau and E. M. Lifshitz. The Classical Theory of Fields. Butterworth-Heinemann, January 1980. [98] J.P. Verboncoeur, A.B. Langdon, and N.T. Gladd. An object-oriented electromagnetic pic code. Computer Physics Communications, 87(1):199–211, 1995. Particle Simulation Methods. [99] Chung-Wen Ho, A. Ruehli, and P. Brennan. The modified nodal approach to network analysis. IEEE Transactions on Circuits and Systems, 22(6):504–509, 1975. [100] C.B. Wilsen, J.W. Luginsland, Yue Ying Lau, T.M. Antonsen, D.P. Chernin, P.M. Tchou, M.W. Keyser, R.M. Gilgenbach, and L.D. Ludeking. A simulation study of beam loading on a cavity. IEEE Transactions on Plasma Science, 30(3):1160–1168, 2002. [101] Jianguo Wang, Zaigao Chen, Yue Wang, Dianhui Zhang, Chunliang Liu, Yongdong Li, Hongguang Wang, Hailiang Qiao, Meiyan Fu, and Yuan Yuan. Three-dimensional parallel unipic-3d code for simulations of high-power microwave devices. Physics of Plasmas, 17(7):073107, 2010. [102] J.P. Verboncoeur, A.B. Langdon, and N.T. Gladd. An object-oriented electromagnetic pic code. Computer Physics Communications, 87(1):199–211, 1995. Particle Simulation Methods. [103] Nathan M Newmark. A method of computation for structural dynamics. Journal of the engineering mechanics division, 85(3):67–94, 1959. [104] K. Li, A. M. Tassoudji, S. Y. Poh, M. Tsuk, R. T. Shin, and J. A. Kong. Fd-td analysis of electromagnetic radiation from modules-on-backplane configurations. IEEE Transactions on Electromagnetic Compatibility, 37(3):326–332, 1995. [105] Er-Ping Li. Electrical Modeling and Design for 3D System Integration. John Wiley & Sons, Ltd, 2012. [106] Yui-Sheng Tsuei, A. C. Cangellaris, and J. L. Prince. Rigorous electromagnetic IEEE Transactions on modeling of chip-to-package (first-level) interconnections. Components, Hybrids, and Manufacturing Technology, 16(8):876–883, 1993. [107] K. Aygun, B. C. Fischer, Jun Meng, B. Shanker, and E. Michielssen. A fast hybrid field-circuit simulator for transient analysis of microwave circuits. IEEE Transactions on Microwave Theory and Techniques, 52(2):573–583, 2004. 123 [108] A. E. Yilmaz, Jian-Ming Jin, and E. Michielssen. A parallel fft accelerated tran- sient field-circuit simulator. IEEE Transactions on Microwave Theory and Techniques, 53(9):2851–2865, 2005. [109] R. Wang and J. Jin. A symmetric electromagnetic-circuit simulator based on the extended time-domain finite element method. IEEE Transactions on Microwave Theory and Techniques, 56(12):2875–2884, 2008. [110] S. O’Connor, S. Hughey, D. Dault, A. J. Pray, J. M. Villa-Giron, and B. Shanker. A novel port/network parameter extraction technique for coupling circuits with full-wave time-domain integral equation solvers. IEEE Transactions on Microwave Theory and Techniques, 67(2):553–564, 2019. [111] Omkar H. Ramachandran, Scott O’Connor, Zane D. Crawford, Leo C. Kempel, and Balasubramaniam Shanker. Port parameter extraction-based self-consistent coupled em-circuit fem solvers. IEEE Transactions on Components, Packaging and Manufacturing Technology, 12:1040–1048, 2021. [112] Jianming Jin and Douglas J. Riley. Finite Element Analysis of Antennas and Arrays. Wiley-IEEE Press, 2009. [113] W.H. Kao, Chi-Yuan Lo, M. Basel, and R. Singh. Parasitic extraction: current state of the art and future trends. Proceedings of the IEEE, 89(5):729–739, 2001. [114] G. Antonini, A.C. Scogna, A. Orlandi, V. Ricchiuti, G. Selli, S. Luan, and J.L. Drewniak. Validation of circuit extraction procedure by means of frequency and time domain measurement. In 2005 International Symposium on Electromagnetic Compatibility, 2005. EMC 2005., volume 1, pages 45–50, 2005. [115] Yong Wang, D. Gope, V. Jandhyala, and C. . R. Shi. Generalized kirchoff’s current and voltage law formulation for coupled circuit-electromagnetic simulation with surface integral equations. IEEE Transactions on Microwave Theory and Techniques, 52(7):1673–1682, 2004. [116] Paule Charland, Ellen El-Khatib, and John Wolters. The use of deconvolution and total least squares in recovering a radiation detector line spread function. Medical Physics, 25(2):152–160, 1998. [117] Yan-Lin Li, Sheng Sun, Qi I Dai, and Weng Cho Chew. Finite element implementation of the generalized-lorenz gauged a-\phi formulation for low-frequency circuit modeling. IEEE Transactions on Antennas and Propagation, 64(10):4355–4364, 2016. [118] Zane Crawford, Jie Li, Andrew Christlieb, and Balasubramaniam Shanker. Uncon- ditionally stable time stepping method for mixed finite element maxwell solvers. Progress In Electromagnetics Research C, 103:17–30, 01 2020. [119] O. C. Zienkiewicz. A new look at the newmark, houbolt and other time stepping formulas. a weighted residual approach. Earthquake Engineering & Structural Dynamics, 5(4):413–418, 1977. 124 [120] Jean-Pierre Berenger. A perfectly matched layer for the absorption of electromagnetic waves. Journal of Computational Physics, 114(2):185–200, 1994. [121] T. Rylander and Jian-Ming Jin. Perfectly matched layer in three dimensions for the time-domain finite element method applied to radiation problems. IEEE Transactions on Antennas and Propagation, 53(4):1489–1499, 2005. [122] D.L. Scharfetter and H.K. Gummel. Large-signal analysis of a silicon read diode oscillator. IEEE Transactions on Electron Devices, 16(1):64–77, 1969. [123] M. Kurata. Design considerations of step recovery diodes with the aid of numerical large-signal analysis. IEEE Transactions on Electron Devices, 19(11):1207–1215, 1972. [124] Jun-quan Chen, Xing Chen, Chang-Jun Liu, Kama Huang, and Xiao-Bang Xu. Analysis of temperature effect on p-i-n diode circuits by a multiphysics and circuit cosimulation algorithm. IEEE Transactions on Electron Devices, 59(11):3069–3077, 2012. [125] Hongzheng Zeng, Yuzhu Tang, Xin Duan, and Xing Chen. A physical model-based FDTD field-circuit co-simulation method for schottky diode rectifiers. IEEE Access, 7:87265–87272, 2019. [126] Charles K Birdsall and A Bruce Langdon. Plasma physics via computer simulation. CRC press, 2018. [127] S Laha, P Gupta, CE Simien, H Gao, J Castro, T Pohl, and TC Killian. Experimental realization of an exact solution to the vlasov equations for an expanding plasma. Physical review letters, 99(15):155001, 2007. [128] Johan Carlsson, Alexander Khrabrov, Igor Kaganovich, Timothy Sommerer, and David Keating. Validation and benchmarking of two particle-in-cell codes for a glow discharge. Plasma Sources Science and Technology, 26(1):014003, 2016. [129] M Surendra. Radiofrequency discharge benchmark model comparison. Plasma Sources Science and Technology, 4(1):56, 1995. [130] Miles M Turner, Aranka Derzsi, Zoltan Donko, Denis Eremin, Sean J Kelly, Trevor Lafleur, and Thomas Mussenbrock. Simulation benchmarks for low-pressure plasmas: Capacitive discharges. Physics of Plasmas, 20(1):013507, 2013. [131] Miles M Turner. Verification of particle-in-cell simulations with monte carlo collisions. Plasma Sources Science and Technology, 25(5):054007, 2016. [132] Fabio Riva, Carrie F Beadle, and Paolo Ricci. A methodology for the rigorous verification of particle-in-cell simulations. Physics of Plasmas, 24(5):055703, 2017. [133] Su Yan and Jian-Ming Jin. A continuity-preserving and divergence-cleaning algorithm based on purely and damped hyperbolic maxwell equations in inhomogeneous media. Journal of computational physics, 334:392–418, 2017. 125 [134] Martin Reiser and Patrick O’Shea. Theory and design of charged particle beams, volume 312. Wiley Online Library, 1994. [135] VF Kovalev and V Yu Bychenkov. Analytic solutions to the vlasov equations for expanding plasmas. Physical review letters, 90(18):185004, 2003. [136] Peter Monk. Finite element methods for Maxwell’s equations. Oxford University Press, 2003. [137] David Smithe, Peter Stoltz, John Loverich, Chet Nieter, and Seth Veitzer. Development and application of particle emission algorithms from cut-cell boundaries in the vorpal em-fdtd-pic simulation tool. In 2008 IEEE International Vacuum Electronics Conference, pages 217–218. IEEE, 2008. [138] Chet Nieter, John R Cary, Gregory R Werner, David N Smithe, and Peter H Stoltz. Application of dey–mittra conformal boundary algorithm to 3d electromagnetic modeling. Journal of Computational Physics, 228(21):7902–7916, 2009. [139] Alexander S Glasser and Hong Qin. The geometric theory of charge conservation in particle-in-cell simulations. arXiv preprint arXiv:1910.12395, 2019. [140] Guangye Chen, Luis Chacón, and Daniel C Barnes. An energy-and charge-conserving, Journal of Computational Physics, implicit, electrostatic particle-in-cell algorithm. 230(18):7018–7036, 2011. [141] T Zh Esirkepov. Exact charge conservation scheme for particle-in-cell simulation with an arbitrary form-factor. Computer Physics Communications, 135(2):144–153, 2001. [142] Katharina Kormann and Eric Sonnendrücker. Energy-conserving time propaga- tion for a structure-preserving particle-in-cell vlasov–maxwell solver. Journal of Computational Physics, 425:109890, 2021. [143] Jianyuan Xiao and Hong Qin. Explicit structure-preserving geometric particle-in-cell algorithm in curvilinear orthogonal coordinate systems and its applications to whole- device 6d kinetic simulations of tokamak physics. Plasma Science and Technology, 2021. [144] Annalisa Buffa, Giancarlo Sangalli, and Rafael Vázquez. Isogeometric analysis in electromagnetics: B-splines approximation. Computer Methods in Applied Mechanics and Engineering, 199(17-20):1143–1152, 2010. [145] Roberto D Graglia, Donald R Wilton, and Andrew F Peterson. Higher order interpolatory vector bases for computational electromagnetics. IEEE transactions on antennas and propagation, 45(3):329–342, 1997. [146] Roberto D Graglia and Andrew F Peterson. Hierarchical divergence-conforming nédélec elements for volumetric cells. IEEE transactions on antennas and propagation, 60(11):5215–5227, 2012. 126 [147] Ralf Hiptmair. Higher order whitney forms. Progress in Electromagnetics Research, 32:271–299, 2001. [148] Robert J. Barker and Edl Schamiloglu. High-power microwave sources and technolo- gies, 2001. [149] J.J. Petillo, E.M. Nelson, J.F. DeFord, N.J. Dionne, and B. Levush. Recent developments to the michelle 2-d/3-d electron gun and collector modeling code. IEEE Transactions on Electron Devices, 52(5):742–748, 2005. [150] A. Cangellaris, Chung-Chi Lin, and Kenneth Mei. Point-matched time domain finite element methods for electromagnetic radiation and scattering. IEEE Transactions on Antennas and Propagation, 35(10):1160–1173, 1987. [151] A. E. Yilmaz, Jian-Ming Jin, and E. Michielssen. A parallel fft accelerated tran- sient field-circuit simulator. IEEE Transactions on Microwave Theory and Techniques, 53(9):2851–2865, 2005. [152] B. Toland, J. Lin, B. Houshmand, and T. Itoh. Fdtd analysis of an active antenna. IEEE Microwave and Guided Wave Letters, 3(11):423–425, 1993. [153] V. A. Thomas, M. E. Jones, M. Piket-May, A. Taflove, and E. Harrigan. The use of spice lumped circuits as sub-grid models for fdtd analysis. IEEE Microwave and Guided Wave Letters, 4(5):141–143, 1994. [154] P. Ciampolini, P. Mezzanotte, L. Roselli, and R. Sorrentino. Accurate and efficient cir- cuit simulation with lumped-element fdtd technique. IEEE Transactions on Microwave Theory and Techniques, 44(12):2207–2215, 1996. [155] Chien-Nan Kuo, B. Houshmand, and T. Itoh. Full-wave analysis of packaged microwave circuits with active and nonlinear devices: an fdtd approach. IEEE Transactions on Microwave Theory and Techniques, 45(5):819–826, 1997. [156] P. Mezzanotte, M. Mongiardo, L. Roselli, R. Sorrentino, and W. Heinrich. Analysis of packaged microwave integrated circuits by fdtd. IEEE Transactions on Microwave Theory and Techniques, 42(9):1796–1801, 1994. [157] Yong Wang, D. Gope, V. Jandhyala, and C. . R. Shi. Generalized kirchoff’s current and voltage law formulation for coupled circuit-electromagnetic simulation with surface integral equations. IEEE Transactions on Microwave Theory and Techniques, 52(7):1673–1682, 2004. [158] Chien-Nan Kuo, V. A. Thomas, Siou Teck Chew, B. Houshmand, and T. Itoh. Small signal analysis of active circuits using fdtd algorithm. IEEE Microwave and Guided Wave Letters, 5(7):216–218, 1995. [159] Garry Rodrigue and Daniel White. A vector finite element time-domain method for solving maxwell’s equations on unstructured hexahedral grids. SIAM Journal on Scientific Computing, 23(3):683–706, 2001. 127 [160] Zheng Lou and Jian-Ming Jin. Time-domain finite-element modeling and simulation of broadband antennas. In 2006 IEEE Antennas and Propagation Society International Symposium, pages 2809–2812, 2006. [161] R. Subramanian, M. Swaminathan, and M. Behar. Parasitic extraction and perfor- mance evaluation of a high i/o package. In Proceedings of 2nd Electronics Packaging Technology Conference (Cat. No.98EX235), pages 99–106, 1998. [162] Jianyuan XIAO, Hong QIN, and Jian LIU. Structure-preserving geometric particle-in- cell methods for vlasov-maxwell systems. Plasma Science and Technology, 20(11):110501, sep 2018. [163] Martin Campos Pinto, Katharina Kormann, and Eric Sonnendrücker. Variational framework for structure-preserving electromagnetic particle-in-cell methods. Journal of Scientific Computing, 91(2):46, Mar 2022. 128