EFFICIENT CHARGE CONSERVING UNCONDITIONALLY STABLE FINITE ELEMENT
FORMULATIONS FOR PARTICLE-IN-CELL SIMULATIONS

By

Omkar Harishankar Ramachandran

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

Electrical Engineering – Doctor of Philosophy

2023

ABSTRACT

The simulation of systems involving charged particles moving in the presence of

electromagnetic fields is of great interest in a number of domains in physics, with

applications including the characterization of pulsed power devices and accelerators,

design of high precision etching and sterilization implements. As a result, several methods

have been proposed to accurately simulate such systems. One such method is the particle-in-

cell (PIC) technique, which characterizes the distribution of a plasma in phase space through

a collection of statistically significant macroparticles. While contemporary implementations

of electromagnetic PIC (EM-PIC) have typically relied on a finite-difference time-domain

(FDTD) stencil to evaluate the fields, there has been a push for the adoption finite element

methods that allow for the use of better geometry representations and more robust function

spaces. In particular, recent developments in the field have focused on developing implicit

finite element solvers that are free of mesh dependent stability constraints while natively

conserving fundamental quantities such as charge and energy.

The goals of this dissertation are to develop efficient, charge-conserving, implicit finite

element particle-in-cell (EM-FEMPIC) methods. First, (i) we construct a formulation of

PIC built around expontential predictor-corrector particle integrators. We demonstrate

that this approach has significantly better error convergence than equivalent polynomial

methods, thus allowing for accurate evaluation of particle trajectories even at the large

step-sizes afforded by implicit EM solvers. Next, (ii) for devices of a narrowband tendency,

we construct a novel EM-FEMPIC method based on envelope tracking. This allows us

to accurately simulate the EM response of such a device while sampling at the narrow

bandwidth, rather than at the highest absolute frequency of interest. Furthermore, we

explore the consequences on charge-conservation for such a method and propose a rubric

to ensure exact satisfaction of Gauss’ laws. We then consider (iii) the matter of energy

conservation in an implicit EM-FEMPIC scheme and propose a set of guidelines that

ensure the conservation of average energy over the course of a simulation. Finally, (iv) we

reformulate a parameter extraction method originally proposed for efficient device-agnostic

simulation of EM systems attached to lumped nonlinear devices to make it applicable to

a system of moving particles. We couple this approach with a domain-decomposition

framework to construct an efficient, ’particle-agnostic’ extraction framework. Taken together

these contributions address several open problems in the field and extend the applicability

of EM-FEMPIC methods to larger, more relevant problems.

Copyright by
OMKAR HARISHANKAR RAMACHANDRAN
2023

To my parents
for their unconditional love and support

v

ACKNOWLEDGMENTS

Reserach is very fundamentally a team sport and while there is only one author

associated with this work, there are many people whose contributions were indispensable

to its creation. I first wish to extend my gratitude to my advisor Dr. Balasubramaniam

Shanker for his patience and guidance. Without the many meetings, sometimes at odd

hours in the evenings debugging code and bouncing ideas, none of this work would’ve

been done. I would also like to extend my gratitude to members of my guidance committee

Dr. Leo Kempel, and Dr. John Luginsland, Dr. Prem Chahal and Dr. Michael Murillo for

their guidance and continued support. I have had many interesting conversations with

them during my time as a graduate student and almost every chapter of this work involved

ideas that directly stemmed from these talks.

Next, I want to sincerely thank the other graduate students in the computational

electromagnetics group for their friendship and support. Unlike a lot of other groups in

academia, the CEML was extremely collaborative, with students often ringing each other

to get help or advise on solving research problems. First and foremost, I want to thank Dr.

Zane Crawford and Dr. Scott O’Connor whose work laid the foundations of the material

presented in this thesis. I had the immense fortune of joining the group when Scott and

Zane were both senior students who understood computational EM far better than I did.

I remember asking for documentation of our 200K+ line fortran code when I first joined

the group only for Zane to laugh and say "I AM the documentation!". I would like to

thank Dr. Abdel Alsnayyan for being the ultimate partner-in-crime throughout my time in

grad school. We never published together, but a lot of my ideas would never have come

to fruition without the many discussions I had with Abdel. Next, I want to acknowledge

the other members of the finite element subgroup, Chad Moorman and Sean DePalma for

the many fun discussions and meetings we’ve had over the past couple of years. Sean, in

particular, got up to speed with my early port extraction results and extended it to solve

complex optimization problems in his short stint as a Master’s student. Chad, being the

vi

sole surviver from the group’s time at MSU, is now set to inherit this mess and do great

things with it. Additionally, I’d like to acknowledge the other students in my group, Jacob

Hawkins, Luke Baumann and Elliot Lu for many interesting conversations and interactions

over the years.

I’d next like to thank the many students of the wider electromagnetics research group

(EMRG) for making the work environment lively, fun and intellectually stimulating. In no

particular order, I’d like to thank Dr. Stavros Vakalis, Dr. Serge Mghabhab, Anton Schlegel,

Cory Hilton, Jacob Randall, Jason Merlo, Daniel Chen, Jorge Colon-Berrios, William Torres,

Ahona Bhattacharrya, Niam Shandi, Dr. Saranraj Karuppuswami, Dr. Saikat Mondal, Dr.

Michael Craton, and Amer Abu Arisheh.

Having interests and hobbies outside research was critical to my success in grad school.

To that end, I’d like to thank everyone involved with the MSU Fencing Club for more or

less being the entirity of my social life, as well as the coaches and staff at the MSU tennis

center for many fun weekly lessons.

Finally, this section would be grossly incomplete without mention of my family, whose

unconditional love and support was absolutely critical to my productivity and success.

In particular, I’d like to thank my parents Dr. V. Bharathi Harishankar and Dr. Hari

Ramachandran. My almost daily conversations with them has been the source of many

great ideas that made this work possible.

I’d also like to thank my aunt Dr. Vĳaya

Ramachandran and my uncle Dr. Ramesh Narayan for giving me a home in the US and

making my time here comfortable and fun.

vii

PREFACE

In reference to IEEE copyrighted material which is used with permission in this

thesis, the IEEE does not endorse any of Michigan State University’s products or ser-

vices.

Internal or personal use of this material is permitted.

If interested in reprint-

ing/republishing IEEE copyrighted material for advertising or promotional purposes

or for creating new collective works for resale or redistribution, please go to http:

//www.ieee.org/publications_standards/publications/rights/rights_link.html to learn

how to obtain a License from RightsLink.

viii

TABLE OF CONTENTS

CHAPTER 1

INTRODUCTION AND BACKGROUND . . . . . . . . . . . . . . . .

1

CHAPTER 2 AN EXPONENTIAL PREDICTOR CORRECTOR FRAMEWORK

FOR RELATIVISTIC EM-FEMPIC . . . . . . . . . . . . . . . . . . . . . 16

CHAPTER 3 AN ENVELOPE TRACKING APPROACH FOR PARTICLE IN CELL

SIMULATIONS .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

CHAPTER 4 ANALYSIS AND BOUNDS FOR ENERGY CONSERVATION IN

EM-FEMPIC .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

.

CHAPTER 5 A TRANSIENT PARAMETER EXTRACTION TECHNIQUE FOR

FINITE ELEMENT PARTICLE IN CELL SOLVERS . . . . . . . . . . . 85

CHAPTER 6 CONCLUSION .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. 113

BIBLIOGRAPHY .

.

.

.

.

.

.

.

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

ix

CHAPTER 1

INTRODUCTION AND BACKGROUND

The simulation of moving charged particles in an electromagnetic field distribution is of

great interest in a number of engineering applications. These include fundamental plasma

physics, such as the design of particle accelerators, plasma processing applications such as

sterilization of medical implements and etching of high precision integrated chips [1, 2, 3],

among many others. Likewise, high fidelity simulation of migrating electrons and positive

holes are frequently required in the design and analysis of cutting edge semiconductor

devices [4]. Due to the varied subject areas where these systems show up, a number of

methods have been proposed over the last few decades to accurately and efficiently solve

for the physics of these systems. These methods include fluid based methods, which

involve directly solving the kinetic equations to obtain phase space distributions of the

charged species [5, 6], and particle based methods, wherein a finite number of samples

of the distribution function are used in conjunction with an appropriate electromagnetic

solver to characterize the physics of the system [7]. While many interesting problems and

applications exist with fluid based solvers, we restrict our analysis in this work purely

to particle based methods and their application to electromagnetic problems. For the

remainder of this chapter, we will introduce and describe the general setup of the problems

considered in the remainder of the thesis (Section 1.1) and a brief overview of the many

implementations of the Particle-in-Cell method (Sections 1.2 and 1.3), before providing an

overview of the current state of the art in electromagnetic finite element particle in cell

(EM-FEMPIC) in the subsequent sections. Finally, we will conclude this chapter with a

brief overview of the chapters to follow.

1

Figure 1.1: A general schematic of the simulation domain.

1.1 Problem Statement

Consider a region Ω ∈ R3 bounded by a surface 𝜕Ω containing a single charged species.

This region is subjected to an external field due to which the charged species accelerate,

and in turn, produce spatially and temporally varying electric and magnetic fields denoted

by E(r, 𝑡) and B(r, 𝑡), respectively, with r ∈ Ω and 𝑡 ∈ [0, ∞). The dynamics of the particles

in phase space can be represented by a Phase Space Distribution Function (PSDF) 𝑓 (𝑡, r, v)

that follows the Vlasov equation:

𝜕𝑡 𝑓 (𝑡, r, v) + v · ∇ 𝑓 (𝑡, r, v)+

𝑞
𝑚

[E(r, 𝑡) + v × B(r, 𝑡)] · ∇𝑣 𝑓 (𝑡, r, v) =0.

(1.1)

In what follows, we assume that the background media in Ω is free space. As a result, we

denote the permittivity and permeability of free space by 𝜖0 and 𝜇0, respectively, and the

speed of light by 𝑐. Finally, we will also assume that the system is quiescent for 𝑡 ≤ 0.

1.2 The Particle-in-Cell (PIC) Technique

The hallmark of PIC techniques to solve for systems of moving charges is that the

simulator elects to not solve (1.1) directly, but rather, represent the charge and current

distributions in Ω using a finite number of samples – or moments – of the PSDF through
𝜌(r, 𝑡) = ∫

𝑞v 𝑓 (𝑡, r, v)𝑑v. These samples usually manifest as

𝑞 𝑓 (𝑡, r, v)𝑑v and J(r, 𝑡) = ∫

Ω

Ω

2

Particle Push

Map Fields to
Particles

Map particle in current
and charge

Update Maxwell’s

Equations

Figure 1.2: The simulation cycle of a standard electromagnetic PIC scheme.

using 𝑁𝑝 macroparticles – with charges and mass corresponding to a clump of multiple

particles. One can evolve the positions and velocity of these particles together with

Maxwell’s equations through the imposition of a spatial shape function 𝑆(r),

𝜌(r, 𝑡) =

𝑁𝑝
(cid:213)

𝑝=1

𝑞𝑝𝑆(r − r𝑝(𝑡)),

J(r, 𝑡) =

𝑁𝑝
(cid:213)

𝑝=1

𝑞𝑝v𝑝(𝑡)𝑆(r − r𝑝(𝑡)).

(1.2a)

(1.2b)

where r𝑝(𝑡) and v𝑝(𝑡) refer to the positions and velocities as functions of time of the 𝑝th

macroparticle. The evolution of the fields E(r, 𝑡) and B(r, 𝑡) over space and time within Ω

follow Maxwell’s equations, given by

∇ × E(r, 𝑡) = −𝜕𝑡B(r, 𝑡),

∇ × 𝜇−1

0 B(r, 𝑡) = J𝑖(r, 𝑡) + J(r, 𝑡) + 𝜖0𝜕𝑡E(r, 𝑡),

(1.3a)

(1.3b)

Here J𝑖(r, 𝑡) describes impressed currents within Ω. Furthermore, the solutions to the curl

equations in (1.3) also need to satisfy Gauss’ Laws,

∇ · 𝜖0E(r, 𝑡) = 𝜌𝑖(r, 𝑡) + 𝜌(r, 𝑡),

∇ · B(r, 𝑡) = 0.

(1.4a)

(1.4b)

wherein 𝜌𝑖(r, 𝑡) are impressed charges. In what follows, we will assume that both the

impressed current and the corresponding charge densities are zero. If they are not, it is

trivial to include then in the analysis framework rubric described in the next section.

3

As usual, boundary conditions need to be imposed on E(r, 𝑡) and B(r, 𝑡) on sections of

the outer boundary 𝜕Ω, to ensure unique solutions. These are assumed to either Dirichlet,

Neumann or impedance boundary conditions on non-overlapping surfaces 𝜕Ω𝐷, 𝜕Ω𝑁 and

𝜕Ω𝐼, with 𝜕Ω = 𝜕Ω𝐷 + 𝜕Ω𝑁 + 𝜕Ω𝐼, and are defined as follows:

ˆ𝑛 × E(r, 𝑡) = Ψ𝐷(r, 𝑡) on Ω𝐷 ,

ˆ𝑛 × 𝜇−1

B(r, 𝑡) = Ψ𝑁 (r, 𝑡) on Ω𝑁 ,

ˆ𝑛 × 𝜇−1

B(r, 𝑡) − 𝑌 ˆ𝑛 × ˆ𝑛 × E(r, 𝑡) = Ψ𝐼(r, 𝑡) on Ω𝐼 .

(1.5a)

(1.5b)

(1.5c)

The evolution of the macroparticles in space and time is determined by solving for

the relativistic equations of motion with the acceleration determined by the Lorentz force.

This yields the following coupled system of equations for ordinary differential equations

(ODEs) for v𝑝(𝑡) and r𝑝(𝑡):

𝑑𝛾𝑝v𝑝(𝑡)
𝑑𝑡

=

𝑞𝑝
𝑚𝑝

(cid:2)E(r𝑝(𝑡), 𝑡) + v𝑝(𝑡) × B(r𝑝(𝑡), 𝑡)(cid:3)

𝑑r𝑝(𝑡)
𝑑𝑡

= v𝑝(𝑡)

(1.6a)

(1.6b)

1.3 Contemporary EM-PIC Solution Methods

In EM-PIC, the solution cycle typically consists of two parts: (1) a full-wave electromag-

netic field solver to update fields within the simulation domain as a function of space and

time used in conjunction with (2) a Newton solver that evolves the particle trajectories

in response to these fields. These two steps need to be done self-consistently in order to

simulate the physics of the moving charges.

Due to the simplicity of this scheme as well as its ability to generate accurate solutions,

many different combinations of field and particle solvers have been developed. The

remainder of this section will provide a brief overview of these methods.

4

Finite Difference Time Domain

The oldest and by far the most well analyzed EM-PIC method involves using a finite

difference time domain (FD-TD) stencil to update the fields along with a modified leapfrog

stencil developed by Boris [8] to evolve the particles. The simplest FD-TD methods work

primarily by breaking up the simulation domain into a Yee grid [9] and placing electric

and magnetic field quantities on the primal and dual meshes respectively. The history of

FD-TD based PIC methods are vast and well studied; we refer the reader to these reviews –

and the references therein – for a deeper explanation of the method [7, 10, 11].

One of the fundamental bottlenecks with using an FDTD based method is the relative

difficulty involved with representing curved surfaces, leading to well known ‘staircasing’

errors, as documented in [12, 13, 14, 15, 16]. To mitigate this, several ‘conformal’ methods

have been proposed and analysed. While the earliest implementations were limited to

straight edged domains [17, 18, 19, 20], these were quickly extended to curved/curvilinear

boundaries [21, 22, 23] with several updates proposed to resolve problems with stability

[24] and better incorporation of dielectric near a curved interface [25, 26].

Currently, popular implementations of conformal FDTD include methods that modify

boundary cell with an appropriate polygon and enforce the correct boundary condition

on the fields. The best exponent of this is the Dey-Mittra scheme [27, 28, 29, 30, 31, 32],

with subsequent contributions resolving issues with the maximum allowable step-size

[33, 34, 35, 36], allowing the method to be used without a prohibitively small stability

constraint. Likewise, irregular interfaces separating dielectrics have also been analysed

using various weighting tricks and constraints [25]. We note at this point that this is only

a small sampling of the methods that exist to deal with applying FDTD to systems with

curved interfaces. We refer the reader to this review for a deeper explanation [37].

5

Finite Volume Methods

Similar to FDTD, Finite Volume Time Domain (FVTD) methods have also been used

extensively to solve general electromagnetic problems, with the earliest implementations

as in [38, 39, 40]. Improvements in higher order representation and extension to higher

order grids [41]; and stencils that exactly preserve charge [42, 43] and energy [44] were

achieved in recent years. FVTD based solvers have been implemented in conjunction with

particle pushers in a PIC scheme, notably including charge conserving schemes with higher

order particle evolution in time [45, 46], drift diffusion based methods for simulating glow

discharges [47], along with a number of updates for charge correction [48, 49, 50] and

modelling stochastic collisions [51, 52, 53].

Discontinuous Galerkin Methods

In a similar manner, there exists a significant body of work on using discontinuous

Galerkin time domain methods (DGTD) for PIC [54, 55, 56, 57]. DGTD-PIC methods have

been used with success to simulate a number of particle systems [58, 59, 60] in addition to

Vlasov-Poisson [61, 62, 63] and Vlasov-Ampere systems [64, 65].

1.4 Electromagnetic Finite Element PIC (EM-FEMPIC)

While the aforementioned methods and have been widely used, we note that that over

the past three decades, research has allowed finite element methods in electromagnetics

to mature [66]. Today, it is at a state where it has become the de-facto modeling algorithm

used by commercial software companies. With this background, it follows that FEM could

potentially be a robust tool for PIC, and examination of the bottlenecks and methods

developed to overcome them constitutes the rest of the chapter. The viability of the

finite element field solver was proven through a series of seminal papers [67, 68, 69,

70, 71] demonstrated both the constraints on charge mapping, as well as basis function

representation that enabled an FEM solver to be integrated with a PIC scheme without

6

breaking important conservation properties. These results in totality are often denoted as

structure-preserving methods. These results have been augmented with the development

of symplectic formulations methods to better conserve charge and energy [72, 73]. By

and large, the methods developed rely on explicit field and particle updates, as these

naturally conserving charge. Unfortunately, explicit methods are conditionally stable and

the stability criterion is related to the finest feature in the mesh. In other words, the time

step size that can be used is governed by the smallest mesh element. This implies that

analysis of geometrically complex systems requires significant computational resources.

Indeed, this feature is shared by all other methods discussed thus far. An obvious remedy

is an implicit scheme that is unconditionally stable. Such schemes for FEM are well known

[74]. The main question is, how can one adapt such methods to satisfy the conservation

of quantities necessary for a PIC scheme? This question was partially addressed in a

collection of recent papers [75, 76, 77], wherein a collection of fundamental rules were

prescribed. These need to be satisfied by the EM solver and particle evolution scheme for

charge to be innately conserved without needing expensive post-processing measures like

divergence cleaning. The framework established through the aforementioned papers have

been expanded further to more efficiently solve systems with narrowband field responses

by implementing a time-marching scheme built around envelope tracking [78], higher

order basis sets [79], domain decomposition to glean efficiency [80], and relativistic motion

[81]. Fundamentally, [75, 78] identified the failure to conserve charge as being the result

of inconsistent choices of representation and testing functions (in space and time) in the

discretization of Ampere’s and Gauss’ Laws. This understanding led to the proposal of

a slight modification to the current term in Ampere’s Law, which was shown to ensure

charge conservation to machine precision [76]. But implicit time stepping schemes come

with their own challenge; they admit null spaces that correspond to a DC field (in the case

of Maxwell solvers) or a time growing DC field (in case of wave equation field solvers).

The existence of these is clearly evident in [77]. Overcoming this bottleneck involved a

7

fundamental change in spatial representation, and through the use of a quasi-Helmholtz

decomposition[77].

Matrix definitions involved in EM-FEMPIC discretization

As a precursor to the description of the EM-FEMPIC scheme, we first detail a number

of commonly used matrix definitions. This section will be referred to when the appropriate

quantities are used. To begin, the sets 𝒩, ℰ, ℱ and 𝒯 are defined as the set of nodes, edges,

faces and tets respectively having 𝑁𝑛, 𝑁𝑒, 𝑁 𝑓 and 𝑁𝑡 elements. The various submatrices

used are as follows:

[★𝜖]𝑖,𝑗 = ⟨W

(1)
𝑖

(r), 𝜀 · W

(1)
𝑗

(r)⟩; 𝑖, 𝑗 ∈ ℰ

[★𝜇−1]𝑖,𝑗 = ⟨W

(2)
𝑖

(r), 𝜇−1 · W

(2)
𝑗

(r)⟩; 𝑖, 𝑗 ∈ ℱ

[★𝜌]𝑖,𝑗 = ⟨𝑊 (3)

𝑖

(r), 𝑊 (3)

𝑗

(r)⟩; 𝑖, 𝑗 ∈ 𝒯

(1.7a)

(1.7b)

(1.7c)

where W

(1)
𝑖

, W

(2)
𝑖

and 𝑊 (3)

𝑖

are the Whitney edge, face and volume basis functions respec-

tively. Further, we define the following matrices:

[M𝑔]𝑖,𝑗 = ⟨W

(1)
𝑖

(r), ∇𝑊 (0)

𝑗

(r)⟩; 𝑖 ∈ ℰ, 𝑗 ∈ 𝒩

[M𝑐]𝑖,𝑗 = ⟨W

(2)
𝑖

(r), [∇×]W

(1)
𝑗

(r)⟩; 𝑖 ∈ ℱ , 𝑗 ∈ ℰ

[M𝑑]𝑖,𝑗 = ⟨𝑊 (3)

𝑖

(r), [∇·]W

(2)
𝑗

(r)⟩; 𝑖 ∈ 𝒯 , 𝑗 ∈ ℱ

[∇] = 𝜀[★𝜖]−1[M𝑔]

[∇×] = 𝜇−1[★𝜇−1]−1[M𝑐]

[∇·] = [★𝜌]−1[M𝑑]

Likewise the submatrices involved in (3.44) are as follows:

[Z]11 = [C

𝑐 ]𝑇[P]Λ
𝑏

𝑏 [C

𝑏
𝑐 ]

[Z]12 = [C

𝑐 ]𝑇[∇×][★𝜀]−1[P]Λ
𝑏

𝑒 [★𝜀][C

𝑒
𝑐]

8

(1.8a)

(1.8b)

(1.8c)

(1.8d)

(1.8e)

(1.8f)

(1.9a)

(1.9b)

0 - form 1 - form 2 - form 3 - form

Primal

Dual

𝜙

𝜌

∇

E

∇×

B

∇·

[★𝜖]

[★𝜇]

∇·

D

H

∇×

∇

0

0

3 - form 2 - form 1 - form 0 - form

Figure 1.3: Overview of the de-Rham complex and the discrete operations needed to map
functions from one space to another.

[Z]13 = [C

𝑐 ]𝑇[∇×][★𝜀]−1Σ[C
𝑏

𝑒
𝑧]

[Z]21 = [C

𝑒
𝑐]𝑇[P]Λ

𝑒 [★𝜀][C

𝑒
𝑐]

[Z]22 = [C

𝑐]𝑇[∇×]𝑇[★𝜇−1][P]Λ
𝑒

𝑏 [C

𝑏
𝑐 ]

[Z]23 = [C

𝑐]𝑇Σ[C
𝑒

𝑒
𝑧]

(1.9c)

(1.9d)

(1.9e)

(1.9f)

where the [𝐶]𝑏 matrices are mappings that identify unknowns that reside on the cotree.

Constructing this mapping is trivial for simply connected structures, but is trickier for

multiply connected geometries.

Discretization in Space

FEM discretization of Eq. (1.4) requires choosing appropriate spatial basis functions

that respect the unique continuity conditions demanded by Maxwell’s equations: Namely,

the tangential continuity of E(r, 𝑡) and the normal continuity of B(r, 𝑡). Traditionally, these

conditions have been met by choosing Whitney edge and face basis functions that live

on and appropriately transform following the de-Rham complex (depicted in Fig. 1.3).

Assuming a tetrahedral discretization of Ω with 𝑁𝑒 and 𝑁 𝑓 faces, E(r, 𝑡) and B(r, 𝑡) are

9

interpolated in space as

E(r, 𝑡) =

B(r, 𝑡) =

𝑁𝑒(cid:213)

𝑖=1
𝑁 𝑓
(cid:213)

𝑖=1

1
𝑖 (r),
𝑒𝑖(𝑡)W

2
𝑖 (r).
𝑏𝑖(𝑡)W

(1.10)

1
Here, W
𝑖

(r) ∈ 𝐻(𝑐𝑢𝑟𝑙; Ω) and W

2
𝑡 (r) ∈ 𝐻(𝑑𝑖𝑣; Ω) represent the Whitney edge function
defined on the 𝑖th edge and Whitney face function defined on the 𝑡th face, respectively.

The function spaces 𝐻(𝑐𝑢𝑟𝑙; Ω) and 𝐻(𝑑𝑖𝑣; Ω) are defined as

𝐻(𝑐𝑢𝑟𝑙; Ω) = {u ∈ L

2(Ω); ∇ × u ∈ L

2(Ω)},

𝐻(𝑑𝑖𝑣; Ω) = {u ∈ L

2(Ω); ∇ · u ∈ L

2(Ω)},

(1.11)

where L2(Ω) refers to the space of square integrable functions on Ω. Further details on

mixed finite elements can be found in [82, 83, 84, 85, 86] and in the references therein. One

can obtain a discrete system of equations by Galerkin testing, resulting in the following

matrix ODE to solve for the vector of field coefficients ¯𝐵(𝑡) and ¯𝐸(𝑡) at a given instance of

time:

(cid:105)

(cid:104) ¯¯𝑆

·

(cid:105)

(cid:104) ¯¯𝑀

·

+

¯𝐵(𝑡)







¯𝐸(𝑡)








𝜕𝑡

𝜕𝑡








¯¯𝐹

=

¯𝐵(𝑡)







¯𝐸(𝑡)

where the various matrix definitions are as follows:

,


[∇×]






0

,

0





[★𝜖]



(cid:104) ¯¯𝑆

(cid:104) ¯¯𝑀

(cid:105)

(cid:105)

0

=





−[∇×]𝑇



[★𝜇−1]





¯¯𝐹 = −

=

0

0

.















¯𝐽(𝑡)
𝜖0

(1.12)

(1.13)

Furthermore, ¯𝐵(𝑡) = (cid:2)𝑏1(𝑡), · · · , 𝑏𝑁 𝑓
where 𝑗𝑗(𝑡) =

, ˜J(r𝑠 , 𝑡)

W

(cid:68)

(cid:69)

(1)
𝑗

(𝑡)(cid:3) 𝑇, ¯𝐸(𝑡) = [𝑒1(𝑡), · · · , 𝑒𝑁𝑒 (𝑡)]𝑇 and ¯𝐽(𝑡) = [𝑗1(𝑡), · · · , 𝑗𝑁𝑒 (𝑡)]𝑇

. One can likewise trivially formulate a wave-equation solver

10

for just the electric or magnetic fields, as done in [76]. The matrices in (1.13) are defined in

1.4.

Charge conservation and Temporal Discretization

To convert (1.12) into a discrete update stencil in time, one has to choose representation

and testing functions in time as done in the spatial setup.

In general, the vector of

coefficients ¯𝐵(𝑡) and ¯𝐸(𝑡) can be interpolated by a set of temporal basis functions 𝑁𝑖(𝑡),

such that at any given time 𝑡,

¯𝐵(𝑡)

¯𝐸(𝑡)

(cid:169)
(cid:173)
(cid:171)

=

(cid:170)
(cid:174)
(cid:172)

𝑁𝑡(cid:213)

𝑘=0

¯𝐵(𝑡𝑘)
¯𝐸(𝑡𝑘)

(cid:170)
(cid:174)
(cid:172)

𝑁𝑘(𝑡) (cid:169)
(cid:173)
(cid:171)

(1.14a)

and tested by an appropriate function 𝑊𝑛(𝑡) to obtain a marching scheme. Both 𝑁𝑖(𝑡) and

𝑊𝑛(𝑡) have local support and are determined by the order of these functions. While the

traditional approach to PIC largely relies on explicit updates, i.e, where the testing scheme

for (1.14) is chosen such that the fields at the next timestep depend only on information

from previous timesteps, a lot of work has been done on extending the applicability of PIC

to implicit methods. The choice of temporal discretization has significant consequences on

conservation of charge and satisfaction of Gauss’ Laws, with potential methods to mitigate

these presented in [76, 77]. Fundamentally, charge conserving implicit EM-FEMPIC

methods are highly desirable due to the availability of unconditionally stable EM solvers in

time [83].

A deeper analysis of temporal representation and its consequences on charge conserva-

tion will be presented in Chapter 3.

1.5 Nullspaces in implicit EM-FEMPIC and Quasi-Helmholtz Decompo-

sition

Nullspaces always exist in implicit EM-FEMPIC, owing to the solutions to the curl

equations not directly satisfying Gauss’ Laws. Because of this, taking a discrete divergence

11

as is needed to populate the space charges, leads to corruption of charge conservation.

This corruption is of the form ∇𝜙(r) in a implicit mixed EM-FEMPIC solver and of the form

𝑡∇𝜙(r) in implicit wave equation EM-FEMPIC. This problem can be mitigated to some

extent by solving for the fields at very high tolerance, with the level of spurious excitation

dependant upon the tolerance used for computation. Significantly for EM-FEMPIC, the

nullspaces will corrupt the satisfaction of Gauss’ Laws despite using careful testing as

proposed in [76].

A robust means to mitigate the effect of nullspaces in implicit EM-FEMPIC solvers

was presented in [77]. We note that fields E(r, 𝑡) and B(r, 𝑡) can be decomposed into
solenoidal and non-solenoidal components. As a primer on notation, ¯𝐸𝑛𝑠(𝑡) refers to
the non-solenoidal coefficients of the electric field. Similarly, ¯𝐸𝑠(𝑡) and ¯𝐵𝑠(𝑡) refer to the

solenoidal coefficients of the electric field and magnetic flux density, respectively. All

relevent sub-matrices involved are defined entirely in Section 1.4.

Projectors

Projectors to separate out the non-solenoidal components from the Whitney basis

functions used in a traditional FEM solve are defined as

[ ¯𝑃]Σ

𝑒 = Σ(Σ𝑇Σ)†Σ𝑇

[ ¯𝑃]Λ

𝑒 = ℐ − [ ¯𝑃]Σ

𝑒

(cid:2) ¯𝑃(cid:3) Λ

𝑏 = ℐ − Σ𝑚

(cid:16)

Σ𝑇

𝑚Σ𝑚

(cid:17) †

Σ𝑇
𝑚

(1.15a)

(1.15b)

(1.15c)

where † represents a Moore-Penrose pseudoinverse, [Σ] = 𝜖0[ ¯𝑀𝑔] and [Σ]𝑚 = [∇·]𝑇.

Numerically this is done by separating the field unknowns using a minimum spanning

tree and it’s associated cotree [66]. Applying these projectors to the field coefficients will

have the effect of separating out the solenoidal components as

¯𝐷𝑛 = Σ ¯𝐸𝑛

𝑛𝑠 + (cid:2) ¯𝑃(cid:3) Λ

𝑒

¯𝐷𝑛

(1.16)

12

for the electric field and

𝑠 = (cid:2) ¯𝑃(cid:3) Λ
¯𝐵𝑛

𝑏

¯𝐵𝑛

(1.17)

for the the magnetic flux density. By definition, the discrete divergence of the projectors are

zero and as a result, the magnetic flux density ¯𝐵𝑛 will have an identically zero divergence.

Note that the use of these projectors is tantamount to the imposition of the discrete Coulomb

gauge. Therefore, by design, these will satisfy both Gauss’ laws.

Discrete System

To construct a stencil from these projectors, one can apply a discrete divergence operator

to (1.16) to obtain

[𝐶 𝑒

𝑧]𝑇 [∇]𝑇 [★𝑒] [∇] [𝐶 𝑒

𝑧] ¯𝐸𝑛𝑠(𝑡) = − [𝐶 𝑒

𝑧]𝑇 [∇]𝑇 ¯𝐺(𝑡).

(1.18)

Therefore, the non-solenoidal components of the electric field can be related exactly to the

charge density. The operation of (1.18) is the numerical analogue of strongly enforcing the

Coulomb Gauge.

Upon using a mapping to find only the ’cotree’ unknowns in the mesh (which house

the solenoidal) components, these can be solved for through

(cid:2) ¯𝑍(cid:3)

11

𝜕𝑡

¯𝐵𝑠(𝑡) + (cid:2) ¯𝑍(cid:3)

¯𝐸𝑠(𝑡) = − (cid:2) ¯𝑍(cid:3)

¯𝐸𝑛𝑠(𝑡)

13

12

(cid:2) ¯𝑍(cid:3)

21

𝜕𝑡

¯𝐸𝑠(𝑡) − (cid:2) ¯𝑍(cid:3)

22

¯𝐵𝑠(𝑡) = ¯𝐺(𝑡) − (cid:2) ¯𝑍(cid:3)

𝜕𝑡

¯𝐸𝑛𝑠(𝑡)

23

(1.19)

The time derivatives can be evaluated by using an implicit Newmark-𝛽 operator. Note

that the solutions to ¯𝐸𝑠(𝑡) and ¯𝐵𝑠(𝑡) will still contain nullspace excitations. However, when

used to obtain the fields through (1.16) and (1.17), these excitations will have no influence

on charge conservation.

1.6 Goals and Outline

The overarching goals of this work are to take the advances outlined in this chapter and

construct improvements to make them more applicable to complex, real-world devices. To

13

that end, the chapters presented in the document address the following problems:

• Chapter 2 extends the EM-FEMPIC method, complete with quasi-Helmholtz, to

systems where the particle motion is relativistic. While some analysis exists in

the literature for the use of non-standard particle-integartion schemes, this has not

been extended to the use of predictor-corrector schemes or integrators using non-

polynomial representation. This is particularly critical in implicit systems involving

relativistic particles as the standard means to push particles in PIC schemes are

explicit and suffer from instability due to the stiff nature of the particle update. We

present a rigorous overview of these methods from the standpoint of charge and

energy conservation and demonstrate the inclusion of a predictor-corrector scheme

in an EM-FEMPIC scheme using an implicit electromagnetic solver.

• Next, we explore possible optimizations to the electromagnetic solver in Chapter 3.

Specifically, we expore the potential use of envelope-tracking methods in a Particle-

in-Cell scheme to make the method far more efficient for devices characterized by

narrowband high-frequency field responses (as is typical in high power microwave

applications). We also analyse the consequences of temporal representation and

testing on charge conservation in great detail. Some of the insight presented in this

introductory paper found its origin in the work that eventually made up the second

chapter.

• Chapter 4 extends the analysis done in Chapters 2 and 3 and uses them to construct a

method that exactly conserves energy. Similar to the technique proposed in [76], we

propose a rubric of constraints that, when followed, will guarantee energy conserving

when using the unconditionally stable Newmark-𝛽 method to evolve Maxwell’s

equations. We further show that this method satisfies Gauss’ Laws to machine

precision, following the framework laid out in [87].

14

• Finally, in Chapter 5, we explore a technique to extract port information at a discrete

number of edges of a finite element mesh to decouple a linear electromagnetic solver

from arbitary constraints on said edges enforced by nonlinear ODEs. While the initial

extraction method was originally proposed for electromagnetic systems attached

to lumped circuit ports, we propose an approach that combines extraction with a

domain decomposition framework to improve computational complexity and this

make the implicit EM-FEMPIC scheme applicable to larger, more complex problems.

15

CHAPTER 2

AN EXPONENTIAL PREDICTOR CORRECTOR FRAMEWORK FOR RELATIVISTIC
EM-FEMPIC

One of the principal challenges in particle-in-cell methods is to ensure self-consistency

at every time step; i.e the fields, positions and velocities need to be self consistent. What

is typically done is to reduce both Maxwell and Newton equations to a set of coupled

first order partial differential equations. These are then collectively evolved using an

appropriate integrator. Alternately, one offsets the evolution of Maxwell’s equations from

that of Newton’s by one time step, i.e., the evolution is explicit. While the former is

self-consistent, the latter is not. Delving into the latter for the sake of exposition, we note

the following: (a) evolution of Newton’s laws using Boris push (a very robust and popular

technique) and leap-frog for Maxwell systems is one of the most popular approaches

in setting up a PIC scheme [67]; (b) Boris time stepping is either momentum or energy

conserving, preserves the volume of phase space in either case, and has low computational

overhead [88, 89]; (c) phase errors are known to exist as one increases the timestep size

[90]; (d) leap-frog evolution for Maxwell systems is conditionally stable with time step sizes

depending on the smallest feature. There is significant interest in constructing methods

that are self-consistent [91]. Existing implementations of such methods typically follow a

so-called ‘kinetic enslavement’ approach, with Maxwell and particle systems combined

together into a large coupled solver, which can then be evaluated through an appropriate

nonlinear solver [92]. Such an approach comes with the obvious downside of being very

computationally expensive, since the size of the system is in the order of the number

of particles. As a result, many implementations of kinetic enslavement often rely on

matrix-free Jacobian free nonlinear solvers to avoid the enormous storage footprint of

the combined system matrix [93]. As is evident, relying on a specific nonlinear solver

comes with downsides. Further, time step sizes used for kinetic enslavement is also subject

16

to stability constraints, and much like the Boris push, position and velocity updates are

polynomial based with convergence rates following the order of approximation.

The primary goal of this work is to investigate another approach for self-consistent

evolution of particle and Maxwell equations with the following requirements; (a) the

evolution of Maxwell systems should be unconditionally stable, i.e., the time step sizes

should really be only governed by physics and not feature size used to describe the

geometry; (b) it should naturally satisfy Gauss’ Laws, thereby conserving charge and (c) it

should show superior error convergence.

The objectives of this work, therefore, are twofold: (a) First, we analyse the accuracy

and error convergence for a novel predictor-corrector integrator based on exponential basis

sets [94] over a number of particle-only numerical experiments. (b) Using this integrator,

we construct a charge conserving EM-FEMPIC scheme that functions under a predictor

corrector update. We validate both the proposed integrator and the overall PIC scheme

through a number of analytical and numerical results.

The outline of the paper will be as follows: Sec. 2.1 and Sec. 2.1 will outline the

construction of the quasi-Helmholtz based EM solver. Sec. 2.1 and following will detail

and analyze energy conservation for a number of different particle evolution schemes. Sec.

2.2 will then validate both the proposed particle evolution scheme and the overall PIC

method through a number of analytical and numerical tests, before concluding remarks

are presented in Sec. 2.3.

2.1 Formulation

Problem Statement

Consider a region Ω ∈ R3 bounded by a surface 𝜕Ω containing a single charged species.

This region is subjected to an external field due to which the charged species accelerate,

and in turn produce spatially and temporally varying electric and magnetic fields denoted

by E(r, 𝑡) and B(r, 𝑡), respectively, with r ∈ Ω and 𝑡 ∈ [0, ∞). The dynamics of the particles

17

in phase space can be represented by a distribution function (PSDF) 𝑓 (𝑡, r, v) that follows

the Vlasov equation:

𝜕𝑡 𝑓 (𝑡, r, v) + v · ∇ 𝑓 (𝑡, r, v)+

𝑞
𝑚

[E(r, 𝑡) + v × B(r, 𝑡)] · ∇𝑣 𝑓 (𝑡, r, v) =0.

(2.1)

𝑞 and 𝑚 here refer to the particle charge and mass respectively. In what follows, we assume

that the background media in Ω is free space. As a result, we denote the permittivity and

permeability of free space by 𝜖0 and 𝜇0, respectively, and the speed of light by 𝑐.

Discrete Solution

Solutions to (2.1) is found through a Particle-in-Cell formulation, wherein the charge

and current distributions within the domain are represented in terms of a finite number of

moments of the PSDF 𝑓 (𝑡, v, r)

𝜌(r, 𝑡) =

𝑁𝑝
(cid:213)

𝑝=1

𝑞𝑝𝑆(r − r𝑝(𝑡)),

J(r, 𝑡) =

𝑁𝑝
(cid:213)

𝑝=1

𝑞𝑝v𝑝(𝑡)𝑆(r − r𝑝(𝑡)).

(2.2a)

(2.2b)

Here, 𝑁𝑝 refers to the total number of such samples or macroparticles in the simulation.

And likewise 𝑞𝑝 and 𝑚𝑝 refer to the charge and mass of the 𝑝th particle respectively. The

dynamics of the fields within the simulation domain Ω obey Maxwell’s Equations which

can be written as

∇ × E(r, 𝑡) = −𝜕𝑡B(r, 𝑡),

∇ × 𝜇−1

0 B(r, 𝑡) = 𝜕𝑡G(r, 𝑡) + 𝜕𝑡 𝜖0E(r, 𝑡),

(2.3a)

(2.3b)

where G(r, 𝑡) = ∫ 𝑡
magnetic laws

0 J(r, 𝜏)𝑑𝜏. Further, the fields are constrained by Gauss’ electric and

∇ · 𝜖0E(r, 𝑡) = 𝜌(r, 𝑡),

(2.4a)

18

The curl equations in (2.3) are discretized in space using a finite element formulation over

∇ · B(r, 𝑡) = 0.

(2.4b)

a tetrahedral mesh with 𝑁𝑒 edges and 𝑁 𝑓 faces, wherein the fields are represented in
space using Whitney basis forms, E(r, 𝑡) = (cid:205)𝑁𝑒
𝑖=1
some time-varying coefficient 𝑒𝑖(𝑡) and 𝑏𝑖(𝑡) respectively and subsequently Galerkin tested.

(r), B(r, 𝑡) = (cid:205)𝑁 𝑓
𝑖=1

𝑏𝑖(𝑡)W

(r), for

𝑒𝑖(𝑡)W

(1)
𝑖

(2)
𝑖

Temporally, this system is discretized using a Newmark-𝛽 [74] formulation with 𝛾 = 0.5 and
(𝑡)(cid:3). We
𝛽 = 0.25. In what follows, ¯𝐸 = [𝑒1(𝑡), 𝑒2(𝑡), · · · , 𝑒𝑁𝑒 (𝑡)], ¯𝐵 = (cid:2)𝑏1(𝑡), 𝑏2(𝑡), · · · , 𝑏𝑁 𝑓
denote evaluations of quantities at 𝑡 𝑛 = 𝑛Δ𝑡 where Δ𝑡 is the time step size, via a superscript
𝑛 such as ¯𝐸𝑛 = ¯𝐸(𝑡 𝑛). Complete details on the mixed FEM discretization can be found in

[82, 83, 84, 85, 86] and the references therein.

Quasi Helmholtz

Using the formulation described in Sec. 2.1 with an implicit time-marching scheme

is well known to not satisfy Gauss’ Laws due to null spaces for both Maxwell solver or

the wave-equation formulation [95, 96], both with implicit unconditionally stable time

stepping. The most comprehensive method to mitigate this issue is to use a so-called

Quasi-Helmholtz decomposition to partition the field components into solenoidal and

non-solenoidal components. The non-solenoidal components can then be explicitly solved

for, thereby ensuring exact satisfaction of the Coulomb gauge. What follows is a brief

overview of the quasi-Helmholtz setup described in great detail in [77].

In what follows, variables with the subscript ‘𝑛𝑠’ will refer to non-solenoidal quantities

and ‘𝑠’ will likewise denote solenoidal quantities.

Relevant Matrix Definitions

For ease of reading, the relevent matrix definitions from Chapter 1 are repeated here.

Let the sets 𝒩, ℰ, ℱ and 𝒯 be defined as the set of nodes, edges, faces and tets within

the mesh of the simulation domain respectively having 𝑁𝑛, 𝑁 𝑓 , 𝑁𝑒 and 𝑁𝑡 elements. The

19

various submatrices that will be used in describing the Quasi-Helmholtz framework are as

follows:

[★𝜖]𝑖,𝑗 = ⟨W

(1)
𝑖

(r), 𝜀 · W

(1)
𝑗

(r)⟩; 𝑖, 𝑗 ∈ ℰ

[★𝜇−1]𝑖,𝑗 = ⟨W

(2)
𝑖

(r), 𝜇−1 · W

(2)
𝑗

(r)⟩; 𝑖, 𝑗 ∈ ℱ

[★𝜌]𝑖,𝑗 = ⟨𝑊 (3)

𝑖

(r), 𝑊 (3)

𝑗

(r)⟩; 𝑖, 𝑗 ∈ 𝒯

(2.5a)

(2.5b)

(2.5c)

where W

(1)
𝑖

, W

(2)
𝑖

and 𝑊 (3)

𝑖

are the Whitney edge, face and volume basis functions respec-

tively. Further, we define the following matrices:

[M𝑔]𝑖,𝑗 = ⟨W

(1)
𝑖

(r), ∇𝑊 (0)

𝑗

(r)⟩; 𝑖 ∈ ℰ, 𝑗 ∈ 𝒩

[M𝑐]𝑖,𝑗 = ⟨W

(2)
𝑖

(r), [∇×]W

(1)
𝑗

(r)⟩; 𝑖 ∈ ℱ , 𝑗 ∈ ℰ

[M𝑑]𝑖,𝑗 = ⟨𝑊 (3)

𝑖

(r), [∇·]W

(2)
𝑗

(r)⟩; 𝑖 ∈ 𝒯 , 𝑗 ∈ ℱ

[∇] = 𝜀[★𝜖]−1[M𝑔]

[∇×] = 𝜇−1[★𝜇−1]−1[M𝑐]

[∇·] = [★𝜌]−1[M𝑑]

(2.6a)

(2.6b)

(2.6c)

(2.6d)

(2.6e)

(2.6f)

Projector Definitions

To separate the non-solenoidal components from the basis forms used for representing

the electric field, we define projectors [ ¯𝑃]Σ

𝑒 and [ ¯𝑃]Λ

𝑒 to break up the electric field as follows

(where † represents a Moore-Penrose pseudoinverse):

[ ¯𝑃]Σ

𝑒 = Σ(Σ𝑇Σ)†Σ𝑇

[ ¯𝑃]Λ

𝑒 = ℐ − [ ¯𝑃]Σ

𝑒

(cid:2) ¯𝑃(cid:3) Λ

𝑏 = ℐ − Σ𝑚

(cid:16)

Σ𝑇

𝑚Σ𝑚

(cid:17) †

Σ𝑇
𝑚

(2.7a)

(2.7b)

(2.7c)

where [Σ] = 𝜖0[M𝑔] and [Σ]𝑚 = [∇·]𝑇. Using these projectors, we can now define a

decomposition for the electric flux density as

¯𝐷(𝑡) = Σ ¯𝐸𝑛𝑠(𝑡) + (cid:2) ¯𝑃(cid:3) Λ
𝑒

¯𝐷(𝑡)

(2.8)

20

and the magnetic flux density as

¯𝐵𝑠(𝑡) = (cid:2) ¯𝑃(cid:3) Λ
𝑏

¯𝐵(𝑡)

(2.9)

Put simply, these projectors allow us to separate out ¯𝐵(𝑡) and ¯𝐷(𝑡) into terms that have

exactly zero divergence (i.e, the ’solenoidal components’ ¯𝐵𝑠(𝑡) and (cid:2) ¯𝑃(cid:3) Λ
𝑒
with non-zero divergence (Σ ¯𝐸𝑛𝑠(𝑡)). Since the application of a discrete divergence operator

¯𝐷(𝑡)) and terms

on either projector is 0 (as demonstrated in [77]), one can see that the formulation forces

the divergence of ¯𝐵𝑠(𝑡) to zero.

Discrete System

As stated earlier, our ultimate goal is to solve all of Maxwell’s equations. First given the

decomposition, the application of the discrete divergence on (3.42) and (3.41) results in an

identically zero matrix and

[𝐶 𝑒

𝑧]𝑇 [∇]𝑇 [★𝑒] [∇] [𝐶 𝑒

𝑧] ¯𝐸𝑛𝑠(𝑡) = − [𝐶 𝑒

𝑧]𝑇 [∇]𝑇 ¯𝐺(𝑡).

(2.10)

Upon rewriting Maxwell’s equations to make use of the decomposition in (3.41) and (3.42);

and with ¯𝐸𝑛𝑠(𝑡) in (2.10) explicitly solved for, we obtain

(cid:2) ¯𝑍(cid:3)

(cid:2) ¯𝑍(cid:3)

11

21

𝜕𝑡

¯𝐵𝑠(𝑡)+ (cid:2) ¯𝑍(cid:3)

𝜕𝑡

¯𝐸𝑠(𝑡)− (cid:2) ¯𝑍(cid:3)

12

22

¯𝐸𝑠(𝑡) = − (cid:2) ¯𝑍(cid:3)

¯𝐸𝑛𝑠(𝑡)

13

¯𝐵𝑠(𝑡) = −𝜕𝑡

¯𝐺(𝑡) − (cid:2) ¯𝑍(cid:3)

𝜕𝑡

¯𝐸𝑛𝑠(𝑡)

23

(2.11)

where ¯𝐸𝑠 and ¯𝐵𝑠 are extracted from the simply connected mesh using tree-cotree maps
(cid:3). Constructing this mapping is trivial for simply connected structures, but is

[𝐶 𝑒

𝑐 ] and (cid:2)𝐶𝑏

𝑐

trickier for multiply connected geometries. The various submatrices involved in (3.44) are

defined as:

[Z]11 = [C

𝑏
𝑐 ]𝑇[P]Λ

𝑏 [C

𝑏
𝑐 ]

[Z]12 = [C

𝑐 ]𝑇[∇×][★𝜀]−1[P]Λ
𝑏

𝑒 [★𝜀][C

𝑒
𝑐]

[Z]13 = [C

𝑐 ]𝑇[∇×][★𝜀]−1Σ[C
𝑏

𝑒
𝑧]

21

(2.12a)

(2.12b)

(2.12c)

[Z]21 = [C

𝑐]𝑇[P]Λ
𝑒

𝑒 [★𝜀][C

𝑒
𝑐]

[Z]22 = [C

𝑐]𝑇[∇×]𝑇[★𝜇−1][P]Λ
𝑒

𝑏 [C

𝑏
𝑐 ]

[Z]23 = [C

𝑒
𝑐]𝑇Σ[C

𝑒
𝑧]

(2.12d)

(2.12e)

(2.12f)

As alluded to earlier, the solution to (3.44) is obtained using a Newmark-𝛽 time stepping

scheme. The solution to ¯𝐸𝑠(𝑡) and ¯𝐵𝑠(𝑡) still has a 𝐷𝐶 null space, but their divergence is

exactly zero. As a result, Gauss’ laws are still exactly satisfied.

Particle Evolution

A particle evolution scheme is primarily involved in obtaining solutions in space

and time of the position and velocity of a particle in response to a prescribed set of

electromagnetic fields. Specifically, the particle motion can be obtained by

𝜕𝛾[v𝑝]v𝑝(𝑡)
𝜕𝑡

= a𝑝(𝑡) =

𝑞
𝑚

(cid:0)E(r𝑝 , 𝑡) + v𝑝 × B(r𝑝 , 𝑡)(cid:1)

𝜕r𝑝(𝑡)
𝜕𝑡

= v𝑝(𝑡)

(2.13a)

(2.13b)

where v𝑝(𝑡) and r𝑝(𝑡) are each vectors in R3 and refer to a given particle’s velocity and
position respectively at time 𝑡, and 𝛾[v] = (1 − |v|2 /𝑐2) is the relativistic time dialation

functional. In what follows, we analyze different approaches to solving these equations as

well as their stability properties.

Boris Push

The Boris method is fundamentally a modified leap-frog solver for (2.13a) and (2.13b)

that breaks the update for the relativistic velocity u𝑝 into three fundamental steps. Defining
𝑝 and u𝑛+1
u𝑛

as particle 3-velocities at timestep 𝑛,

𝑝

1. First, half of the electric force is added to a dummy variable u−
𝑝 .

−
𝑝 = u

𝑛
𝑝 + 𝜖Δ𝑡

u

22

(2.14)

where 𝜖 = (𝑞𝑝/2𝑚𝑝)E(r

𝑛+1/2
𝑝

, 𝑡 𝑛+1/2) and Δ𝑡 is the timestep size.

2. Next, the magnetic contribution is added,

+
𝑝 = u
u

−
𝑝 + Δ𝑡

(cid:16)

𝑞𝑝
𝑚𝑝

𝑛+1/2
𝑝

v

× B(r

𝑛+1/2
𝑝

, 𝑡 𝑛+1/2)

(cid:17)

3. Finally, the other half of the electric force is added to obtain u𝑛+1

𝑝

.

𝑛+1
𝑝 = u

+
𝑝 + 𝜖Δ𝑡

u

(2.15)

(2.16)

While different ways of evaluating (2.15) have been proposed over the years, with some

introducing phase errors [90], they all share the fundamental property that they execute

a perfect rotation of u−

𝑝 and as a result, an applied magnetic force will never do work

on a particle. Hence, the method as a whole is energy conserving under the action

of an arbitrary magnetic field. While many different methods exist to prove energy

conservation, we considered the 𝑧-transform of the single step update for the implicit

leap-frog representation with the forcing terms evaluated at half steps. Proving that this

stencil conserves energy is sufficient to show that (2.14)-(2.16) broken up into substeps will

conserve energy. We can write the velocity update stencil for a particle moving due to a

general magnetic field as follows:

(I − Ω) v

𝑛+1
𝑝 = (I + Ω) v

𝑛
𝑝 .

(2.17)

Here Ω can be written as

Ω =

𝜔𝑧 −𝜔𝑦

0

−𝜔𝑧

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)
, 𝑡 𝑛+1/2)

𝜔𝑦 −𝜔𝑥

𝜔𝑥

0

(cid:170)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:172)

where 𝜔𝑘 = Δ𝑡(𝑞/𝑚𝛾𝑛+1/2
matrix in R3×3. Defining M = (I + Ω)−1 (I − Ω), the 𝑧-transform of (2.17) becomes

for 𝑘 ∈ 𝑥, 𝑦, 𝑧 and I is the regular identity

𝑛+1/2
𝑝

)B𝑘(r

𝑝

0

,

(2.18)

M𝑧 = 𝜆𝑧 = 1

23

(2.19)

where 𝜆 refers to the eigenvalues of M. Under the spectral theorem, for a system to be

stable, |𝑧| <= 1 for any combination of eigenvalues chosen. However, we can define a

stronger condition |𝑧| = 1 wherein the update stencil strictly defines a perfect rotation on

the complex plane under the action of a magnetic field. This approach has been used in the

past to appraise numerical methods for energy conservation; see [74, 83] and the references

therein. As a result,

which implies that the particle’s kinetic energy is

exactly conserved regardless of the form of the magnetic field. For the Boris update, the

(cid:12)
(cid:12)
(cid:12)
(cid:12)

(cid:16)

v𝑛+1
𝑝

(cid:17) 2(cid:12)
(cid:12)
(cid:12)
(cid:12)

(cid:16)

v𝑛
𝑝

=

(cid:12)
(cid:12)
(cid:12)
(cid:12)

(cid:17) 2(cid:12)
(cid:12)
(cid:12)
(cid:12)

(2.20)

eigenvalues 𝜆 are

𝜆 =

1

√

1−(𝜔2

𝑥+𝜔2

1−(𝜔2

𝑥+𝜔2

𝑥+𝜔2
1+𝜔2
𝑥+𝜔2
1+𝜔2

𝑥)−2𝑗
𝑥+𝜔2
𝑥)+2𝑗
𝑥+𝜔2

𝜔2
𝑦+𝜔2
√
𝑧
𝜔2
𝑦+𝜔2
𝑧

𝑥+𝜔2

𝑥+𝜔2
𝑥

𝑥+𝜔2

𝑥+𝜔2
𝑥











It can be shown through simple algebraic manipulation that that each of the three

eigenvalues in (2.20) has exactly unit magnitude for any 𝜔𝑥 , 𝜔𝑦 , 𝜔𝑧 ∈ R. Therefore, when

combined with (2.19), we see that the roots of the 𝑧-transform has to lie exactly on the unit

circle, thereby conserving energy.

Polynomial Predictor Corrector

Next, we describe the Adams predictor-corrector update. These methods are generally

derived by representing the forcing terms a𝑝(𝑡) = 𝑞𝑝/𝑚𝑝 (cid:0)E(r𝑝 , 𝑡) + v𝑝(𝑡) × B(r𝑝 , 𝑡)(cid:1) and

v𝑝(𝑡) in terms of polynomial functions for a prescribed order and testing them with a pulse

function defined over the length of one timestep. Specifically, the explicit predictor step

of the process (typically referred to as an Adams-Bashforth update), involves writing the

acceleration a𝑝 and relativistic velocity u𝑝 as

a𝑝(𝑡) =

v𝑝(𝑡) =

𝑁𝑡(cid:213)

𝑖=1

𝑁𝑡(cid:213)

𝑖=1

𝑝𝐿𝑘(𝑡 − 𝑡 𝑖)
𝑖

a

𝑝𝐿𝑘(𝑡 − 𝑡 𝑖)
𝑖

v

24

(2.21a)

(2.21b)

where 𝐿𝑘(𝑡 − 𝑡𝑖) refers to a 𝑘th order Lagrange polynomial involving the sample at 𝑡𝑖 and

the 𝑘 − 1 samples before it in time. Next, a stencil is obtained from the differential equation

by taking an inner product with a pulse function

𝑃𝑛(𝑡) =

1

if 𝑡 ∈ (cid:2)𝑡 𝑛 , 𝑡 𝑛+1(cid:3)

,

0 otherwise





(2.22)

giving us the following update stencil upon evaluation of the requisite integrals for 𝑘 = 4:

𝑛+1
𝑝 = u

𝑛
𝑝 +

u

Δ𝑡
24

(55a

𝑛
𝑝 − 59a

𝑛−1
𝑝 + 37a

𝑛−2
𝑝 − 9a

𝑛−3
𝑝

),

𝑛+1
𝑝 = r

𝑛
𝑝 +

r

Δ𝑡
24

(55v

𝑛
𝑝 − 59v

𝑛−1
𝑝 + 37v

𝑛−2
𝑝 − 9v

𝑛−3
𝑝

).

(2.23a)

(2.23b)

One can similarly device an implicit scheme by modifying (2.21) such that the Lagrange

polynomials center on 𝑛 + 1 and go back 𝑘 − 1 steps. Upon using the same testing function,

we obtain a so-called Adams-Moulton update, which for 𝑘 = 4 looks as follows:

𝑛+1
𝑝 = u

𝑛
𝑝 +

u

Δ𝑡
24

(9a

𝑛+1
𝑝 + 19a

𝑛
𝑝 − 5a

𝑛−1
𝑝 + a

𝑛−2
𝑝

)

𝑛+1
𝑝 = r

𝑛
𝑝 +

r

Δ𝑡
24

(9v

𝑛+1
𝑝 + 19v

𝑛
𝑝 − 9v

𝑛−1
𝑝 + v

𝑛−2
𝑝

).

(2.24a)

(2.24b)

To construct a predictor-corrector scheme using these stencils, at each timestep, a guess

velocity ˜v𝑝 and position ˜r𝑝 are computed following (2.23). These are then inserted into
(2.24) in place of v𝑛+1

. Going forward, we shall refer to this predictor corrector

and r𝑛+1

𝑝

𝑝

setup with fourth order polynomials as ’Adams4’, and likewise refer to a similar method

using third order polynomials as ’Adams3’.

Analysis of Energy Conservation

We apply the same means of analysing energy conservation here as we did for the Boris

update. Consider a simple system wherein the magnetic field is constant and only pointing

along the ˆ𝑧 direction. To give the method the best possible chance of conserving energy, let

us assume that u𝑛

𝑝 , u𝑛−1
𝑝

and u𝑛−2

𝑝

have identical magnitude. Therefore, 𝛾𝑛

𝑝 = 𝛾𝑛−1

𝑝 = 𝛾𝑛−2

𝑝

.

25

If under these very favorable conditions, the method fails to conserve energy, then it is

sufficient to assume that it will fail to do so under a general field. Under our prescriptions,

the corrector step of the Adams4 velocity update can be written as follows

(I + 9Ω) u

𝑛+1
𝑝 − (I − 19Ω) u

𝑛
𝑝 + 5Ωu

𝑛−1
𝑝 − Ωu

𝑛−2
𝑝 = 0

(2.25)

where Ω has the same form as (2.18), but with 𝜔𝑥 = 𝜔𝑦 = 0 and 𝜔𝑧 = (Δ𝑡 𝑞𝑝/24𝑚𝑝)𝛾𝑛

𝑝 B(r𝑛

𝑝 , 𝑡 𝑛).

Replacing each matrix product with its eigenvalues and taking a 𝑧-transform gives us

(cid:16)

𝜆1𝑧3 − 𝜆2𝑧2 + 5𝜆3𝑧 + 𝜆4

(cid:17)

u𝑝(𝑧) = 0

For our chosen system, the eigenvalues can take the following values:

𝜆1 = {1, 1 − 𝑗9𝜔𝑧 , 1 + 𝑗9𝜔𝑧 }

𝜆2 = {1, 1 + 𝑗19𝜔𝑧 , 1 − 𝑗19𝜔𝑧 }

𝜆3 = 𝜆4 = {0, 𝑗𝜔𝑧 , −𝑗𝜔𝑧 }

(2.26)

(2.27a)

(2.27b)

(2.27c)

By inspection, it is fairly clear that only one combination of eigenvalue choices yields a root

for 𝑧 that is on the unit circle, i.e when 𝜆1 = 𝜆2 = 1 and 𝜆3 = 𝜆4 = 0, giving us,

𝑧3 − 𝑧2 = 𝑧2(𝑧 − 1) = 0

(2.28)

which admits a root of 𝑧 = 1. Every other combination of eigenvalues leads to roots with

magnitudes larger or smaller than 1 in the general case. While one might be able to find

specific values of 𝜔𝑧 for which multiple combinations of eigenvalues yield roots on the

unit sphere, this would make the method as a whole extremely sensitive to step-size and

generally impractical as an energy-conserving method to advance particles.

Exponential Predictor-Corrector

So far, we have considered update schemes that use polynomials to interpolate the

particle parameter samples. However, it is possible to use exponential functions for this

26

purpose. One method to construct a 𝑃𝐸(𝐶𝐸)𝑚 method for ODEs was proposed in [94],

wherein the update for the relativistic velocity and position expressions can be written as

𝑛+1
𝑝 =

u

𝑘
(cid:213)

(cid:104)

𝑖=1

𝑝𝑖v

𝑛+1−𝑖
𝑝

+ 𝑝𝑘+𝑖a

𝑛+1−𝑖
𝑝

(cid:105)

𝑛+1
𝑝 =

r

𝑘
(cid:213)

(cid:104)

𝑖=1

𝑝𝑖r

𝑛+1−𝑖
𝑝

+ 𝑝𝑘+𝑖v

𝑛+1−𝑖
𝑝

(cid:105)

(2.29a)

(2.29b)

for an appropriate set of coefficients 𝑝𝑖, the specifics of constructing which can be found in

[94]. This choice of coefficients encodes an exponential interpolation for the unknowns

v𝑝(𝑡) and r𝑝(𝑡) in the interval [𝑡 𝑛 , 𝑡 𝑛+𝑘], which can be written as

𝑛+𝑘
(cid:213)

𝑖=𝑛

𝑛+𝑘
(cid:213)

u𝑝(𝑡) =

r𝑝(𝑡) =

𝛼𝑢

𝑖 u

𝑝 𝑒𝜆𝑖 𝑡
𝑖

𝛼𝑟
𝑖 r

𝑝 𝑒𝜆𝑖 𝑡
𝑖

(2.30a)

(2.30b)

𝑖=𝑛
where 𝛼𝑘 , 𝜆𝑘 ∈ C refer to appropriately chosen coefficients in the complex plane for the

parameters in the setup. Given the number of prior timesteps involved at any given update,

it is not possible to prove stability/energy conservation for an arbitrary applied field.

However, unlike the polynomial predictor corrector method, we show through numerical

examples that the method does conserve energy for both single particle systems and when

incorporated into a full EM-FEMPIC simulation.

Computational Complexity

Since the coefficients for the exponential PC method are not known analytically, there is

computational overhead in computing them. However, solving for 𝑝𝑖 only needs to be done

once after which it can be tabulated and used to solve different ODEs discretized using the

same number of basis sets. The costs, therefore, of using this scheme within a PIC method

is identical to any 𝑚-step predictor corrector method. Suppose one uses an iterative solver

for the fields, the complexity of evolving Maxwell’s equations at any given timestep can

27

be written as 𝒪(𝑁iter𝑁dof) where 𝑁iter denotes the number of iterations the solver takes to

converge and 𝑁dof the total number of unknowns in the system. In practice, this cost is

significantly higher than the particle evolution. Furthermore, the unconditionally stable

nature of the field solve allows for taking much larger timesteps than would otherwise be

possible using an explicit scheme. As a result, even though the Boris and Adams particle

update require fewer multiplications than the PC1 scheme used in the paper, the added

complexity is negligible in relation to the cost associated with solving for the fields.

2.2 Results

The numerical examples in this section are broken up into three parts. (1) First, we

consider systems forced by uniform electric and magnetic fields; (2) We then consider

systems with nonuniform fields; (3) Finally, the entire EM-FEMPIC scheme is validated

through an expanding particle beam simulation.

Uniform fields

In this section, we will consider the motion of particles in response to a combination

of electric and magnetic fields that are assumed to be spatially and temporally constant.

Further, for the sake of simplicity, we consider a coordinate system where 𝑞𝑝 = 𝑚𝑝 = 𝑐 = 1.

Since all of our experiments have analytical solutions, the initial conditions of both the

multi step polynomial and exponential methods are appropriately populated. Finally, in

each numerical example, the exponential PC method used is the ’PC1’ stencil described in

[94]. Specifically, we effected the SVD required to obtain the coefficients to a tolerance 10−12

(this serves as a lower bound for the error) as well as chose a semidisk radius 𝜌 = 3.15 and

constructed a method that uses 22 steps of prior timestep information interpolated with 18

exponential basis sets.

28

Figure 2.1: Comparison of particle trajectory between exponential PC and an analytical
solution.

Figure 2.2: Convergence of error (defined as the relative norm between computed and
analytical particle trajectories) for the linear acceleration example. Note that the curve for
the exponential PC is truncated due to hitting the floor of the tolerance of the method.

Quantity Natural

𝑞𝑝
𝑚𝑝
𝑐
E𝑝(r, 𝑡)
𝑣0

1
1
1
1
√
1/

2

MKS
1.6 × 10−19 C
9.1 × 10−31 kg
3 × 108 m/s
5.6 × 10−12 V/m

√

𝑐/

2 m/s

Table 2.1: Conversion between natural and MKS units for the direct E acceleration example

29

Quantity Natural

𝑞𝑝
𝑚𝑝
𝑐
B𝑝(r, 𝑡)
𝑟𝑐
𝑣0

1
1
1
1
1
√
1/

2

MKS
1.6 × 10−19 C
9.1 × 10−31 kg
3 × 108 m/s
1.2 × 10−3 T
1 m
√
2 m/s

𝑐/

Table 2.2: Conversion between natural and MKS units for the cyclotron motion example

Linear Acceleration E(r, 𝑡) = (1, 0, 0), B(r, 𝑡) = ¯0

To begin, we consider the simple example of a particle being accelerated by an electric

field of peak amplitude 1 oriented along the ˆ𝑥 axis. A mapping from natural to MKS

units are provided in 2.1. The initial relativistic momentum of the particle was set such

that p0 = 𝛾0

𝑝v0

𝑝 = (1, 0, 0). Fig. 2.1. shows a plot of the trajectories that the particles were

predicted to follow by the exponential method, laid over the analytical result. All of the

methods closely agreed with the predicted curve, though the error in the position was

significantly lower in the exponential method. This is demonstrated in Fig. 2.2, where we

see Boris and the Adams predictor corrector methods roughly follow the trends expected

from the polynomial order. The coarsest and finest timestep size used are Δ𝑡 = 0.1 and

Δ𝑡 = 0.005 respectively. The exponential method, on the other hand, performs much

better, showing better than 8th order convergence before flattening out due to hitting

the tolerance of the method. The analytical expression used is as follows, with the full

derivation available at [97]:

(cid:115)

(cid:18)

1 +

1 +

(cid:19) 2

𝑞|𝐸|𝑡
𝑚

r(𝑡) =

𝑚
𝑞|𝐸|

(cid:169)
(cid:173)
(cid:171)

.

− 𝛾(0)(cid:170)
(cid:174)
(cid:172)

(2.31)

The factor |𝐸| refers to the magnitude of the electric field used, and 𝛾(0) refers to the initial

relativistic factor.

30

Figure 2.3: Comparison of particle energy over multiple cyclotron cycles. We note that the
polynomial PC schemes spuriously gain energy for the timestep size chosen.

Figure 2.4: Comparison of particle trajectories obtained from the exponential method
compared to Boris and the analytical solution. Note how the improved accuracy of the
exponential method causes the trajectory to lie exactly on the analytical curve, whereas the
Boris solution is slightly shifted for the same timestep size.

31

Figure 2.5: Convergence of position error (defined as the relative norm between computed
and analytical particle trajectories)

over a single cycle for different evolution schemes. Once again, we note that the
polynomial methods show approximately the expected convergence for the order of the
polynomial used, while the exponential method converges much faster (11th order) before
it hits the set tolerance.

Cyclotron motion E(r, 𝑡) = ¯0, B(r, 𝑡) = (0, 0, 1)

Next, we considered the dynamics of a particle moving under a constant ˆ𝑧-directed

magnetic field. As before, the initial momentum of the particle was set to p0 = 𝛾0

𝑝v0

𝑝 =

(1, 0, 0). A mapping from natural to MKS units are provided in 2.2. The canonical solution

for this problem predicts a shortening of the non-relativistic cyclotron frequency and an

increase in the radius of the loop by exactly a factor of 𝛾. The cyclotron test stresses a

number of different aspects of the Newton solver. First, since the particle is moving only

under the influence of a magnetic field, this is the perfect setup to validate our predictions

about energy conservation. We note from Fig. 2.3 that while Boris and the exponential

method conserve the relativistic energy 𝑚𝑝 𝛾𝑝 𝑐 to machine precision, the polynomial

methods exhibit spurious heating. Second, since the location of the gyrocenter is extremely

sensitive to early time shifts, one can look at how well the different methods predict its

location. As is evident from Fig. 2.4, despite conserving energy exactly, the Boris method

has a slight shift of the orbit that is not seen in the more accurate exponential method.

We can also look at error convergence in the position over a single loop, as shown in Fig.

32

Figure 2.6: Convergence of error (defined as the relative norm between computed and
analytical particle trajectories)
for the |E| = |B| setup. The exponential method still outperforms Boris and the polynomial
PC methods, but by a smaller order.

2.5. The coarsest and finest stepsizes in the figure correspond to Δ𝑡 = 0.5 and Δ𝑡 = 0.05

respectively. Here, we note that the Boris push and polynomial predictor corrector schemes

converge at a slope approximately corresponding to the order of representation. The

exponential method on the other hand, shows much better convergence. To stress the

system further, the orbits were made more relativistic by setting the initial particle velocity

such that 𝛾 = 2. Looking once again at Fig. 2.5, we see that the convergence is even better

with the method hitting its tolerance with as little as 10 steps per cycle. As in the previous

case, the analytical solution is derived in [97], and looks as follows:

r(𝑡) =

𝑣0𝛾0
|𝐵|

(cid:20)

sin

(cid:19)

(cid:18) 𝛾0𝑞|𝐵|𝑡
𝑚

(cid:18)

ˆ𝑥 +

−1 + cos

(cid:18) 𝛾0𝑞|𝐵|𝑡
𝑚

(cid:19) (cid:19)

(cid:21)

ˆ𝑦

.

(2.32)

The term |𝐵| refers to the magnitude of the ˆ𝑧 directed applied magnetic field.

|E| = |B|

Finally, we considered the more complex case of relativistic motion forced both by

mutually perpendicular electric and magnetic fields of the same magnitude. As before,

a mapping from natural to MKS units are provided in 2.3. In this instance, we chose

E(r, 𝑡) = (0, 1, 0) and B(r, 𝑡) = (0, 0, 1), with the particle starting at rest at the origin. While

33

Quantity Natural

𝑞𝑝
𝑚𝑝
𝑐
B𝑝(r, 𝑡)
E𝑝(r, 𝑡)
𝑣0

1
1
1
1
1
√
1/

2

MKS
1.6 × 10−19 C
9.1 × 10−31 kg
3 × 108 m/s
1.2 × 10−3 T
5.6 × 10−12 V/m

√

𝑐/

2 m/s

Table 2.3: Conversion between natural and MKS units for the |E| = |B| case. The conversion
ensures that the electric and magnetic forces acting on the positron at 𝑡 = 0 are equal.

the dynamics of the particle are more complicated as one reaches relativistic speeds, it

is possible to derive an analytical solution, as done in [97]. The final expression for our

choice of parameters simplifies to

where

r(𝑡) =

𝑈 3
6

ˆ𝑥 +

𝑈 2
2

ˆ𝑦,

(cid:18) (cid:16)√

9𝑡2 + 8 + 3𝑡

(cid:17) 2/3

(cid:19)

− 2

𝑈 =

(cid:16)√

9𝑡2 + 8 + 3𝑡

(cid:17) 1/3

.

(2.33)

(2.34)

Upon running this setup with each of our solvers, we once again observe from Fig. 2.6 that

the exponential method significantly outperforms the polynomial based methods, though

by a smaller polynomial factor than the other examples presented thus far. The coarsest

and finest stepsizes in Fig. 6 correspond to Δ𝑡 = 1.0 and Δ𝑡 = 0.02 respectively.

Space Varying Fields

Next, we tested the long-time stability of the exponential method by considering a system
wherein the scalar potential and magnetic fields varied as follows, 𝜙(r, 𝑡) = 0.01(cid:112)𝑥2 + 𝑦2,
B(r, 𝑡) = ˆ𝑧(cid:112)𝑥2 + 𝑦2. Once again employing units of 𝑚𝑝 = 𝑞𝑝 = 𝑐 = 1, the particle was
𝑝 = (0.1, 0, 0). Further, we chose a step size Δ𝑡 = 1.5 × 10−5
initialized at r0

𝑝 = (0.9, 0, 0) and u0

and ran the system for 50000 timesteps. We note from Fig. 2.7 that exponential PC method

accurately tracks both the fast gyrations and the slow rotations without losing or gaining

energy.

34

Figure 2.7: Plot of particle trajectory for the first and 80th orbits, showing good preservation
of phase space volume

Figure 2.8: Plot of particle trajectories for the magnetic bottle predicted by (left) Boris,
(center) Exp PC and (right) 4th order Adams overlaid on a lineplot of the magnetic field
along the 𝑦-𝑧 plane. We note that spurious energy gain in the Adams method causes the
trapped particle to escape despite being initialized with a velocity outside the loss cone.

Particle trapped in a Magnetic Bottle

To stress energy conservation further, we considered a setup where an electron was

initialized within a magnetic bottle defined using two magnetic dipole oriented along the

𝑧-direction, with field strength

B(r) =

𝜇0
4𝜋

(cid:18) 3r(m · r)
|r|5

− m
|r|3

(cid:19)

.

(2.35)

The two dipoles were separated by a distance of 20 m and the magnetic moment was set to

𝑚 = 108 ˆ𝑧. The mass, charge and speed of light were represented in MKS units. The particle

initially begins at rest at position (0, 5, 0) and is accelerated for 10 ns by a 𝑦-directed electric

field before it is switched off and moves only in the magnetic field from that point. The

35

Figure 2.9: Snapshot of the particle spread at 𝑡 = 10 ns. We note that the particle spead
generally follows the expansion expected from the physics, and further agrees well with
equivalent results from XOOPIC

Figure 2.10: Satisfaction of Gauss’ Law over the run for the expanding particle beam. The
figure shown is the relative norm between the ∇ · D(r, 𝑡) and ∇ · G(r, 𝑡).

field lines of the dipoles, as well as the trajectory predicted by Boris, exponential PC and

the 4th order Adams predictor corrector method are shown in Fig. 2.8. We note that the

spurious energy gain of the Adams method permits the particle to escape the bottle, while

the other methods predict trajectories that keep the particle contained.

Expanding Particle Beam

To test the validity of the exponential predictor-corrector scheme on a full EM-FEMPIC

scheme, we analysed the physics of an expanding beam contained within a conducting

36

Figure 2.11: Time history of particle and field energy for the expanding beam. Due to the
high initial particle momentum, the energy is largely dominated by the kinetic energy
of the particles. However, we note from the data that both field and particle energy are
conserved across the run.

Figure 2.12: Snapshot of the particle spread at 𝑡 = 10 ns predicted by the proposed method
compared to results from an EM-FEMPIC formulation run using a leapfrog integrator.
Again, we note very good agreement between the two methods.

37

cylindrical cavity of length 10 cm and radius 2 cm oriented such that the axis of rotation

aligned along ˆ𝑧. The walls of the cavity were assumed to be perfectly conducting and a

particle beam composed of electrons was initialized at 𝑧 = 0. The beam was characterized

by a driving voltage of 500 𝑘𝑉, a driving current of 1 𝐴 and an emitter radius of 0.8 cm.

This yielded an initial setup wherein the electrons were accelerated at roughly 𝛾 = 2 or

𝑝 ∼ 2.6 × 108 m/s. The exponential predictor-corrector was used with a timestep size equal
v0

to Δ𝑡 = 15 ps – while the comparison runs with XOOPIC[98] and an explicit EM-FEMPIC

code were run at Δ𝑡 = 1 ps. The tetrahedral mesh used to discretize the system had an

average cell size of 8.2 mm and the maximum number of particles within the system at any

one time was approximately 8000 particles, resulting in approximately 3.5 particles per cell.

At each step, one predictor step was taken followed by as many corrector steps required as

to reach the desired tolerance of 10−9. In this specific run, achieving this threshold rarely

required more than one corrector step. The particle beam also had a turn-on time of 1 ns

so as to have a smooth increase in field strength. A snapshot of of the particle beam at 10

ns – compared with an equivalent simulation set-up on XOOPIC[98] – is shown in Fig. 2.9

and the satisfaction of Gauss’ Law over the course of the entire run is shown in Fig. 2.10.

As is evident, the proposed method both agrees with the expected physics and satisfies

Gauss’ Laws to machine precision. Likewise, we note from Fig. 2.11 that particle and field

energies are conserved over the course of the simulation. Finally, we note from Fig. 2.12

that the proposed method compares well to an explicit leapfrog based EM-FEMPIC solver,

albeit at a stepsize that was 15 times larger.

Particle Beam in a Klystron

To stress the proposed solver in a system with excited cavity modes, we analysed the

behavior of a simple cylindrical klystron, with dimensions as in Fig. 2.13. Once again,

the system was discretized by a tetrahedral mesh with average edge length of 2.93 mm.

The solver was run with a step size of 10 ps and compared to XOOPIC running at 1 ps.

38

Figure 2.13: Snapshot of the position distribution at 𝑡 = 4 ns. The proposed method agrees
well with XOOPIC and predicts the expected particle bunching behavior around the same
region as XOOPIC (at approximately 𝑧 = 2.5 cm).

Figure 2.14: Time history of particle and field energy for the klystron. Unlike in the
previous run, most of the energy is in the fields – we note here that once again average
energy is conserved across time for both fields and particles.

A sinusoidal current source was defined on a cylindrical surface existing at the middle

of the neck of the klystron, with an amplitude of 100 A and frequency of 4.1 GHz. These

dimensions correspond to an average of around 2 particles per cell, though there are

regions in the domain with higher density as shown by the bunching observed in Fig. 2.13.

Once again, we note good agreement with XOOPIC, and observe from Fig. 2.14 that both

field and particle energies are conserved on average over multiple cycles of the run.

39

2.3 Summary

In this work, we have proposed and validated a fully relativistic charge conserving

EM-FEMPIC scheme that makes use of a novel exponential predictor corrector framework.

We have numerically benchmarked our proposed framework over both single particle

full PIC runs, demonstrating superior accuracy in the former and good agreement with

existing validated methods in the latter. The improvement of scalability of this scheme to

make it more applicable to complex devices that are of interest in high power design is

currently being analysed and will be reported on in the near future.

40

AN ENVELOPE TRACKING APPROACH FOR PARTICLE IN CELL SIMULATIONS

CHAPTER 3

The advances in EM-FEMPIC outlined in the introductory chapter have made it possible

to (1) simulate plasma phenomena while capturing the underlying geometry to very good

precision and (2) analyse the system in time at the frequency of interest for capturing the

desired physics as opposed to the satisfaction of mesh-based stability criteria. Despite the

massive progress that has been accomplished in recent years, there are still classes of devices

for which the current state-of-the-art is inefficient to simulate for large scale problems. One

such example is the analysis of devices that have a narrowband high-frequency response.

Similarly, devices with features smaller than the electromagnetic wavelength that are

nonetheless important for the electrostatic physics are computationally untenable to solve

with contemporary techniques. Despite having a relatively small window of significant

frequency content, to simulate such a device, one would need to discretize the simulation

in time at a rate determined by the maximal frequency of interest.

Efficient ways to simulate these devices exist when one considers a purely electromag-

netic analysis (i.e. without particles). Termed envelope tracking, these methods solve an

altered form of Maxwell’s equations wherein the high frequency offset present in the fields

is analytically prescribed. Performing this operation, effectively a Hilbert Transform of

Faraday’s and Ampere’s laws, yields a new set of equations that still vary in time, but

can be discretized at the bandwidth of the signal, which is significantly lower than the

maxmial frequency in the unshifted system.

Extending these results to a system with moving macroparticles is challenging for

two primary reasons: (1) While the fields themselves oscillate within a narrowband high

frequency window, the particle positions and velocities generally have significant DC

components that force analysis of Newton’s equations at the highest frequency of interest.

(2) While recent advances make it possible to conserve charge within an implicit time

41

marching scheme [76], these techniques break down when used with an analytically

prescribed plane wave component (We discuss this in greater detail in Section 3.2).

In the remainder of this chapter, we construct an EM-FEMPIC method wherein the fields

are solved for using an envelope tracking method, henceforth referred to as ET-FEMPIC. We

address both of the challenges listed above by (1) constructing an integration scheme for the

particle system that allows Newton’s laws to be solved at the highest frequency of interest

while being mapped self consistently with the Maxwell solver and (2) by constructing a

Quasi-Helmholtz framework to explicitly satisfy the Coulomb gauge. We show analytically

and through numerical experiments that this framework conserves charge to machine

precision while producing results that match a traditional EM-FEMPIC implementation.

3.1 Problem Statement

Consider a region Ω ∈ R3 bounded by a surface 𝜕Ω containing a single charged species.

This region is subjected to an external field due to which the charged species accelerate,

and in turn produce spatially and temporally varying electric and magnetic fields denoted

by E(r, 𝑡) and B(r, 𝑡), respectively, with r ∈ Ω and 𝑡 ∈ [0, ∞). The dynamics of the particles

in phase space can be represented by a distribution function (PSDF) 𝑓 (𝑡, r, v) that follows

the Vlasov equation:

𝜕𝑡 𝑓 (𝑡, r, v) + v · ∇ 𝑓 (𝑡, r, v)+

𝑞
𝑚

[E(r, 𝑡) + v × B(r, 𝑡)] · ∇𝑣 𝑓 (𝑡, r, v) =0.

(3.1)

In what follows, we assume that the background media in Ω is free space. As a result,

we denote the permittivity and permeability of free space by 𝜖0 and 𝜇0, respectively, and

the speed of light by 𝑐. In what follows, we will assume that either the external fields

impressed on Ω or the source exciting fields is narrowband in that the center frequency of

the excitation 𝑓0 ≫ 𝑓𝑏𝑤, where 𝑓𝑏𝑤 is the bandwidth. Likewise, we will assume that the

system is quiescent for 𝑡 ≤ 0.

42

3.2 Formulation

We follow the usual path of not solving (3.1) directly, instead representing the moments

of the PSDF using a charge and current density as, 𝜌(𝑡, r) = 𝑞 ∫
𝑞 ∫

𝑓 (𝑡, r, v)𝑑v and J(𝑡, r) =
Ω v 𝑓 (𝑡, r, v)𝑑v. We then use a particle approximation of these moments, and evolve their
location and velocity together with Maxwell’s equations. As a result, assuming a shape

Ω

functions 𝑆(r), one obtains

𝜌(𝑡, r) = 𝑞

𝑁𝑝
(cid:213)

𝑝=1

𝑆(r − r𝑝(𝑡)),

J(𝑡, r) = 𝑞

𝑁𝑝
(cid:213)

𝑝=1

v𝑝(𝑡)𝑆(r − r𝑝(𝑡)).

(3.2a)

(3.2b)

where 𝑁𝑝 denotes the number of macroparticles. Furthermore, r𝑝(𝑡) and v𝑝(𝑡) refer to the

positions and velocities as a function of time of the 𝑝th macroparticle. With no loss of

generality, we will assume that there exist a source J𝑖(r𝑠 , 𝑡) = ˜J𝑖(r𝑠 , 𝑡)𝑒 𝑗𝜔0𝑡 at points r𝑠 that
excites the system. Given this temporal dependence, we posit that the field and fluxes have

a similar behavior, i.e., E(r, 𝑡) = ˜E(r, 𝑡)𝑒 𝑗𝜔0𝑡 and B(r, 𝑡) = ˜B(r, 𝑡)𝑒 𝑗𝜔0𝑡. In these expressions,
the quantities with a tilde are slowly varying with respect to time. Using these expressions

in Maxwell’s equations, we can write

∇ × ˜E(r, 𝑡) = −𝑗𝜔0

˜B(r, 𝑡) − 𝜕𝑡

˜B(r, 𝑡),

∇ × 𝜇−1
0

˜B(r, 𝑡) = ˜J𝑖(r𝑠 , 𝑡) + J(r, 𝑡)𝑒−𝑗𝜔0𝑡

+ 𝑗𝜔0𝜖0

˜E(r, 𝑡) + 𝜖0𝜕𝑡

˜E(r, 𝑡),

Following [76], we replace J(r, 𝑡) with its time integral

G(r, 𝑡) =

𝑡

∫

0

𝑡

𝑁𝑝
(cid:213)

∫

J(r, 𝜏)𝑑𝜏 = 𝑞

v𝑝(𝜏)𝛿(r − r𝑝(𝜏))𝑑𝜏

𝑝=1

𝑁𝑝
(cid:213)

= 𝑞

0
r𝑝(𝑡)
∫

𝑑˜r𝛿(r − ˜r)

𝑝=1

r𝑝(0)

43

(3.3a)

(3.3b)

(3.3c)

in (3.3b), to obtain

∇ × 𝜇−1
0

˜B(r, 𝑡) = ˜J𝑖(r𝑠 , 𝑡) + 𝑒−𝑗𝜔0𝑡 𝜕𝑡G(r, 𝑡)

+ 𝑗𝜔0𝜖0

˜E(r, 𝑡) + 𝜖0𝜕𝑡

˜E(r, 𝑡)

and satisfy Gauss’ electric and magnetic laws:

∇ · 𝜖0

˜E(r, 𝑡) = 𝜌𝑖(r𝑠 , 𝑡) + 𝜌(r, 𝑡) exp{−𝑗𝜔0𝑡},

∇ · ˜B(r, 𝑡) = 0.

(3.3d)

(3.4a)

(3.4b)

It is important to note at this point that G(r, 𝑡) and 𝜌(r, 𝑡) are not necessarily narrowband.

As a result, they cannot be decomposed into fast and slow varying components. As usual,

boundary conditions need to be imposed on ˜E(r, 𝑡) and ˜B(r, 𝑡) on sections of the outer
boundary 𝜕Ω. These are assumed to either Dirichlet, Neumann or impedance boundary

conditions on non-overlapping surfaces 𝜕Ω𝐷, 𝜕Ω𝑁 and 𝜕Ω𝐼, and are defined as follows:

ˆ𝑛 × ˜E(r, 𝑡) = Ψ𝐷(r, 𝑡) on Ω𝐷 ,

ˆ𝑛 × 𝜇−1 ˜B(r, 𝑡) = Ψ𝑁 (r, 𝑡) on Ω𝑁 ,

ˆ𝑛 × 𝜇−1 ˜B(r, 𝑡) − 𝑌 ˆ𝑛 × ˆ𝑛 × ˜E(r, 𝑡) = Ψ𝐼(r, 𝑡) on Ω𝐼 .

(3.5a)

(3.5b)

(3.5c)

Note, it is assumed that 𝜕Ω𝐷 + 𝜕Ω𝑁 + 𝜕Ω𝐼 = 𝜕Ω. The evolution of the macroparticles in

space and time is determined by solving for the equations of motion with the acceleration

determined by the Lorentz force. This yields the following coupled system of equations

for ordinary differential equations (ODEs) for v𝑝(𝑡) and r𝑝(𝑡):

𝑑v𝑝(𝑡)
𝑑𝑡

𝑞
𝑚

=

(cid:2) ˜E(r𝑝(𝑡), 𝑡) + v𝑝(𝑡) × ˜B(r𝑝(𝑡), 𝑡)(cid:3) exp{𝑗𝜔0𝑡}

𝑑r𝑝(𝑡)
𝑑𝑡

= v𝑝(𝑡)

(3.6a)

(3.6b)

In what follows, we present a method to self-consistently solve both Maxwell’s equations

and equations of motion, especially under the narrow band approximation.

44

Figure 3.1: Depiction of electric field ¯𝑒 𝑛 and magnetic flux density ¯𝑏𝑛 coefficients along a
cell consisting of a single tetrahedron. The arrows denote the directions of the unit vector
along the edges and unit normal along the faces.

Discretization in Space and Time

Solutions to Eq. (3.3) are obtained by spatially representing ˜E(r, 𝑡) and ˜B(r, 𝑡) in terms
of Whitney edge and face basis functions respectively (shown for a single tet in Fig. 3.1):

˜E(r, 𝑡) =

˜B(r, 𝑡) =

𝑁𝑒(cid:213)

𝑖=1
𝑁 𝑓
(cid:213)

𝑖=1

1
𝑒𝑖(𝑡)W
𝑖 (r),

2
𝑖 (r).
𝑏𝑖(𝑡)W

(3.7)

1
Here, W
𝑖

2
𝑡 (r) represent the lowest order Whitney edge function defined on the 𝑖th
(r) and W
edge and face function defined on the 𝑡th face, respectively. Further, 𝑁𝑒 and 𝑁 𝑓 denote the

number of edges and faces in the tetrahedral mesh used to discretize Ω; details on mixed

finite elements can be found [82, 83, 84, 85, 86] and reference therein. Using Gakerkin

testing results in the following matrix ODE to solve for the vector of field coefficients ¯𝐵(𝑡)

and ¯𝐸(𝑡) in time:

(cid:105)

(cid:104) ¯¯𝑀

·

+

(cid:105)

(cid:104) ¯¯𝑆

·

¯𝐵(𝑡)







¯𝐸(𝑡)








¯¯𝐹

=

𝜕𝑡

𝜕𝑡








¯𝐵(𝑡)







¯𝐸(𝑡)

(3.8)

45

where the various matrix definitions are as follows:








+ 𝑗𝜔0


[★𝜇−1]






0

,

0





[★𝜖]



(3.9)

(cid:104) ¯¯𝑆

(cid:104) ¯¯𝑀

(cid:105)

(cid:105)

=













¯¯𝐹 = −

=

0

[∇×]

−[∇×]𝑇

0

,

0





[★𝜖]



−








𝜕𝑡

[★𝜇−1]

0

0

¯𝐽𝑖(𝑡)
𝜖0















0

.








exp{−𝑗𝜔0𝑡}

¯𝐺(𝑡)
𝜖0
(𝑡)(cid:3) 𝑇, ¯𝐸(𝑡) = [𝑒1(𝑡), 𝑒2(𝑡), · · · , 𝑒𝑁𝑒 (𝑡)]𝑇,

Furthermore, ¯𝐵(𝑡) = (cid:2)𝑏1(𝑡), 𝑏2(𝑡), · · · , 𝑏𝑁 𝑓
(cid:68)
[𝑔1(𝑡), 𝑔2(𝑡), · · · , 𝑔𝑁𝑒 (𝑡)]𝑇 where 𝑔𝑖(𝑡) =
(1)
𝑗

¯𝐺(𝑡) =
and ¯𝐽𝑖(𝑡) = [𝑗1(𝑡), 𝑗2(𝑡), · · · , 𝑗𝑁𝑒 (𝑡)]𝑇
. We preface at this point that the method reduces to

where 𝑗𝑗(𝑡) =

, ˜J𝑖(r𝑠 , 𝑡)

, G(r, 𝑡)

(1)
𝑖

W

W

(cid:69)

(cid:68)

(cid:69)

a traditional EM-FEMPIC solve when 𝜔0 = 0, so the proposed method represents a

generalization of the EM-FEMPIC formulation. To convert (3.8) into a discrete stencil at

different intervals of time, we utilize a Newmark-𝛽 [74] scheme with 𝛾 = 0.5 and 𝛽 = 0.25.

Furthermore, we define Δ𝑡,𝜔0 = (30𝜉 𝑓max)−1 and Δ𝑡,𝜔𝑏𝑤 = (30𝜉 𝑓bw)−1 as the timestep size

appropriate for the fast varying and downshifted systems respectively, with 𝜉 being

a real oversampling factor. This choice of parameters results in representing the field

coefficients in terms of second order Lagrange polynomials in time and testing with an

average acceleration condition (represented by 𝑊(𝑡)) such that for 𝑡 ∈ [𝑡𝑛−1, 𝑡𝑛+1] where

𝑡𝑛 = 𝑛Δ𝑡,𝜔bw

¯𝐵(𝑡)
(cid:170)
(cid:174)
(cid:172)

¯𝐸(𝑡)

(cid:169)
(cid:173)
(cid:171)

=

=

2
(cid:213)

𝑘=0

2
(cid:213)

𝑘=0

𝑁𝑛,𝑘(𝑡) (cid:169)
(cid:173)
(cid:171)

¯𝐵(𝑡𝑛+𝑘−1)
(cid:170)
(cid:174)
¯𝐸(𝑡𝑛+𝑘−1)
(cid:172)

𝑁𝑛,𝑘(𝑡) (cid:169)
(cid:173)
(cid:171)

¯𝐵𝑛+𝑘−1

¯𝐸𝑛+𝑘−1

,

(cid:170)
(cid:174)
(cid:172)

𝐿𝑛,𝑘(𝑡) =

𝑡 − 𝑡𝑛+1−𝑗
𝑡𝑛+1−𝑘 − 𝑡𝑛+1−𝑗

2
(cid:214)

𝑗=0
𝑗≠𝑘

46

(3.10a)

(3.10b)

𝐿𝑛,𝑘(𝑡)

𝑁𝑛,𝑘(𝑡) =

𝑡 ∈ [𝑡𝑛−1, 𝑡𝑛+1]

otherwise.

𝑊𝑛(𝑡) =

𝑡𝑛−𝑡
Δ𝑡

𝑡−𝑡𝑛
Δ𝑡

𝑡 ∈ [𝑡𝑛−1, 𝑡𝑛]

𝑡 ∈ [𝑡𝑛 , 𝑡𝑛+1]

.

0

otherwise

0









(3.10c)

(3.11)

Discretization and Exact Charge Conservation

Next, we re-examine rubrics of a charge conserving PIC scheme from a slightly different

perspective. The discrete divergence of the set of equations derived (3.8) should yield

Gauss’ laws. But a word of caution, the discrete equations are a result of measuring their

continuous counterparts in time. More precisely, we note that the discrete divergence of

Ampere’s laws yields

𝜖0 [∇]𝑇 (cid:0)[★𝜖] 𝜕𝑡

¯𝐸(𝑡) + 𝑗𝜔0 [★𝜖] ¯𝐸(𝑡)(cid:1) = − [∇]𝑇

(cid:18)

¯𝐽𝑖(𝑡)

+ 𝑒−𝑗𝜔0𝑡 𝜕𝑡

¯𝐺(𝑡)

(cid:19)

.

Discretizing Gauss law, however, yields

𝜖0 [∇]𝑇 [★𝜖] ¯𝐸(𝑡) = ¯𝜌𝑖(𝑡) + ¯𝜌(𝑡)

(3.12)

(3.13)

where ¯𝜌𝑖(𝑡) = [𝜌𝑖(𝑡), · · · , 𝜌𝑁𝑛 (𝑡)]𝑇, 𝜌𝑖(𝑡) = ⟨𝑊 0
(r), 𝜌𝑖(r𝑠 , 𝑡)⟩, and ¯𝜌(𝑡) = [𝜌1(𝑡), · · · , 𝜌𝑁𝑛 (𝑡)]𝑇 =
𝑖
[∇]𝑇 ¯𝐺(𝑡)𝑒−𝑗𝜔0𝑡. Note, in the above equations, the [∇]𝑇 matrix effects a divergence of the

flux density (which lies in the dual grid) in terms of quantities defined on the primal

grid. In practice, as one only evolves the curl equations, the solution at every time step

will automatically satisfy (3.12) provided care is taken in constructing the right hand side

[76]. But what is critical for satisfaction of Gauss’ law is that this discrete solution also

47

satisfies (3.13). In the text that follows, we take some liberties in verbiage: (a) it will be

understood that (3.12) and (3.13) are not explicitly discretized in time and evolved; (b) the

system of equations resulting from taking the discrete divergence of the time discretized

Ampere’s laws and those from temporal discretization of (3.12) will be identical; (c) we use

the word discretization and measurement interchangeably they imply an inner product

with a temporal basis function; and (d) our goal is to understand how one should measure

or discretize (3.13) so as to to be consistent with the measurement of (3.12). Indeed, if we

choose different temporal basis to measure (3.12) and (3.13), it must be shown that the two

measures are consistent. For instance, if the basis functions used to measure (3.12) are first

order and those used to measure (3.13) are delta functions, they will yield different results;

this is a point that we will return to later. Note, we have specifically, assumed that ¯𝐺(𝑡) is

known. Indeed, as shown in [76, 75], using ¯𝐺(𝑡) is critical for exact satisfaction of Gauss’

electric law. To understand these issues better, we analyze two cases; (a) 𝜔0 = 0 and (b)
𝜔0 ≠ 0. To simplify our discussion, we assume both ¯𝐽𝑖(𝑡) = 0 = ¯𝜌𝑖(𝑡).

Case 1: 𝜔0 = 0

When 𝜔0 = 0, (3.12) simplifies to

𝜖0 [∇]𝑇 [★𝜖] 𝜕𝑡

¯𝐸(𝑡) = − [∇]𝑇 𝜕𝑡

¯𝐺(𝑡)

(3.14)

Note, parenthetically, we note that [∇]𝑇 ¯𝐺(𝑡) = − ¯𝜌(𝑡). In keeping with the Newmark-𝛽

scheme, to ensure late time stability, we have to use 𝑊𝑛(𝑡) as testing functions in time. If

a scheme is charge conserving, then at any point 𝑡𝑛, the system arising from the discrete

divergence of Ampere’s law

⟨𝑊𝑛(𝑡), 𝜖0 [∇]𝑇 [★𝜖] 𝜕𝑡

¯𝐸(𝑡)⟩ = ⟨𝑊𝑛(𝑡), 𝜕𝑡 ¯𝜌(𝑡)⟩

(3.15)

should satisfy Gauss’ law, but under what measure. To make the ensuing text notationally

less dense, we use ¯𝜙(𝑡) = 𝜖0 [∇]𝑇 [★𝜖] ¯𝐸(𝑡). Thus, (3.15) can be written as

⟨𝑊𝑛(𝑡), 𝜕𝑡

¯𝜙(𝑡)⟩ = ⟨𝑊𝑛(𝑡), 𝜕𝑡 ¯𝜌(𝑡)⟩.

(3.16)

48

Evaluation of this inner product results in

¯𝜙𝑛+1 − ¯𝜙𝑛−1
2

−

¯𝜌𝑛+1 − ¯𝜌𝑛−1
2

= 0,

(3.17)

which is the relations that coefficients of the electric field will satisfy in keeping with the

substitution defined earlier.

The questions are two fold; (a) since Gauss’ law in (3.13) is never solved, what is an

equivalent discrete system, and (b) will the update equation for this system be consistent

with those obtained from (3.16). This may seem trivial as both sides of (3.14) have a time

derivative and one can evaluate them analytically to obtain (3.13) together with null initial

conditions.

But it is important to remember (a) that right hand side of Ampere’s law is deliberately

chosen to be different from the conventional one where one discretizes current [71] and

(b) we effect the solution of (3.8) by choosing temporal basis sets such that the solution is

unconditionally stable, and do not treat it as a first order ODE. Under these circumstances,

we can ask what should (cid:101)𝑊𝑛(𝑡) to reduce (3.13) to a discrete system. To make our analysis
as general as possible, lets us denote (cid:101)𝑊𝑛(𝑡) as a basis that is used to discretize (3.13):

⟨ (cid:101)𝑊𝑛(𝑡), 𝜖0 [∇]𝑇 [★𝜖] ¯𝐸(𝑡)⟩ = ⟨ (cid:101)𝑊𝑛(𝑡), ¯𝜌(𝑡)⟩.

Using the abbreviated notation defined earlier, we get

⟨ (cid:101)𝑊𝑛(𝑡), ¯𝜙(𝑡)⟩ = ⟨ (cid:101)𝑊𝑛(𝑡), ¯𝜌(𝑡)⟩

Consider two potential choices; for (cid:101)𝑊𝑛(𝑡) = 𝛿 (𝑡 − 𝑡𝑛+1), one gets

⟨𝛿(𝑡 − 𝑡𝑛+1), ¯𝜙(𝑡)⟩ = ⟨𝛿(𝑡 − 𝑡𝑛+1), ¯𝜌(𝑡)⟩

¯𝜙𝑛+1 = ¯𝜌𝑛+1

and for (cid:101)𝑊𝑛(𝑡) = 𝑊𝑛(𝑡), one gets

¯𝜙𝑛+1 + 2 ¯𝜙𝑛 + ¯𝜙𝑛−1
4

−

¯𝜌𝑛+1 + 2 ¯𝜌𝑛 + ¯𝜌𝑛−1
4

= 0.

49

(3.18)

(3.19)

(3.20)

(3.21)

In (3.21), if we were to assume an quiescent initial condition at 𝑡 = 0, it is trivial to show

that one recovers solutions obtained in (3.17). However, for (3.21) one needs Gauss’ laws to

be satisfied at both 𝑡 = {0, Δ𝑡 } for it to be consistent with solutions in (3.17). Indeed, one

can use any test function as long as it is piecewise continuous, causal and one imposes

additional initial conditions.

We digress a little to note that this is a consequence of using G(𝑡, r) and not J(𝑡, r).

To set the stage, consider ¯𝐽(𝑡) = (cid:2)˜𝑗1(𝑡), ˜𝑗2(𝑡), · · · , ˜𝑗𝑁𝑒 (𝑡)(cid:3) 𝑇 where ˜𝑗𝑖(𝑡) =
standard discretized Ampere’s yields

(cid:68)

(1)
𝑖

W

, J(r, 𝑡)

(cid:69)

. The

𝜖0 [∇]𝑇 [★𝜖] 𝜕𝑡

¯𝐸(𝑡) = [∇]𝑇 ¯𝐽(𝑡).

Using ¯𝜂(𝑡) = [∇]𝑇 ¯𝐽(𝑡) and testing with 𝑊𝑛(𝑡) simplifies the notation to

(cid:10)𝑊𝑛(𝑡), 𝜕𝑡

¯𝜙(𝑡)(cid:11) = ⟨𝑊𝑛(𝑡), ¯𝜂(𝑡)⟩

Evaluating the integrals as listed above yields the following stencil:

¯𝜙𝑛+1 − ¯𝜙𝑛−1 = Δ𝑡,𝜔bw

¯𝜂𝑛+1 + 2 ¯𝜂𝑛 + ¯𝜂𝑛−1
4

(3.22)

(3.23)

(3.24)

But Gauss’ law is slightly different; indeed, ¯𝜌(𝑡) = ∫ 𝑡

𝑜

𝑑𝜏 ¯𝜂(𝜏). This implies that the discrete

evolution of Gauss’ law yields

(cid:10)𝑊𝑛(𝑡), ¯𝜙(𝑡)(cid:11) = ⟨𝑊𝑛(𝑡), ¯𝜌(𝑡)⟩

Then (3.25) simplifies to

¯𝜙𝑛+1 + 2 ¯𝜙𝑛 + ¯𝜙𝑛−1 = ¯𝜌𝑛+1 + 2 ¯𝜌𝑛 + ¯𝜌𝑛−1.

(3.25)

(3.26)

But as only ¯𝜂(𝑡) is available, one would need to integrate this to obtain the charge density.

Choosing a backward Euler scheme for illustration, we obtain

¯𝜌𝑛 = ¯𝜌𝑛−1 + Δ𝑡,𝜔bw

¯𝜂𝑛.

Using (3.27) in (3.24) results in

¯𝜙𝑛+1 − ¯𝜙𝑛−1 =

1
4

(cid:16) ¯𝜌𝑛+1 + ¯𝜌𝑛 − ¯𝜌𝑛−1 − ¯𝜌𝑛−2(cid:17)

50

(3.27)

(3.28)

It is apparent from above equations, that irrespective of the integration scheme, the

coefficient obtained by solving Ampere’s law will not satisfy the discrete Gauss’ law.

The above discussion presents the nuances of the challenge in ensuring that the solution

of the discrete curl equations satisfy Gauss’ law. When 𝜔0 = 0, one can somewhat get
around the challenge by defining ¯𝐺(𝑡). As we will see, this challenge cannot be overcome

when 𝜔0 ≠ 0.

Case 2: 𝜔0 ≠ 0

Next, we consider the effect of adding an analytically prescribed fast-varying function

to our field solution. As before, we begin our analysis with the divergence of Ampere’s

Law in (3.12) and obtain

𝜖0 [∇]𝑇 (cid:0)[★𝜖] 𝜕𝑡

¯𝐸(𝑡) + 𝑗𝜔0 [★𝜖] ¯𝐸(𝑡)(cid:1) = − [∇]𝑇 (cid:16)

𝑒−𝑗𝜔0𝑡 𝜕𝑡

¯𝐺(𝑡)

(cid:17)

.

(3.29)

Using the same notation as the previous subsection, suppose 𝜙 (𝑡) = ˜𝜙 (𝑡) 𝑒 𝑗𝜔𝑡. We can

represent this as

𝜙 (𝑡) =

𝑛+1
(cid:213)

𝑖=𝑛−1

˜𝜙𝑖 𝑁𝑖 (𝑡) 𝑒 𝑗𝜔𝑡

(3.30)

where 𝑁𝑖 (𝑡) represents the regular second order Newmark basis functions. 𝜌 is defined as

is normally done in Newmark:

Now, 𝜕𝑡 𝜙 − 𝜕𝑡𝜌 becomes

𝜌 (𝑡) =

𝑛+1
(cid:213)

𝑖=𝑛−1

𝜌𝑖 𝑁𝑖 (𝑡)

𝑛+1
(cid:213)

𝑖=𝑛−1

(cid:104) ˜𝜙𝑖𝜕𝑡

(cid:16)

𝑁𝑖 (𝑡) 𝑒 𝑗𝜔𝑡 (cid:17)

+ 𝜌𝑖𝜕𝑡 𝑁𝑖 (𝑡)

(cid:105)

= 0

(3.31)

(3.32)

51

𝑊-testing gives us

(cid:42)

𝑊 (𝑡) ,

𝑛+1
(cid:213)

𝑖=𝑛−1

(cid:104) ˜𝜙𝑖𝜕𝑡

(cid:16)

𝑁𝑖 (𝑡) 𝑒 𝑗𝜔𝑡 (cid:17)

+ 𝜌𝑖𝜕𝑡 𝑁𝑖 (𝑡)

(cid:43)

(cid:105)

(cid:42)

=

𝑊 (𝑡) ,

(cid:42)

+

𝑊 (𝑡) ,

𝑛+1
(cid:213)

𝑖=𝑛−1
𝑛+1
(cid:213)

𝑖=𝑛−1

˜𝜙𝑖 (cid:16)

𝑁𝑖 (𝑡) 𝜕𝑡

(cid:16)

𝑒 𝑗𝜔𝑡 (cid:17)

+ 𝑒 𝑗𝜔𝑡 𝜕𝑡 𝑁𝑖 (𝑡)

(cid:43)

(cid:17)

(3.33)

𝜌𝑖𝜕𝑡 𝑁𝑖 (𝑡)

(cid:43)

= 0

Thus, there are 6 relevant terms in the expansion of (cid:10)𝑊 , 𝜙 (𝑡)(cid:11). They are as follows:

(cid:68)

𝑊 (𝑡) , 𝜕𝑡

(cid:16)

𝑁𝑘 𝑒 𝑗𝜔𝑡 (cid:17) (cid:69)

= 𝑒 𝑗𝜔𝑡𝑛

𝑛+1
(cid:213)

𝑘=𝑛−1

(cid:2)𝛼𝑘,0 + 𝛼𝑘,Δ𝑡 𝑒 𝑗𝜔Δ𝑡 + 𝛼𝑘,−Δ𝑡 𝑒−𝑗𝜔Δ𝑡 (cid:3)

where

𝛼𝑛+1,0 =

4𝑗 + 2𝜔Δ𝑡
2𝜔2Δ3
𝑡
2𝑗 + 𝜔Δ𝑡
2𝜔2Δ3
𝑡

𝛼𝑛+1,−Δ𝑡 = −

𝛼𝑛+1,Δ𝑡 =

𝛼𝑛,0 =

𝛼𝑛,−Δ𝑡 =

𝛼𝑛,Δ𝑡 =

𝛼𝑛−1,0 =

𝛼𝑛−1,−Δ𝑡 =

−2𝑗 − 3𝜔Δ𝑡 + 2𝑗 (𝜔Δ𝑡)2 + 2 (𝜔Δ𝑡)3
2𝜔2Δ3
𝑡

−4𝑗 − 2𝑗 (𝜔Δ𝑡)2
𝜔2Δ3
𝑡
2𝑗 − 2𝜔Δ𝑡
𝜔2Δ3
𝑡
2𝑗 + 2𝜔Δ𝑡
𝜔2Δ3
𝑡
4𝑗 − 2𝜔Δ𝑡
2 (𝜔Δ𝑡)3
−2𝑗 + 3𝜔Δ𝑡 + 2𝑗 (𝜔Δ𝑡)2 − 2 (𝜔Δ𝑡)3
2 (𝜔Δ𝑡)3

𝛼𝑛−1,Δ𝑡 =

−2𝑗 − 𝜔Δ𝑡
2 (𝜔Δ𝑡)3

(3.34)

(3.35a)

(3.35b)

(3.35c)

(3.35d)

(3.35e)

(3.35f)

(3.35g)

(3.35h)

(3.35i)

We can conclude by inspection from the coefficients defined in (3.35a) that for general 𝜔, all
nine coefficients can be nonzero. As a result, the quantities 𝛽𝑘 = (cid:2)𝛼𝑘,0 + 𝛼𝑘,Δ𝑡 𝑒 𝑗𝜔Δ𝑡 + 𝛼𝑘,−Δ𝑡 𝑒−𝑗𝜔Δ𝑡 (cid:3)

52

defined for 𝑘 = {𝑛 − 1, 𝑛, 𝑛 + 1} must also generally by nonzero. Then,

(cid:10)𝑊 (𝑡) , 𝜕𝑡 𝜙 − 𝜕𝑡𝜌(cid:11) = 𝛽𝑛+1𝜙𝑛+1 + 𝛽𝑛 𝜙𝑛+

𝛽𝑛−1𝜙𝑛−1𝑒 𝑗𝜔𝑡𝑛 −

(cid:18) 𝜌𝑛+1 − 𝜌𝑛−1
2

(cid:19)

= 0

(3.36)

cannot be captured in a pointwise manner, since all three coefficients multiplying 𝜙𝑘

can generally by nonzero, while the coefficients multiplying 𝜌𝑘 only exist when 𝑘 =

{𝑛 − 1, 𝑛 + 1}. If charge conservation is achieved, then we should be able to find a testing

function (cid:101)𝑊𝑛(𝑡) that, when used to measure Gauss’ Law will reduce it to a form that is
satisfied by the solution to (3.32).

Let us first begin by setting (cid:101)𝑊𝑛(𝑡) = 𝛿(𝑡 − 𝑡𝑛). Testing Gauss’ law with this function will

lead to

(cid:10)𝛿(𝑡 − 𝑡𝑛), ¯𝜙(𝑡)(cid:11) = (cid:10)𝛿(𝑡 − 𝑡𝑛), exp{−𝑗𝜔0𝑡} ¯𝜌(𝑡)(cid:11)

which, when evaluated leads to the following relation for ¯𝜙𝑛

¯𝜙𝑛 = exp{−𝑗𝜔0𝑡𝑛 } ¯𝜌𝑛

(3.37)

(3.38)

Comparing this relation to (3.32), we find that the pointwise relation for 𝜙𝑛 does not satisfy

the stencil for the divergence of Ampere’s Law. Interestingly, the solutions do agree in the limit

when 𝜔0 → 0, but otherwise, they are inconsistent. Likewise, if we were to perform a similar
analysis with (cid:101)𝑊𝑛(𝑡) = 𝑊𝑛(𝑡), we end up with the following update expression for 𝜙𝑛+1

Δ𝑡,𝜔bw
4

¯𝜙𝑛+1 = −

Δ𝑡,𝜔bw
2

¯𝜙𝑛 −

Δ𝑡,𝜔bw
4

¯𝜙𝑛−1 + 𝛼𝑛+1(𝜔0) ¯𝜌𝑛+1

(3.39)

+ 𝛼𝑛(𝜔0) ¯𝜌𝑛 + 𝛼𝑛−1(𝜔0) ¯𝜌𝑛−1

Once again, the coefficients 𝛼𝑛+1(𝜔0), 𝛼𝑛(𝜔0) and 𝛼𝑛−1(𝜔0) refer respectively to the inner

products of 𝑊𝑛(𝑡) and exp{−𝑗𝜔0𝑡} multiplying the basis functions used to represent the

field quantities in time 𝑁𝑛,0(𝑡), 𝑁𝑛,1(𝑡)and 𝑁𝑛,2(𝑡) respectively (defined in (3.10c)). As

before, (3.39) and (3.32) agree in the limit of 𝜔0 → 0, they are inconsistent for finite 𝜔0.

While neither choice of (cid:101)𝑊𝑛(𝑡) results in discrete systems that are consistent with (3.32),
and there may never be, we take a different and more robust path. In addition to the above

53

difficulties, we note the following: The standard Newmark time stepping scheme excites a

null space that is time independent. While the null space can be small, it will not have a

trivial divergence and will corrupt the satisfaction of Gauss’ law [83]. A way to overcome

both problems is to use a Helmholtz decomposition or impose a Coulomb gauge in the

discrete setting. The means to do so is elaborated next.

Quasi Helmholtz or Coulomb Gauge

Much of the development of what follows has been detailed in our earlier paper [77].

The following discussion is purely for completeness of the paper. In a nutshell, ˜E(r, 𝑡)
and ˜B(r, 𝑡) can decomposed into solenoidal (components that are divergence free) and
non-solenoidal components (have a finite curl). Hence, the use of the prefix “quasi.” In

this framework, Gauss’ laws are explicitly discretized and solved. Furthermore, while the

solution for the solenoidal component has a null space, its divergence is exactly zero. As a

result, Gauss’ laws are exactly satisfied.

In what follows, we use ¯𝐸𝑛

𝑛𝑠 refers to the non-solenoidal coefficients of the electric field

at the 𝑛th timestep. Likewise, ¯𝐸𝑛

𝑠 and ¯𝐵𝑛

𝑠 refer to the divergence-free solenoidal coefficients

of the electric field and magnetic flux density, respectively. A complete prescription of all

sub-matrices involved are provided in Section 1.4.

Requisite Proejctors

To project the non-solenoidal components from the basis used for representing the

electric field, we define projectors [ ¯𝑃]Σ

𝑒 and [ ¯𝑃]Λ

𝑒 that, when operated on the complete

electric field, respectively extract the non-solenoidal and solenoidal components (where †

represents a Moore-Penrose pseudoinverse):

[ ¯𝑃]Σ

𝑒 = Σ(Σ𝑇Σ)†Σ𝑇

[ ¯𝑃]Λ

𝑒 = ℐ − [ ¯𝑃]Σ

𝑒

54

(3.40a)

(3.40b)

(cid:2) ¯𝑃(cid:3) Λ

𝑏 = ℐ − Σ𝑚

(cid:16)

Σ𝑇

𝑚Σ𝑚

(cid:17) †

Σ𝑇
𝑚

(3.40c)

where [Σ] = 𝜖0[ ¯𝑀𝑔] and [Σ]𝑚 = [∇·]𝑇. Using these projectors„ we can now define a

decomposition for the electric flux density as

¯𝐷(𝑡) = Σ ¯𝐸𝑛𝑠(𝑡) + (cid:2) ¯𝑃(cid:3) Λ
𝑒

¯𝐷(𝑡)

and the magnetic flux density as

¯𝐵𝑠(𝑡) = (cid:2) ¯𝑃(cid:3) Λ
𝑏

¯𝐵(𝑡)

(3.41)

(3.42)

Using this projector, it is rather straightforward to show that the divergence of ¯𝐵(𝑡) is zero.

Discrete System

To use this decomposition, we use all of Maxwell’s equations. First given the decom-

position, the application of the discrete divergence on (3.42) and (3.41) results in zero

and

[𝐶 𝑒

𝑧]𝑇 [∇]𝑇 [★𝑒] [∇] [𝐶 𝑒

𝑧] ¯𝐸𝑛𝑠(𝑡) = − [𝐶 𝑒

𝑧]𝑇 [∇]𝑇 𝑒−𝑗𝜔0𝑡 ¯𝐺(𝑡).

(3.43)

Upon delta-testing this expression at timestep 𝑛, we get a direct relation between ¯𝐸𝑛𝑠(𝑡)
and ¯𝐺(𝑡). Likewise, using decomposition in (3.8) and with (3.43) results in

¯𝐵𝑠(𝑡) + (cid:2) ¯𝑍(cid:3)

¯𝐸𝑠(𝑡) = − (cid:2) ¯𝑍(cid:3)

¯𝐸𝑛𝑠(𝑡)

13

12

(cid:2) ¯𝑍(cid:3)

(cid:2) ¯𝑍(cid:3)

11

21

𝜕𝑡

¯𝐵𝑠(𝑡)+𝑗𝜔0 (cid:2) ¯𝑍(cid:3)

𝜕𝑡

¯𝐸𝑠(𝑡)+𝑗𝜔0 (cid:2) ¯𝑍(cid:3)

11

21

−𝑒−𝑗𝜔0𝑡 𝜕𝑡

¯𝐺(𝑡) − (cid:2) ¯𝑍(cid:3)

𝑗𝜔0

23

¯𝐸𝑠(𝑡) − (cid:2) ¯𝑍(cid:3)

¯𝐵𝑠(𝑡) =

22
¯𝐸𝑛𝑠(𝑡) − (cid:2) ¯𝑍(cid:3)

𝜕𝑡

¯𝐸𝑛𝑠(𝑡)

23

(3.44)

where ¯𝐸𝑠(𝑡) and ¯𝐵𝑠(𝑡) are extracted from the simply connected mesh using tree-cotree
(cid:3) that are defined in Section 1.4. The evaluation of time derivatives is

maps [𝐶 𝑒

𝑐 ] and (cid:2)𝐶𝑏

𝑐

effected through the basis defined earlier; it has been extensively discussed in [76].

As alluded to earlier, since we directly discretize Gauss’ laws and since the discrete

divergence of (3.44) is identical to zero, charge conservation is exactly satisfied.

55

Particle Push and Integration Scheme for

¯𝐺

Next, we describe the framework used to evolve particle trajectories following (3.6). We

recall from Section 3.2 that the error in the non-solenoidal component of the electric field is

tied intimately to the accuracy in the evaluation of ¯𝐺(𝑡), making it important to evaluate

the particle positions and velocities to high accuracy. But it is also known that the Fourier

transform of the velocity or the position of any particle is not necessarily narrow-band.

As a result, the equations of motion and the path integral need to be evaluated at a step

size Δ𝑡,𝜔0, while samples of the field are only known at Δ𝑡,𝜔bw

. There are two ways to get

around this mismatch. First, one can incorporate the analytically known fast varying field

components into the Lorentz update

(r𝑝 , 𝑡) =

𝑞𝑝
𝑚𝑝

Re (cid:8) ˜E(r𝑝 , 𝑡) exp{𝑗𝜔0𝑡}(cid:9)

(3.45)

v(r𝑝 , 𝑡) × Re (cid:8) ˜B(r𝑝 , 𝑡) exp{𝑗𝜔0𝑡}(cid:9) ,

𝑑v
𝑑𝑡
𝑞𝑝
𝑚𝑝

+

which can be numerically integrated with the fast varying components treated analytically. The

specifics of the resulting integration rule can be written as

v(r

𝑛+1
𝑝

, 𝑡𝑛+1) − v(r

𝑛
𝑝 , 𝑡𝑛) =

𝑞𝑝
𝑚𝑝

2
(cid:213)

𝑘=0

E(r

𝑛−1−𝑘
𝑝

, 𝑡𝑛−1−𝑘)

(cid:16)

𝑀 𝑟

𝑛,𝑘 + 𝑀 𝑖

𝑛,𝑘

(cid:17)

+

𝑞𝑝
𝑚𝑝

2
(cid:213)

𝑘=0

v(r

𝑛−1−𝑘
𝑝

, 𝑡𝑛−1−𝑘) × B(r

𝑛−1−𝑘
𝑝

, 𝑡𝑛,1,𝑘)

(cid:16)

𝑛,𝑘 + 𝐿𝑖
𝐿𝑟

𝑛,𝑘

(cid:17)

where the operators used are defined as follows:

𝑡𝑛+1∫

𝑁𝑛,𝑗(𝑡)𝑁𝑛,𝑘(𝑡) cos(𝜔0𝑡)𝑑𝑡,

𝐿𝑟
𝑛,𝑗,𝑘 =

𝑡𝑛

𝑡𝑛+1∫

𝑁𝑛,𝑗(𝑡)𝑁𝑛,𝑘(𝑡) sin(𝜔0𝑡)𝑑𝑡,

𝐿𝑖
𝑛,𝑗,𝑘 = −

𝑡𝑛

56

(3.46)

(3.47a)

(3.47b)

𝑡𝑛+1∫

𝑁𝑛,𝑗(𝑡) cos(𝜔0𝑡)𝑑𝑡,

𝑀 𝑟

𝑛,𝑗 =

𝑡𝑛
𝑡𝑛+1∫

𝑀 𝑖

𝑛,𝑗 = −

𝑁𝑛,𝑗(𝑡) sin(𝜔0𝑡)𝑑𝑡.

(3.47c)

(3.47d)

The particle positions can then be integrated consistently in time at Δ𝑡,𝜔0 through a fourth

𝑡𝑛

order Adams-Bashforth stencil:

𝑛+1
𝑝 = r

𝑛
𝑝 +

r

Δ𝑡,𝜔0
24

(55v

𝑛
𝑝 − 59v

𝑛−1
𝑝 + 37v

𝑛−2
𝑝 − 9v

𝑛−3
𝑝

)

(3.48)

Since the particle positions cannot be downshifted in the same way as the velocity update,

some oversampling is still required to maintain accuracy, but as we show in Sec. 3.3, this

factor is relatively small. The computed trajectories can then be used to obtain ¯𝐺(𝑡)

(cid:42)

¯𝐺𝑛

𝑙 =

(1)
𝑙

W

(r𝑝(𝑛Δ𝑡,𝜔bw

)),

𝑁𝑝
(cid:213)

𝑝=1

𝑞

r𝑝(𝑛Δ𝑡,𝜔bw
∫

)

(cid:43)

(cid:101)r𝑆(r −
𝑑

(cid:101)r)

,

r𝑝(0)

(3.49)

where 𝑛 refers to the number of timesteps evolved by the EM system, and 𝑙 refers to a

specific edge in the EM system.

In systems where the number of particles is not very large, and therefore the cost of a

particle update is negligible in relation to the cost of a field solve, we can follow a simpler

integration setup and simply interpolate the electric field to time points that are spaced

at Δ𝑡,𝜔0. In essence, the magnitudes of the electric fields and magnetic flux density at the

location of a given particle 𝑝, can be reconstructed from the downshifted quantities at a

given time 𝑡 ∈ [𝑡𝑛−1, 𝑡𝑛+1] as

E(r𝑝 , 𝑡) =

B(r𝑝 , 𝑡) =

𝑁𝑒(cid:213)

2
(cid:213)

𝑖=1
𝑁 𝑓
(cid:213)

𝑘=0
2
(cid:213)

𝑖=1

𝑘=0

𝑁𝑛,𝑘(𝑡) ¯𝐸𝑘

𝑖 exp{𝑗𝜔0𝑡𝑛−1+𝑘 }W

1
𝑖 (r𝑝)

𝑁𝑛,𝑘(𝑡) ¯𝐵𝑘

𝑖 exp{𝑗𝜔0𝑡𝑛−1+𝑘 }W

2
𝑖 (r𝑝)

(3.50a)

(3.50b)

57

These fields can then be used in a Lorentz update at the smaller stepsize. Since the particle

paths are now known at the finer timestep size Δ𝑡,𝜔0, the path integral in (3.49) can be

evaluated as

)

(cid:101)r𝑆(r −
𝑑

(cid:101)r) =

r𝑝(𝑛Δ𝑡,𝜔bw
∫

r𝑝(0)

𝑀−1
(cid:213)

𝑗=0

r𝑝((𝑗+1)Δ𝑡,𝜔0
∫

)

(cid:101)r𝑆(r −
𝑑

(cid:101)r).

r𝑝(𝑗Δ𝑡,𝜔0

)

(3.51)

where 𝑀 refers to the number of timesteps needed at Δ𝑡,𝜔0 to advance to a time of 𝑛Δ𝑡,𝜔bw

.

We show, once again, in Section 3.3 that this approach can be used to get precision similar

to the actively downshifted integration scheme described earlier in the section, but the

rate of oversampling required is larger. Note, in both these methods, we have assumed

non-relativistic motion. Studies are underway on efficient methods in relativistic regimes.

3.3 Numerical Experiments

In this section, we detail results obtained from an implementation of the ET-FEMPIC

scheme described above. The results will be structured as follows: (A) First, we demonstrate

the viability and computational gains of using envelope tracking in a linear EM system

without particles by analysing the radiated power due to a monopole antenna. (B) Next.

we demonstrate the accuracy of the integration scheme developed in Section 3.2. Then, we

proceed to analyse two systems with particles, thereby including non-linear effects from

the active media. As a prelude, we note that the modulated Gaussian functions used in

some of the numerical examples are defined as follows:

𝑣(𝑡) = cos(𝜔0𝑡) exp

(cid:18)

−

(cid:19)

(𝑡 − 6𝜎)2
2𝜎2

𝜎 =

2
𝜔bw

(3.52)

Radiated Power from a Monopole Antenna

We consider a conducting strip suspended over a finite ground plane, as specified

in Fig. 3.2a. The coupling between the EM system and the driving circuit is achieved

58

(a) Geometry of the monopole antenna. The di-
mensions are in cm.

(b) Power radiated due to a 1 mV source compared
to measured data and regular MFEM

Figure 3.2: Geometry and radiated power results for a monopole antenna.

across a vertical 1 cm edge going from the conducting plane to the strip. The current

driving the antenna is generated by a Thevenin source characterized by 𝑓0 = 300 MHz and

𝑓bw = 200 MHz connected in series to a 100 Ω resistor. The circuit subsystem itself was

modelled through a Modified Nodal Analysis network [99], with the constituent equations

appropriately modified to account for the analytically known fast field components. The

voltage and current across the port feed were then used to compute the complex impedance

of the feed as a function frequency. The radiated power curves for the antenna were

computed from the impedance obtained from a regular MFEM solve and through the

envelope tracking technique described in this work. As can be seen in Fig. 3.2b, there is

good agreement in the radiated power as a function of frequency between a regular MFEM

and the envelope tracking method, despite the timestep size in the latter being 2.5 times

larger.

Fidelity of the Particle Push Routine

Since the particle positions and velocities are not known to conform to the same

frequency spread as the fields, we proposed two different time-stepping schemes in Section

59

𝑓bw (MHz) 𝑁Regular

20
10
5
1

100
200
400
2000

𝜖ℒ2 (cid:0)𝑁Regular(cid:1) 𝑁Downshift
4.135 × 10−8
3.308 × 10−8
3.591 × 10−8
1.195 × 10−8

21
54
117
501

𝜖ℒ2 (𝑁Downshift)
3.916 × 10−8
6.392 × 10−8
7.997 × 10−8
2.410 × 10−8

Table 3.1: Errors for the two particle integration methods, compared against numerically
obtained data at Δ𝑡,bench = Δ𝑡,𝜔max/10. 𝑁Regular and 𝑁Downshift refer to the oversampling
factor for the naively oversampled and downshifted methods respectively, and 𝜖ℒ2 refers
to the ℒ2 error for a given method compared against the benchmark data. We note that the
oversampling factor required for the downshifted method is significantly lower than with
naive oversampling. In each case, 𝑓0 = 2 GHz.

3.2 to accurately evolve Newton’s equations. The numerical results presented in this

section demonstrate the viability of both methods. To set the stage, we consider a system

consisting of one particle moving under the influence of E(r, 𝑡) = ˆ𝑥𝑣(𝑡) and B(r, 𝑡) = ˆ𝑧𝑣(𝑡)

with 𝑣(𝑡) as defined in (3.52). The positions of the fields and velocities are then solved

for using two different methods. First, we use a particle solver that operates at a smaller

timestep size Δ𝑡,𝜔max than the field solve. The values of the field are then interpolated to

the finer timesteps and used to evolve Newton’s equations. Since the fields conform to

the bandwidth amenable for an envelope tracking analysis, the interpolation should be as

accurate as a time-marching routine that evolves at steps of Δ𝑡,𝜔max. Second, we evolve the

particle using the downshifted integration scheme described in Sec. 3.2. The final position

curves obtained from both methods was compared against a solution obtained using a

fourth order Adams Bashforth method evolved at Δ𝑡,bench = Δ𝑡,𝜔max/10. The relative error

of the two methods in relation to this benchmark is reported in Table 3.1. As is evident,

both methods yield the same order of error over a range of center frequency/bandwidth

combinations.

Klystron exicted by a gap voltage

Next, we analyze the EM response for a device with strong particle effects. Specifically,

we examine the reduction in the quality factor of a Klystron under beam loading, with the

60

𝑡max(ns) 𝑁𝑡,ET-MFEM 𝑄ET-MFEM 𝑁𝑡,MFEM 𝑄MFEM 𝑄EM-FEMPIC 𝑄ET-FEMPIC
13500
9000
6750

1079.1
556.4
231.4

1084
561.4
246.7

903.7
484.9
211.5

899.4
488.4
207.8

135.6
90.4
67.8

1500
1000
750

Table 3.2: Tabulated quality factor data in the Klystron solve. 𝑁𝑡 and 𝑄 refer respectively to
the number of timesteps used in the analysis and the respective quality factor obtained. As
can be seen in the table, the Envelope tracking approach closely tracks the results predicted
by the MFEM solve.

set-up as shown in Fig. 3.3. A sinusoidal RF source placed at the connecting neck, a 40

kV, 3A beam was introduced on one end of the feed-tube. The gap voltage as a function

of frequency was then computed from the Fourier transform of the measured electric

field across an edge spanning the walls of the neck. The quality factor of the cavity was

estimated by locating the half power points beneath the resonance peak, and computing 𝑄

as

𝑄 =

𝑓peak
Δ 𝑓

(3.53)

We note from Fig. 3.4 and Table. 3.2 that the quality factor when the klystron is loaded

drops slightly in comparison to an unloaded run, with some of the energy of the cavity

used to accelerate the particles. This acceleration can be seen by observing a histogram of

particle velocities as a function of position along the length of the tube, as shown in Fig.

3.6. Furthermore, the quality factor derived using ET-FEMPIC are close to those derived

from a traditional EM-FEMPIC solve but computed at approximately the tenth the number of

time steps. Furthermore, the drop observed is similar to that observed in [100] simulated

using an axi-symmetric finite difference time domain code. Finally, we note from Fig. 3.5

that Gauss’ Law is satisfied to machine tolerance.

Relativistic MILO

To stress the viability of the algorithm for a complex real world device, we considered a

setup with a relativistic MILO, as discussed in [101]. The layout of an axial cross-section

the device is as shown in Fig. 3.8, with red sections denoting PEC surfaces, green sections

61

Figure 3.3: Schematic of the Klystron. The figure represents a cross-section in the 𝑦-𝑧 plane.
The device is assumed to be cylindrically symmetric about the 𝑧 axis. Furthermore, all
walls are PEC.

representing surfaces with an applied potential difference 𝑉in(𝑡), dark green sections

representing locations of particle emitters, and finally dashed blue sections representing

surfaces where the output power 𝑃out(𝑡) in the form of the Poynting flux across the surface

is measured. 𝑉in(𝑡) = sin(2𝜋 𝑓0𝑡) was defined as a sinusoidal impulse with magnitude 1.8

MV at a center frequency of 𝑓0 = 1.2 GHz. Likewise, the particles were emitted from

inner conductor with a purely radial velocity of 2.12 × 108 𝑚𝑠−1. Due to the high particle

velocities in this example, the particle push algorithm described earlier was modified to

solve for the relativistic momentum. The particle current was set to 50 kA. This setup was

simulated using the proposed method with a timestep size Δ𝑡 = 33.3 𝑝𝑠. To provide a

62

Figure 3.4: Gap voltage vs frequency in the loaded and unloaded case – both run in the
downshifted frame. We note that the peaks line up at the same frequency and the loaded
response shows a dip in the saturated gap voltage thereby reducing the quality factor.

direct comparison, we also simulated this setup on XOOPIC [102] for the same parameters

with Δ𝑡 = 1 𝑝𝑠. A 𝑧-𝑟 phase plot of the particles is shown in Fig. 3.8. Likewise, the

spectrum of the output power is reported in Fig. 3.9. We note from both figures that

the proposed method predicts particle distributions that agree closely with XOOPIC and

further predicts an output spike at 1.2 GHz, as expected from other similar results in the

literature [101].

63

Figure 3.5: Plot of relative error per particle in satisfaction of discrete Gauss’ Law for the
Klystron setup simulated through ET-FEMPIC.

Computational Complexity

Finally, we discuss the computational complexity of the method and the range devices

for with the technique maximizes savings in compute time. We wish to emphasize at this

point that any time domain simulation method approach requires some means of fixing

Δ𝑡. For explicit schemes with a mesh dependant stability constraint, this is done quite

trivially by looking at the smallest mesh features. For implicit methods that are similarly

constrained by the mesh, the user needs to estimate a good Δ𝑡 based on a presumptive

highest frequency of interest. Typically, this timestep size is inversely related to a factor

64

Figure 3.6: Velocity histogram at 𝑡 = 100 ns along the length of the Klystron showing
particle acceleration when exposed to the RF source.

multiplied by the highest frequency, i.e

Δ𝑡 =

1
𝜉 ( 𝑓0 + 𝑓bw)

.

(3.54)

As noted previously, when 𝜔0 = 0, the method reduces back to a traditional EM-FEMPIC

solve, so for highly-broadband devices, the method is no more expensive than a traditional

implicit EM-FEMPIC solve. For narrowband devices whose step size is heavily determined

by the frequency offset in the fields, as opposed to the feature scales of the geometry, the

savings in the number of timesteps can be estimated as follows. Under the proposed

method, the frequency offset is analytically provided for, leading to a new timestep size

65

Figure 3.7: Comparison between the loaded cavity responses obtained through the regular
and downshifted frames.

given by

Δ𝑡 =

1
𝜉 ( 𝑓bw)

.

(3.55)

Therefore, the number of timesteps required to go to a certain simulation time is reduced

by a factor of 𝑓bw/( 𝑓0 + 𝑓bw). This figure represents the reduction in number of steps for a

device whose stepsize is determined entirely by the bandwidth of the fields. If the geometry

of the simulation setup demands a smaller stepsize (as in the case of the relativistic MILO),

one may have to choose a smaller stepsize than what would be demanded by the physics.

66

Figure 3.8: Layout and 𝑧-𝑟 phase distribution of particle positions along the MILO at 50
ns. Red sections denote conducting surfaces, green sections denote places with imposed
potential differences and dashed blue segments represent surfaces where the output power
spectrum is measured.

3.4 Summary

In this paper, we have proposed a technique to greatly improve the computational

performance of EM-FEMPIC codes for a class of narrowband high frequency devices. Using

a Quasi-Helmholtz setup, we show that our method exactly satisfies charge conservation

and achieves similar fidelity to a traditional FEMPIC solve. As an aside, we have spent

considerable fraction of the paper examining nuances of charge conservation from a

different perspective. The upshot of this discussion is that the underlying rubric of quasi-

Helmholtz decomposition is always robust and immune null spaces that would otherwise

corrupt the system. Additionally, our results demonstrate real world speedups equal to the

67

Figure 3.9: Plot of the spectrum of output power obtained from the MILO using the method
proposed in this work. We recover a peak at 1.2 GHz, as predicted by similar experiments
in the literature [101].

ratio between the frequency shift and the bandwidth. In particular, the Klystron example

used close to a tenth the number of timesteps compared to a traditional EM-FEMPIC solve.

This is before any kind of additional speed up for parallel processing is applied.

68

ANALYSIS AND BOUNDS FOR ENERGY CONSERVATION IN EM-FEMPIC

CHAPTER 4

The full wave simulation of charged particles interacting with electromagnetic fields

is of vast importance in many applications across physics and engineering, including

pulsed power, high precision etching, accelerator design among many others. One of the

most popular techniques involved in such analysis is the electromagnetic particle-in-cell

(EM-PIC) method. While PIC solvers have traditionally employed FD-TD, advances in

finite element based solvers have recently made it increasingly viable – particularly due its

ability to robustly handle complex geometries. Furthermore, novel contributions in current

mapping and the use of quasi-Helmholtz decomposition to strongly enforce the Coulomb

gauge [77] have made it possible to use unconditionally stable implicit EM solvers while

conserving charge.

One area within PIC development that has seen relatively little advance is in the use of

more sophisticated particle-push methods, as opposed to the ubiquitous Boris integrator

[88]. This is particularly relevant when using an implicit method for the field solve. Since

Boris is explicit, the movement of the charges is computed solely using fields from the

previous timestep, leading to loss in accuracy. At the same time, using a fully implicit

particle updater, such as is done in so-called ’Kinetic enslavement’ methods [92] comes

with high computational cost. The goal of this work is to contribute to the existing state of

the art by (i) proposing a rigorous set of conditions that, when followed, ensures overall

energy conservation while using an unconditionally stable Newmark-𝛽 field integrator,

and (ii) by suggesting ’velocity-correction’ schemes that can be used after a field solve

to satisfy the proposed rubrics. We show through numerical examples that the method

conserves average energy over asymptotically large timescales, both for a single particle as

well as within an EM-FEMPIC simulation.

69

4.1 Problem Statement

Consider a 3D region Ω containing a distribution of charged particles. This region is

subjected to an external field due to which the charged species accelerate, and in turn

produce spatially and temporally varying electric and magnetic fields denoted by E(r, 𝑡)

and B(r, 𝑡), respectively, with r ∈ Ω and 𝑡 ∈ [0, ∞). The dynamics of the particles in phase

space can be represented by a distribution function (PSDF) 𝑓 (𝑡, r, v) that follows the Vlasov

equation. Furthermore, we assume that the background media in Ω is free space.

4.2 Formulation

Following the Particle-in-Cell approach, we represent the charge and current dis-

tributions in Ω using the moments of the PSDF through 𝜌(r, 𝑡) = 𝑞 ∫
𝑓 (𝑡, r, v)𝑑v and
J(r, 𝑡) = 𝑞 ∫
Ω v 𝑓 (𝑡, r, v)𝑑v. Representing these moments using 𝑁𝑝 macroparticles, one can
evolve their positions and velocity together with Maxwell’s equations. Assuming a shape

Ω

function 𝑆(r), we obtain

𝜌(r, 𝑡) = 𝑞

𝑁𝑝
(cid:213)

𝑝=1

𝑆(r − r𝑝(𝑡)),

J(r, 𝑡) = 𝑞

𝑁𝑝
(cid:213)

𝑝=1

v𝑝(𝑡)𝑆(r − r𝑝(𝑡)).

(4.1a)

(4.1b)

where r𝑝(𝑡) and v𝑝(𝑡) refer to the positions and velocities as functions of time of the 𝑝th

macroparticle. The evolution of the fields E(r, 𝑡) and B(r, 𝑡) over space and time within Ω

follow Maxwell’s equations, given by

where,

∇ × E(r, 𝑡) = −𝜕𝑡B(r, 𝑡),

∇ × 𝜇−1

0 B(r, 𝑡) = 𝜕𝑡G(r, 𝑡) + 𝜖0𝜕𝑡E(r, 𝑡),

G(r, 𝑡) =

𝑡

∫

0

J(r, 𝜏)𝑑𝜏 = 𝑞

𝑁𝑝
(cid:213)

r𝑝(𝑡)
∫

𝑑˜r𝛿(r − ˜r)

𝑝=1

r𝑝(0)

70

(4.2a)

(4.2b)

(4.3)

and J(r, 𝑡) describes impressed currents within Ω. Furthermore, the solutions to the curl

equations in (4.2) also need to satisfy Gauss’ Laws,

∇ · 𝜖0E(r, 𝑡) = 𝜌𝑖(r, 𝑡) + 𝜌(r, 𝑡),

∇ · B(r, 𝑡) = 0.

(4.4a)

(4.4b)

where in 𝜌𝑖(r, 𝑡) are impressed charges. In what follows, we will assume that both the

impressed current and the corresponding charge densities are zero. If they are not, it is

trivial to include then in the analysis framework rubric described in the next section.

As usual, boundary conditions need to be imposed on E(r, 𝑡) and B(r, 𝑡) on sections of

the outer boundary 𝜕Ω, to ensure unique solutions. These are assumed to either Dirichlet,

Neumann or impedance boundary conditions on non-overlapping surfaces 𝜕Ω𝐷, 𝜕Ω𝑁 and

𝜕Ω𝐼, with 𝜕Ω = 𝜕Ω𝐷 + 𝜕Ω𝑁 + 𝜕Ω𝐼, and are defined as follows:

ˆ𝑛 × E(r, 𝑡) = Ψ𝐷(r, 𝑡) on Ω𝐷 ,

ˆ𝑛 × 𝜇−1

B(r, 𝑡) = Ψ𝑁 (r, 𝑡) on Ω𝑁 ,

ˆ𝑛 × 𝜇−1

B(r, 𝑡) − 𝑌 ˆ𝑛 × ˆ𝑛 × E(r, 𝑡) = Ψ𝐼(r, 𝑡) on Ω𝐼 ,

(4.5a)

(4.5b)

(4.5c)

where the functions Ψ𝐷(r, 𝑡), Ψ𝑁 (r, 𝑡) and Ψ𝐼(r, 𝑡) refer to the imposed Dirichlet, Neumann

and Impedance boundary condition respectively. The field equations in (4.2) are discretized

using a mixed finite element scheme and evolved in time using a Newmark-𝛽 [74] scheme

with parameters 𝛾 = 0.5 and 𝛽 = 0.25, as described in [76]. This formulation is known to

be unconditionally stable [83].

4.3 Rubrics for Energy Conservation

To set the stage for analysing energy conservation, consider a cavity with perfectly

conducting walls. Using matrix definitions from Section 1.4, we begin by writing the

semi-discrete form of Maxwell’s equations in the following form

[★𝜇−1]

0

0

[★𝜖]








(cid:164)¯𝑏(𝑡)

(cid:164)¯𝑒(𝑡)








(cid:169)
(cid:173)
(cid:171)

+

(cid:170)
(cid:174)
(cid:172)








0

−[M𝑐]𝑇


[M𝑐]






0

¯𝑒(𝑡)

(cid:170)
(cid:174)
¯𝑏(𝑡)
(cid:172)

(cid:169)
(cid:173)
(cid:171)

.

0

(cid:170)
(cid:174)
¯𝑗(𝑡)
(cid:172)

= (cid:169)
(cid:173)
(cid:171)

(4.6)

71

Here, ¯𝑒(𝑡) and ¯𝑏(𝑡) refer to the vector of time varying coefficients of the elctric and magnetic

field unknowns discretized using the standard Whitney edge and face basis functions, and

(r), J(r, 𝑡)(cid:11). To derive a set of rules that need to be followed for energy to be
¯𝑗𝑖(𝑡) = (cid:10)W
conserved, we take the following approach. We shall first demonstrate that in a source free

1
𝑖

cavity (i.e with 𝑗(𝑡) = 0), the unconditionally stable Newmark stencil guarantees that the

difference of electric and magnetic energy stays constant over a timestep. Then, we shall

consider the electromagnetic side of an EM-FEMPIC with ¯𝑗(𝑡) = 𝑞 (cid:205)𝑖 v𝑡(𝑡).

4.4 Analysis of Energy Conservation

Velocity Correction Scheme

The evolution of the macroparticles in space and time is determined by solving for

the relativistic equations of motion with the acceleration determined by the Lorentz force.

This yields the following coupled system of equations for ordinary differential equations

(ODEs) for v𝑝(𝑡) and r𝑝(𝑡):

𝑑𝛾𝑝v𝑝(𝑡)
𝑑𝑡

𝑞
𝑚

=

(cid:2)E(r𝑝(𝑡), 𝑡) + v𝑝(𝑡) × B(r𝑝(𝑡), 𝑡)(cid:3)

𝑑r𝑝(𝑡)
𝑑𝑡

= v𝑝(𝑡)

(4.7a)

(4.7b)

To solve (4.7), we make use of a particle integration system comprised of an explicit 4th

order Adams-Bashforth update given by

𝑛+1
𝑝 = v

𝑛
𝑝 +

v

𝑛+1
𝑝 = r

𝑛
𝑝 +

r

Δ𝑡
24
Δ𝑡
24

(55a

𝑛
𝑝 − 59a

𝑛−1
𝑝 + 37a

𝑛−2
𝑝 − 9a

𝑛−3
𝑝

)

(55v

𝑛
𝑝 − 59v

𝑛−1
𝑝 + 37v

𝑛−2
𝑝 − 9v

𝑛−3
𝑝

).

(4.8)

(4.9)

where a𝑝(𝑡, r𝑝) =

𝑞𝑝
𝑚𝑝

(cid:0)E(𝑡, r𝑝) + v𝑝(𝑡, r𝑝) × B(𝑡, r𝑝)(cid:1) and v𝑛

𝑝 = v𝑝(𝑡 = 𝑡𝑛), r𝑛

𝑝 = r𝑝(𝑡 = 𝑡𝑛). In

general, the stencil in (4.8) does not conserve energy asymptotically, however, the locally

higher order error convergence is essential to accurately map currents and thus compute

fields.

72

To enable the particles to be energy conserving, we couple the stencil in (4.8) with an

implicit Newmark-𝛽 integrator with 𝛾 = 0.5 and 𝛽 = 0.25, given by

𝑛+1
𝑝 = v

𝑛−1
𝑝 +

v

(cid:16)

Δ𝑡
2

𝑛+1
𝑝 + 2a

𝑛
𝑝 + a

𝑛−1
𝑝

a

(cid:17)

(4.10)

Utilizing the spectral theorem, we can analyse the stability of (4.10) by analysing its

associated 𝑧-transform. In the absence of an electric field, (4.10) becomes,

𝑛+1
𝑝 = v

𝑛−1
𝑝 +

v

(cid:16)

Ω𝑛+1 · v

𝑛+1
𝑝 + 2Ω𝑛 · v

𝑝 + Ω𝑛−1 · v
𝑛

𝑛−1
𝑝

(cid:17)

(4.11)

where Ω can be written as

Ω =

0

𝜔𝑧 −𝜔𝑦

−𝜔𝑧

0

𝜔𝑦 −𝜔𝑥

𝜔𝑥

0

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

(cid:170)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:172)

,

(4.12)

. To show that the proposed scheme conserves energy over long timescales, it is sufficient

to show that the roots of one step update of the proposed scheme lie on the unit circle. The

eigenvalues of Ω are given by

0

(cid:113)

𝑗

𝜔2

𝑥 + 𝜔2

𝑦 + 𝜔2
𝑧

(cid:113)

−𝑗

𝜔2

𝑥 + 𝜔2

𝑦 + 𝜔2
𝑧

𝜆 =











. As a result, the 𝑧-transform is as follows:

𝑧2 = 1 +

(cid:16)

𝜆𝑧2 + 2𝜆𝑧 + 𝜆 + 1

(cid:17)

(4.13)

(4.14)

The roots of 𝑧 lie almost exactly on the unit circle when 𝑞𝑝Δ𝑡 |B(r, 𝑡)|/𝑚𝑝 ≪ 1, where Δ𝑡

denotes the timestep size. This condition is easily satisfied since the timestep size is usually

chosen in order to appropriately discretize the cyclotron frequency 𝜔𝑐 = 𝑞𝑝 |B(r, 𝑡)|/𝑚𝑝.

While the Boris integrator itself is not symplectic, it ensures by construction that the particle

velocity under the influcence of a magnetic field does not change magnitude [88].

73

Energy Balance in the Field Solver

Definition 4.4.1 (Electric Field Energy). The electric field energy at any time 𝑡 can be written as
𝑊𝐸(𝑡) = 1

2E(r, 𝑡) · D(r, 𝑡) = 1

2 ¯𝑒(𝑡)𝑇 · [★𝜖] · ¯𝑒(𝑡).

Definition 4.4.2 (Magnetic Field Energy). The magnetic field energy at any time 𝑡 is given by
𝑊𝐵(𝑡) = 1

¯𝑏(𝑡)𝑇 · [★𝜇−1] · ¯𝑏(𝑡).

2B(r, 𝑡) · H(r, 𝑡) = 1

2

and the energy in the particles 𝑊𝑃 as

𝑊𝑃(𝑡𝑛) =

𝑁𝑝
(cid:213)

𝑝=1

𝑚𝑝 𝑐(𝛾𝑝 − 1),

(4.15)

where 𝛾𝑝 refers to the relativistic correction. As an abuse of notation, we shall denote ¯𝑒 𝑛

to be the same as ¯𝑒(𝑡𝑛). Next, we note that the Hodge matrices [★𝜖] and [★𝜇−1] are exactly

symmetrical. Furthermore, the difference of either field quantity between timesteps 𝑛 and

𝑛 + 1 can be written through the Newmark relation as

Lemma 4.4.1. ¯𝑒 𝑛+1 − ¯𝑒 𝑛 = 1

2Δ𝑡((cid:164)¯𝑒 𝑛+1 + (cid:164)¯𝑒 𝑛) + 𝒪(Δ2

𝑡 ) for 𝛾 = 2𝛽

Proof. From [103], the newmark relation for an unknown 𝑢(𝑡) can be written as

𝑢𝑛+1 = 𝑢𝑛 + Δ𝑡 (cid:164)𝑢𝑛 +

(cid:19)

− 𝛽

(cid:18) 1
2

𝑡 (cid:165)𝑢𝑛 + 𝛽ℎ2 (cid:165)𝑢𝑛+1 + 𝒪(Δ2
Δ2
𝑡 ).

(4.16)

Likewise, the velocity (cid:164)𝑢(𝑡) can be related to the acceleration through

(cid:164)𝑢𝑛+1 = (cid:164)𝑢𝑛 + Δ𝑡(1 − 𝛾) (cid:165)𝑢𝑛 + 𝛾Δ𝑡 (cid:165)𝑢𝑛+1 + 𝒪(Δ2

𝑡 ).

(4.17)

Subtracting Δ𝑡/2 times (4.17) from (4.16) yields

𝑢𝑛+1 − 𝑢𝑛 =

1
2

Δ𝑡( (cid:164)𝑢𝑛+1 + (cid:164)𝑢𝑛) + (𝛽 −

1
2

𝛾)Δ2

𝑡 ( (cid:165)𝑢𝑛+1 − (cid:165)𝑢𝑛) + 𝒪(Δ2
𝑡 ).

(4.18)

Replacing 𝑢(𝑡) ≡ ¯𝐸(𝑡) and setting 𝛾 = 2𝛽, gives us the required result.

Lemma 4.4.2. 𝛿𝑛 𝜖𝐹 = 𝑊𝐸(𝑡 𝑛+1) − 𝑊𝐸(𝑡 𝑛) + 𝑊𝐵(𝑡 𝑛+1) − 𝑊𝐵(𝑡 𝑛) = 0 ∀𝑛, if 𝜕𝑡

¯𝐺 = 0 and 𝛾 = 2𝛽.

74

Proof. The fractional change in field energy can be written as

𝛿𝑛 𝜖𝐹 =

[★𝜖] Δ𝑛

¯𝐸

( ¯𝐸𝑛+1 + ¯𝐸𝑛)
2

+

( ¯𝐵𝑛+1 + ¯𝐵𝑛)
2

(cid:2)★𝜇−1(cid:3) Δ𝑛

¯𝐵.

From Lemma 4.4.1, the differences in field unknowns can be written as

𝛿𝑛 𝜖𝐹 =Δ𝑡

[★𝜖] ( (cid:164)¯𝐸𝑛+1 + (cid:164)¯𝐸𝑛)

( ¯𝐸𝑛+1 + ¯𝐸𝑛)
4

+ Δ𝑡

( ¯𝐵𝑛+1 + ¯𝐵𝑛)
4

(cid:2)★𝜇−1(cid:3) ( (cid:164)¯𝐵𝑛+1 + (cid:164)¯𝐵𝑛).

From the semi-discrete form of Ampere’s Law,

𝛿𝑛 𝜖𝐹 =Δ𝑡

( ¯𝐸𝑛+1 + ¯𝐸𝑛)
4

[★𝜖] ( (cid:164)¯𝐸𝑛+1 + (cid:164)¯𝐸𝑛)

− Δ𝑡

( ¯𝐸𝑛+1 + ¯𝐸𝑛)
4
( ¯𝐸𝑛+1 + ¯𝐸𝑛)
2
¯𝐺 = 0, 𝛿𝑛 𝜖𝐹 = 0 (upto 𝒪(Δ2

= Δ𝑡

𝑡 )).

[∇×]𝑇 ( (cid:164)¯𝐵𝑛+1 + (cid:164)¯𝐵𝑛)
¯𝐺𝑛+1 + 𝜕𝑡
2

¯𝐺𝑛)

(𝜕𝑡

Clearly, when 𝜕𝑡

(4.19a)

(4.20)

(4.21)

We note at this point that the Newmark-𝛽 approach assumes a truncated form of the

intermediate value theorem. Thus, the relation between function values and its derivatives

will always admit an error of Δ2

𝑡 for a mixed finite element setup. However, 𝑧-transform
analysis has demonstrated that the method as a whole is asymptotically non-dissipative

[83]. As a result, there may be local oscillations at any given timestep of order 𝒪(Δ2

𝑡 ), but
there will be no large scale gain or loss of energy. In essence average energy over long time

will remain conserved.

Current Generated from Particles

Consider a generic Particle-in-Cell scheme with J(r, 𝑡) = (cid:205)𝑝 𝑞𝑝v𝑝(r𝑝 , 𝑡). Thus, 𝑗(𝑡) =

(cid:205)𝑝 𝑞𝑝 ¯𝑣𝑝(𝑡). The semi-discrete Maxwell system can now be written as


[★𝜇−1]






0

0





[★𝜖]



(cid:164)¯𝑏(𝑡)
(cid:170)
(cid:174)
(cid:172)

(cid:164)¯𝑒(𝑡)

(cid:169)
(cid:173)
(cid:171)

+








0

−[M𝑐]𝑇


[M𝑐]






0

¯𝑒(𝑡)

(cid:170)
(cid:174)
¯𝑏(𝑡)
(cid:172)

(cid:169)
(cid:173)
(cid:171)

= (cid:169)
(cid:173)
(cid:171)

0

(cid:205)𝑝 𝑞𝑝 ¯𝑣𝑝(𝑡)

.

(cid:170)
(cid:174)
(cid:172)

(4.22)

75

To construct an energy balance, consider the inner product of ¯𝑒 𝑛+1 and Ampere’s Law

minus the similar product with ¯𝑒 𝑛. Using similar Manipulations from the previous section,

this difference can be written as

(¯𝑒 𝑛+1+¯𝑒 𝑛)𝑇·[★𝜖]·((cid:164)¯𝑒 𝑛+1+(cid:164)¯𝑒 𝑛)+( ¯𝑏𝑛+1+ ¯𝑏𝑛)𝑇·[★𝜇−1]·( (cid:164)¯𝑏𝑛+1+ (cid:164)¯𝑏𝑛) = (¯𝑒 𝑛+1)𝑇·

(cid:32)

(cid:213)

𝑝

(cid:33)

−(¯𝑒 𝑛)𝑇

(cid:32)

(cid:213)

(cid:33)

𝑞𝑝 ¯𝑣𝑛
𝑝

𝑞𝑝 ¯𝑣𝑛+1
𝑝

𝑝
(4.23)

The LHS of (4.23) is identical to Δ𝑡 𝜖𝐹. On the RHS, the electric fields can be taken inside

the sum to yield

𝛿𝜖𝐹 = Δ𝑡

(cid:32)

(cid:213)

𝑝

𝑞𝑝(¯𝑒 𝑛+1)𝑇 · ¯𝑣𝑛+1

𝑝

(cid:33)

−

(cid:32)

(cid:213)

𝑝

(cid:33)

𝑞𝑝(¯𝑒 𝑛)𝑇 · ¯𝑣𝑛
𝑝

Now, if our particle integrator evaluates cross products correctly,

𝛿𝜖𝐹 = Δ𝑡

(cid:32)

(cid:213)

𝑝

𝑞𝑝(¯𝑒 𝑛+1 + ¯𝑣𝑛+1

𝑝

· Ω( ¯𝑏𝑛+1))𝑇 · ¯𝑣𝑛+1

𝑝

(cid:33)

−

(cid:32)

(cid:213)

𝑝

𝑞𝑝(¯𝑒 𝑛 + ¯𝑣𝑛

𝑝 · Ω( ¯𝑏𝑛))𝑇 · ¯𝑣𝑛

𝑝

The product of the Lorentz force RHS with the velocity at time 𝑡𝑛 is simply

𝑚𝑝 ¯𝑢𝑛

𝑝 · ¯𝑣𝑛

𝑝 = 𝑐

𝑑(𝑚𝑝(𝛾𝑝 − 1))
𝑑𝑡

=

𝛿𝜖𝑃
Δ𝑡

Using (4.26) in (4.25) gives us

𝛿𝜖𝐹 = 𝛿𝜖𝑝

(4.24)

(cid:33)

(4.25)

(4.26)

(4.27)

implying the any change in particle energy manifests as a change in field energy. Importantly,

three rules need to be followed for this to occur:

1. The field solver must use 𝛾 = 0.5 and 𝛽 = 0.25, both for stability and for energy

conservation.

2. The particle solver has to evaluate a pure rotation as part of the Lorentz force update.

In other words v × B has to be exactly perpendicular to v.

3. Finally, the particle update stencil must be constructed such that it integrates 𝑞 ¯𝑒(𝑡)

correctly, without spuriously gaining or losing energy. This condition forces the

particle update by itself to be symplectic.

76

Figure 4.1: Average kinetic energy of the particle as a function of time. We note that the
proposed method conserves energy better than a 4th order multistep method.

One potential way to satisfy these conditions is to simply use a Boris rotation on the fields

at 𝑛 + 1 as a corrector after the field solve. Likewise, one can use a Newmark integrator

with timestep sizes chosen to be small in relation to the cyclotron frequency to ’almost’

satisfy these conditions. In the following section, we use the latter approach and show that

energy is conserved within the scope of both a single-particle and complete PIC simulation.

4.5 Results

Cyclotron Motion over Asymptotically Large Time

In order to show that the proposed predictor-corrector method conserves energy over

large timescales, we consider a single particle moving under the influence of a 𝑧-directed

magnetic field. We choose a coordinate system wherein 𝑚𝑝 = 𝑐 = 𝑞𝑝 = 1, with the particle

starting at the origin with relativistic velocity u(𝑡 = 0) = 𝑢0 ˆ𝑥, with 𝑢0 = 1. The timestep

size was set to Δ𝑡 = 0.1, allowing for 𝑞𝑝 |B(r, 𝑡)|/𝑚𝑝Δ𝑡 = 0.1. The particle was integrated for

77

Parameter
𝑞
𝑚
Beam Voltage
Beam Current
Δ𝑡
Δ𝑡,XOOPIC
𝑁𝑝/timestep

Value
1.6 × 10−19 C
9.1 × 10−31 kg
174 kV
10 A
10 ps
1 ps
50

Table 4.1: Parameters for the expanding beam run.

a thousand timesteps (corresponding roughly to 16.6 gyrations). The average energy was
then computed by finding a best fit line through the relativistic energy 𝑚𝑝 𝑐2/(cid:112)1 − v(𝑡)2/𝑐2.
We note from Fig. 4.1 that the proposed method conserves average energy far better than

the 4th order Adams-Bashforth/Adams-Moulton predictor corrector method. The error in

the particle velocity was found to converge quadratically with stepsize in the proposed

method, outperforming the first order Boris method, while preserving average energy.

Expanding Particle Beam

Next, consider a conducting cylindrical cavity of length 10 cm and radius 2 cm oriented

such that the axis of rotation aligned along ˆ𝑧. The walls of the cavity were assumed

to be perfectly conducting and a particle beam composed of electrons was initialized at

𝑧 = 0. The parameters of the simulation are as in Table. 4.1. The tetrahedral mesh used to

discretize the system comprised of 3229 tetrahedra had an average edge length of 2.3 mm.

Single Particle Test

To demonstrate that merely cobbling together an energy conserving EM scheme with

an energy conserving particle scheme is insufficient, we looked at the overall difference in

field and particle energy per timestep for a single particle injected into the system. The

energy difference curve shown in Fig. 4.2 clearly demonstrates that when a correcter step

as proposed is not employed, there are spurious spikes in particle energy that result from

78

Figure 4.2: Plot of energy change over one timestep for a single particle. Note the spurious
spikes when a corrector step is not employed.

a mismatch between particle velocities and currents on the mesh. Likewise, we also see

proof that the Newmark system is not perfectly energy conserving, due to the relation

between zeroth and first derivatives only being accurate to Δ2

𝑡 . These manifest as harmonic
oscillations in the solution, but do not grow over time, since the Newmark stencil by itself

is non-dissipative over large timescales [83].

79

Figure 4.3: Plot of energy change over timestep for the full beam run. The spikes from the
previous case are still present with the traditional approach, but is a lot more noisy due to
the presence of multiple particles

Full Expanding Beam

A snapshot of the 𝑧-𝑟 phase space distribution particle beam at 5 ns – compared with an

equivalent simulation set-up on XOOPIC[98] – is reported in Fig. 4.4 and the satisfaction of

Gauss’ Law over the course of the entire run is shown in Fig. 4.5. As is evident, the implicit

method with the proposed velocity correction shows very good agreement to the reference

method, while satisfying Gauss’ Law to machine precision. Looking at the energy curve

in Fig. 4.3, we note similar spikes in particle energy as in the previous example. Due to

many particles being present, however, these oscillations make the overall conservation

80

Figure 4.4: Snapshot of the expanding plasma beam at 𝑡 = 5 ns. We note that the
distribution predicted by EM-FEMPIC using a 4th order Adams integrator closely agrees
with equivalent results from XOOPIC.

Figure 4.5: Satisfaction of Gauss’ Law for the expanding particle beam. We note that the
error is at machine precision over the course of the run.

81

Figure 4.6: Particle distribution within a cylindrical klystron at 𝑡 = 7.6 𝑛𝑠. We note that
both the EM-FEMPIC using an energy preserving predictor-corrector particle integrator
agrees with a leapfrog based method and exhibits particle bunching at 𝑧 ∼ 20 mm.

Parameter
𝑞
𝑚
Beam Voltage
Beam Current
Δ𝑡
Δ𝑡,Leapfrog
𝑁𝑝/timestep

Value
1.6 × 10−19 C
9.1 × 10−31 kg
31.05 kV
1 A
11.6 ps
1 ps
100

Table 4.2: Parameters for the klystron run.

chart far noisier. In contrast, the method employing the corrector step is smoother and is

dominated by the 𝒪(Δ2

𝑡 ) error in the Newmark approximation.

Particle Beam in a Klystron

Next, we considered the performance of the implicit EM-FEMPIC method to an FEMPIC

method built using leapfrog, by analysing the behavior of a particle beam accelerated into

82

Figure 4.7: Satisfaction of Gauss’ Law for the klystron. We note that the error is at machine
precision over the course of the run.

Figure 4.8: Particle and field energy predicted by the implicit EM-FEMPIC case for the
klystron.

83

a cylindrical klystron. The geometry of the device is shown in Fig. 4.6, with all walls

assumed to be perfectly conducting. At the neck of the device (𝑟 = 4 mm, 𝑧 = 14 − 16 mm),

we placed a current source with frequency 3.9 GHz, oriented along the ˆ𝑧-direction. Once

again, the system was discretized by a tetrahedral mesh comprised of 3229 tetrahedra, this

time with an average edge length of 2.63 mm. The solver was run with a step size of 11.6

ps and compared to the leapfrog based method running at 1 ps. The parameters of the

beam are as reported in Table 4.2. Once again, we note very good agreement between the

implicit and explicit EM-FEMPIC implementations in Fig. 2.13, despite the former running

at approximately 11 times the stepsize. Likewise, we note machine precision satisfaction of

Gauss’ Law in Fig. 4.7 and good energy conservation in Fig. 4.8.

4.6 Conclusion

In this chapter, we have presented a set of rubrics that need to be followed in order to

exactly preserve average energy while using an unconditionally stable implicit EM-solve.

These rubrics were demonstrated through a number of numerical examples that made use

of a velocity correction scheme that satisfied the required conditions when using a small

stepsize in relation to the cyclotron frequency. We further demonstrated good theoretical

agreement in each result as well as conservation of charge and energy.

84

CHAPTER 5

A TRANSIENT PARAMETER EXTRACTION TECHNIQUE FOR FINITE ELEMENT
PARTICLE IN CELL SOLVERS

The methods and techniques outlined in the thesis so far have dealt with either

mitigating the effects of using a larger stepsize afforded by an implicit field solver or

coming up with ways to use even larger stepsizes while exactly satisfying all of Maxwell’s

equations. In order, we have looked at

1. Using highly accurate particle push methods to evolve particles in phase space more

accurately than with widely used polynomial predictor corrector methods. Doing

this allows us to accurately trace particles despite having a larger stepsize than would

be afforded by an explicit PIC scheme.

2. Using envelope tracking to use even larger step-sizes for narrowband devices.

3. Deriving rubrics for energy conservation in a PIC scheme making use of an implicit

field solver. We also suggested a simple velocity correction scheme if having a high

order polynomial predictor is essential to accurately map currents to the mesh.

However, none of these address the fundamental drawback with finite element based

solvers, namely that one has to invert a matrix (or iteratively solve an A · 𝑥 = 𝑏 problem) at

every timestep. This is fundamentally different to FDTD based solvers where the update

matrix is strictly diagonal [7]. One way to do this is to increase the timestep size and

reduce the number of matrix solves needed to evolve a system for a prescribed span of

time. However, as noted in Chapter 3, there are limits to how far we can push the stepsize,

dictated primarily by the frequency response of the device being analysed.

In this chapter, we attempt to tackle this problem using a slightly different approach.

A similar issue is present in the analysis of devices that involve lumped nonlinear com-

ponents. Such simulations are of great interest in modern RF design, particularly as the

85

characterization of radiative coupling effects becomes important early in the design process.

Some examples of designs that benefit from such analysis include packaged systems,

active antenna devices/arrays and high speed interconnects [104, 105, 106]. Traditionally,

simulations of such systems would involve a circuit or device model run self-consistently

in lockstep with either an integral equation [107, 108] or finite element [109] solver. As

one can imagine, if the attached lumped device is nonlinear, the number of matrix solves

required goes from one per timestep to one per nonlinear iteration per timestep, making it

very expensive for real world devices. One potential way to get around this problem is to

use parameter-extraction, where the significantly larger EM solver is reduced to a set of

time-varying port responses. Since the nonlinearities exist only the device system, any

computational cost associated with nonlinear iteration is restricted purely to the device

model. Furthermore, once extracted, the transient port response can be used with any

attached device so long as the overall EM geometry remains unchanged (we refer to such

methods as being ’device agnostic’). This approach has been applied to nonlinear spice-like

nodal circuit networks [99] attached to integral equation based systems [110] and, more

recently, to a general physical device model attached to a finite element electromagnetic

solver [111]. The latter citation is attached as an Appendix to this dissertation.

Before we consider applying this idea to PIC simulations, there is still a problem to

mitigate: As noted in [111], the computational savings in port extraction are achieved

primarily due to the fact that the number of nonlinear devices in a typical simulation is

significantly smaller than the number of electromagnetic unknowns in the system. This

assumption is simply not true in a general PIC simulation, where any field unknown can

potentially be associated with a particle current. In this work, we attempt to solve this

problem by coupling parameter extraction with domain decomposition. In particular, we

will attempt to break up a general EM geometry into a set of non-overlapping volumes,

and extract transient responses on each edge on the interfaces connecting these volumes.

Therefore, we drastically reduce the number of parameters that need to be extracted.

86

Domain decomposition, in the form of the finite element tearing and interconnecting FETI

method, has been used in conjunction with EM-FEMPIC in the recent past [80]. We will

use this as a starting point for our method.

5.1 Transient Port Extraction for Static Ports

Taking a short break from plasma physics, we consider the use of full-wave electro-

magnetic analysis in more general RF and microwave design applications. As mentioned

earler, these devices are often characterized by the presence of strongly nonlinear elements

(amplifiers, FETs, diodes etc.) as well as the presence of radiative coupling effects between

connected device elements. As a result, any simulation scheme we use must employ

self-consistent, time-domain analysis.

The predominant approach to transient analysis of EM-circuit system is to solve the

system self consistently [112]. This involves solving both the linear and non-linear system

at every time step. Obviously, the tight integration implies that the solution is not circuit

agnostic. Alternatives that have been explored is to use frequency domain methods to

construct a transient “impulse response” that take the form either as 𝑅𝐶 extraction [113])

or 𝑆 parameter methods [114]. The extracted response is readily incorporated into a circuit

simulator. While this approach is somewhat effective, the advantages and limitation are

apparent; (a) the approach is independent of the circuit system; (b) given the bandwidth of

excitation, the harmonics generated due to non-linearity and need to capture early time

behavior, the number of frequency samples necessary can be very high; and (c) often only

a subset of these frequencies are used. In weakly non-linear systems or when the coupling

is not strong the errors accrued may be tolerable. When analysis of the circuit system is

possible using frequency domain techniques (harmonic balance) under the assumption of

weakly non-linear systems, one often takes recourse to using a Schur complement approach

to couple EM to circuit systems [115, 110]. In addition to being circuit agnostic, this is

computationally more efficient as there are fewer ports than number of spatial degrees of

87

freedom.

It follows, that developing such a methodology for transient analysis will have the

same benefits, in addition to potential integration with multiphysics codes that model

device physics. Extracting port parameters of the EM system is analogous to computing its

numerical impulse response. Doing this in time domain is challenging because of known

instabilities associated with deconvolution [116]. A recently proposed technique [110] for

solving coupled circuits with time domain integral equation (TDIE) solvers overcame this

fundamental bottleneck. Extending this technique to finite element based solvers involves

several changes in the extraction process. First, the extracted signal manifests itself as

a transient admittance in the circuit system as opposed to an impedance in [110]. As a

result, the feed model used is changed to a current probe as opposed to a delta-gap feed,

leading to differences both in coupling and in the extraction process. Finally, using a finite

element scheme allows for integration with different set differential equations used to

model the device subsystem; in this work, we demonstrate this capability via coupling

with a non-linear Drift-Diffusion equations to model a Schottky diode.

The principal contributions of this section are (a) the development of a method for

extracting transient port parameters in the EM-circuit interface for finite element systems,

(b) the demonstration that solutions obtained through this method are identical to those

obtained using a fully coupled solution to solver precision, and (c) integration with

device specific differential equations. Furthermore, we also briefly demonstrate the

implementation of a Perfectly Matched Layer (PML) system for mixed finite element

electromagnetic solvers. Via numerous examples, we will demonstrate the application of

these for analysis of linear and nonlinear circuits coupled to EM systems. Where possible,

we will show comparison against data that exists in the literature (either measured or

modeled). We note that while our results are obtained using a implicit mixed FEM system,

the prescribed procedure is applicable to the traditional wave equation solvers. Our

rationale for using mixed FEM as opposed to the usual wave equation is that the latter

88

has a time growing null-space of the form 𝑡∇𝜓(r), whereas the former has a null space of

the form ∇𝜓(r). In mixed FEM, the magnitude of the null-space excited depends on the

threshold used for the iterative solver. That said, it should noted that a gauging constraint

as described in [95, 96, 117, 87] eliminates this null space. Finally, the nonlinearities are

assumed to be lumped or pointwise. While the proposed method can potentially be used

to isolate small regions of continuous nonlinear materials, this is relegated to future work.

Preliminaries

Consider an object ΩEM ∈ R3 bounded by a surface 𝜕ΩEM, that describes the geometry

of an electromagnetic object containing 𝑁𝑝 ports, each associated with a lumped circuit

subsystem. The currents flowing across these ports are collectively represented as JCKT(r, 𝑡)

with r ∈ Ω𝐸𝑀. We assume that any voltage sources in the circuit system are bandlimited

to some frequency range [ 𝑓min, 𝑓max] with 𝑓min > 0. Furthermore, we assume that the

amplitude of these sources are zero when 𝑡 ≤ 0. The entire system is excited by a signal,

either at a port or elsewhere. Our goal is to obtain a self-consistent transient solution that

couples full wave electromagnetics with the attached device subsystems. In what follows,

all regions/quantities will tagged as belonging either to the linear electromagnetic (EM)

or device (DEV) system. To construct a numerical model, we use a finite element scheme

with Newmark-Beta temporal basis functions to model the Maxwell system and higher

order temporal basis functions for the device model. The discrete EM system arising from

using these basis sets can be written as

ℒ ◦ (cid:2) ¯𝐸EM(𝑡)(cid:3) + 𝒞EM ◦ (cid:2)¯𝐽DEV(𝑡)(cid:3) = 0

(5.1)

where ℒ is a discrete linear differential operator, 𝒞EM is a discrete coupling operator that
relates quantities in the device subsystems to the EM system, ¯𝐸EM(𝑡) are the coefficients of
electric fields for all space and ¯𝐽DEV(𝑡) represents impressed current distributions in the

device system. Likewise, the behavior of the DEV systems can be described by operators

89

𝒟, ℱ and couples to (5.1) through 𝒞DEV, forming

𝒟 ◦ (cid:2)¯𝐽DEV(𝑡), ¯𝐸EM(𝑡)(cid:3) + 𝒞DEV ◦ (cid:2) ¯𝐸EM(𝑡)(cid:3) = ℱ ◦ (cid:2) ¯𝐸EM(𝑡)(cid:3) .

(5.2)

We note, at this point, that 𝒟, 𝒞DEV and ℱ can have arbitrarily complex dependencies and

are not restricted to a simple MNA circuit system. As will be shown in Sec. 4.5, 𝒟 can also

depend on factors such as carrier density and temperature.

Mechanics of Port Extraction

Fundamentally, transient port extraction works by exploiting the linearity of the Operator

ℒ. For a one port system, suppose the electric field and currents in (5.1) are defined as

follows:

¯𝐸(𝑡) =

¯𝐽DEV(𝑡) =

𝑁𝑡(cid:213)

𝑖=0
𝑁𝑡(cid:213)

𝑖=0

𝑒𝑖𝜆(𝑡 − 𝑡𝑖)

𝑗𝑖𝜆(𝑡 − 𝑡𝑖)

(5.3)

where 𝜆(𝑡) refers to a prescribed, compactly supported basis time basis function. Suppose

¯𝐺(𝑡) denotes the solution of (5.1) due to a current defined as a single time basis function

defined at the system’s singular port (this current is denoted as ¯𝐹Δ𝑡 (𝑡)),

ℒ ◦ ¯𝐺(𝑡) + 𝒞EM ◦ ¯𝐹Δ𝑡 (𝑡) = 0
=⇒ ¯𝐺(𝑡) = −ℒ−1 ◦ 𝒞EM ◦ ¯𝐹Δ𝑡 (𝑡)

(5.4)

Since we know that ¯𝐽DEV(𝑡) is expanded in terms of 𝜆(𝑡), we can rewrite (5.1) as

¯𝐸(𝑡) = ℒ−1 ◦ 𝒞EM ◦

𝑁𝑡(cid:213)

𝑖=0

𝑗𝑖𝜆(𝑡 − 𝑡𝑖) =

𝑁𝑡(cid:213)

𝑖=0

¯𝐺(𝑡 𝑁𝑡 −𝑖) = ¯𝑗 ★ ¯𝑔

𝑗𝑖

(5.5)

where ¯𝑔 refers to the temporal weighting coefficients of the extracted signal ¯𝐺 and ★

describes a discrete convolution operation. To describe the method succinctly: Because

the EM response is represented as a collection of temporal basis sets, all we need is a

function ¯𝐺(𝑡) such that ℒ ◦ ¯𝐺(𝑡) + 𝒞EM ◦ ¯𝐹Δ𝑡 (𝑡) = 0 where ¯𝐹Δ𝑡 refers to a single basis function

90

excitation. After such a function is obtained, ¯𝐸EM(𝑡) can be reconstructed for arbitrary
¯𝐽DEV(𝑡) through ¯𝐸EM(𝑡) = ¯𝐺(𝑡) ★ ¯𝐽DEV(𝑡) without any further input from the EM system.

Discretization of EM and Device Systems

We construct a Maxwell solver following a mixed finite element scheme using Whitney

edge and face basis functions E(r, 𝑡) = (cid:205)𝑁𝑒
𝑖=1
𝑁𝑒 and 𝑁 𝑓 are the number of edges and faces respectively of the tetrahedral mesh to

(r) and B(r, 𝑡) = (cid:205)𝑁 𝑓
𝑖=1

(r) , where

2
𝑏𝑖(𝑡)W
𝑖

1
𝑒𝑖(𝑡)W
𝑖

discretize the domain; see [118] and references therein. The EM unknowns are represented

in time as e(𝑡) = (cid:205)𝑁𝑡
𝑗=1
of these functions are defined in [119]. A Newmark-𝛽 time stepping stencil with 𝛾 = 0.5

𝑏 𝑗 𝑁(𝑡 − 𝑡𝑗), and tested by 𝑊(𝑡 − 𝑡𝑖). Both

𝑒𝑗 𝑁(𝑡 − 𝑡𝑗) and b(𝑡) = (cid:205)𝑁𝑡
𝑗=1

and 𝛽 = 0.25 is used to solve for e(𝑡) and b(𝑡) and an appropriately configured PML to

truncate the computational domain. Contemporary implementations of PML systems

follow the general framework first outlined by Berenger [120] with more recent additions,

including the use of stretched coordinates [121]. The implementation of these systems is

done by either directly evaluating the convolutions resulting from the use of a stretched

coordinate system or defining and solving for two auxiliary variables in addition to the

regular field unknowns to achieve the same effect. The PML implementation used in

this chapter directly evaluates the convolution integrals. To do so, we define a stretched

coordinate system via the following transform

Λ(𝜔) =

𝑠𝑦 𝑠𝑧
𝑠𝑥

0

0

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

0

𝑠𝑥 𝑠𝑧
𝑠𝑦

0

0

0

𝑠𝑥 𝑠𝑦
𝑠𝑧

(cid:170)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:172)

(5.6)

with 𝑠𝑖 = 1 + 𝜎𝑖
𝑗𝜔𝜖0

to match the absorbing layers to free space. Here, 𝜎𝑖 are the components

of an anisotropic conductivity that governs the field loss. Stretching coordinates in this

91

manner alters Maxwell’s equations as follows in frequency domain:

Λ(𝜔)−1 · ∇ × E(r, 𝜔) = −Λ(𝜔)−1 · 𝑗𝜔B(r, 𝜔)

∇ × 𝜇−1

Λ(𝜔)−1 · B(r, 𝜔) = J(r, 𝜔) + 𝜖(r)Λ(𝜔) · 𝑗𝜔E(r, 𝜔)

(5.7)

Obtaining a time marching scheme involves inverse Fourier transforming (5.7) to obtain

L2(𝑡) ∗ ∇ × E(r, 𝑡) = −L2(𝑡) ∗

𝜕B(r, 𝑡)
𝜕𝑡

∇ × 𝜇−1

L2(𝑡) ∗ B(r, 𝑡) = J(r, 𝑡) + 𝜖0L1(𝑡) ∗ E(r, 𝑡)

(5.8a)

(5.8b)

where 𝐿1(𝑡) = F −1 (𝑗𝜔Λ(𝜔)) and 𝐿2 = F −1 (cid:16)
by testing (5.8a) with a W2(r) basis function and (5.8b) with W1(r). Furthermore, the

. We discretize these equations

Λ(𝜔)−1(cid:17)

convolution terms are evaluated as done in [121].

The behavior of the attached devices at each port can be described generally by operators

𝒟, ℱ and couples to the EM system through 𝒞CKT, forming

𝒟 ◦ (cid:2)J

CKT(r, 𝑡), e(𝑡)(cid:3) + 𝒞CKT ◦ [e(𝑡)] = ℱ ◦ [e(𝑡)] .

(5.9)

Where 𝒟 and ℱ are general nonlinear operators and 𝒞CKT is a coupling operator that

relates quantities in the EM system to those in the attached device. For the results

presented in this work, we restrict 𝒟 to either be an circuit network implemented through

Modified Nodal Analysis [99]; used entirely by itself or in conjunction system governed

by a set of Drift-Diffusion equations to model diodes [122, 123]. In the case of the former,

we temporally represent the voltage and circuit unknowns using a pth order backward

Lagrange interpolation function 𝐿𝑝(𝑡 − 𝑡𝑖). We note that this choice of representation is

what is commonly used in contemporary implementations of MNA, but are in no way

the only feasible choice. Upon using our chosen representation, we obtain the following

system

YV

CKT(𝑡) = f

CKT(𝑡) + f

CKT
𝑛𝑙

CKT, 𝑡).

(V

(5.10)

which is subsequently delta tested to obtain a time marching scheme. Here VCKT is a vector
containing both the nodal voltages and branch currents in the circuit, fCKT and f

refer to

CKT
𝑛𝑙

92

the excitations due to linear and nonlinear components. The linearized form in (5.10) can

be solved at each timestep using a multi-dimensional Newton-Raphson scheme similar to

[109]. Similarly, when employing a drift diffusion operator to model diodes in the system,

the currents due to electrons and holes J𝑛(r, 𝑡) and J𝑝(r, 𝑡) running through the device are

related to carrier densities 𝑛(r, 𝑡), 𝑝(r, 𝑡) and potential 𝜙(r, 𝑡) through

J𝑛(r, 𝑡) = 𝑞𝐷𝑛∇𝑛(r, 𝑡) + 𝑞𝜇𝑛(E(r, 𝑡))𝑛(r, 𝑡)∇𝜙(r, 𝑡)

J𝑝(r, 𝑡) = −𝑞𝐷𝑝∇𝑝(r, 𝑡) + 𝑞𝜇𝑝(E(r, 𝑡))𝑝(r, 𝑡)∇𝜙(r, 𝑡)

(5.11a)

(5.11b)

where 𝜇𝑛 and 𝜇𝑝 are field dependent mobility rates for the electrons and holes respectively;

and 𝐷𝑝 and 𝐷𝑛 are corresponding diffusion coefficients. The currents and carrier densities

are further related through a set of continuity equations:

𝜕𝑛(r, 𝑡)
𝜕𝑡

=

∇J𝑛(r, 𝑡)
𝑞

− 𝑅 + 𝐺

𝜕𝑝(r, 𝑡)
𝜕𝑡

= −

∇J𝑝(r, 𝑡)
𝑞

− 𝑅 + 𝐺

(5.11c)

(5.11d)

where 𝑅 and 𝐺 respectively denote the electron-hole recombination and the collision

ionization rates. Finally, the carrier densities are related to the potential through Poisson’s

equation

∇ · (𝜖∇𝜙(r, 𝑡)) = −𝑞 (𝑝(r, 𝑡) − 𝑛(r, 𝑡) + 𝑁𝑡(r, 𝑡))

(5.11e)

where 𝑁𝑡(r, 𝑡) refers to the doping concentration. Solution to the drift diffusion system can

be obtained by discretizing (5.11) using an appropriate finite element or finite difference

method; see [122, 123, 124] and the references therein for a detailed analysis. The results

presented in this chapter only involve 1D drift-diffusion systems and as a result we

discretize (5.11) using a corresponding 1D finite difference system.

We describe the interaction between EM and device subsystems in two parts. First, we

consider a device system modelled using MNA. In this instance, the quantities involved in

the device system are voltages and currents, which need to be related to fields and current

93

densities in the EM system. Specifically, if the 𝑘th FEM edge (denoted by l𝑘) is attached to

the 𝑗th circuit subsystem, the current impressed on the EM system is given by

⟨𝑊(𝑡 − 𝑡𝑖), 𝐽CKT

𝑘

(𝑡)⟩ = ⟨𝑊(𝑡 − 𝑡𝑖), 𝐼CP

𝑗

(𝑡)

∫

|l𝑘 |

ˆ
1
𝑘 𝑑r⟩
l𝑘 · W

(5.12)

= ⟨𝑊(𝑡 − 𝑡𝑖), 𝐼CP

𝑗

(𝑡)𝐶𝑘 𝑗⟩

with 𝐶𝑘 𝑗 denoting a coupling coefficient that relates quantities in the device subsystem to

the EM solver. Furthermore, 𝐼CP

𝑗

(𝑡) refers to the magnitude of the current impressed by

the circuit subsystem over the coupling edge l𝑘. Similarly, the voltage across the coupling

branch can be related to the electric field across the 𝑘th FEM edge

⟨𝛿(𝑡 − 𝑡𝑖), 𝑉 CKT

𝑗

(𝑡)⟩ = ⟨𝛿(𝑡 − 𝑡𝑖), 𝑒𝑘(𝑡)

∫

|l𝑘 |

ˆ
1
𝑘(r)𝑑r⟩
l𝑘 · W

(5.13)

= ⟨𝛿(𝑡 − 𝑡𝑖), 𝑒𝑘(𝑡)𝐶𝑗𝑘⟩.

𝐶𝑗𝑘 here likewise denotes a coupling coefficient that relates quantities in EM solver to

the device. We observe from (5.13) and (5.12) that our choice of testing/representation

functions leads the two coupling coefficients to be identical. For a drift diffusion setup,

the electric field at the location of the port is related to the electron and hole mobilities

𝜇𝑛(E(r, 𝑡)) and 𝜇𝑛(E(r, 𝑡)) respectively. Since the devices are assumed to be lumped, E(r, 𝑡)

at the location of the port can be used directly to compute the carrier mobilities, since the

field is assumed to be spatially constant within the device. Likewise, we can use J𝑝(r, 𝑡)

and J𝑛(r, 𝑡) are to construct the net current passing through the diode, which can then be

reintroduced to the EM system following (5.12).

Extraction of the Numerical Impulse Response

The computational bottlenecks involved with solving a coupled system as described in

the previous Section are twofold: (1) Resolving nonlinear elements in the circuit system

involves performing a solve of the combined matrix equation and (2) changing any of the

attached circuit subsystems would require the coupled problem to be solved again, despite

94

the EM system remaining unaltered. A potential way to exploit the linearity of the EM

system is to extract its impulse response at each EM-circuit interface and use it through

(5.13) in the circuit solve. Unfortunately, it is well known that deconvolution required to

implement this is unstable [116].

The key insight in the method proposed herein is as follows. The current deposited on

a given port edge is represented in time through a linear combination of 𝑁𝑡 basis functions.

As a result, given the EM response due to a single temporal basis function, we can exploit

the linearity of Maxwell’s equations and reconstruct the field anywhere in the system.

Since this sequence of operations only involves reconstructing the current at a given port

in terms of basis functions by which it is represented in the coupled solve, the respective

fields computed by both methods should be numerically indistinguishable.

With 𝑝(𝑞) denoting the set of FEM edges associated with the port 𝑞, we define an

excitation vector e𝑞(𝑡) defined as follows

𝑒𝑞,𝑘(𝑡) =

𝑁(𝑡 − 𝑡𝛿)

𝑘 ∈ 𝑝(𝑞)

0

otherwise





(5.14)

where 𝛿 is the timestep at which the excitation is applied. e𝑞 is then used to define the
forcing function JCKT(𝑡) through (5.12) with 𝐽CKT(𝑡) = 𝐶𝑘𝑞e𝑞(𝑡). This function is then
applied to the RHS of (5.8) to obtain a solution vector x𝑞. In order to solve the device

equations, however, we only require the coefficients associated with each port, allowing

us to construct a matrix 𝐺𝑘𝑞 = 𝑥 𝑘,𝑝(𝑞) of dimensions 𝑁𝑝 × 𝑁𝑝 × 𝑁𝑡. Each column of 𝐺

represents a discrete impulse response for a pulse centered at the edge 𝑝(𝑞) measured at

the 𝑘th edge. As a result, constructing the electric field at port 𝑘, in response to an arbitrary

set of currents can be done by simply summing the convolutions of 𝐺𝑘𝑞 with 𝐼𝑞 for each

attached circuit port. The reconstructed fields can then be related using the appropriate

coupling equations to quantities the device subsystems. For instance, if the attached port

is governed through MNA, the voltage across the 𝑗th port 𝑉 CKT

𝑗

(𝑡) in (5.13) can now be

95

(a) Geometry of the monopole system. The di-
mensions are in cm.

(b) Plots of the input admittance in S measured
from 0.5 GHz to 4.5 GHz compared to existing
results in the literature [112].

Figure 5.1: Description of the geometry and obtained results for the cylindrical monopole
antenna.

written in terms of 𝐼CP(𝑡)

𝑉 CKT
𝑗

(𝑡𝑖) = ⟨𝛿(𝑡 − 𝑡𝑖), 𝐶𝑗𝑘

𝑁𝑝
(cid:213)

𝑞=1

𝐺𝑘𝑞(𝑡) ∗ 𝐼CP

𝑞 (𝑡)⟩.

(5.15)

yielding a standalone matrix equation for the device system.

Numerical Examples

For the results presented in the remainder of this section, 𝑁𝑡 denotes the number of

timesteps that the simulation is run over and 𝑁EM, 𝑁CKT denote the numbers of EM and

circuit unknowns respectively in the system. Unless specified otherwise, voltage sources

are defined using 𝑣(𝑡) = cos(2𝜋 𝑓0𝑡)𝑒−𝑡2/2𝜎2 where 𝜎 = 3 × (2𝜋 𝑓bw)−1, with 𝑓max = 𝑓0 + 𝑓bw.
The timestep size Δ𝑡 = (30 𝑓𝑚𝑎𝑥)−1. Finally, GMRES was used to solve the system iteratively

to a tolerance of 10−12.

96

Input Impedance of a Monopole Antenna

In this first example, we validate our technique by analysing a cylindrical monopole

suspended above an infinite ground plane. Specifically, the monopole has a length of

5 cm, radius of 1.52 mm and is suspended 1.6 mm above a conducting square of side

length 10 cm, as shown in Fig. 5.1a. To mimic an infinite ground plane, the truncating

walls of the simulation domain are in direct contact with the ends of the square. The

ground plane is coupled to the cylinder by a single, vertically oriented edge, across which

is connected a driving circuit given by a time varying voltage source connected in series to

a 100Ω resistor. The voltage fed to the resistor is assumed to be a modulated Gaussian

with center frequency 𝑓0 = 2.5 GHz and bandwidth 𝑓BW = 2 GHz The timestep size Δ𝑡
was set to be (30 𝑓max)−1. The mesh used to discretize the domain had an average edge
length of (20 𝑓max)−1, resulting in 𝑁EM = 512, 436 and the simulation was run for 𝑁𝑡 = 2001.

The setup is geometrically identical to an example in [112] and looking at Fig 5.1b, we see

good agreement between the admittance curves generated through port extraction and

a coupled time domain solver for the same simplified probe model. The solve time per

timestep performing the extraction as detailed in Section 5.1 was approximately 8 seconds

per timestep, with the subsequent circuit solve completing its entire run of 2001 timesteps

in under 20 ms.

Input impedance of a strip above a Finite Ground Plane

We consider a conducting strip suspended over a finite ground plane, as specified in

Fig. 5.2a. The coupling between the EM system and the driving circuit is achieved across a

vertical 1 cm edge going from the conducting plane to the strip. The circuit is assumed to

be a Thevenin source characterized by 𝑓0 = 1 GHz and 𝑓bw = 999 MHz connected in series

to a 100 Ω resistor. The simulation domain is discretized using a tetrahedral mesh with
approximate average edge length set to (20 𝑓max)−1, yielding 𝑁EM = 2, 000, 936. The system

was run for 𝑁𝑡 = 4000 timesteps (with each timestep taking approximately 13 seconds

97

(a) Geometry of the monopole antenna. The di-
mensions are in cm.

(b) Power radiated due to a 1 mV source compared
to measured data and FDTD [104]

Figure 5.2: Description of the geometry and obtained results for the monopole strip
suspended over a finite ground plane.

(a) Geometry of the microstrip amplifier with
a FET attached between G and D [109]. The
dimensions are in mm.

(b) Comparison of 𝑆11 and 𝑆21 for a microwave
amplifier as described in [109]

Figure 5.3: Geometry description and calculated 𝑆-parameters for the microwave amplifier.

to converge) and the port parameters were extracted through Fourier transforms of the

time-series data. As is evident from Fig. 5.2b, the radiated power curve shows very good

agreement to measured data and FD-TD.

98

(a) Schematic of the microstrip rectifier. All di-
mensions are in mm.

(b) Computed conversion efficiency for a diode
rectifier with a simulation done through a physical
model compared to ADS [125].

Figure 5.4: Geometry layout and comparison of conversion efficiency for a microstrip
rectifier circuit.

Microstrip Amplifier

Next, we validate the proposed technique for nonlinear circuit systems by comparing

the reflection coefficient and gain for a microstrip amplifier. The geometry and driving

circuits are exactly as in [109] and we obtain the S parameters through small signal analysis,

with 𝑓0 = 5.5 GHz, 𝑓bw = 3.5 GHz and 𝑓max = 𝑓0 + 𝑓bw. The tetrahedral mesh used to
discretize the domain has an average edge length of (15 𝑓max)−1 with 𝑁EM = 5, 134, 732. The

data used to compute the scattering parameters was obtained by running this setup for

𝑁𝑡 = 6000 timesteps. We note from Fig. 5.3b that the measured S parameters show good

agreement to results from [109]. Extracting this response took approximately 37 seconds

per timestep, and the nonlinear circuit solve completed in just over 5 seconds.

Miscrostrip Rectifier modelled through Drift-Diffusion

Until now we have demonstrated the use of port extraction on linear and nonlinear

systems connected to nodal circuit networks. Next, we aim to show that the proposed

99

method works with systems where the devices are governed nonlinear differential equations.

The system under analysis is a microstrip rectifier circuit as shown in Fig. 5.4a. The

thickness of the board was 1 mm and the relative permittivity of the substrate 2.65. A

HSMS-282B diode is placed across port 𝑃 (with physical parameters for (5.11) as in [125])

with a 10 pF filter capacitor attached to a variable load across port 𝐶. The input source

was assumed to be a modulated Gaussian with 𝑓0 = 2.45 GHz and 𝑓bw = 0.25 GHz. We

performed two experiments on a microstrip rectifier circuit: (1) First, the Schottky diode in

the layout was modeled using an equivalent circuit network, mimicking a similar setup

simulated on ADS. (2) Next, using the same extracted port response as in the first experiment, we

modeled the diode using a set of Drift-Diffusion [125] equations in (5.11). In each case, the

conversion efficiency of the rectifier

𝜂 =

𝑃DC
𝑃source

· 100%

(5.16)

where 𝑃DC denotes the power measured at the output end 𝑃source the corresponding quantity

at the soruce was compared against data from [125]. As is evident in Fig. 5.4b, in the first

experiment, results obtained through the proposed method agree well with corresponding

results obtained through ADS EM Co-simulation. In the second experiment, our results

better match measured data of the rectifier circuit than the corresponding efficiency curve

predicted by ADS. We emphasize the fact that the results from the equivalent circuit do not

agree with experimental measurements, due to the network not being representative of the

actual diode for the parameters chosen, thereby illustrating a situation where the ability to

couple the EM layout with a general device model is a significant advantage.

Strip above a Finite Ground Plane driven by different circuits

In keeping with objectives stated earlier, we first extracted the port parameters following

the procedure outlined earlier in the chapter for the strip-monopole example with Δ𝑡 = 16 ps.

This extracted response was then attached to a Chebyshev filter and a Diode Mixer circuit

100

50 Ω

16.308 nH 16.308 nH

𝑉𝑠(𝑡)

9.05 pF

13.48 pF

9.05 pF

+

𝑉port(𝑡)
−

(a) Schematic of the Chebyshev filter used.

(b) Plot of the port voltage for the Chebyshev filter
from Fig. 5.5a obtained through the fully coupled
and extracted responses.

Figure 5.5: Circuit description and comparison of port voltages between the port extraction
and fully coupled methods for a linear circuit system.

respectively, and the obtained port voltages in time were compared to equivalent results

obtained from a direct solution of the coupled system.

Chebyshev filter

First, we use the extracted transient port parameters on a Chebyshev filter as shown in

Fig. 5.5a. 𝑉𝑠(𝑡) was characterized by 𝑓0 = 1.5 GHz and 𝑓bw = 0.5 GHz. The timestep size in

the circuit system was set to the same size used in the extraction of the EM response. The

comparison of the port voltages as a function of time are shown in Fig. 5.5b. The 𝐿2 error

between the two solutions was 3.1 × 10−12.

Diode Mixer

Next, we use extracted port parameter with a nonlinear Diode Mixer as shown in

Fig. 5.6a. The diode between nodes 3 and 7 has a saturation current 𝐼𝑠 = 2 nA, emission

coefficient 𝜂 = 2.0 and 𝑘𝐵𝑇/𝑞 = 25.6 mV. The RF and LO sources were assumed to be

sine waves of magnitude 0.4 V with frequencies 900 MHz and 800 MHz respectively. The

101

1

100 Ω

2

10 Ω
1 pF

RF

1 M Ω

100 Ω

1 pF

LO

5

7

.1 pF

8

6 nH

3 nH

1𝑘Ω

10 Ω

4

3

6

9

+ 𝑉𝑏𝑖𝑎𝑠
−

(a) Schematic of the Diode Mixer. The EM system
is attached between nodes 8 and ground

(b) Plot of the port voltage for the Diode Mixer
from Fig. 5.6a obtained through the fully coupled
and extracted responses.

Figure 5.6: Circuit description and comparison of port voltages between the port extraction
and fully coupled methods for a nonlinear circuit system.

current across the diode was modelled using the Shockley equation and the bias voltage

was set to 0.7 V to activate the diode. The relative 𝐿2 error between the two curves in Fig.

5.6b was 4.7 × 10−12.

Computational Complexity

Finally, we discuss the asymptotic cost complexity of the proposed method. Let 𝑁EM and

𝑁CKT denote the number of degrees of freedom of the EM and circuit systems, respectively;

𝑁𝑡 the number of timesteps; 𝑁NL the number of nonlinear iterations per time step; 𝑁𝑝 the

number of circuit ports and 𝑁GMRES the number of matrix multiplications required to solve

the linearized system denoted in the superscript. The cost for solving the fully coupled

system is

𝐶coupled = 𝒪

(cid:16)

𝑁𝑡 𝑁 coupled
NL

𝑁 coupled

GMRES (𝑁EM + 𝑁CKT)

(cid:17)

.

On the other hand, the cost of port extraction is

𝐶PE = 𝐶PE,1 + 𝐶PE,2

(5.17)

(5.18)

= 𝒪(𝑁𝑡 𝑁𝑝 𝑁 PE

GMRES𝑁EM) + 𝒪(𝑁𝑡 𝑁 PE

NL𝑁 CKT

GMRES𝑁CKT).

102

Figure 5.7: Cumulative solution time of the linearized system from Section 5.1 as a function
of the number of iterations.

Before, we proceed, we note the following. Typically, 𝑁EM ≫ 𝑁CKT. As a result, we ignore

the cost of computing the Jacobian in (5.17). In (5.18), the first portion refers to the cost

of exciting each port and obtaining the corresponding response at other ports. It is a

one-time cost and not incurred as one marches through (indeed, it is the characteristic of

the EM systems and circuit agnostic). The second term in (5.18), is the cost of non-linear

solve at each port. Typically, the number of non-linear solves, 𝑁 coupled

NL

𝑁 coupled
GMRES

in (5.17), is

significantly larger than 𝑁𝑝 𝑁 PE

GMRES

as it involves a fully coupled solve involving all the

degrees of freedom in the system. In order to meaningfully compare computational costs, it

is important to incorporate the contributions of both 𝐶PE,1 and 𝐶PE,2 against 𝐶coupled. To do

this, we considered the solve time per nonlinear iteration within a single timestep of each

103

solve for the numerical example involving the diode mixer attached to a strip-monopole

antenna. This includes the cost for evaluating a single timestep in the extraction process as

well as the cost of computing 𝑁NL nonlinear iterations, where 𝑁NL is the average number of

nonlinear iterations required to achieve convergence per timestep. In this example, 𝑁NL=

6, per time step. Extracting the impulse response took 13s per 𝑁𝑡, i.e., 𝐶PE,1 = 13𝑁𝑡. What

is compared in Fig. 5.7 are 𝐶PE,2/𝑁𝑡 and 𝐶Coupled/𝑁𝑡. As is evident, 𝐶PE,2 ≪ 𝐶coupled.

5.2 Port Extraction with EM-FEMPIC

Fundamentally, the challenge of extending port extraction to a full EM-FEMPIC problem

is due to the complexity of the method scaling badly with the number of ports 𝑁𝑝. As

shown in the previous section, the complexity of parameter extraction can be written as

follows

𝐶PE = 𝐶PE,1 + 𝐶PE,2

(5.19)

= 𝒪(𝑁𝑡 𝑁𝑝 𝑁 PE

GMRES𝑁EM) + 𝒪(𝑁𝑡 𝑁 PE

NL𝑁 CKT

GMRES𝑁CKT),

for a solver that evaluates the EM and CKT systems using an iterative method like GMRES.

We immediately see from (5.19) that as 𝑁𝑝 grows large, the cost of port extraction quickly

dwarfs any benefit gained by relegating the nonlinear solve to the circuit. Unfortunately,

in an EM-FEMPIC problem, the ’nonlinear’ contributions of the system can generally

exist everywhere in the mesh, and as a result, we need to do a bit more work to get port

extraction to be viable.

The solution is to first reduce the volume EM solve to an interface problem, and then

extract port parameters for every interface unknown in the system. The remainder of this

chapter details this process as well as explores the regimes in which the proposed method

is more efficient than a traditional EM-FEMPIC solve.

104

Spatial Discretization

The FEM formulation used in this work is the mixed finite element method. The basis

functions used are the 1-form W(1)(r) ∈ 𝐻(𝑐𝑢𝑟𝑙; Ω) which allows for tangential continuity of
fields, and the 2-form W(2)(r) ∈ 𝐻(𝑑𝑖𝑣; Ω) which allows for normal continuity of fluxes. The
reconstructed fields used to accelerate the particles are defined as E(𝑡, r) = (cid:205)𝑁𝑒
𝑒𝑖(𝑡)W(1)(r)
𝑖=1
and B(𝑡, r) = (cid:205)𝑁 𝑓
𝑏𝑖(𝑡)W(2)(r) where 𝑁𝑒 are the number of edges and 𝑁 𝑓 are the number
𝑖=1
of faces. Testing Faraday’s law with the 2-form and Ampere’s law with the 1-form leads to

the semidiscrete Maxwell system

[★𝜇−1]

0
(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)







(cid:124)

0





[★𝜖0]


(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)
(cid:125)

¯𝐵

𝜕𝑡

¯𝐸

𝜕𝑡















(cid:123)(cid:122)
¯¯𝐴1

+







(cid:124)

0

−𝑐2[𝑀𝑐]𝑇
(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)

(cid:123)(cid:122)
¯¯𝐴0

[𝑀𝑐]





𝑐[★𝐼]


(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)
(cid:125)

(5.20)

=

¯𝐵

¯𝐸















0







¯𝐿




𝜀


(cid:124)(cid:123)(cid:122)(cid:125)
¯¯𝐹

where the degree of freedom vectors ¯𝐸 = [𝑒1(𝑡), 𝑒2(𝑡), . . . , 𝑒𝑁𝑒 (𝑡)], ¯𝐵 = [𝑏1(𝑡), 𝑏2(𝑡), . . . , 𝑏𝑁 𝑓
and ¯𝐿 = [𝑙1(𝑡), 𝑙2(𝑡), ...𝑙𝑁𝑒 (𝑡)] with 𝑙𝑖(𝑡) = ∫
currents J𝑠(𝑡, r) exist on 𝜕Ω due to a Neumann or Robin boundary condition.

(r) · (J𝑠(𝑡, r) − 𝜕𝑡G(𝑡, r)). The surface

𝑑ΩW

(1)
𝑖

Ω

(𝑡)],

Temporal Discretization

This work utilizes the Newmark-𝛽 time marching scheme to evolve the fields in time

with parameters 𝛾 and 𝛽, which creates an unconditionally stable time marching scheme

when 𝛾 = .5 and 𝛽 = .25. Other choices can lead to different stability conditions. The

parameter choices effect the definition of a temporal basis function 𝑁(𝑡) that is tested by a

function 𝑊(𝑡) [74]. When (5.20) is discretized with Newmark-𝛽 with 𝛾 = 0.5 and 𝛽 = 0.25,

105

Figure 5.8: Breakup of the simulation domain Ω into subvolumes

the fully discretized system becomes

(0.5 ¯¯𝐴1 + 0.25Δ𝑡
+ (0.5 ¯¯𝐴1 + .25Δ𝑡

¯¯𝐴0) ¯𝑋 𝑛+1 − .5Δ𝑡
¯¯𝐴0
¯¯𝐴0) ¯𝑋 𝑛−1 + 0.5Δ𝑡

¯𝑋 𝑛

˜𝐺𝑛+1 + 0.5Δ𝑡

˜𝐺𝑛−1

(5.21)

+ 0.25Δ𝑡

˜𝐹𝑛+1 − 0.5Δ𝑡

˜𝐹𝑛 + 0.25 ˜𝐹𝑛−1 = 0.

Here, ¯𝑋 𝑚 = [ ¯𝐵𝑚 ¯𝐸𝑚]𝑇, ˜𝐺𝑚 = [0 𝜀−1 ¯𝐺]𝑚,𝑇, and ˜𝐹𝑚 = [0 𝜀−1 ¯𝐽 𝑚

𝑠 ] at time step 𝑚 with time

step size Δ𝑡. This time marching scheme is separate from that used to evolve the currents

through Newton’s equations. However, the time step sizes and indices between the two

systems are consistent. The boundary current ¯𝐽 𝑚
𝑠

is tangential to a surface. As it has no

normal component, it does not contribute to Gauss’ law and therefore does violate charge

conservation.

Domain Decomposition Preliminaries

Consider the region Ω depicted in Fig. 5.8 that is divided into 𝑁𝑣 non-overlapping

subdomains where subdomain Ω𝑖 and Ω𝑗 are separated by boundary Γ𝑖𝑗. The junction of

more than two volumes is a "corner" denoted by Γ𝑐. In each subdomain Ω𝑖, (5.21) is defined

106

and is assumed to be a self contained “primal" problem that has fictitious excitations

from "dual" unknowns. The dual unknowns, the Lagrange multipliers denoted by Λ,

affect either Neumann or Robin boundary conditions such that the surface equivalence

theorem can be applied for each Ω𝑖 with respect to Ω using the external currents on 𝜕Ω𝑖,

the fictitious boundary current from Λ, and the particle current in Ω𝑖. By solving for the

dual unknowns and a small subset of the primal unknowns, the overall computational cost

of solving the original monolithic system in (5.21) is reduced.

Finite Element Tearing and Interconnecting

The boundaries between Ω𝑖 and Ω𝑗, Γ𝑖𝑗 and Γ𝑗𝑖, are distinct and have a unique Lagrange

multiplier defined for each subdomain. As described in [80], the boundary condition on

the Lagrange multipliers becomes

Λ(𝑖)(𝑡, r) = ˆ𝑛𝑖 × 𝜇−1

0 B(𝑡, r) − 𝜂−1 ˆ𝑛𝑖 × ˆ𝑛𝑖 × E(𝑡, r) ∈ Γ𝑖𝑗.

(5.22)

The edges on Γ𝑖𝑗, as well as the interior unknowns and unknowns on 𝜕Ω𝑖 are included
in a vector ¯𝑋(𝑖)
𝐼

. The Lagrange multipliers for the corner edges remain the same, with a

unique Lagrange multiplier for each feature. The faces on an Γ𝑖𝑗 are grouped together with
the corner edges, which we will denote as ¯𝑋𝑁 . The system of equations solved in each

subdomain is

¯¯𝐴(𝑖)
𝐼𝐼
¯¯𝐴(𝑖)
𝑁 𝐼
where ¯¯𝐴 = .5 ¯¯𝐴1 + .25Δ𝑡








¯𝑋(𝑖),𝑛+1

¯𝑋(𝑖),𝑛+1
𝑁

𝐼

¯¯𝐴(𝑖)
𝐼𝑁
¯¯𝐴(𝑖)
𝑁 𝑁







¯¯𝐴0 and given








=








ℛ(𝑖)
𝐼
ℛ(𝑖)
𝑁















−








.5Δ𝑡

(𝑖),𝑛+1
¯Λ
𝐼
(𝑖)
¯Λ
𝑁








∈ Ω𝑖

(5.23)

¯¯𝐴0

¯𝑋 𝑛

ℛ = .5Δ𝑡
− (.5 ¯¯𝐴1 + .25Δ𝑡

¯¯𝐴0) ¯𝑋 𝑛−1 − .5Δ𝑡

˜𝐺𝑛+1 − .5Δ𝑡

˜𝐺𝑛−1

(5.24)

− .25Δ𝑡

˜𝐹𝑛+1 + .5Δ𝑡

˜𝐹𝑛 − .25 ˜𝐹𝑛−1 − .5Δ𝑡Λ𝑛 + .25Λ𝑛−1,

ℛ(𝑖)
𝐼

contains ℛ associated with interior unknowns, external unknowns, and unknowns of

edges on Γ𝑖𝑗 and ℛ(𝑖)
𝑁

contain the remaining unknowns in subdomain Ω𝑖. To derive the set

107

of equations to solve for the dual unknowns, first consider the summation of (5.22) for two

volumes

Λ(𝑖)(𝑡, r) + Λ(𝑗)(𝑡, r) = (𝑌𝑖 + 𝑌𝑗) ˆ𝑛𝑖 × ˆ𝑛𝑖 × E(𝑡, r)

(5.25)

where 𝑌𝑖 is the surface impedance on Γ𝑖𝑗 and 𝑌𝑗 the surface impedance on Γ𝑗𝑖. Define an
unsigned Boolean matrix such that ¯¯𝐵(𝑖)
¯𝑋𝐼 = ¯𝑒(𝑖)
, the unknowns for the electric field on Γ𝑖𝑗.
𝐼
𝐼
Also, define the map from the boundary edges on Γ𝑖𝑗 to its counterpart on Γ𝑗𝑖, ¯𝑇𝑖𝑗. Testing

(5.25) with curl-conforming basis functions W𝑒, generalizing the case to any number of
subdomains, and using ¯¯𝐵(𝑖)
𝐼

and ¯¯𝑇𝑗𝑖 restrict and map the unknowns of Ω𝑗 to Ω𝑖 yields

Λ(𝑖) + (cid:213)

¯¯𝑇𝑖𝑗(Λ(𝑗) − (𝑌𝑖 + 𝑌𝑗) ¯¯𝑀(𝑗)

𝑇

¯¯𝐵(𝑗)
𝐼

¯𝑋(𝑗)
𝐼

) = 0

(5.26)

𝑗∈𝜎(𝑖)

where 𝜎(𝑖) are all subdomains Ω𝑗 which share a boundary with Ω𝑖 and the matrix
¯¯𝑀(𝑖)

𝑑Γ𝑖𝑗 ˆ𝑛 × W𝑒 · ˆ𝑛 × W𝑒. Rewriting the first set of equations in (5.23) as

𝑇 = ∫

Γ𝑖𝑗

¯𝑋(𝑖),𝑛+1

𝐼

=

¯¯𝐴(𝑖),−1
𝐼𝐼

(ℛ(𝑖)
𝐼

− .5Δ𝑡

¯Λ

(𝑖),𝑛+1
𝐼

− ¯¯𝐴(𝑖)
𝐼𝑁

¯𝑋(𝑖),𝑛+1
𝑁

)

(5.27)

Now, (5.27) can be plugged into (5.26) to yield

Λ(𝑖)+(cid:213)

𝑇𝑖𝑗(¯¯𝐼(𝑗) − (𝑌𝑖 + 𝑌𝑗) ¯¯𝑀(𝑗)

𝑇

¯¯𝐵(𝑗)
𝐼

¯¯𝐴(𝑗),−1
𝐼𝐼

¯¯𝐵(𝑗),𝑇
𝐼

)Λ(𝑗)

𝑗∈𝜎(𝑖)

(cid:213)

=

𝑗∈𝜎(𝑖)

𝑇𝑖𝑗

¯¯𝐵(𝑗)
𝐼

¯¯𝐴(𝑗),−1
𝐼𝐼

( ¯¯𝐴(𝑗)
𝐼𝑁

¯¯𝐵(𝑗)
𝑁

¯𝑋𝑁 − ℛ(𝑗)
𝐼

).

(5.28)

In order to complete the definition of the equation to solve for Λ𝐼 over ∪Γ𝑖𝑗, we first define
the equation for ¯𝑋𝑁 , the unknowns associated with faces on Γ𝑖𝑗 and edges on Γ𝑐.

The derivation of the equation to solve for ¯𝑋𝑁 follows a similar path as ¯𝑋𝑐 from

MFET-DP1. First, substitute (5.27) into the second set of equations in (5.26) to yield

(cid:16) ¯¯𝐴(𝑖)

𝑁 𝑁

− ¯¯𝐴(𝑖),−1
𝑁 𝐼

(cid:17) ¯𝑋(𝑖),𝑛+1

¯¯𝐴(𝑖),−1
𝐼𝐼
− ¯¯𝐴(𝑖)
𝑁 𝐼

¯¯𝐴(𝑖)
𝐼𝑁
¯¯𝐴(𝑖),−1
𝐼𝐼

𝑁
(cid:16)

ℛ(𝑖)
𝐼

= ℛ(𝑖)
𝑁
(cid:17)

− ¯Λ

(𝑖)
𝐼

The Boolean matrix ¯¯𝐵(𝑖)
𝑁

is defined as

¯¯𝐵(𝑖)
𝑁 =

0
¯¯𝐵(𝑖)
𝑐








¯¯𝐵(𝑖)
𝑟, 𝑓

0








108

− ¯Λ

(𝑖)
𝑁

.

(5.29)

(5.30)

to enforce the impedance boundary conditions for the faces on Ω𝑖 and edges on corners
where ¯¯𝐵(𝑖)
𝑟, 𝑓

that acts on the face degrees of freedom. UAs a result,

is the portion of ¯¯𝐵(𝑖)
𝑟

where

¯¯𝐾𝑁 𝑁

𝑁 = ¯ℛ𝑁 + ¯¯𝐾𝑁 𝐼Λ𝐼 .
¯𝑋 𝑛+1

¯¯𝐾𝑁 𝑁 =

𝑁𝑣(cid:213)

𝑖=1

¯¯𝐵(𝑖)
𝑁

( ¯¯𝐴(𝑖)
𝑁 𝑁

− ¯¯𝐴(𝑖),
𝑁 𝐼

¯¯𝐴(𝑖),−1
𝐼𝐼

¯¯𝐴(𝑖),−1
𝐼𝑁

) ¯¯𝐵(𝑖),𝑇
𝑁

𝑁𝑣(cid:213)

¯¯𝐾𝑁 𝐼 =

¯¯𝐵(𝑖)
𝑁

¯¯𝐴(𝑖)
𝑁 𝐼

¯¯𝐴(𝑖),−1
𝐼𝐼

¯¯𝐵(𝑖),𝑇
𝐼

¯¯𝐵(𝑖),𝑇
𝐿

𝑖=1
𝑁𝑣(cid:213)

¯¯𝐵(𝑖)
𝑁

𝑖=1

¯ℛ𝑁 =

(ℛ(𝑖)
𝑁

− ¯¯𝐴(𝑖)
𝑁 𝐼

¯¯𝐴(𝑖),−1
𝐼𝐼

ℛ(𝑖)
𝐼

).

(5.31)

(5.32a)

(5.32b)

(5.32c)

The global system of equations for the edge Lagrange multipliers Define a Boolean

matrix ¯¯𝐵(𝑖)
𝐿

that maps the local Lagrange multipliers to a global vector. The Boolean matrix

is applied to (5.28) and discretized in time to define the equation

¯¯𝐾𝐼𝐼Λ𝐼 = ¯ℛ𝐼 − ¯¯𝐾𝐼𝑁

¯𝑁 𝑛+1
𝑁

where

¯¯𝐾𝐼𝐼 = 𝐼 +

𝑁𝑣(cid:213)

𝐵(𝑖)
𝐿

𝑖=1
¯¯𝑇𝑖𝑗(¯¯𝐼(𝑗) − 𝑀(𝑗)

𝑇

(cid:213)

𝑗∈𝜎(𝑖)

¯¯𝐵(𝑗)
𝐼

¯¯𝐴(𝑗),−1
𝐼𝐼

) ¯¯𝐵(𝑗),𝑇
𝐼

𝐵(𝑗),𝑇
𝐿

¯¯𝐾𝐼𝑁 = −

𝑁𝑣(cid:213)

(cid:213)

𝐵(𝑖)
𝐿

𝑇 𝑗𝑖 𝑀(𝑗)

𝑇

𝐵(𝑗)
𝐼

¯¯𝐴(𝑗),−1
𝐼𝐼

¯¯𝐴(𝑗)
𝐼𝑁

𝐵(𝑗),𝑇
𝑁

𝑖=1
𝑁𝑣(cid:213)

¯ℛ𝐼 = −

𝑗∈𝜎(𝑖)

(cid:213)

𝐵(𝑗)
𝐿

𝑇 𝑗𝑖

¯¯𝑀(𝑗)
𝑇

𝐵(𝑗)
𝐼

¯¯𝐴(𝑗),−1
𝐼𝐼

ℛ(𝑖)
𝐼

)

𝑖=1

𝑗∈𝜎(𝑖)

(5.33)

(5.34a)

(5.34b)

(5.34c)

Now, we can apply the port extraction method proposed earlier in the chapter to every

unknown in (5.31) and (5.33).

109

Figure 5.9: Complexity curves as a function of number of interface to volume unknowns
for different values of 𝑁𝑡. Curves in blue represent savings in relation to a traditional
EM-FEMPIC scheme and curves in red represent configurations where the extraction
method is less efficient.

Complexity Analysis

To understand the computational implications of using transient port extraction coupled

with a FETI based domain decomposition scheme, we first need to identify the costs

associated with different parts of the overall method. First, suppose there are 𝑁 unknowns

in the overall domain. By sectioning the domain into 𝑁𝑣 partitions, we obtain 𝑁𝑣 block
matrices that each have a solution complexity of (𝑁/𝑁𝑣)3 using naive matrix inversion.

Since mesh subsectioning tools in the literature traditionally follow a paradigm of each

subvolume having roughly the same number of unknowns, we can restrict our analysis

without loss of generality to subsectioning a cube into 𝑁 equal volumes. Upon doing so, it is
(cid:17)
fairly easy to demonstrate that the cost of the interface solution is (𝑁/𝑁 𝑓 )2 (cid:16)
where 𝑁 𝑓 refers to 𝑁 1/3. Since the port extraction only operates on the interface system,

3/𝑁 𝑓 − 3/𝑁 2
𝑓

,

110

Figure 5.10: 𝑟-𝑧 phase space distribution for the expanding particle beam compared
between a regular EM-FEMPIC method and the proposed extraction based method

and the block matrices are only computed once and stored, the overall complexity can

therefore be written as:

𝐶PEPIC = (𝑁/𝑁𝑣)3 + 𝑁𝑡(𝑁/𝑁 𝑓 )2 (cid:16)

3/𝑁 𝑓 − 3/𝑁 2
𝑓

(cid:17)

(5.35)

The trend of this complexity function is plotted in Fig. 5.9. We note as the number of

timesteps rises in relation to the number of unknowns in the system, the extraction method

gets progressively less viable as most of the cost is relegated to the interface problem.

However, with an unconditionally stable time-stepper as we have used throughout this

work 𝑁𝑡 ≤ 𝑁 is a fairly easy condition to meet for any representative problem.

Numerical PIC Example: Expanding Particle Beam

Consider a conducting cylindrical cavity (as we have seen in the previous chapters)

of length 10 cm and radius 2 cm. The walls of the cavity were assumed to be perfectly

conducting and a particle beam composed of electrons was initialized at 𝑧 = 0. The

111

Parameter
𝑞
𝑚
Beam Voltage
Beam Current
Δ𝑡
Δ𝑡,XOOPIC
𝑁𝑝/timestep

Value
1.6 × 10−19 C
9.1 × 10−31 kg
120 kV
10 A
10 ps
1 ps
50

Table 5.1: Parameters for the expanding beam run.

parameters of the simulation are as in Table. 5.1. The mesh used to discretize the system

comprised of 3229 tetrahedra (resulting in 9879 unknowns) had an average edge length

of 2.3 mm. The geometry was split into two volumes with the interface containing 37

unknowns. The extraction was done for 𝑁𝑡 = 2000 and we note from Fig. 5.10 that the

extraction method agrees closely with the standard EM-FEMPIC method. The discrepancy

in particle positions are primarily due to the individual particle orbits being chaotic and

they this diverge from a small initial offset exponentially fast. However, the overall physical

characteristics (beam radius, etc.) are very closely matched.

5.3 Conclusion

In this chapter, we introduced a transient parameter extraction method that allows for

seamless decoupling of a self-consistent electromagnetic problem from lumped nonlinear

devices attached at ports in the simulation domain. This method was further extended to be

applicable to an EM-FEMPIC system. We demonstrated through numerical examples good

agreement between the proposed method and their equivalent self-consistent simulation

method. The complexity of the method was also rigorously derived, with specific conditions

based on problem size and timesteps proposed.

112

CHAPTER 6

CONCLUSION

The development of finite element particle-in-cell methods has been a very active area

of research in recent years, bringing with it the ability to robustly discretize complex 3D

devices and allow access to more complex function spaces to represent the fields Through

a sequence of seminal papers, it is now possible to combine implicit FEM solvers with

arbitrary particle push routines while exactly conserving charge. The main objectives of

this work were to (1) make the implicit EM-FEMPIC method more applicable to large scale

problems, and (2) analyse potential means to ensure conservation of other fundamental

physical quantities.

We first presented a novel formulation in Chapter 2 that coupled an unconditionally

stable implicit FEM field solver to a particle push scheme that utilized a predictor-corrector

loop. Our method was shown to conserve charge to machine precision this paradigm and as

a result, made the method as a whole more applicable to highly relativistic problems where

the traditionally used explicit particle pusher can become unstable at the time-step sizes

afforded by the implicit field solver. Additionally, the energy conserving characteristics

of different particle push methods were examined, with the exponential method shown

to be non-dissipative under the numerical examples tested. However, it remains an open

problem at time of writing as to whether this behavior is simply due to enhanced accuracy,

or if the exponential stencil is fundamentally energy-conserving.

Next in Chapter 3, we borrowed the well-established idea of envelope tracking methods

from mainstream computational electromagnetics and extended it to be applicable to

particle-in-cell formulations. Breaking up the fields into a slowly varying envelope

modulated with a fast harmonic carrier was shown to have enormous consequences on

charge conservation. As a result, new formalism was derived to come up with guidelines

for choosing testing and representation functions under a general field breakup. The

113

proposed method was shown to achieve significant computational speedup for plasma

systems that were manipulated by high power RF sources, which are commonplace in

many relevant pulsed-power and accelerator systems.

In Chapter 4, we formalized the requirements for energy conservation in an implicit

EM-FEMPIC solver. In particular, we re-iterated the idea that having a non-dissipative

field solver attached to a non-dissipative particle solver is not sufficient to ensure energy

conservation. The mapping of currents from particle movements has to be done with care

in order to relate energies in the EM solver to the particle push. In our analysis we proposed

a velocity correction method that uses an implicit Tajima scheme as a post-processing

step after the standard EM-FEMPIC run. We show through our numerical results that

the resulting stencil removes any spurious energy gain from the particles and conserves

energy as well as the field solver without active sources.

Finally in Chapter 5, we proposed another approach to speed up an implicit EM-

FEMPIC solver by employing a transient parameter extraction method originally proposed

for electromagnetic system driven by lumped nonlinear devices. The port extraction

method was derived in detail and its complexity rigorously derived. However, since

the particle-in-cell method can impose nonlinearities everywhere in the mesh (since the

particles are allowed to move freely inside the domain), naive port extraction at every edge

is not computationally feasible. To solve this problem, we incorporated port extraction into

a domain decomposition algorithm and derived relations between the number of timesteps

and the number of degrees of freedom in the mesh where port extraction would yield

computational benefit.

Future Work

In totality, this work has made progress on addressing a number of open problems

in implicit EM-FEMPIC solver techniques. A number of interesting avenues of future

research remains open. First, the derived rubrics for energy conservation in Chapter 4

114

were strictly limited to a standard Maxwell solver. Extending this analyis to the envelope

tracking method proposed in Chapter 3 would be of great interest as the partcle update

currently used is known to not conserve energy over asymptotic timescales. Likewise,

a proof (or disproof) of energy conservation for the exponential particle update used in

chapter 2 is again an open problem.

The rubrics for energy conservation derived in chapter 5 is still bounded by oscillations

in the field solver, which conserves energy on average, but admits an error of 𝒪(Δ2

𝑡 ) at
any given timestep. Deriving a Newmark based method that either does not admit this

error, or relegates it to a higher power of Δ𝑡 would make the method even more precise.

Additionally, the quasi-Helmholtz scheme used only removed the effect of nullspace

excitations from the non-solenoidal components of the field. Spurious contributions to the

solenoidal fields can still have an impact on energy conservation. Thus, more work needs

to be done to analyse and mitigate these effects. Similar issues exist in the formulation of

the envelope tracking approach and is an interesting avenue of future work.

Finally, the domain-decomposition port extraction method derived in Chapter 5 was

fairly simplistic without much emphasis placed on the effects of matrix inverse storage and

assuming no preconditioning. With more efficient linear algebra modules, the complexity

bounds can be made even more robust and thus make the method applicable to a larger

range of problems.

115

BIBLIOGRAPHY

[1]

[2]

[3]

Richard Marchand. PTETRA, a tool to simulate low orbit satellite–plasma interaction.
IEEE Transactions on Plasma Science, 40(2):217–229, 2011.

RW Lemke, TC Genoni, and TA Spencer. Three-dimensional particle-in-cell simula-
tion study of a relativistic magnetron. Physics of Plasmas, 6(2):603–613, 1999.

E Fourkal, B Shahine, M Ding, JS Li, T Tajima, and C-M Ma. Particle in cell
simulation of laser-accelerated proton beams for radiation therapy. Medical Physics,
29(12):2788–2798, 2002.

[4] Dragica Vasileska and Stephen M. Goodnick. Computational Electronics. Springer

International Publishing, 2006.

[5]

Christian Klingenberg and Knut Waagan. Relaxation solvers for ideal mhd equations
-a review. Acta Mathematica Scientia, 30(2):621–632, 2010.

[6] Alexander A. Schekochihin. MHD turbulence: a biased review. Journal of Plasma
Physics, 88(5):155880501, October 2022. Publisher: Cambridge University Press.

[7]

[8]

[9]

[10]

[11]

Charles K Birdsall and A Bruce Langdon. Plasma physics via computer simulation. CRC
press, 2004.

Jay P Boris. Relativistic plasma simulation-optimization of a hybrid code. In Proc.
Fourth Conf. Num. Sim. Plasmas, pages 3–67, 1970.

Kane Yee. Numerical solution of initial boundary value problems involving maxwell’s
equations in isotropic media. IEEE Transactions on antennas and propagation, 14(3):302–
307, 1966.

John Villasenor and Oscar Buneman. Rigorous charge conservation for local electro-
magnetic field solvers. Computer Physics Communications, 69(2-3):306–316, 1992.

John P Verboncoeur. Particle simulation of plasmas: review and advances. Plasma
Physics and Controlled Fusion, 47(5A):A231, 2005.

[12] R. Holland. Pitfalls of staircase meshing.

Compatibility, 35(4):434–439, 1993.

IEEE Transactions on Electromagnetic

[13]

J.B. Schneider and K.L. Shlager. Fdtd simulations of tem horns and the implications for
staircased representations. IEEE Transactions on Antennas and Propagation, 45(12):1830–
1838, 1997.

[14] Saul Abarbanel, Adi Ditkowski, and Amir Yefet. Bounded error schemes for the
wave equation on complex domains. Journal of Scientific Computing, 26(1):67–81, Jan
2006.

[15] A. Akyurtlu, D.H. Werner, V. Veremey, D.J. Steich, and K. Aydin. Staircasing
errors in fdtd at an air-dielectric interface. IEEE Microwave and Guided Wave Letters,
9(11):444–446, 1999.

116

[16]

J.P. Verboncoeur. Aliasing of electromagnetic fields in stair step boundaries. Computer
Physics Communications, 164(1):344–352, 2004. Proceedings of the 18th International
Conferene on the Numerical Simulation of Plasmas.

[17] Korada Umashankar and Allen Taflove. A novel method to analyze electromagnetic
scattering of complex objects. IEEE Transactions on Electromagnetic Compatibility,
EMC-24(4):397–405, 1982.

[18] A. Taflove and K. Umashankar. A hybrid moment method/finite-difference time-
domain approach to electromagnetic coupling and aperture penetration into complex
geometries. IEEE Transactions on Antennas and Propagation, 30(4):617–627, 1982.
[19] Kenneth K. Mei, Andreas Cangellaris, and Diogenes J. Angelakos. Conformal time

domain finite difference method. Radio Science, 19(5):1145–1147, 1984.

[20] W.K. Gwarek. Analysis of an arbitrarily-shaped planar circuit a time-domain
approach. IEEE Transactions on Microwave Theory and Techniques, 33(10):1067–1072,
1985.

[21] G. Kriegsmann, A. Taflove, and K. Umashankar. A new formulation of electromag-
netic wave scattering using an on-surface radiation boundary condition approach.
IEEE Transactions on Antennas and Propagation, 35(2):153–161, 1987.

[22] T.G. Jurgens, A. Taflove, K. Umashankar, and T.G. Moore. Finite-difference time-
domain modeling of curved surfaces (em scattering). IEEE Transactions on Antennas
and Propagation, 40(4):357–366, 1992.

[23] T.G. Jurgens and A. Taflove. Three-dimensional contour fdtd modeling of scattering
from single and multiple bodies. IEEE Transactions on Antennas and Propagation,
41(12):1703–1708, 1993.

[24] Chris J. Railton, Ian James Craddock, and John B. Schneider. Improved locally
distorted cpfdtd algorithm with provable stability. Electronics Letters, 31:1585–1586,
1995.

[25] M.Celuch Marcysiak and Wojciech K. Gwarek. Higher-order modelling of media
interfaces for enhanced fdtd analysis of microwave circuits. In 1994 24th European
Microwave Conference, volume 2, pages 1530–1535, 1994.

[26] N. Kaneda, B. Houshmand, and T. Itoh. Fdtd analysis of dielectric resonators with
curved surfaces. IEEE Transactions on Microwave Theory and Techniques, 45(9):1645–1649,
1997.

[27] S. Dey, R. Mittra, and S. Chebolu. A technique for implementing the fdtd algorithm
on a nonorthogonal grid. Microwave and Optical Technology Letters, 14(4):213–215,
1997.

[28] S. Dey and R. Mittra. A locally conformal finite-difference time-domain (fdtd) algo-
rithm for modeling three-dimensional perfectly conducting objects. IEEE Microwave
and Guided Wave Letters, 7(9):273–275, 1997.

117

[29] S. Dey, R. Mittra, and S. Chebolu. A technique for implementing the fdtd algorithm
on a nonorthogonal grid. Microwave and Optical Technology Letters, 14(4):213–215,
1997.

[30] Supriyo Dey and Raj Mittra. A modified locally conformal finite-difference time-
domain algorithm for modeling three-dimensional perfectly conducting objects.
Microwave and Optical Technology Letters, 17(6):349–352, 1998.

[31] Supriyo Dey and R. Mittra. A conformal finite-difference time-domain technique for
modeling cylindrical dielectric resonators. IEEE Transactions on Microwave Theory and
Techniques, 47(9):1737–1739, 1999.

[32] W. Yu and R. Mittra. A conformal fdtd software package modeling antennas and
microstrip circuit components. IEEE Antennas and Propagation Magazine, 42(5):28–39,
2000.

[33]

[34]

I.A. Zagorodnov, R. Schuhmann, and T. Weiland. A uniformly stable conformal
fdtd-method in cartesian grids. International Journal of Numerical Modelling: Electronic
Networks, Devices and Fields, 16(2):127–141, 2003.

Igor Zagorodnov, Rolf Schuhmann, and Thomas Weiland. Conformal fdtd-methods to
avoid time step reduction with and without cell enlargement. Journal of Computational
Physics, 225(2):1493–1507, 2007.

[35] Tian Xiao and Q.H. Liu. Enlarged cells for the conformal fdtd method to avoid the
time step reduction. IEEE Microwave and Wireless Components Letters, 14(12):551–553,
2004.

[36] Tian Xiao and Qing Huo Liu. A 3-d enlarged cell technique (ect) for the conformal

fdtd method. IEEE Transactions on Antennas and Propagation, 56(3):765–773, 2008.

[37] Collin S Meierbachtol, Andrew D Greenwood, John P Verboncoeur, and Balasub-
ramaniam Shanker. Conformal electromagnetic particle in cell: A review. IEEE
Transactions on Plasma Science, 43(11):3778–3793, 2015.

[38] Niel K. Madsen and Richard W. Ziolkowski. A three-dimensional modified finite
volume technique for maxwell’s equations. Electromagnetics, 10(1-2):147–161, 1990.

[39] Vĳaya Shankar, Alireza H. Mohammadian, and William F. Hall. A time-domain,
finite-volume treatment for the maxwell equations. Electromagnetics, 10(1-2):127–145,
1990.

[40] Alireza H. Mohammadian, Vĳaya Shankar, and William F. Hall. Computation
of electromagnetic scattering and radiation using a time-domain finite-volume
discretization procedure. Computer Physics Communications, 68(1):175–196, 1991.

[41] R. Holland, V.P. Cable, and L.C. Wilson. Finite-volume time-domain (fvtd) techniques
for em scattering. IEEE Transactions on Electromagnetic Compatibility, 33(4):281–294,
1991.

118

[42] S.D. Gedney and F. Lansing. A parallel discrete surface integral equation method for
the analysis of three-dimensional microwave circuit devices with planar symmetry.
In Proceedings of IEEE Antennas and Propagation Society International Symposium and
URSI National Radio Science Meeting, volume 3, pages 1778–1781 vol.3, 1994.
[43] Niel K. Madsen. Divergence preserving discrete surface integral methods for
Journal of

maxwell’s curl equations using non-orthogonal unstructured grids.
Computational Physics, 119(1):34–45, 1995.

[44] François Hermeline. A finite volume method for solving maxwell equations in inho-
mogeneous media on arbitrary meshes. Comptes Rendus Mathematique, 339(12):893–
898, 2004.

[45] Masami Matsumoto and Shigeo Kawata. Tripic: Triangular-mesh particle-in-cell

code. Journal of Computational Physics, 87(2):488–493, 1990.

[46] Alan M. Winslow. Numerical solution of the quasilinear poisson equation in a

nonuniform triangle mesh. Journal of Computational Physics, 135(2):128–138, 1997.

[47] G. Lapenta, F. Iinoya, and J.U. Brackbill. Particle-in-cell simulation of glow discharges
in complex geometries. IEEE Transactions on Plasma Science, 23(4):769–779, 1995.
[48] C-D Munz, Pascal Omnes, Rudolf Schneider, Eric Sonnendrücker, and Ursula Voss.
Divergence correction techniques for maxwell solvers based on a hyperbolic model.
Journal of Computational Physics, 161(2):484–511, 2000.

[49] C.-D. Munz, P. Omnes, R. Schneider, E. Sonnendrucker, E. Stein, U. Voss, and
T. Westermann. Kad12d-a particle-in-cell code based on finite-volume methods. In
12th International Conference on High-Power Particle Beams. BEAMS’98. Proceedings (Cat.
No.98EX103), volume 1, pages 541–544 vol.1, 1998.

[50] Claus-Dieter Munz, Rudolf Schneider, Eric Sonnendrücker, E. Stein, Ursula Voß, and
Thomas Westermann. A finite-volume particle-in-cell method for the numerical
treatment of maxwell-lorentz equations on boundary-fitted meshes. International
Journal for Numerical Methods in Engineering, 44:461–487, 1999.

[51] Nikolaos Gatsonis and Anton Spirkin. Unstructured 3D Simulations of Field Emission

Array Cathodes for Micropropulsion Applications. 2002.

[52] Anton Spirkin and Nikolaos Gatsonis. Unstructured 3D PIC Simulation of Plasma Flow

in a Segmented Microchannel. 2003.

[53] Anton Spirkin and Nikolaos A. Gatsonis. Unstructured 3d pic simulations of the flow
in a retarding potential analyzer. Computer Physics Communications, 164(1):383–389,
2004. Proceedings of the 18th International Conferene on the Numerical Simulation
of Plasmas.

[54]

J.S Hesthaven and T Warburton. Nodal high-order methods on unstructured grids:
I. time-domain solution of maxwell’s equations. Journal of Computational Physics,
181(1):186–221, 2002.

119

[55] Masoud Movahhedi, Abdolali Abdipour, Alexandre Nentchev, Mehdi Dehghan,
and Siegfried Selberherr. Alternating-direction implicit formulation of the finite-
element time-domain method. IEEE Transactions on Microwave Theory and Techniques,
55(6):1322–1331, 2007.

[56] Serge Piperno and Loula Fatima Fezoui. A centered Discontinuous Galerkin Finite
Volume scheme for the 3D heterogeneous Maxwell equations on unstructured meshes.
Technical Report RR-4733, INRIA, February 2003.

[57] Arne Taube, Michael Dumbser, Claus-Dieter Munz, and Rudolf Schneider. A high-
order discontinuous galerkin method with time-accurate local time stepping for the
maxwell equations. International Journal of Numerical Modelling: Electronic Networks,
Devices and Fields, 22(1):77–103, 2009.

[58] G.B. Jacobs and J.S. Hesthaven. High-order nodal discontinuous galerkin particle-in-
cell method on unstructured grids. Journal of Computational Physics, 214(1):96–121,
2006.

[59] Gustaaf Jacobs, Jan Hesthaven, and Giovanni Lapenta. Simulations of the Weibel
Instability with a High-Order Discontinous Galerkin Particle-In-Cell Solver. 2006.

[60] G.B. Jacobs and J.S. Hesthaven. Implicit–explicit time integration of a high-order
particle-in-cell method with hyperbolic divergence cleaning. Computer Physics
Communications, 180(10):1760–1767, 2009.

[61] Ross Evan Heath. Analysis of the discontinuous Galerkin method applied to collisionless

plasma physics. PhD thesis, University of Texas, Austin, January 2007.

[62]

James A. Rossmanith and David C. Seal. A positivity-preserving high-order semi-
lagrangian discontinuous galerkin scheme for the vlasov–poisson equations. Journal
of Computational Physics, 230(16):6203–6232, 2011.

[63] Blanca Ayuso, José A. Carrillo, and Chi-Wang Shu. Discontinuous galerkin methods
for the one-dimensional vlasov-poisson system. Kinetic and Related Models, 4(4):955–
989, 2011.

[64] Yingda Cheng, Irene M. Gamba, Fengyan Li, and Philip J. Morrison. Discontinuous
galerkin methods for the vlasov–maxwell equations. SIAM Journal on Numerical
Analysis, 52(2):1017–1049, 2014.

[65] Yingda Cheng, Andrew J. Christlieb, and Xinghui Zhong. Energy-conserving dis-
continuous galerkin methods for the vlasov–ampère system. Journal of Computational
Physics, 256:630–655, 2014.

[66]

Jian-Ming Jin. The finite element method in electromagnetics. John Wiley & Sons, 2015.

[67] Martin Campos Pinto, Sébastien Jund, Stéphanie Salmon, and Eric Sonnendrücker.
Charge-conserving fem–pic schemes on general grids. Comptes Rendus Mecanique,
342(10-11):570–582, 2014.

120

[68] Haksu Moon, Fernando L Teixeira, and Yuri A Omelchenko. Exact charge-conserving
scatter–gather algorithm for particle-in-cell simulations on unstructured grids: A
geometric perspective. Computer Physics Communications, 194:43–53, 2015.
Jonathan Squire, Hong Qin, and William M Tang. Geometric integration of the
vlasov-maxwell system with a variational particle-in-cell scheme. Physics of Plasmas,
19(8):084501, 2012.

[69]

[70] Haksu Moon, Fernando L Teixeira, and Yuri A Omelchenko. Exact charge-conserving
scatter–gather algorithm for particle-in-cell simulations on unstructured grids: A
geometric perspective. Computer Physics Communications, 194:43–53, 2015.

[71] S O’Connor, ZD Crawford, J Verboncoeur, J Lugisland, and B Shanker. A set
of benchmark tests for validation of 3d particle in cell methods. arXiv preprint
arXiv:2101.09299, 2021.

[72] Michael Kraus, Katharina Kormann, Philip J Morrison, and Eric Sonnendrücker.
Gempic: geometric electromagnetic particle-in-cell methods. Journal of Plasma Physics,
83(4), 2017.

[73] XIAO Jianyuan, QIN Hong, and LIU Jian. Structure-preserving geometric particle-in-
cell methods for vlasov-maxwell systems. Plasma Science and Technology, 20(11):110501,
2018.

[74] Olgierd Cecil Zienkiewicz. A new look at the newmark, houbolt and other time
stepping formulas. a weighted residual approach. Earthquake Engineering & Structural
Dynamics, 5(4):413–418, 1977.

[75] Zane D Crawford, Scott O’Connor, John Luginsland, and B Shanker. Rubrics for
charge conserving current mapping in finite element particle in cell methods. arXiv
preprint arXiv:2101.12128, 2021.

[76] Scott O’Connor, Zane D. Crawford, O. H. Ramachandran, John Luginsland, and
B. Shanker. Time integrator agnostic charge conserving finite element pic. Physics of
Plasmas, 28(9):092111, 2021.

[77] Scott O’Connor, Zane D. Crawford, O.H. Ramachandran, John Luginsland, and
B. Shanker. Quasi-helmholtz decomposition, gauss’ laws and charge conservation
for finite element particle-in-cell. Computer Physics Communications, 276:108345, 2022.
[78] Omkar H. Ramachandran, Zane D. Crawford, Scott O’Connor, John Luginsland, and

B. Shanker. An envelope tracking approach for particle in cell simulations, 2022.

[79] Zane D. Crawford, O. H. Ramachandran, Scott O’Connor, John Luginsland, and
B. Shanker. Higher order charge conserving electromagnetic finite element particle
in cell method, 2021.

[80] Zane D. Crawford, O. H. Ramachandran, Scott O’Connor, Daniel L. Dault, John
Luginsland, and B. Shanker. Domain decomposition framework for maxwell finite
element solvers and application to pic, 2022.

121

[81] Omkar H. Ramachandran, Leo C. Kempel, John Luginsland, and B. Shanker. A
charge conserving exponential predictor corrector fempic formulation for relativistic
particle simulations, 2022.

[82] A. Bossavit. Whitney forms: a class of finite elements for three-dimensional compu-
tations in electromagnetism. IEE Proceedings A - Physical Science, Measurement and
Instrumentation, Management and Education - Reviews, 135(8):493–500, Nov 1988.

[83] Zane Crawford, Jie Li, Andrew Christlieb, and Balasubramaniam Shanker. Uncon-
ditionally stable time stepping method for mixed finite element maxwell solvers.
Progress In Electromagnetics Research, 103:17–30, 2020.

[84] Man-Fai Wong, O. Picon, and V. Fouad Hanna. A finite element method based on
whitney forms to solve maxwell equations in the time domain. IEEE Transactions on
Magnetics, 31(3):1618–1621, May 1995.

[85] Bo He and F.L. Teixeira. Geometric finite element discretization of maxwell equations

in primal and dual spaces. Physics Letters A, 349(1–4):1 – 14, 2006.

[86] A. Bossavit and L. Kettunen. Yee-like schemes on a tetrahedral mesh, with diagonal
lumping. International Journal of Numerical Modelling: Electronic Networks, Devices and
Fields, 12(1-2):129–142, 1999.

[87] Scott O’Connor, Zane D Crawford, OH Ramachandran, John Luginsland, and
B Shanker. Quasi-helmholtz decomposition, gauss’ laws and charge conservation for
finite element particle-in-cell. arXiv preprint arXiv:2103.06737, 2021.

[88]

J. P. Boris. Relativistic plasma simulation-optimization of a hybrid code. Proceeding
of Fourth Conference on Numerical Simulations of Plasmas, November 1970.

[89] Hong Qin, Shuangxi Zhang, Jianyuan Xiao, Jian Liu, Yajuan Sun, and William M.
Tang. Why is boris algorithm so good? Physics of Plasmas, 20(8):084503, 2013.

[90] Seĳi Zenitani and Takayuki Umeda. On the boris solver in particle-in-cell simulation.

Physics of Plasmas, 25(11):112110, 2018.

[91] Fabio Bacchini, Jorge Amaya, and Giovanni Lapenta. The relativistic implicit particle-
in-cell method. Journal of Physics: Conference Series, 1225(1):012011, may 2019.

[92] Stefano Markidis and Giovanni Lapenta. The energy conserving particle-in-cell

method. Journal of Computational Physics, 230(18):7037–7052, 2011.

[93] Stefano Markidis. Development of Implicit Kinetic Simulation Methods, and their Ap-
plication to Ion Beam Propagation in Current and Future Neutralized Drift Compression
Experiments. ProQuest Dissertations Publishing, 2010.

[94] Andreas Glaser and Vladimir Rokhlin. A new class of highly accurate solvers for
ordinary differential equations. Journal of Scientific Computing, 38:368–399, 2009.

122

[95] Rui Wang, Douglas J Riley, and Jian-Ming Jin. Application of tree-cotree splitting
to the time-domain finite-element analysis of electromagnetic problems.
IEEE
transactions on antennas and propagation, 58(5):1590–1600, 2010.

[96] Neelakantam V Venkatarayalu and Jin-Fa Lee. Removal of spurious dc modes in
edge element solutions for modeling three-dimensional resonators. IEEE transactions
on microwave theory and techniques, 54(7):3019–3025, 2006.

[97] L. D. Landau and E. M. Lifshitz. The Classical Theory of Fields. Butterworth-Heinemann,

January 1980.

[98]

J.P. Verboncoeur, A.B. Langdon, and N.T. Gladd. An object-oriented electromagnetic
pic code. Computer Physics Communications, 87(1):199–211, 1995. Particle Simulation
Methods.

[99] Chung-Wen Ho, A. Ruehli, and P. Brennan. The modified nodal approach to network

analysis. IEEE Transactions on Circuits and Systems, 22(6):504–509, 1975.

[100] C.B. Wilsen, J.W. Luginsland, Yue Ying Lau, T.M. Antonsen, D.P. Chernin, P.M. Tchou,
M.W. Keyser, R.M. Gilgenbach, and L.D. Ludeking. A simulation study of beam
loading on a cavity. IEEE Transactions on Plasma Science, 30(3):1160–1168, 2002.

[101] Jianguo Wang, Zaigao Chen, Yue Wang, Dianhui Zhang, Chunliang Liu, Yongdong
Li, Hongguang Wang, Hailiang Qiao, Meiyan Fu, and Yuan Yuan. Three-dimensional
parallel unipic-3d code for simulations of high-power microwave devices. Physics of
Plasmas, 17(7):073107, 2010.

[102] J.P. Verboncoeur, A.B. Langdon, and N.T. Gladd. An object-oriented electromagnetic
pic code. Computer Physics Communications, 87(1):199–211, 1995. Particle Simulation
Methods.

[103] Nathan M Newmark. A method of computation for structural dynamics. Journal of

the engineering mechanics division, 85(3):67–94, 1959.

[104] K. Li, A. M. Tassoudji, S. Y. Poh, M. Tsuk, R. T. Shin, and J. A. Kong. Fd-td analysis
of electromagnetic radiation from modules-on-backplane configurations.
IEEE
Transactions on Electromagnetic Compatibility, 37(3):326–332, 1995.

[105] Er-Ping Li. Electrical Modeling and Design for 3D System Integration. John Wiley &

Sons, Ltd, 2012.

[106] Yui-Sheng Tsuei, A. C. Cangellaris, and J. L. Prince. Rigorous electromagnetic
IEEE Transactions on

modeling of chip-to-package (first-level) interconnections.
Components, Hybrids, and Manufacturing Technology, 16(8):876–883, 1993.

[107] K. Aygun, B. C. Fischer, Jun Meng, B. Shanker, and E. Michielssen. A fast hybrid
field-circuit simulator for transient analysis of microwave circuits. IEEE Transactions
on Microwave Theory and Techniques, 52(2):573–583, 2004.

123

[108] A. E. Yilmaz, Jian-Ming Jin, and E. Michielssen. A parallel fft accelerated tran-
sient field-circuit simulator. IEEE Transactions on Microwave Theory and Techniques,
53(9):2851–2865, 2005.

[109] R. Wang and J. Jin. A symmetric electromagnetic-circuit simulator based on the
extended time-domain finite element method. IEEE Transactions on Microwave Theory
and Techniques, 56(12):2875–2884, 2008.

[110] S. O’Connor, S. Hughey, D. Dault, A. J. Pray, J. M. Villa-Giron, and B. Shanker.
A novel port/network parameter extraction technique for coupling circuits with
full-wave time-domain integral equation solvers. IEEE Transactions on Microwave
Theory and Techniques, 67(2):553–564, 2019.

[111] Omkar H. Ramachandran, Scott O’Connor, Zane D. Crawford, Leo C. Kempel, and
Balasubramaniam Shanker. Port parameter extraction-based self-consistent coupled
em-circuit fem solvers. IEEE Transactions on Components, Packaging and Manufacturing
Technology, 12:1040–1048, 2021.

[112] Jianming Jin and Douglas J. Riley. Finite Element Analysis of Antennas and Arrays.

Wiley-IEEE Press, 2009.

[113] W.H. Kao, Chi-Yuan Lo, M. Basel, and R. Singh. Parasitic extraction: current state of

the art and future trends. Proceedings of the IEEE, 89(5):729–739, 2001.

[114] G. Antonini, A.C. Scogna, A. Orlandi, V. Ricchiuti, G. Selli, S. Luan, and J.L. Drewniak.
Validation of circuit extraction procedure by means of frequency and time domain
measurement. In 2005 International Symposium on Electromagnetic Compatibility, 2005.
EMC 2005., volume 1, pages 45–50, 2005.

[115] Yong Wang, D. Gope, V. Jandhyala, and C. . R. Shi. Generalized kirchoff’s current
and voltage law formulation for coupled circuit-electromagnetic simulation with
surface integral equations. IEEE Transactions on Microwave Theory and Techniques,
52(7):1673–1682, 2004.

[116] Paule Charland, Ellen El-Khatib, and John Wolters. The use of deconvolution and
total least squares in recovering a radiation detector line spread function. Medical
Physics, 25(2):152–160, 1998.

[117] Yan-Lin Li, Sheng Sun, Qi I Dai, and Weng Cho Chew. Finite element implementation
of the generalized-lorenz gauged a-\phi formulation for low-frequency circuit
modeling. IEEE Transactions on Antennas and Propagation, 64(10):4355–4364, 2016.

[118] Zane Crawford, Jie Li, Andrew Christlieb, and Balasubramaniam Shanker. Uncon-
ditionally stable time stepping method for mixed finite element maxwell solvers.
Progress In Electromagnetics Research C, 103:17–30, 01 2020.

[119] O. C. Zienkiewicz. A new look at the newmark, houbolt and other time stepping
formulas. a weighted residual approach. Earthquake Engineering & Structural Dynamics,
5(4):413–418, 1977.

124

[120] Jean-Pierre Berenger. A perfectly matched layer for the absorption of electromagnetic

waves. Journal of Computational Physics, 114(2):185–200, 1994.

[121] T. Rylander and Jian-Ming Jin. Perfectly matched layer in three dimensions for the
time-domain finite element method applied to radiation problems. IEEE Transactions
on Antennas and Propagation, 53(4):1489–1499, 2005.

[122] D.L. Scharfetter and H.K. Gummel. Large-signal analysis of a silicon read diode

oscillator. IEEE Transactions on Electron Devices, 16(1):64–77, 1969.

[123] M. Kurata. Design considerations of step recovery diodes with the aid of numerical
large-signal analysis. IEEE Transactions on Electron Devices, 19(11):1207–1215, 1972.

[124] Jun-quan Chen, Xing Chen, Chang-Jun Liu, Kama Huang, and Xiao-Bang Xu.
Analysis of temperature effect on p-i-n diode circuits by a multiphysics and circuit
cosimulation algorithm. IEEE Transactions on Electron Devices, 59(11):3069–3077, 2012.

[125] Hongzheng Zeng, Yuzhu Tang, Xin Duan, and Xing Chen. A physical model-based
FDTD field-circuit co-simulation method for schottky diode rectifiers. IEEE Access,
7:87265–87272, 2019.

[126] Charles K Birdsall and A Bruce Langdon. Plasma physics via computer simulation. CRC

press, 2018.

[127] S Laha, P Gupta, CE Simien, H Gao, J Castro, T Pohl, and TC Killian. Experimental
realization of an exact solution to the vlasov equations for an expanding plasma.
Physical review letters, 99(15):155001, 2007.

[128] Johan Carlsson, Alexander Khrabrov, Igor Kaganovich, Timothy Sommerer, and
David Keating. Validation and benchmarking of two particle-in-cell codes for a glow
discharge. Plasma Sources Science and Technology, 26(1):014003, 2016.

[129] M Surendra. Radiofrequency discharge benchmark model comparison. Plasma

Sources Science and Technology, 4(1):56, 1995.

[130] Miles M Turner, Aranka Derzsi, Zoltan Donko, Denis Eremin, Sean J Kelly, Trevor
Lafleur, and Thomas Mussenbrock. Simulation benchmarks for low-pressure plasmas:
Capacitive discharges. Physics of Plasmas, 20(1):013507, 2013.

[131] Miles M Turner. Verification of particle-in-cell simulations with monte carlo collisions.

Plasma Sources Science and Technology, 25(5):054007, 2016.

[132] Fabio Riva, Carrie F Beadle, and Paolo Ricci. A methodology for the rigorous
verification of particle-in-cell simulations. Physics of Plasmas, 24(5):055703, 2017.

[133] Su Yan and Jian-Ming Jin. A continuity-preserving and divergence-cleaning algorithm
based on purely and damped hyperbolic maxwell equations in inhomogeneous media.
Journal of computational physics, 334:392–418, 2017.

125

[134] Martin Reiser and Patrick O’Shea. Theory and design of charged particle beams, volume

312. Wiley Online Library, 1994.

[135] VF Kovalev and V Yu Bychenkov. Analytic solutions to the vlasov equations for

expanding plasmas. Physical review letters, 90(18):185004, 2003.

[136] Peter Monk. Finite element methods for Maxwell’s equations. Oxford University Press,

2003.

[137] David Smithe, Peter Stoltz, John Loverich, Chet Nieter, and Seth Veitzer. Development
and application of particle emission algorithms from cut-cell boundaries in the vorpal
em-fdtd-pic simulation tool. In 2008 IEEE International Vacuum Electronics Conference,
pages 217–218. IEEE, 2008.

[138] Chet Nieter, John R Cary, Gregory R Werner, David N Smithe, and Peter H Stoltz.
Application of dey–mittra conformal boundary algorithm to 3d electromagnetic
modeling. Journal of Computational Physics, 228(21):7902–7916, 2009.

[139] Alexander S Glasser and Hong Qin. The geometric theory of charge conservation in

particle-in-cell simulations. arXiv preprint arXiv:1910.12395, 2019.

[140] Guangye Chen, Luis Chacón, and Daniel C Barnes. An energy-and charge-conserving,
Journal of Computational Physics,

implicit, electrostatic particle-in-cell algorithm.
230(18):7018–7036, 2011.

[141] T Zh Esirkepov. Exact charge conservation scheme for particle-in-cell simulation
with an arbitrary form-factor. Computer Physics Communications, 135(2):144–153, 2001.

[142] Katharina Kormann and Eric Sonnendrücker. Energy-conserving time propaga-
tion for a structure-preserving particle-in-cell vlasov–maxwell solver. Journal of
Computational Physics, 425:109890, 2021.

[143] Jianyuan Xiao and Hong Qin. Explicit structure-preserving geometric particle-in-cell
algorithm in curvilinear orthogonal coordinate systems and its applications to whole-
device 6d kinetic simulations of tokamak physics. Plasma Science and Technology,
2021.

[144] Annalisa Buffa, Giancarlo Sangalli, and Rafael Vázquez. Isogeometric analysis in
electromagnetics: B-splines approximation. Computer Methods in Applied Mechanics
and Engineering, 199(17-20):1143–1152, 2010.

[145] Roberto D Graglia, Donald R Wilton, and Andrew F Peterson. Higher order
interpolatory vector bases for computational electromagnetics. IEEE transactions on
antennas and propagation, 45(3):329–342, 1997.

[146] Roberto D Graglia and Andrew F Peterson. Hierarchical divergence-conforming
nédélec elements for volumetric cells. IEEE transactions on antennas and propagation,
60(11):5215–5227, 2012.

126

[147] Ralf Hiptmair. Higher order whitney forms. Progress in Electromagnetics Research,

32:271–299, 2001.

[148] Robert J. Barker and Edl Schamiloglu. High-power microwave sources and technolo-

gies, 2001.

[149] J.J. Petillo, E.M. Nelson, J.F. DeFord, N.J. Dionne, and B. Levush. Recent developments
to the michelle 2-d/3-d electron gun and collector modeling code. IEEE Transactions
on Electron Devices, 52(5):742–748, 2005.

[150] A. Cangellaris, Chung-Chi Lin, and Kenneth Mei. Point-matched time domain finite
element methods for electromagnetic radiation and scattering. IEEE Transactions on
Antennas and Propagation, 35(10):1160–1173, 1987.

[151] A. E. Yilmaz, Jian-Ming Jin, and E. Michielssen. A parallel fft accelerated tran-
sient field-circuit simulator. IEEE Transactions on Microwave Theory and Techniques,
53(9):2851–2865, 2005.

[152] B. Toland, J. Lin, B. Houshmand, and T. Itoh. Fdtd analysis of an active antenna.

IEEE Microwave and Guided Wave Letters, 3(11):423–425, 1993.

[153] V. A. Thomas, M. E. Jones, M. Piket-May, A. Taflove, and E. Harrigan. The use of
spice lumped circuits as sub-grid models for fdtd analysis. IEEE Microwave and
Guided Wave Letters, 4(5):141–143, 1994.

[154] P. Ciampolini, P. Mezzanotte, L. Roselli, and R. Sorrentino. Accurate and efficient cir-
cuit simulation with lumped-element fdtd technique. IEEE Transactions on Microwave
Theory and Techniques, 44(12):2207–2215, 1996.

[155] Chien-Nan Kuo, B. Houshmand, and T. Itoh. Full-wave analysis of packaged
microwave circuits with active and nonlinear devices: an fdtd approach. IEEE
Transactions on Microwave Theory and Techniques, 45(5):819–826, 1997.

[156] P. Mezzanotte, M. Mongiardo, L. Roselli, R. Sorrentino, and W. Heinrich. Analysis
of packaged microwave integrated circuits by fdtd. IEEE Transactions on Microwave
Theory and Techniques, 42(9):1796–1801, 1994.

[157] Yong Wang, D. Gope, V. Jandhyala, and C. . R. Shi. Generalized kirchoff’s current
and voltage law formulation for coupled circuit-electromagnetic simulation with
surface integral equations. IEEE Transactions on Microwave Theory and Techniques,
52(7):1673–1682, 2004.

[158] Chien-Nan Kuo, V. A. Thomas, Siou Teck Chew, B. Houshmand, and T. Itoh. Small
signal analysis of active circuits using fdtd algorithm. IEEE Microwave and Guided
Wave Letters, 5(7):216–218, 1995.

[159] Garry Rodrigue and Daniel White. A vector finite element time-domain method
for solving maxwell’s equations on unstructured hexahedral grids. SIAM Journal on
Scientific Computing, 23(3):683–706, 2001.

127

[160] Zheng Lou and Jian-Ming Jin. Time-domain finite-element modeling and simulation
of broadband antennas. In 2006 IEEE Antennas and Propagation Society International
Symposium, pages 2809–2812, 2006.

[161] R. Subramanian, M. Swaminathan, and M. Behar. Parasitic extraction and perfor-
mance evaluation of a high i/o package. In Proceedings of 2nd Electronics Packaging
Technology Conference (Cat. No.98EX235), pages 99–106, 1998.

[162] Jianyuan XIAO, Hong QIN, and Jian LIU. Structure-preserving geometric particle-in-
cell methods for vlasov-maxwell systems. Plasma Science and Technology, 20(11):110501,
sep 2018.

[163] Martin Campos Pinto, Katharina Kormann, and Eric Sonnendrücker. Variational
framework for structure-preserving electromagnetic particle-in-cell methods. Journal
of Scientific Computing, 91(2):46, Mar 2022.

128