%.
..!.
31...:
I. .55..

‘1 i-§.o“u

:mLII .
1: Law:
2‘

1...;
.25.!

.75}...
.. :3: .c.

:
Iii-{3.5!
0.8.1.: :a

 

. ,
-i .
alnﬂku.

a

i l
’l’ A“;
‘ﬁﬁi‘ 5

.‘9

 

 

waists

_”

lllllllllU1llllll111151llllllllllllllllllllllllllllHUll

3 1293 01028 936

 

LIBRARY
Michigan State
Unlverslty

 

 

 

This is to certify that the

dissertation entitled

303V SIMULATION OF
ELECTRON CYCLOTRON RESONANCE PLASMAS

presented by

Venkatesh P. Gopinath

has been accepted towards fulﬁllment
of the requirements for

Ph. D. degree in Electrical Engineering

 

Mm
Ma' r professor

Date 8/15/94

MS U is an Afﬁrmative Action/Equal Opportunity Institution 0-12771

 

PLACE ll RETURN BOX to remove We checkout from your record.
TO AVOID FINES return on or before dete due.

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MSU II An Affirmative Adlai/Emil Opportunity Inﬂation

3D3V Simulation of Electron Cyclotron Resonance Plasmas

By

Venkatesh P. Goplnath

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY

Department of Electrical Engineering

1 994

Abstract

3D3V SIMULATION OF ELECTRON CYCLOTRON RESONANCE PLASMAS
By

Venkatesh P. Gopinath

The two and three-dimensional simulation of plasma machines used for
materials processing has been investigated. The issues arising when doing self-
consistent electromagnetic ﬁeld/ particle type simulations include numerical
instabilities, large computer memory requirements and long computer
simulation times No simulation models were developed to investigate these
issues.

For the ﬁrst model. a self-consistent 3D3V plasma-EM numerical model was
developed to simulate discharges inside a compact microwave plasma source.
The model coupled an electromagnetic field model and a particle plasma model
to simulate plasma production and heating in low pressure ECR plasmas. A grid
system was developed to improve numerical stability in cylindrical coordinate
system structures. The input parameters to the simulation included operating
pressure. gas type and magnetic ﬁeld pattern. Characteristics of the plasma
simulated included plasma temperature. charge density. plasma potential,
radial electric ﬁeld distribution near the walls and power absorption patterns.

For the second model, a 2D3V simulation of the downstream region of
plasma processing machines was conducted assuming 4) symmetry. The plasma

source was assumed to provide a constant or spatially variable ﬂux of particles

to the downstream region. The plasma potential and the charge densities across
the simulation volume were resolved. A non-uniform particle weighting scheme
was implemented to aid stability at the grid points close to the origin.

Simulation models were redesigned to achieve some measure of parallelism.
This was done by observing that the movement of particles in particle type
plasma simulations are essentially independent of each other once the governing
forces have been self-consistently determined. Implementations of 203V and
3D3V plasma simulations on a MasPar MP2 massively parallel processor were
achieved. Speed-up of over 13 times the fastest locally available serial machine
was achieved demonstrating the promise of parallel simulations. Further, the
simulation of problem sizes beyond the computing and storage capability of
desktop workstations was also demonstrated.

The concept of parallel simulations was modiﬁed to spread the computation
load across a set of networked desktop workstation machines in a distributed
manner. A distributed simulation using PVM was designed to be run across 2, 4
and 8 workstations. The two processor implementation was shown to provide
considerable speed up over a single desktop case. However. communication
bottlenecks reduced the computational speed-up considerably for the four

processor implementation.

This dissertation is dedicated to my parents
Smt. Ramamani and Sri P.S. Gopinath
for their understanding and support.

ACKNOWLEDGEMENTS

I would like to acknowledge my advisor Dr. Timothy Grotjohn for his support
and guidance throughout this dissertation. I also thank my committee members
Drs. Asmussen. Ledford and Reinhard for all their encouragement and for
serving on the committee. I would like to thank Dr. Diane Rover for her help in
conducting some advanced computations. Last, but in no way the least, I would

like to thank my dear wife. Shoba, for all her love and support.

Table of Contents

L15 1: of Tables ........................................... ix
1.18 t. of Figures .......................................... x
Chapter 1 Introduction ................................. 1
1.1 Motivation of 3D3V Simulations ....................... 1

1.2 Research Goals ................................... 5

1.3 Dissertation Outline ................................ 6
C31).&Elpter 2 Background .................................. 9
2.1 Plasma Description and Fundamental Equations .......... 9

2.2 Numerical Modeling of Plasmas ....................... 12

2.2. 1 Particle Solution Techniques ....................... 12

2.2. l .1 Overall Algorithm of Particle Simulation ........... 13

2.2.1 .2 Particle Movement ............................ 15

2.2.1.3 Collision Modeling using Monte Carlo Techniques . . . . 17

2.2.2 Fluid Models ................................... 22

2.2.3 Hybrid Models ................................. 24

2.2.4 Poisson’s and Maxwell's Equations Solution ........... 25

2.2.5 Geometrical Considerations ....................... 31

2.3 Analytical Modeling of Plasmas ....................... 35

2.3. l Schottky Diffusion Model ......................... 37

2.3.2 Free-Fall Diﬂusion Model ......................... 38

2.3.3 Ambipolar Diffusion Model ........................ 40

2.3.4 Limits of Analytical Models ........................ 42

2.4 Vector. Parallel and Distributed Processing Methods ....... 43

2.5 Review of Previous ECR Plasma Modeling Work ........... 47

Chapter 3 Description of Microwave Plasma Sources ........... 50

3.1 Introduction ...................................... 50
3.1.1 Downstream Plasma Chamber ..................... 50
3.1.2 Experimental Methods ........................... 53

3.1.2.1 Sample Preparation and Processing ............... 53
3.1.3 Experimental Results of Uniformity and Sputter Rate . . . . 53
3.1.4 Summary of Downstream Measurements ............. 65

3.2 Compact Ion Source ................................ 66
Chapter 4 A 3D3V Plasma Simulation System ................ 69
4.1 Introduction ...................................... 69

4.2 Design of the Plasma Simulation ...................... 69

4.2. 1 A Non-Terminated Grid Design ..................... 70

4.2.2 Poisson’s Equation Solution ....................... 74

4.2.3 Plasma Kinetic Model ............................ 76

4.2.3.1 Particle-Neutral Collisions ...................... 76

4.2.3.2 Charged Particle-Particle Collisions ............... 78

4.2.3.3 Electron-Neutral Ionizing Collisions ............... 79
4.2.4 Maxwell’s Equation Solution ....................... 81
4.2.5 Self-Consistent 3D3V Plasma Simulations ............ 82

4.2.5.1 Geometrical Requirements ..................... 82

4.2.5.2 Combined Plasma-Maxwell Simulation ............ 84

4.2.5.3 Power Absorption Calculations .................. 87

4.3 Efforts to Speed up the Simulation ..................... 88

4.3.1 Cycling the EM Program ................ - .......... 88

4.3.2 Implementation of a Parallel EM-Plasma Version ....... 93
c“l_ZlaI)ter 5 Downstream Plasma Simulations .................. 99
5. 1 Introduction ...................................... 99

5.2 2D3V Downstream Simulation ........................ 100

5.2.1 An Unequal Weighting Scheme ..................... 102

5.2.2 Poisson’s Equation Solution ....................... 106

5.2.3 Example Simulation ............................. 108

Chapter 6 Simulation of a Compact ECR Source .............. 120

6.1 Simulation of a Compact Ion Source .................... 120

6.2 Results ......................................... 123
Chapter 7 Implementation on the MasPar MP2 System ......... 131

7. 1 Introduction ...................................... 131

7.2 Implementation of a Two Dimensional Parallel Program ..... 134

7.2.1 Movement of Particles Across PEs ................... 134

7.2.2 Poisson’s Equation Solution ....................... 135

7.2.3 Calculation of Fields ............................. 135

7.2.4 Results of the Two Dimensional Simulation ........... 135

7.3 Implementation of a 3 Dimensional Parallel Program ....... 139

7.3.1 Implementation of Third Dimension in Memory ......... 139

7.3.2 Implementation of Third Dimension Across PEs ........ 142

7.3.2.1 Movement of Particles Across PEs ................ 143

7.3.2.2 Poisson’s Equation Solution .................... 145

7.3.2.3 Calculation of Fields .......................... 145

7.4 Implementation of 3D Maxwell’s Equations Solution ........ 147

7-5 Discussion of Results ............................... 148
Chapter 8 Distributed 3D3V Simulation ..................... 153
8- 1 Introduction ...................................... 153

8.2 Parallel Virtual Machine ............................. 153

8.3 A Distributed Poisson’s Equation Solver ................. 154

8.4 Distributed 3D3V Plasma Simulation Implementation ...... 160

8.5 Discussion of Results ............................... 163
C1lapter 9 Conclusion .................................. 1 70
9. 1 Summary of Research .............................. 170

9.2 Directions for Future Research ........................ l 73
References ........................................... 175

List of Tables

’I‘able 3.1: Sputtering rate and uniformity vs. microwave power ............... 56
Table 3.2: Sputtering rate and uniformity vs. pressure ............................ 56
Table 3.3: Sputtering rate and uniformity vs. RF induced bias ................. 57
’I‘able 3.4: Sputtering rate and uniformity vs. downstream distance ......... 57
Table 7.1: Specification of the three systems ........................................... 140
Table 7.2: Results for the 2D simulation .................................................. 140
Table 7.3: Results for the 3D simulation .................................................. 149
'I‘able 8.1: Specification of the two systems .............................................. 165
Table 8.2: Results of distributed simulation ............................................ 165
Table 8.3: Results on a sub-netted system .............................................. 167

Figure 1.1:
Figure 2.1:
Fingre 2.2:
Figure 2.3:
Figure 2.4:
Figure 2.5:
Figure 2.6:
Figure 2.7:
Figure 2.8:
Figure 3.1:
F 1glare 3.2:
Figure 3.3:
F 1.gl—lre 3.4:
I"“1.g‘l..lre 3.5:
Figllre 3.6:
Figllre 3.7:
Figllre 3.8:
Figllre 4.1:
Figllre 4.2:
Figllre 4.3:
Figure 4.4:
Flgqre 4.5:
Fig.4“, 4.6:
I‘\ig‘l1re 4.7:
I“igure 4.8:
I“igllre 4.9:

List of Figures

An example plasma processing system. ................. 2

Time evolution of particles during a simulation ............ 14
Selection of collision event by Monte Carlo method. ........ 19
Flow chart of a typical electrostatic plasma simulation. ..... 20
Distribution of particle charge to grid. .................. 28
Cylindrical grids with uniform grid spacing ............... 32
Cylindrical grids with non-uniform grid spacing. .......... 34
A 25 cm diameter ECR system. .................... - . . . 41
Example of symmetry in a plasma simulation. ............ 44
Comparison of theory with experiment for 9 cm system. . . . . 55
Oxide etched vs. power. ............................. 58
Oxide etched vs. pressure. ........................... 59
Oxide etched vs. RF induced bias. ..................... 60
Comparison of theory with experiment for downstream points. 61
Comparison of theory with experiment for 3 downstream points. 63
Comparison of theory with experiment for a 6 inch wafer. . . . 64
Compact ECR ion source. ........................... 67
Non-terminated grid scheme. ......................... 72
Elementary cell in non-terminated scheme. .............. 73
Example ionization cross section plot. .................. 80
Combined plasma-EM simulation ﬂowchart. ............. 86
Time evolution of ﬁelds for the serial and cycled cases. ..... 90
Energ' of particles with source cycled on and off. ......... 91
Number density of particles with source cycled on and oﬂ‘. . . 92
A parallel EM-plasma implementation. ................. 94
Time evolution of ﬁelds for the serial and parallel cases. . . . . 95

Figure 4.10: Evolution of number density for the serial and parallel cases. 96

Figure 4.11: Time evolution of energy for the serial and parallel cases. . . 97
Figure 5. l: Downstream simulation zone. ........................ 101
Figure 5.2: Unequal weighting implementation. .................... 104

Figure 5.3: Implementation of Von Neumann condition. ............. 107
Figure 5.4: Time evolution of number of particles. .................. 109
Figure 5.5: Plasma potential in the simulation region. ............... 111
Figure 5.6: Time evolution of particle energy ....................... 112
Figure 5.7: Electron density vs. 2. .............................. 1 13
Figure 5.8: Electron density vs. r. .............................. 114
Figure 5.9: Ion density vs. r. .................................. 115
Figure 5.10: Ion density vs. 2. ................................. 116
FigLire 5. 1 l: A Bessel function type ﬂux distribution. ............... 1 17
Fingre 5. 12: Electron density vs. 2 with non-uniform source ........... 118
Fig‘ure 6.1: B ﬁeld contours inside ion source. ..................... 121
Figure 6.2: Structure being simulated. .......................... 122
Figure 6.3: Time evolution of average particle energy. ............... 124
Figllre 6.4: Power absorption locations .......................... 125
F 1g‘l-lre 6.5: Time evolution of number of particles. .................. 126
F 1.gllre 6.6: Plasma potential in the rz plane. ...................... 128
Figure 6.7: Time average values of Er along the edge. ............... 129
FigLIre 6.8: Electron energy distribution function. .................. 130
Fig—Ire 7.1: Solving Poisson’s equation. .......................... 136
Figllre 7.2: Solving for electric ﬁeld .............................. 137
Figllre 7.3: Interpolation of x component of E-Field at particle location. . . 138
FIgIAIe 7 .4: Implementation of 3rd dimension in memory .............. 141
Figllre 7.5: Implementation of 3rd dimension across PEs. ............ 144
I“Iglnre 7.6: Motion of particles across PBS in the z and xy planes. ...... 146
Figure 8.1: Distribution of load across computers. ................. 156
Figllre 8.2: Shared planes across computers. ..................... 157
Figlire 8.3: A discarded partitioning scheme. ...................... 158
I“131—Ire 8.4: Flow chart of distributed Poisson’s equation solver. ........ 161
Figllre 8.5: Flow chart of distributed 3D3V plasma simulation. ........ 162
I“1gllre 8.6: Comprehensive view of results. ....................... 169

Chapter 1

Introduction

 

1 - 1 Motivation of 3D3V Simulations

Electron Cyclotron Resonance (ECR) plasmas have demonstrated excellent
potential for semiconductor processing because of their ability to create high
densities of charged species and free radicals without high plasma sheath
potentials and contamination ﬁ'om electrodes [1]. Unlike RF parallel plate
reactors, ECR plasma generation is isolated from wafer biasing so that ion ﬂux
and ion energ to wafer surfaces can be independently controlled. Since ECR
plasma reactors operate at lower pressures [2], the mean free path is very long
l'35-?lding to anisotropic ion transport across the sheath. High power absorption
densities and magnetic ﬁeld conﬁnement help to obtain large etching and
deposition rates at lower pressures by maintaining large charged particle
densities.

A typical plasma source detailing the plasma generation and downstream
I‘eglons is shown in Figure 1.1. The plasma is generated in the region under the
quartz dome, termed the source region, with the help of the ECR magnets. It
ﬂier, ﬂows into the processing area, also termed the downstream region.

Permanent magnets in the downstream region aid in the conﬁnement of charged

particles.

Cavity Walls

 

 

    

Center
Conductor

 

 

Microwave _ ,,
Cavity ‘3. _ r
\ \
Quartz*Dome
' T; . .. ECR
'. . - . Magnets
Downstream ‘
Region ———>
Source Region
Conﬁnement _
Magnets V
Wafer Holder

 

 

 

 

 

 

Figure 1.1: An example plasma processing system.

To gain a better understanding of microwave plasma source operation, the
kinetics of plasma excitation and charged particle transport need to be studied.
In earlier studies it has been demonstrated using simple models (Bl-[5]), that
the plasma sputter rate due to argon ions bombarding a surface is strongly
dependent on the incident ion density and energy of the ions when they arrive at
1:116 surface. The ambipolar model for charged particle behavior used in these
studies assumed a uniform ion charge density leaving the plasma source and
entering the downstream processing region, and an absence of downstream
conﬁnement magnets. Simulations to predict plasma properties in the case of
nonuniform source charge density, in the presence of conﬁnement magnets and
for more complex gases are of interest. The charge density inside a discharge is a
complex function of a number of factors such as reactor geometry, form of
aﬁcitation, pressure, type of gas, presence of magnetic conﬁnement, etc.
Tilerefore, in order to accurately predict the downstream behavior of a plasma
discharge, accurate two or three dimensional (3D) simulations which take the
a‘IDOFve factors into consideration will have to be done.

One of the many ways of creating and maintaining a plasma is by the ECR
eEf‘feczt. ECR plasmas are operated at low pressures (0. lmTorr-lOmTorr) so that
ﬁle electron mean free path is large enough for efﬁcient energy coupling. The
energy _ ﬁ'om a time varying electric ﬁeld can be coupled to an electron very
efficiently in the presence of an appropriate static magnetic ﬁeld [6] . For
%mple. a 2.45 GI-lz microwave electric ﬁeld transfers energy to an electron very

e=fﬂeiently in the presence of a static magnetic ﬁeld of strength 875 Gauss. In the
case of low pressure plasmas where collisions are few and volume recombination

is negligible, ECR can be used to create and maintain a high density discharge.

 

The static magnetic ﬁeld has the additional advantage of conﬁning the charged
particles inside the discharge leading to higher charge densities or alternatively
cooler electron temperatures. Three dimensional simulations of microwave
cavities used for plasma generation present the possibility of predicting the
effect of cavity shape. permanent magnet strength, size and orientation, choice
of gas and other operating parameters on the ﬁnal plasma properties without
having to build prototypes.

While 1D and 2D ([8]-l16l) simulations of discharges in the source and
downstream locations have been done extensively. it is only recently that these
techniques are being extended to 3D structures ([17]-[21]). Three-dimensional-
tllree-velocity (3D3V) particle simulations of plasmas account for the behavior of

Charged species in the three geometrical dimensions and the three velocity
directions of particle motion. The fundamental restriction on grid based particle
SiIIllilation of discharges is that the grid size should be approximately the Debye
length. Consider a 3D3V simulation of a small cylindrical discharge of radius 1 .5
cm and length 1.5 cm. Assuming a Debye length of 0.05 cm, a grid of
' approximately 30x2 10x30 (r, o, and 2 directions) nodes would be required since
ule Debye length requirement has to be satisﬁed for the grids on the
circumference. A simulation consisting of two types of charged particles and at
least eight particles per grid would store six ﬂoating point variables per charged
I)i‘ll'ticle (location and velocities in each of the 3 dimensions) and at least seven
ﬂoating point variables per grid point (electric and magnetic ﬁelds in the three
mmensions and charge density). A simple calculation shows that it would
reqmre computer memory in excess of 70 MBytes. Therefore 3D3V simulations

V"ill have to performed either by using computers of sufﬁcient memory and

 

speed or by modifying the problem to ﬁt the existing computer resources. The
latter approach can take many different forms, one of which is to take advantage

of any symmetry present in discharges. This is made easy by the fact that most
discharges show some form of (b independence or repeatable pattern. A second
approach is to identify parallel computing components of the simulation and
distribute the computation and hence the memory requirements amongst a
number of slower machines ([20]-[22)l. This approach has the advantage of

using available computing resources and distributing the load across several

low-cost machines.

1 - 2 Research Goals

'I‘lne objective of this study is to implement, test and verify state-of-the-art
plasma simulation techniques, software and methodologies for the modeling of
discharges in both source and downstream regions of low-pressure, high-density
ECR plasma reactors. The primary goal of this study is to predict the three
dimensional charged particle ﬂuxes, energies and distributions inside the
plEisrna reactor as a function of the input parameter space which includes
1111(:rowave power, static magnetic ﬁelds. gas composition. flow rate, pressure
and reactor geometry. An additional goal is to verify the simulations using
measured plasma parameters. Simulation software developed will use and
Qtend state-of-the-art plasma simulation techniques including complex
geOrnetries in three dimensions, self-consistent plasma-Maxwell equations

S'Olutions, and advanced computing techniques.
This study will develop the capability to simulate plasma discharges in

IIllorowave ion/ free radical source developed at Michigan State University [24)

 

 

 

using collision cross sections and ionization rates from the literature. Also, this
study will develop the simulation tools to model plasma behavior in downstream
processing regions of microwave plasma sources. Additionally, this research will
also develop software to simulate reactor structures in a number of different
coordinate systems including cartesian and cylindrical.
Since self-consistent plasma-Maxwell simulations are very compute and
memory intensive, it is necessary to seek computational methods and options
which lead to solutions in a reasonable amount of time. In the case of simulating

large geometries, parallel and distributed computing approaches are to be

Considered.

1 - 3 Dissertation Outline

'I‘his dissertation is divided into four subjects: literature review of plasma
Simmations; development of numerical plasma simulation model, comparison of
analytical models, numerical models and experimental data, and the extension
of these techniques to use massively parallel processor (MPP) computers and
distributed computing.

Chapter 2 is a multi-part review of plasma fundamentals and different
l-r1<>dels used in describing plasma behavior. The ﬁrst part introduces the
fundamental equations that describe and control the behavior of plasma. The
Second segment introduces the fundamentals of numerical modeling of plasmas
11"ltlluding different methods like. i.e.. ﬂuid and hybrid methods of plasma
$1Inulation. A review of the geometrical considerations involved and the role they

Play in plasma simulations is also introduced. The third part describes various

al‘lalytical models used in describing plasma behavior like the free-fall diffusion

 

and ambipolar diffusion models and their limitations. The fourtln section of this
chapter introduces the many different ways in which time consurrring
simulations can be speeded up with the help of advanced computing techniques.
Lastly. a review of previous ECR plasma modeling is introduced highlighting the
need for the research done in this dissertation.

Chapter 3 describes the two experimental systems studied and modeled.
Additionally, the application and limitation of a simple ambipolar diffusion
model to predict/ model the sputtering rate and uniformity of oxide removal from
silicon wafers is described. The aim of this exercise is (1) to correlate between the
analytical ambipolar model of charge density and the resulting sputter
characteristics under different operating conditions and two different wafer
sizes, and (2) to establish the limitations of simple analytical models and the
need for numerical models.

Chapter 4 introduces the 3D3V plasma simulation techniques developed in
this research. The particle type simulation details developed for this research are
explained. This chapter also illustrates the non-terminated cylindrical grid
methods developed in order to improve numerical stability. The different
collisions modelled and the methods used to model tlnese are also presented. The
use of different techniques to speed up the simulations are also discussed.

A two dimensional implementation of the particle model to the simulation of
downstream plasma dynamics is presented in Chapter 5. A comparison of the
results of this model with the ambipolar theory is done.

Chapter 6 applies the complete 3D3V plasma-Maxwell’s equations solution to
the simulations of an ion source used at Michigan State University. Results of

the simulations are compared with experimental characterizations of the source.

The use of massively parallel processors as a useful tool to speed up plasma
simulations is explained in Chapter 7. This chapter details the various
techniques used to speed up plasma and EM simulations in two and three
dimensions. This chapter also shows the possibility of simulating structures and
problem sizes beyond the capability of conventional desktop computers.

Chapter 8 demonstrates the possibility of using many conventional desktop
machines to achieve considerable speed-up over single machine simulations.
The concept of distributed applications and the methods used to achieve these
will also be presented.

Finally a summary of important results of this research and directions for
future research is presented in Chapter 9.

This work was performed under the guidance of Dr. T. A. Grotjohn, Associate
Professor in the Department of Electrical Engineering. Michigan State
University. The research in this dissertation has been published in part in
refereed journals and presented at international conferences ([5),[25],(26],[27]

and [74]).

Chapter 2

Background

 

2. 1 Plasma Description and Fundamental Equations

Plasmas are composed of particles including ions, electrons and neutrals
which are excited by the input power source through a variety of heating and
collision processes. In a plasma, the electrons and ions no longer behave as
separate entities but exhibit a collective behavior. Plasma behavior is governed
by plasma kinetics and the electric and magnetic ﬁelds. Therefore. plasma
kinetics must be solved in conjunction with either Maxwell’s equations or, under
electrostatic conditions with Poisson’s equation only. The plasma then interacts
with a material immersed in it leading to sputtering, etching or deposition [28)
as the case maybe.

The Maxwell’s equations are [29]

a

VxE = —ll-a—tH (2.1)
VxH= 1235+] (22)
a: ‘
VOB = 0 (203)
VOE = E (2.4)

where E is the electric ﬁeld strength. H is the magnetic ﬁeld strength, J is the

electric current density, B is the magnetic ﬂux density, 8 is the electric

10

permittivity of the medium, )1 is the magnetic permeability of the medium and p
is the electric charge density. Equation (2.3) can be solved separately to
calculate the static B ﬁeld from the permanent magnets encircling the cavity. In
the case of the source region, these magnets will provide the ECR zones and in
the downstream case, if present, will provide charge particle conﬁnement. In the
presence of just a static E ﬁeld, equation (2.4) can be modiﬁed to obtain
Poisson’s equation as
V2V = _E - (2.5)
where
= —VV (2.6)
and Vis the electrostatic potential.
In addition to the above equations, plasma kinech are described in the form
of a general equation that describes the temporal and spatial variation of

relevant macroscopic variables [6). This general equation (2.7) known as the

Boltzmann equation can be written for a species as follows

afa aft!
.37 +v-Vfa+a0V,fa = (371011 (2.7)

where fa is the distribution function in six dimensional phase space (locations
and velocities) for the particle type a, v is the velocity of the particle, a is the
acceleration, V fa and Vufa are the spatial and velocity gradients of the
distribution function, respectively. If the particle type under consideration is
charged, the acceleration is related to the electric and magnetic ﬁelds in

equations (2. 1) - (2.4) by the Lorentz force equation as follows

- 212.1
a— dt-m(qE+q(vXB)) (2.8)

11

where q is the charge on the particle and m is its mass. The general Boltzmann
equation can be modiﬁed to be a set of transport equations of average values for
physical variables X!!!) (e. g., velocity, energy). This is done by multiplying
equation (2.8) by x(v) and integrating over all velocity space. A general transport

equation in terms of average values can be derived to be

a
Bat-MAE“) + Ve (na (xv)a) —na(a e va>a = {x[$a]0011fv (2.9)

where nu is the total number of particles of type or [6]. Substituting various
variables for the general variable x(v) gives a number of useful relations, termed
the moments of the Boltzmann equation, like the continuity equation, momentum
transport equation, energy transport equation, etc. However, the set of transport
equations thus derived contain more variables than independent equations. For
example, the momentum transport equation contains the pressure dyad as an
unknown. Therefore certain simplifying assumptions have to be made in order to
truncate the hierarchy of the moments of the transport equation. The simplest of
these is the cold plasma model wherein only the ﬁrst two moments of the
Boltzmann equation, i.e. the density and. average velocity, are considered ‘by
assurrring that the pressure temperature dyad is equal to zero. If on the other
hand, the divergence of the heat ﬂux vector is assumed to be zero, in effect
assurrring zero thermal ﬂux, one can truncate the series after the third moment,
i.e., the energy transport equation, and obtain the warm plasma model. Further
discussion of simulating plasmas using equations derived in this fashion is

made in Section 2.2.2.

12

2.2 Numerical Modeling of Plasmas

2.2. 1 Particle Solution Techniques

Particle type simulation of plasmas attempt to simulate the behavior of a
large number (109/cm3 — 1011/ cm3) of electrons by considering a few
representative particles since it would be quite impossible to perform
simulations in a reasonable amount of time if all particles were considered.
These representative particles, called superparticles, carry the same mass as
that of the charged particle they are representing but contribute to the charge
density a factor greater than one. For example, if one were to simulate electron
' motion in a plasma of electron charge density 109/cm3 in a chamber of volume
1cm3 by 106 superparticles, each superparticle would carry a weiglnt of 1000
electron elementary charges.

The particles, electrons and one or more type of ions (both negative and
positive), are then moved under the inﬂuence of static and time varying E and B
ﬁelds. At the end of each timestep iteration, the contribution of the particles to
the charge and current densities on the backgound gids are computed and
new ﬁelds are calculated by solving both Poisson’s and Maxwell’s equations or
only Poisson’s equation as the case demands. Any particle going out of the
simulation region B considered lost. The loss of particles to the walls is normally
compensated by the generation of new particles. This generation term can take
one of two forms. In the case of the plasma excitation region, some electrons in
the ECR regions acquire energy higher than the ionization potential of the
backgound gas creating new electron-ion pairs. In the other case of

downstream plasma simulations, a charged particle ﬂux equal to the ion ﬂux

13

emanating from the source region is added as particles during each iteration. A
low pressure plasma simulation can be considered to have reached a steady
state solution when the rate of loss to the walls is compensated for by the rate of
generation as shown in Figure 2. 1. Simulations of high pressure plasma should

also include volume recombination and collisional (Joule) heating of particles.

2.2. 1. 1 Overall Algorithm of Particle Simulation

The structure to be simulated is broken up into gids in three dimensions
such that the gid size is of the order of the Debye length. The number of
particles (electrons and one or more type of ions) is chosen such that tlnere are at
least eight particles per gid in order to obtain a numerically stable solution [7).
The input parameters for a typical simulation include neutral gas pressure and
type, initial charge density, electron and ion temperatures and cavity
dimensions. The temperatures are used to initialize the particle velocities using
a Maxwell-Boltzmann distribution. The type of the neutral gas decides the
ionization potential and its density determines the particle—neutral collision
rates along with the particle densities. The initial charge densities along with the
cavity dimensions determine the charge weighting of the superparticles. The
location of the particles are randorrrly chosen in the cavity giving a ﬁnal uniform
initial distribution. The boundary conditions may be of the form of grounded
walls and some constant E ﬁeld.

Initial charge and current densities are calculated from the above initial
location and velocities respectively and are tlnen used to calculate the new E and

8 ﬁelds by solving Maxwell’s equations. The electrons and ions are moved under

Number (103)

20

 

l4

 

 

.—--o--‘---------------------------------‘-----‘

Electrons _

Ions - -

A J I A

 

 

5e-09 le-08 1.5e-08 2e-08 2.5e-O8 3c-08 3.5e-08 4e-08
Time (see)

Figure 2.1: Time evolution of particles during a simulation.

15

the inﬂuence of these E and B ﬁelds and their trajectories are checked to see if
they are lost to the walls. Monte Carlo techniques. described irn Section 2.2. 1.3
below, are used to determine the occurrence of a collision for a particle and
appropriate actions are taken. At the end of each iteration, new charge and
current densities are calculated and the whole process is repeated again.

In the case of the source region ECR plasma simulations, some electrons
acquire energy above the ionizing potential of the backg'ound gas and the
occurrence of ionization is determined by the ionization cross-section for that
gas. In downstream simulations, the input ﬂux compensates for the loss of
particles. The simulation is considered complete when the rate of loss is

compensated by the rate of gain of particles.

2.2. 1.2 Particle Movement

The kinetics of charged particle motion is done by numerically solving the

Lorentz force equation, given by equation (2.8), for the timestep At along with

a = v (2.10)

dt
where v is the velocity and r is the position of the particle.
In order to solve equations (2.8) and (2. 10) numerically and obtain a stable

solution, consider a single differential equation of the form

(11: '

— = , t .

dt ﬂx ) (2 1 1)
with the initial condition

x00) = x0 (2.12)

16

where ﬂx,t) is some single valued function of the variables x and t.

Equation (2.11) can be expanded [30) by Taylor series about the point t = ti
for the time it”. By neglecting higher order terms gives, Taylor series the forward
Euler algorithm as

x. = xi+hf(xi, t) (2.13)

1+1

where xi” is the position at t=ti+1 and h is the timestep (tm - ti).

This simple numerical technique turns out to be non-energy conserving. The
solution of these equations in a way that is energy conserving uses the leapfrog
metlnod. In this method the velocity and position are calculated on an
alternating half step division basis.

The charged particle motion using equations (2.8) and (2. 10) can be written

as

(2.14)

c
It
lo

('1
:-
+
r—\
a. e
+
X
5 I-e
m
\___/
a~
+
.5

i+1 m . 1
1+2
and

(2.15)

r r 1+hv
i+-— i+-

2 2
The velocity at time t,” /2 is replaced by the average of the velocity at t, and t,”

i+l'

and after deﬁning we as qB/m. equation (2.14) can be written in a matrix form

as 3 coupled equations

E

ni+§
vx,i +1 vywcz-vzwcy vywcz_v5wcy - x
v =qﬂE-l-tbvco —V(o +évw —v0J +v (216)
y,i+l m y.l+§ 2 2 ex 1: cz 2 2 at x c2 y ' '
vz,i+l E vchy-vywcx [+1 vchy—vywcx; v2,-

 

 

17

After some marnipulations. the above equation can be written as

—1 —1
_ h 4’2 h h
”5+1 " [I—iwc] ZEi+é+ [I—iwc] [1+iwc]vi (2.17)
where
l O O
O O 1
and
O wcz -wcy
we = —o)cz O (0“ . (2.19)
foe), 406x 0 _

 

 

2.2.1.3 Collision Modeling using Monte Carlo Techniques

The mathematical techniques used by the Monte Carlo method are based on
the selection of random numbers and hence are very well suited for problems
which show statistical distributions. In principle, Monte Carlo metlnods can be
considered as a very general mathematical tool for the solution of a great variety
of problems rangng from numerical integation to semiconductor device
simulation l31],[32).

Monte Carlo applications can be divided into two major goups, the ﬁrst of
which consists of simulation of systems which are statistical in nature. The
second group consists of methods designed for problems that have well deﬁned
mathematical equations. However, most problems are a combination of the two.
For example, transport problems in semiconductors and plasmas are statistical

in nature but also have well-deﬁned equations.

18

Consider an example illustrating the use of Monte Carlo methods to
determine the occurrence of a collision for a particle. Let rm be the scattering
rate and Ap the probability that the particle will experience a collision during the

interval At, then

Ap = FmAt (2.20)

and the probability of no collision is given by

pnot = l-Ap = l—rmA” (2.21)

This method can be extended to a system containing a number of scattering
events 7». such that 21 7», = F. Consider a set of three collision events whose rates
are A1, A2 and 13. The random number generated will deterrrrine the type of
collision undergone by the particle as shown in Figure 2.2. Here the value of a
random number, r, lying between 0 and 1 is used to choose one of the four types
of collisions. For example, a value of 0.3 will result in a collision event
corresponding to M. The height of each strip in Figure 2.2 is proportional to the
relative probabilities of the events. The accuracy of this method depends on the
fact that if it is done for a sufﬁciently large number of particles, the random
numbers generated will follow the correct statistical distribution. Figure 2.3
shows a typical electrostatic plasma particle algorithm 111 which Monte Carlo
methods are extensively used. In this algorithm, Monte Carlo determines the
type of collision and the state of the particle after collision.

Monte Carlo techniques can also be applied to generate distributions
according to a given function. In cases where the distribution of variables or

their properties are known or are expected beforehand, it may save simulation

19

1.0

 

 

 

 

(13,- r)/ rm = 0.1
0.9
A3/ Fm = 0.2
0.7
l'
1.2/ rm = 0.5
0.2
7.1/1‘m = 0.2
0.0

 

 

 

Figure 2.2: Selection of collision event by Monte Carlo method.

20

 

Initialize electron and ion initial

sitions and random veloci
nitialize start-u backgrou
potential and E eld.

 

 

 

 

‘
I
I
I
I
I
I
I
I
I

\

 

Loopl over each valid charge

pamcl e.

Move the char e cle under
the inﬂuence 0% 1333' ﬁe Ids.

 

 

 

 

 

 

 

  

'-

Particle Lost?

  

 

 

 

 

 

 

 

 

 

 

Use Monte Carlo to determine
type of collision and action.
I ..........
, Calculate new chargeden si
solve Poisson’s e nation
calculate new E elds.
Yes Ou t Simulation results.
Steadvsmne? En Emma...

 

 

 

Figure 2.3: Flow chart of a typical electrostatic plasma simulation.

21

time to initialize the particles according to that function. For example, the start-
up velocities of electron and ions can be distributed randomly according to a
Maxwell-Boltzmann energy distribution. The same approach can be applied to
distribute the random locations in space of the irnitial set of electrons and ions.
In this case, for example, one might follow the ambipolar Bessel function type
distribution in space.

The application of the rejection technique to the generation of random
numbers [32] with a given distribution ﬂx) begins by letting C be a positive
number such that

C _>.f(x) (2.22)

in the interval (ab), and let r1 and r'1 be two random numbers with uniform
distributions in the range (0,1). Then

x1 = a+ (b—a) r1 (2.23)
and

f1 r'lC (2.24)

are two random numbers obtained with a ﬂat distribution in (ab) and (0.0),
respectively. If

f1 Sf(x1) ' (225)

then x, is retained as the choice of x else it is rejected and another pair (r1, r'1 )
is generated and the above process is repeated. The result of the process. is a
distribution that follows ﬁx).

Monte Carlo methods have been applied extensively to the study of plasma
simulations. B. E Thompson et al. [33] used tlnese methods to simulate ion

transport in RF sheaths. In this work, they used a Monte Carlo technique to

22

select the type of ion-molecule collision (hard sphere, ﬁeld interaction or charge
exchange). Further, the Monte Carlo method was also used to deterrrrine the
ﬁnal velocity of ions after elastic collisions. G. R. Govindraju et. al. [34] used
Monte Carlo techniques to simulate electron swarm behavior in uniform E x B
ﬁelds. Here again the method was used to choose one of the many types of
collisions. Monte Carlo methods have also been used in plasma etching and
sputtering studies. Ignacio et al. [35) applied Monte Carlo techniques to simulate
2D etching of silicon. In this study, angular ion distribution frmctions were
obtained by using Monte Carlo methods to solve the transport of ions across the
plasma sheath which were then applied to calculate etch rates. Biersack and
‘ Eckstein [36) applied Monte Carlo methods to simulate sputtering yields. energy
and angular distributions of sputtered particles in collisional sputtering

processes and the simulations ageed with experimental results very closely.

2.2.2 Fluid Models

In higher pressure plasmas (> few mTorr), where the characteristic lengtln of
the system is larger than the mean free patln of the electrons, i.e., chamber
dimensions are several mean free paths long, one can treat the plasma as a
ﬂuid. The collective behavior of the plasma components can then be simulated
by solving the moments of the Boltzmann equations for each of the species.
Fluid model simulations make some simplifying assumptions to terrrrinate the
sequence of moments of the Boltzmann equation. For example, a simulation of a
one-dimensional plasma with the cold plasma approximation for electrons and

ions would solve the following equations:

23

an, life

5; +$ = kionnnne—arnine (2.25)

an, a)".

5; +37 = kionnnne—arnine (2.27)
av 3"

L = ”ﬂea-Deaf (2.28)

. av 3"-
1: = _“ini$ ”Drag, (229)
along with the Poisson’s equation in one dimension
2
a V _ 4
5:2— — —E (ni—ne) (2.30)

where kbn is the ionization rate, a, is the recombination rate. nn is the neutral
gas density, "1(a) is the ion (electron) mobility and Dile) is the ion (electron)
diffusion coefﬁcient, We) is the ion (electron) charge density, jug) is the ion
(electron) current density and V is the potential. It should be noted that of the
above set of equations, only equations (2.26), (2.27) and (2.30) are mutually
independent since equations (2.28) and (2.29) can be substituted in (2.26) and
(2.27) to form a set of three independent differential equations. Since these
equations vary in time and space, initial conditions for all three independent
variables are speciﬁed in addition to boundary conditions which are speciﬁed for
every. timestep at the two boundaries.

Graves (37] simulated plasmas using ﬂuid models where both electrons and
ions were treated as continuous ﬂuids. In this model electron and ion motion

was assumed to be collisionally dominated. In another example, Barnes et al.

24

[38] simulated a set of equations similar to (2.26)- (2.30) for low-pressure (50-

500 mTorr) RF glow discharges.

2.2.3 Hybrid Models

When electrons and ions are acted upon by the same electric ﬁeld, electrons,
due to their lower mass, traverse a longer distance than the ions 111 the same
timestep. This difference is enhanced in the simulation of low pressure plasmas
where each species has a different temperature and hence a different random
velocity. For example, irn a simple microwave argon plasma at low pressures
(0. 1-10mTorr), the electron temperature is generally between 4-15 eV while the
ion temperature is in the range of 0.3-1 eV. This difference in energies presents
difﬁculties while simulating plasmas. In order to conserve energy and for the
simulation to make sound physical sense, the distance travelled by an electron
per timestep in particle type simulations should be less than a gid size, which is
of the order of the Debye length. However, if ion motion is considered in the
same timestep, it would travel less than a thousandth of the gid for the above
example. This problem is magriﬁed in magnetically conﬁned plasmas where an
electron can typically spend a long time trapped along the magnetic ﬁeld lines
while the ions are uneffected. In order to reduce unnecessary computations,
methods have to be developed which de-couple time-steps for electrons and ions
and one such method is the ﬂuid-particle hybrid model. Further, Monte Carlo
techniques introduced earlier, can turn out to be computationally expensive or
difﬁcult when applied to low ﬁeld regions, slow ion motion and prediction of

particle recombination and generation. In these cases, a hybrid model such as

25

proposed by Bandopadhyay et. al. [39] to couple drift-diffusion models with
Monte Carlo metlnods have been considered.

Hybrid models typically treat one of the particles as a ﬂuid while simulating
the other as a particle. For example, a simulation may consider the electrons as
a ﬂuid and treat the ions as particles and use a timestep based on ion motion.
Electron behavior is simulated by using the moments of the Boltzmann equation
similar to the method described in Section 2.2.2.

Lee and Birdsall [40] used hybrid - models to simulate beam-plasma
interactions. Surendra and Graves [42] used hybrid models to do 1D simulations
of DC glow-discharges used in diamond ﬁlm deposition. In their work, hydrogen
discharges were simulated at pressures of 20-30 Torr by treating energetic
electrons in the cathode sheath as particles and both electrons and ions in low
ﬁeld region as ﬂuids. The sheath electrons were coupled to the bulk ﬂuid model

as a factor in Poisson’s equation.

2.2.4 Poisson’s and Maxwell’s Equations Solution

One way of modeling the electromagnetic excitation of plasma source regions
is to directly solve Maxwell’s curl equations which govern time dependence of the
electromagnetic ﬁelds. In addition to tlnese. Poisson’s equation is solved
numerically to self consistently resolve local static electric ﬁelds and charge
density. The B ﬁeld generated by the permanent magnets can be solved in
advance since they do not change for the duration of the simulation [8].

However, the case of downstream behavior simulation is essentially
electrostatic, since most of the electromagnetic ﬁelds are absorbed by the

plasma in the generation region, the electromagnetic ﬁelds can be neglected in

26

the downstream region. In this case, only Poisson’s equation needs to be solved
and the charged particles are moved under the inﬂuence of static electric ﬁelds
only.

Poisson’s Equation can be discretized in rectangular coordinates as follows

 

 

 

(Vj-l—zvj+vj+l)k,1 (Vk—1_2Vk+vk+l)j,1
+ (2.31)
sz Ay2
(VI-1T2Vl'l' Vl+1)j,k _ _Pj,k,1
A22 8

where j, k and lare the gid indices and Ax, Ag and A2 are the gid spacing in the
in the x, y and 2 directions, respectively. Sirrrilarly the E ﬁeld in the three
directions is obtained by solving equation (2.6). For example, Ex, the component

of the electric ﬁeld in the x direction is obtained at gid point {i.kJ) as follows

(Vi—I‘Vj+1)u
(Ex)j,k,l =

2 Ax (2.32)

 

and components in the y and 2 directions are calculated similarly.

The solutions to Maxwell’s curl equations for plasma discharges has been
studied extensively by Tan [41] and the reader is referred to his dissertation for
details. The method utilized in his study was the ﬁnite-difference time-domain
(FDTD) method.

Another important part of numerical calculation of plasmas is the particle
and force weiglnting, i.e., the relation between gid and particle quantities. The
simplest form of weighting is called zero-order weighting also known as nearest-
grid-point (NGP) weighting. In this case, all particles in the volume AxAyAz
around the gid point (i,k,l) are assigned to that point. This procedure is

computationally very fast since only one gid look up is done and charge density

27

is the number of particles divided by the volume. However, this form of weighting
leads to relatively noisy charge density and electric ﬁelds.

First order weighting, also called cloud-in-cell (CIC) [43] smoothens the above
noise but adds the additional computation of looping over the nearest eight gid
poirnts in the three dimensional case. In the case of nonuniform gid spacing or
gid points on the boundary, the volume around the grid points has to be
calculated for each gid point. Consider an internal grid point (i,k,l) forming the
origin of a cube of one gid spacing length in each of the three directions, and
consider a charged particle located at (x, y,z) such that the nearest gid point is
(i,k,l) with location ()9, yk,zl) as shown in Figure 2.4. The x location of the particle
irnside the cube is (x-Jg), and the y and 2 locations are calculated similarly. Then
the fraction of charge that gets assigned to each of the gid points is done as
follows

413M: (l-X)x(l-Y)x(1—Z)
(4.1),, = (X)x<1-Y)>< (1-2)
q,,,..,,, = (I-X)x(Y)x (1-2)
qj,,.,,.,=(1—X)><(1-Y)x(21
qj.l,,..1,, = (X) x (Y) x (1—2)
4,..1,,.,,,,= (X)x(1-Y) x (Z)

qj,k+1,1+1= (l-X)X(Y) X (Z)
qj-I-1,k+1,l+1= (X) x (Y) X (Z)

(2.33)

where ‘1J.k.l is the contribution of the charge at point (Mel) from the particle
under consideration, X is the fractional distance of the particle in the x direction

from the gid poirnt (i,k.l) and is deﬁned as

x -x-
X = $1 (2.34)

28

 

 

(”(+11”) (i+1,k+1,l+1)
6M”) (1+1,k,l+1)|
(x1, yk, zl+Az) (x. y,z)
‘Q
H'erii'.’ (yryk)

 

 

 

T (i,k+1,l) 1 (2—21)
2

 

 

 

y/( T (1+1 ,k-I-I ,U
(xj+Ax, y k+Ay, zl)

(i,k,l) —>x (1+1,k,l)

(nykvzrl (xfl'Axvyk’Zl)

‘ “Figure 2.4: Distribution of particle charge to gid.

29

and y and z are deﬁned similarly. The CIC method has the advantage of
weighting the charge on a grid point in accordance to its relative distance from
the particle. This form of weighting results in a smoother effective particle shape
as compared to NGP. Higher order weighting in the form of quadratic and cubic
splines can also be considered at the cost of extra computation.

An inverse form of the above weighting is done while calculating the effective
electric and magnetic ﬁelds acting on a particle at a fractional distance (X,Y,a
from the grid point (i,k.U. In this case the ﬁeld at the particle is a sum of the
weighted ﬁeld contributions from the eight neighboring grids points. It is
important to note here that the same type of weighting should be done for both
the charge and the ﬁelds in order to avoid a setf-force, i.e, the particle
accelerating itself.

The effect of particle motion appears in the form of current density J in the
right hand side of equation (2.2). Equations (2.1) and (2.2) are then iteratively
solved to obtain self-consistent E and H ﬁelds. Current density due to particle
motion is calculated by distributing the particle velocities in each of the three
dimensions in a manner similar to equation (2.33). Then, the current density in

the x direction at the grid point (i,k.U, J, is calculated as follows

JxUJcJ) = va'JcJ) xpj.k.l (2.35)

where um“, is the contribution of particle velocities to gid point (1.16.1) and p1.“

is the charge density at that point. Using current density calculated in this

manner only approximately satisﬁes the continuity equation

. ap -
V J+-8—t —- . (2.36)

30

This could lead to errors in the charge and the divergence of the electric ﬁeld.
One approach is to introduce some corrections to the right hand term [44] in
equation (2.31) in order to satisfy the charge continuity equation (2.36). First the

electric ﬁeld E is modiﬁed as follows

E"+1 = ‘0’E"+1—Vv (2.37)

where En” is the electric ﬁeld calculated at (n+1)‘h timestep by solving equations
(2.1) and (2.2) and ME“ is the modiﬁed electric ﬁeld used to provide a
correction factor to Poisson’s equation. The source term in Poisson’s equation is

then modiﬁed as follows

n+1
VZV = Vo‘O’E"*‘—E—e—. - (2.38)

This approach requires explicit solving of Poisson's equation in order to enforce
Gauss’ law.

Another approach to this numerical problem is to introduce an error
correcting term to the current J [45], [46].

First, a function Hx,t) is deﬁned as

F(x, t) = VOD—p (2.39)
where

D = eE. (2.40)

The correction term. dVF. termed as Marder’s correction. added to J is

computed as follows

J = J— dVF (2.41)

31

where dis a correction factor bound by the constraint

sz

d<m

(2 .42)

and Ax and At are the increments in space and time respectively.

2.2.5 Geometrical Considerations

Geometrical considerations play a very important role in plasma simulations.
While 2D and 3D rectangular structures can be simulated with simple
discretizations, simulating 3D cylindrical structures is more complicated.
Firstly. in order to satisﬁr the Debye length criterion in all dimensions, there will
have to be 21:: more grid points on the circumference than there are in the radial
direction. This could be very expensive in terms of computer memory
requirements and advanced computer techniques, discussed in Section 2.4. will
have to be considered. Secondly, uniform grid spacing in cylindrical coordinates
has another problem. Consider a cylinder of radius R divided uniformly into 10
grid points each of length R/lO as shown in Figure 2.5. The ratio of the area
between the ﬁrst annular region, numbered 1 on the ﬁgure, and the last region,
numbered 10, is 1: (R/lO)2/1c(R2-(O.9R)2) and is approximately 1:20. i.e. 5%. If
the charge were initially distributed to the grid uniformly. there would be
substantially fewer particles in the inner rings than the outer. This will lead to
instability of the program at grid points closer to the origin. This problem can be
solved in a couple of different ways, one of which is to create nonuniform grids
in an attempt to make the outer and inner grid areas the same order of

magnitude. In the second method. grids could be created such that there are

 

 

 

 

 

 

““““““““

 

 

: lindrical grids with uniform grid spacing.

 

 

 

33

fewer grids in the ¢ direction for radial grids closer to the origin as shown in
Figure 2.6. In this case the ratio of the areas of two grid elements is 1:64, i.e.
15%. There is still a substantial disparity in areas which indicates that ﬁner q>
grid spacing is needed for the outer grids. In this case the numerical method
used to solve Poisson’s equation, for example. Gauss-Seidel method. will have to
be modiﬁed to reﬂect the irregularity in one or more dimensions. This
structuring has the advantage of reducing the amount of memory used for the
structure, however. additional computational time will be required to resolve the
nonuniform grid spacing. This approach has been extended and implemented in
this dissertation. In either case the objective is to make the ratio of the areas
closer to unity.

A combined cartesian-cylindrical coordinate approach can also be used in
certain cases. Heee and Zutter [47] simulated a rectangular waveguide with a
cylindrical bend in it by discretizing Maxwell’s equation in cartesian coordinates
for the rectangular part and in cylindrical coordinates for the bend. They
developed a transformation relating the values of variables at the boundary with
the values in the neighboring grid points in the two regions.

Another approach to simulating regions with curved boundaries is to use
non-orthogonal unstructured coordinate systems. In this case the coordinates
representing the three dimensions need not necessarily be mutually
perpendicular. It should be noted here that conventional vector geometry
assumes orthogonal coordinate systems and, in this case, vector equations like .
the Lorentz force equation will have to be solved with care.

Matsumoto and Kawata [48] have proposed approximating curved

boundaries using triangular meshes to solve particle motion in magnetostatic

 

 

35

ﬁelds. Assuming small timesteps they could locate the new position of the
particle by searching through a small number of grids. Sonnendrucker et al. [49)
presented a ﬁnite element PIC (particle in cell) model for unstructured meshes.
Brandon et a1. [18] presented a 3D EM plasma simulation using non-orthogonal
unstructured grids. Their algorithm reduced to a standard Finite Difference

Time Domain (FDTD) method when orthogonal grids were used.

2.3 Analytical Modeling of Plasmas

In very low density plasmas. electron motion can diffuse through a
background neutral gas without any substantial effect due to the ions. This type
of diffusion where the coulomb forces can be neglected is termed as free
diﬂirsion. In higher density plasmas however, the diffusion rate of electrons and
ions is controlled by the strong coulomb attraction which tends to maintain
overall space charge neutrality. These theories can be derived from the general
Boltzmann equation [51) after making several simplifying assumptions.

Consider the conservation equation for electrons given by

an

ate+V'("e(”e>) = nevi—anZ—vAne—neze (2.43)

where n, is the electron charge density. <ve> is the average value of the electron
velocity, vi is the ionization frequency, VA is the frequency of attaching collisions.
at is the recombination co-efﬁcient and Ze is the generalized source frequency.
Similar equations can be written for ions and neutrals. Under steady state
conditions and sinusoidal wave forcing ﬁelds, each macroscopic quantity can be

written as

36

ken

(ve) = (009) +(v12)2(w'-

. (2.44)
€1(mt- k 0 r)

"e = nOe + nIe
where <vOe> and <v1,_,> are the average values of the time invariant and time
varying components of the electron velocity respectively and nae and n18 are the
time invariant and time varying components of the electron charge density
respectively. Assuming quasi-static approximation where the wavelength A —> ea
or alternatively k —-) 0 leads to em”— '0 r) -) aim.

The momentum conservation for the electrons can be written as

§7(neme(ve)) + V0(neme(veve)) (2.45)

+neq(E+ ((ve) xB)) = neme(ve)vm

where vm is the collision frequency for momentum transfer.

Time varying E and B ﬁelds are also deﬁned in a manner similar to equation
(2.44). After collecting only the time invariant terms from equations (2.43) and
(2.45) and after considerable simpliﬁcations involving the discarding of second
order terms. an equation for the collision dominated plasma without volume

recombination and attachment can be derived as

DaV2n(r,¢,z) +nv,. = 0 (2.46)

where vi is the ionization frequency, n is the plasma density and Da is the

ambipolar diffusion coefﬁcient given by

 

_ l"'a'l)e""‘lel)r'

Da

(2.47)

37

where We) is the ion (electron) mobility and Dug) is the ion (electron) diffusion
coefﬁcient.
Some of the analytical diffusion theories which are derived after simplifying

assumptions to equation (2.46) are explained in the following subsections.

2.3.1 Schottky Diffusion Model

A simple model of diffusion in plasmas is the Schottky diffusion model
proposed by Schottky [50) which assumes that volume collisions and space—
charge effects govern charged particle diffusion and control the discharge
density distribution. This model is used to solve the general diffusion equation
(2.46) which is the continuity equation assuming no externally applied electric
ﬁelds and a cold plasma model. Assuming no static B ﬁeld, single stage
ionizations only and perfectly absorbing boundaries, equation (2.46) can be
solved for a ﬁnite-length cylinder using a separation of variables technique to
give

n (r, z) = No.10(2°k£r)cos(nfz) (2.48)

where No is the charge density at 2:0 and r=0 and Jo is the zeroth order Bessel
function. The equation relating geometrical variables to the diffusion constant
and ionization frequency is

vi_ 12_ 2.4052 152
5' - (A) - (—R ) +0 (249)

a

where R is the radius and L is the length of 3 ¢ symmetric cylindrical discharge
chamber. The separation constant, A, is deﬁned as the characteristic diffusion

length for Schottky diffusion. In low pressure plasmas, where electron

38

temperatures are much higher than ion temperatures, equation (2.47) can be
simpliﬁed to a modiﬁed form of the diffusion coefﬁcient as

.D ‘ .k T
pa = Le = is: (2.50)
(J.e q
where Te is the electron temperature, kb, is the Boltzmann constant and q is the
electron charge. Using equation (2.50) and after extensive derivations [23]

equation (2.49) can be expressed in the form

(kae) 5/2
co 8
18,805 (e) exp(—kae)d€

where P is the pressure of the neutral gas, a is the energy, 61(8) is the ionization

(CPA) 2 = (2.51)

 

 

cross section, a, is the ionization potential and C is a constant expressed in units
of (’I‘orr-cm)‘1 when the pressure is expressed in Torr and the length in cm. It
should be noted that in order to arrive at equation (2.51), it was assumed that
ion mobility varies inversely with pressure. Equation (2.51) provides a simple
relationship between geometrical factors, represented by A, neutral gas pressure
P and electron temperature Te and hence can be used to predict scaling relations

in low pressure cylinder type plasmas.

2.3.2 Free-Fall Diffusion Model

Another analytical model of low pressure, uniform discharges is the Free-Fall
diffusion theory. This theory assumes that the mean free path is greater than
the cavity dimensions and that the transport equations are dominated by
ionization collision terms. Ions and electrons are assumed to freely fall to the

discharge walls and quasi-neutrality is assumed everywhere except in the

39

sheath. This theory assumes that the only loss mechanism of ions and electrons
is due to the recombination at the walls and hence derives a particle balance
equation between loss to the walls and the gain due to ionization. Assuming
uniform ion ﬂux and charged particle density, the balance equation can be

written for a cylindrical structure as described in Section 2.3.1 as

PA 11 = ”iVinis

1W0

(2.52)
12.21): (RL +R2) = niovinRzL

where AM is the surface area of the wall, I“, is the ion ﬂux to the walls, ths is
the chamber volume and nu, is the constant ion density in the chamber. A

general diffusion expression for free-fall diffusion can be found [51) as

(19,792
.. a
L 80,. (e) exp(—kae)d8
where C is a constant expressed in 1 /(cm3Torr) when the pressure P is

CPa = (2.53)

 

 

expressed in Torr, o, is the ionization collision cross section expressed in cm2,

the length terms are expressed in cm and

_ RL )

a_(R+L' (2.54)
Mahoney [23] compared experimental measurements on an ECR plasma

reactor with both the free-fall and Schottky diffusion theories and showed that

for his reactor which was 8.9cm in discharge diameter, the plasma characteristic

makes a transition from Schottky type behavior to a free-fall type behavior below

a pressure-diffusion length of 0.015 Torr-cm and an electron temperature of

36,000 K. He explained this was due to the fact that at smaller volumes and

40

lower pressure, the mean free path of ions exceeds chamber dimensions and
charge density can be considered essentially uniform, i.e., the assumptions of

the free-fall diffusion are more valid in this region.

2.3.3 Ambipolar Diffusion Model

Experimental plasma reactors of the type shown in Figure 2.7 are not of a
simple cylindrical form assumed in the earlier sections. In fact, they consist of a
source region which expands into a larger downstream processing region and
equation (2.48) no longer accurately predicts the variation in charge density
within the chamber. An extension of the Schottky diffusion model to a more
complex geometry was done by Hopwood et. a1. [3]. This work used an ambipolar
diffusion model to describe the downstream behavior of the plasma. It was
assumed that the electrons and ions follow ambipolar diffusion in which a space
charge region slows the electron motion and speeds up the ions such that both
move at the same diffusion rate given by equation (2.47). This theory also
assumes equal charged particle densities and ﬂuxes in the plasma.

Solving the ambipolar diffusion equation (2.46) for the processing region with
a uniform source discharge radius b and a chamber radius a as shown in Figure

2.7 gives an analytic expression for ion density as [3)

.. 12
n(r,z) =No(3.’?) 2 (1)Jl((xn)a)10((xn)£)exp(—kz) (2.55)

(n=1) x" 130:.)

k (I)
_ .a. -5. (2.56)

41

Quartz Discharge

Chamber Microwave
Input Probe

ECR Surface —\ /_ Sliding Short

Water / .5 Resonant Cavity

..
I
,.
G I,
as "
If
,
L
r
.
I
,; {ﬂ
,1
x
5‘ \ ‘. . A; /
\“ ‘\
‘ ..\_.
t i, -. ‘ v.‘
. ‘; \ ,‘_- az‘. a“ \
. . -r. s
-’.'I.‘,.' .,
I.‘,_ 7‘ .'{ .I_ I
I,"I,.: ’51
‘ 1., ’ .'l. I.
, ‘I vl‘ ‘
1., . ..., 1..
I,"I’,. ,. ,I.‘I, , ‘/
{I " I "‘
. ~p 1...! .
- .' -. ~. , n‘ r
' ‘ »‘ [A
I I I
,

,,,,,,,,,,,,,,,,,,,,,,,,,,,,

     
  
   

Magnet

.
/
,
I
/
z
I
I.‘
, 1 ,' I ' l.”.
r‘ ‘.
p.,.
.0, , .
‘l.' .’/' .
' . r. r , .
3,31 ,‘1/ , .‘z , ,’.
‘7. .":\“".~ ‘
5; \~.‘
\'. \
\ a
a\ \ ‘
‘.\v .A,
.\ _ ‘.\ ‘.\'.\".\“.\\‘.\’.‘
‘ v. v. .v. '. ~. '.\
“ r. ‘ \-\‘.\"“.\\‘ \ i
. ‘.‘\...\.~.\.
\
\
‘I\\

    
 

, '1/0» 2'
yI/‘M (Ar/40‘ ,
Vol/M 0"”

3 Plasma z .__ 0 cm

 

““““

 

 

 

Wafer

‘ Load Lock
To Pumps

Vacuum Chamber

Movable Chuck

 

To Chiller ‘—

 

Figure 2.7: A 25 cm diameter ECR system.

42

and N0 is the charge density at 2:0 and r=0, Jn is the nth order Bessel function
of the ﬁrst kind and x" is the nth zero of Jo. In solving the equation angular
symmetry is assumed, hence there is no ¢ dependence. As observed in equation
(2.55), the density distribution has a Bessel function dependence on the radial
variable, r, and a decaying exponential dependence on the vertical variable, z.
Equation (2.55) has shown good agreement in some studies with
experimental data. In one such study, the sputtering uniformity versus position
on a wafer was used as a measure of the ion density variation. This was done by
Salbert [4] on a 9 cm system by measuring both the ion density and the
sputtering rate versus radial position at a position 8 cm downstream from the
plasma source. The density of argon ions was measured using a double
Langmuir probe and it was seen that the plasma density followed the ambipolar
diffusion model closely, and the amount of oxide sputtered away was
proportional to the ion density. Hence plasma density given by equation (2.55)
can be used to predict the uniformity and relative sputter rate of argon
discharges. Equation (2.55) indicates that the uniformity of the plasma density
‘ and hence the sputtering rate across a given diameter is improved by using a

larger diameter plasma source.

2.3.4 Limits of Analytical Models

The analytical model derived in the above section assumed a constant source
region ion density and thus fails to accurately predict a downstream discharge
in the case of a nonuniform source. This theory was also derived under the
assumption of 4) symmetry and hence would have to be re—derived for cases

where c symmetry is not present. Furthermore, this theory needs to be modiﬁed

43

for the case of downstream magnetic conﬁnement where the condition of
absorbing walls is no longer true.

The application of simple analytical models is helpful in plasma reactor
scaling. More accurate models which would reduce the design and optimization
time for plasma sources and processes will need to utilize numerical techniques

as described in the earlier sections of this chapter.

2.4 Vector, Parallel and Distributed Processing Methods -

Plasma simulations, whether they be purely particle type or hybrid, typically
solve Poisson’s equation in order to self consistently resolve charge density and
electric ﬁeld. This requires discretization of Poisson’s equation over the three
dimensional grid points. The charge and current density calculations have to be
done using the particle-in-cell (PIC) approach to distribute the effect of the
charged particles over the entire grid structure. Hence, the effect of larger
geometries is twofold as it increases the demand on the computer’s memory and
the time taken per iteration due to the larger grid size and increased number of
particles.

The effects of the above problems can be reduced by using irmovative
computer techniques. Depending on the symmetry of the structure under
simulation, one can accurately predict the whole structure by simulating a slice
of the same. For example, in an ECR system with four magnets providing the
ECR ﬁelds, one can predict the whole structure by simulating only 90 degrees
and then reﬂecting the results around the circle as shown in Figure 2.8. This

approach can save up to three-quarters of the grid points required for

 

 

Simulation

 

Figure 2.8: Example of symmetry in a plasma simulation.

45

simulations. This will also reduce the number of particles required to conduct
an accurate simulation.

Another approach to this problem is to take the supercomputer route which
uses large memory and high-speed resources. This method has its advantages
when combined with the “slice” approach and special programming techniques
which are unique to the choice of the supercomputer. Lohner and Ambrosiano
[21] have developed vectorized particle location tracers for application in vector
computers. They reduced the search time for particles by using the last known
position as the initial guess and looping over adjacent grids to locate the current
position. This allowed them to form vector loops and their results showed a
Speedup of 14 times when used on vector machines.

The third approach is to use a massively parallel processor (MPP) and
distribute the processing across the nodes (CPUs) of the machine. MPPs are
designed to reduce interprocessor communication overhead. This approach is
extremely desirable in cases where the application is inherently parallel such as
in the particle simulation of plasmas. In such simulations, each processing
element can be considered to represent one or more grid points and an
incremental volume around these grid points is assigned to it. In an extreme
case, each superparticle being simulated can be considered a processor.
However, a trade-off between the processing speed-up and communication
overhead between the processors exists. Lin et. al. [20) have developed parallel
PIC codes for a massively parallel processor.

A variation of the above approach would be to use many slower computers in
parallel to speed up the program. For example, in the case of a cylindrical cavity,

one can conceive of a situation where the cavity is broken up into eight pieces,

46

each slice of 90 degees and half of the original 2 length. Each of these pieces
could then be “distributed” to eight different workstations of similar
speciﬁcations to be simulated. Any data that has to be transferred, for example,
a particle leaving one slice and entering another, would require that the details
of the particle be transferred from one machine to the appropriate one. This and
the potential on the faces “shared” by two machines can be transmitted between
the appropriate machines at the end of each “loop” so that all machines are
consistent and in synchronization. This approach has the advantage of using
cheaper and more widely available workstations to perform simulations.
Further, one' can conceivably simulate large asymmetrical structures using
numerous workstations. This approach also distributes the memory
requirement across the participating computers.

A message passing package which caters to such applications is Parallel
Virtual Machine (PVM) [52]. PVM has been applied extensively to molecular
dynamics problems [53], [54] using Lennard-Jones type potentials to calculate
force and transmitting data of any molecules that transfer across. This problem
is simpler than plasma discharge simulations since it does not involve
calculating any kind of Poisson’s equation which requires more data to be
shared between processors. However, the parallel approach has its pitfalls, one
of which is stated in the famous “Amdahl’s law” which states that no matter
what one tries, there will always be some non-parallel part of the code which will
partially offset any gains made by the parallel part. Also the program may spend
more time in the communication part thus offsetting any gains due to

parallelization.

47

Use of parallel processing methods generally lead to more complicated code
which tend to be very difﬁcult to debug. Therefore new code development and
debugging tools, for example xab [55], which are speciﬁcally developed for
parallel programming. will have to be used. The distribution of code across
several machines makes it more difﬁcult to pinpoint the problem and would
require complex intermediate data storage to make them reliable and
debuggable.

Ewing et a1. [56) have used PVM to successfully distribute the computation of
elastic wave propagation to simulate a displacement seismogram. This
implementation was done using a host/ node approach where the host program
performs I/O and decides the domain decomposition of the nodes. The
, communication of interprocessor boundary solutions synchronizes the nodes.
’Ihey demonstrated a speedup of nearly 5.4 for a six processor implementation.

Liewer et al. [57] have studied concurrent PIC codes for plasma simulations
and Brandon et al. [18] have studied PIC models for multiprocessors.
Matsushita et al. [22] have applied parallel techniques to the simulation of

nonlinear MHD (Magneto-Hydro Dynamic) simulations.

2.5 Review of Previous ECR Plasma Modeling Work

Grotjohn [8) did a 2D3V simulation of a compact ECR source assuming axial
symmetry. The time varying electric and magnetic ﬁelds were solved by using
time-domain ﬁnite-difference solutions of Maxwell’s equation. The electron and
ion behaviors were modeled using Monte Carlo particle techniques.

Kuckes [58] computed the propagation and absorption of electromagnetic

waves in magnetized plasmas. He predicted complete wave absorption above a

48

critical plasma density. Trost and Shohet [59] simulated electron behavior
during ECR heating in a stellarator. They used Monte Carlo methods to simulate
electron distribution functions. The simulation results agreed well with
experiments. Sprott [60] developed analytical models to predict ECR heating in
toroidal octuples. A computer simulated static B ﬁeld of the octuple was used in
the study. His model predicted strong heating in places where VHB = 0.
Experimental measurements veriﬁed the predicted heating locations. Lieberman
and Lichtenberg [61] treated the theory of ECR heating in a magnetic mirror
both analytically and numerically. Hussein and Emmert [62] in their simulation
of the ECR source region eliminated the need to solve Poisson’s equation by
assuming quasi-neutrality and inferring the local plasma potential from the
electron Boltzmann equation. It has been shown by Self and Ewald [63) that the
hybrid approach can be applied even in the case of low pressure free-fall
conditions.

Porteus and Graves [641 simulated two dimensional low pressure
magnetically conﬁned plasmas by using the hybrid approach. They treated the
electrons as a ﬂuid and the ions as particles and Poisson’s equation was
explicitly solved to obtain the potential. Their simulation of ECR heating
assumed either a ﬂat power per electron or a Bessel function type of spatial
power deposition. They did not couple Maxwell’s equation to self consistently
simulate energy.

In their more recent work, Hussein and Emmert [65] studied the ion energy
spectrum in divergent static magnetic ﬁelds. Ions were treated as particles and

ion-neutral collisions were treated using Monte Carlo techniques.

49

Weng and Kushner [66) investigated the electron energy distribution in the
ECR zone using Monte Carlo simulations and expliciﬂy considered electron-
electron collisions. The transfer of energy from the electron gas to the
background gas for downstream processing was also investigated.

Graves, Wu and Porteus [67) developed 2D hybrid simulations of high density
ECR plasmas. The hybrid mode simulation treated electrons as a ﬂuid and ions
as particles. The ECR reactor being simulated was not distinguished into a
source and downstream region, instead. the substrate was located in the source
region itself. Power deposition proﬁles were assumed such that the ECR zone
was near the substrate surface. The effect of radial microwave power deposition
proﬁle on plasma parameters like density, potential and electron temperature
was investigated.

Yasaka et al. [68] have done 2D modeling of ECR plasma production
assuming the cold plasma theory in which the thermal motions of charge
particles was neglected. Their simulations developed proﬁles of EM wave and
plasma density evolution in a bounded cylindrical system. Further, they showed
that the location of the microwave antenna had considerable effect on plasma

production.

Chapter 3

Description of Microwave Plasma Sources

 

3. 1 Introduction

The operating characteristics and description of the two sources used in this
research are described in this chapter. A 25 cm diameter MPDR 325 system and
a 9 cm diameter plasma discharge system were used for performing some
downstream measurements and simulations. A smaller MPDR 610 compact ion
source was used for plasma source simulations. This chapter also applies the
ambipolar diffusion model of Section 2.3.3 to the problem of modeling the
downstream plasma behavior in MPDR 325. An evaluation of the accuracy and
limitations of this model are presented and the need for numerical models is

discussed.

3.1.1 Downstream Plasma Chamber

The microwave plasma source [3] shown in Figure 2.7 consists of a
cylindrical cavity which can be adjusted to a resonant mode by means of the
variable length microwave input probe (antenna) and the adjustable sliding
short. The lower termination of the microwave plasma source, referred to as the
baseplate, is made of several individual parts. A circular quartz discharge
chamber, which serves as the containment for the plasma, is surrounded by

several rare earth magnets. These magnets are imbedded in the base plate with

50

51

alternating poles facing the discharge. Process gas ﬂows into the chamber from
holes just under the quartz dome. Inside the vacuum chamber is the anodized
aluminum substrate holder that supports the silicon wafer. The RF biased
substrate chuck is movable, allowing adjustment of the wafer position with
respect to the plasma source region. Experiments were conducted on two ECR
systems with 9 cm and 25 cm diameter discharges.

Silicon forms a 4-10 Angstroms thick native oxide within minutes of
exposure to air, therefore an in situ, low damage, method to remove small
amounts of oxide and other contaminants prior to metallization and other
processing steps is desirable. One way to achieve this is by a “soft sputter etch”
of the surface using an argon plasma in which Ar“ ions are accelerated to the
surface with low kinetic energy. In some cases, this procedure can be done in
the same processing chamber used for subsequent deposition or gowth of
desired layers.

A prior investigation utilizing an electron cyclotron resonance (ECR) argon
plasma exited by 2.45 GHz microwave power for sputter etching was reported by
Salirrrian et. al. [69). In their divergent ﬁeld ECR system, which used two
electromagnets, the substrate was biased with 13.56 MHz RF power inducing a
negative substrate bias. This allowed separate control of plasma density and ion
energ' as opposed to a conventional parallel plate RF plasma system in which it
is difﬁcult to achieve high densities without high ion energies. Salimian et. al.
investigated the variation in sputter rate versus microwave power, bias voltage
and downstream distance. Potential problems associated with sputter cleaning
such as substrate backscattering and redeposition have been reviewed by

Vossen [70). Recent work by Raynaud and Pomot [71] on the cleaning of silicon

52

surfaces by argon ECR plasmas with low ion energies (< 100 eV) at a pressure of
0.5 mTorr has shown it is feasible to obtain a silicon surface exempt from
carbon and oxygen contaminants.

In this chapter experimental sputter oxide removal is investigated by
studying both the sputter rate and the uniformity of 8102 removal from oxidized
silicon wafers using a multipolar ECR argon discharge. Additionally. this
chapter correlates the sputter results to plasma measurements. ’leo permanent
magnet based ECR systems with different diameter discharges were used. The
oxide thickness versus position was measured with an ellipsometer both before
and after the sputter removal process in order to assess uniformity and
sputtering rate of 3 and 6 inch wafers. The sputtering rate and uniformity was
studied across a parameter space consisting of variations in microwave power,
pressure, induced substrate bias and substrate position.

The oxide removal results are correlated with plasma properties including ion
densities measured with Langmuir probes. The uniformity results are compared
with the ambipolar diffusion model proposed by Hopwood [3]. This analytical
ambipolar model introduced in Section 2.3.3 provides a simple method of
predicting charge density variation in the radial and axial directions. This model
is applied to the prediction of sputter etch uniformityusing an argon plasma
with the underlying assumption that sputter rate and uniformity are dictated by
the plasma charge density at the surface. The amount and uniformity of Si02

etched is then compared with the ambipolar model.

53

3.1.2 Experimental Methods

3.1.2.1 Sample Preparation and Processing

Thermally oxidized silicon wafers with oxide thickness of 500 - 700
Angstroms were scanned by a Gaertner wafer-scanning ellipsometer, providing a
map of refractive index and oxide thickness across the wafer surface before and
after sputtering. In the case of the 9 cm diameter discharge system ([4],[5),[8]),
sputtering was done at a pressure of 1.0 mTorr, a microwave power of 200 W, an
argon ﬂow of 7.5 sccm, and a DC bias of -50 V with the wafer located 8 cm
downstream from the bottom of the base plate. Sputtering runs were conducted
on the 25 cm diameter system by varying plasma conditions over the parameter
space of input power, chamber pressure and RF bias voltage. The nominal
conditions were 20 sccm argon ﬂow, 1 mTorr pressure, -50 V induced DC bias
and 500 W microwave power. Runs were conducted from 0.6 mTorr to 2 mTorr
pressure, from ~30 V to -80 V RF induced DC bias, from 200 W to 800 W
microwave power and with 2 locations of 6.3 cm, 8.3 cm, 9.8 cm and 11.8 cm

downstream.

3. 1 .3 Experimental Results of Uniformity and Sputter Rate

The sputtering uniformity versus) position on the wafer is primarily
determined by the incident ion density and energy across the wafer. For the
results reported here, the uniformity is determined by the ion density variation.
This was veriﬁed on the 9 cm system by Salbert [4] by measuring both the ion
density and the sputtering rate versus radial position at a position 8 cm

downstream from the baseplate. The density of argon ions was measured using

a double Langmuir probe. This result is repeated in Figure 3.1 for continuity
with the rest of this chapter. Figure 3.1 shows both the ion density and the oxide
removed for a 3 inch diameter wafer. The discharge conditions were -50 V
induced DC bias, 1 mTorr pressure and 200 W microwave power. Also shown in
Figure 3.1 is the plasma density versus radial position as calculated using the
ambipolar diffusion model, with a = 9.8 cm and b = 3.75 cm (see Figure 2.7),
given in Section 2.3.3. It is noted that a and b approximately represent the sizes
in the 9 cm system, i.e., their values are set to give the best ﬁt to experimental
data. As can be seen in this ﬁgure, the plasma density follows closely the
ambipolar diffusion model, and the amount of oxide sputtered away is
proportional to the ion density. Hence plasma density given by equation (2.55)
can be used to predict the uniformity and relative sputter rate of argon
discharges. For purposes of comparison, in this report, uniformity is expressed
as a standard deviation percentage, which is the standard deviation of the oxide
removed divided by the average amount of oxide removed. For the data shown in
Figure 3. 1, the uniformity was 9%. Equation (2.55) indicates that the uniformity
of the plasma density and hence the sputtering rate across a given diameter is
improved by using a larger diameter plasma source.

In the case of the larger 25 cm diameter discharge system, the results for the
3 inch diameter wafers versus variation in microwave power, pressure, RF
induced bias and downstream position are shown in Tables 3.1, 3.2. 3.3 and
3.4, respectively. Figures 3.2, 3.3, 3.4 and 3.5 show the plots of sputtering rate
vs. microwave power, pressure, RF induced DC bias and downstream distance
respectively. In all cases the standard deviation percentage uniformity for 3 inch

wafers was 2-4%. Of particular note is the improved uniformity with increase in

55

Comparison of Oxide Sputtered vs. Simulated and Measured Ion Density
200 . . . -

 

     

 

 

 

7 ' Sputtered Thickness .H
180 » : Measured Ion Density . ‘
[D
E I Ambipolar Diffusion -—-
o 160 P I
g .
g) 140 ' I. : 41
I
5 120 - : l
U 5
E 100 . : .
5. 80 . :
m 3
2% 60 , :
O 40 I :
I
20 . g ,
o - . . . 3 . . .
-40 ~30 -20 - 10 0 1 O 20 3O 4O
Radial Distance in mm

Figure 3. 1: Comparison of theory with experiment for 9 cm system.

56

Table 3.1: Sputtering rate and uniformity vs. microwave power

 

Power

Sputtering Rate

Watts Angstroms / min %
200 20.78 4.00

Uniformity

 

500

42.64

3.2

 

800

 

 

50.3

 

3.19

 

 

Table 3.2: Sputtering rate and uniformity vs. pressure

 

 

 

 

 

 

Pressure Sputtering rate Uniformity
mTorr Angstroms / min %
0.6 39.15 2.8
1.0 31 .76 3.46
2 23.15 3.6

 

 

57

Table 3.3; Sputtering rate and uniformity vs. RF induced bias

 

 

 

 

 

 

 

RF induced bias Sputtering Rate Uniformity
in Volts Angstroms / min %
~30 16.42 3.89
~40 3 1 .76 3.46
~50 42.64 3.2
~65 54. 14 3.2 1
~80 79.22 3.36

 

 

 

 

 

Table 3.4: Sputtering rate and uniformity vs. downstream distance

Downstream Sputtering Rate Uniformity
distance in cms Angstroms / min %

 

 

 

 

6.3 56.5 2.87
8.3 47.7 2.36
9.8 41 .35 2.12
l l .8 39.93 2. 12

 

 

 

 

 

Oxide Etched in A

55

50

45

40

35

30

25

2O

15

58

 

 

A j l I A

 

200 300 400 500 600
Power in Watts

Figure 3.2: Oxide etched vs. power.

700 800

 

Oxide Etched in A

59

 

40-

35-

f

3O

25)

 

It
.1

b

 

0.4

0.6 0.8 l 1.2 1.4 1.6 1.8

Pressure in mTorr

Figure 3.3: Oxide etched vs. pressure.

 

Oxide Etched in A

80

70

50

3O

20

10

60

 

t

 

 

RF induced bias in Volts

Figure 3.4: Oxide etched vs. RF induced bias.

 

Relative Etch Rate

61

 

0.9 -
0.8 .
e'l‘z —
0-7 ' Experimental points .

0.6 -
0.5 .

0.4 >

 

0.2 -

 

)-
p
)-
p
)-

 

 

0.1 -
0 2 4 6 8 10 12

Downstream Distance in cms

Figure 3.5: Comparison of theory with experiment for downstream points.

62

downstream distance. The improved uniformity is accompanied by a reduced
sputtering rate further downstream. The sputtering rate decrease for position
further downstream is predicted approximately by the plasma density given by
equation (2.55) and as shown in Figure 3.5. The uniformity also improves as
pressure is reduced as given in Table 3.2. This is as expected since the lowering
of pressure increases the mean free path between collisions resulting in
improved uniformity.

Table 3. 1 shows that the sputtering rate increases with microwave power.
This occurs because the plasma density increases as the microwave power is
increased [72]. Lastly, the sputtering rate decreases as the pressure is
increased. This occurs because the plasma potential decreases as the pressure
increases [73]. In particular it should be noted that the energy of the ions hitting
the surface is approximately given by the sum of the plasma potential and the
induced DC bias.

The comparison of sputtering uniformity vs. radial distance with the
ambipolar model given in Equation (2.55) is shown in Figure 3.6 for 3 inch
wafers. It can be seen that the sputter rates follow the ambipolar predictions
closely both in the radial and downstream directions. For the 6 inch diameter
wafers the variation in uniformity across the wafer versus radial distance is
shown in Figure 3.7. The most oxide was removed in the center with smaller
amounts removed near the edge as predicted by equation (2.55). The oxide

removal pattern is it: symmetric and the uniformity was approximately 5.4%.

63

 

 

 

 

 

2: 6.3 o
260 h z: 7.3 +
Z: 9.8 o
w 240 -
E
b v v 1" A 0 ° —¢
ED 220 -
E
U 200 »
93
8 ..... * ...... * ...... f"”‘" ..§.. .3... s ‘5}
S. 180 -
a)
O
'c --------- a. ---------- a ---------- B ---------- D ~~~~~~~~~~ ﬂ ---------- ram-g
g 160 .
140 ‘ ‘ ‘ ‘
0 0 5 1 1 5 2 2 5 3 3 5
Radial Distance in cms

Figure 3.6: Comparison of theory with experiment for 3 downstream points.

 

 

 

 

 

Ambipolar Diffusion—
] - : : - . _ Experimental Points- '
7., 0.8) ‘
8
E
0.1 0.6 r ‘
o
‘5’
o 0.4 -
5
.5!
o
m 0.2]
O A a n n
0 1 2 3 4 5 6

Distance in cms

Figure 3.7: Comparison of theory with experiment for a 6 inch wafer.

65

3.1.4 Summary of Downstream Measurements

The sputter proﬁle approximately follows the measured plasma density and
the ambipolar diffusion calculated proﬁles. For a 9 cm diameter ECR discharge,
the standard deviation percentage uniformity for 3 inch wafers was
approximately 9%. For the 25 cm discharge system the uniformity across 3 inch
wafers was 2-4% and less than 6% for 6 inch wafers. At the nominal conditions
of 500 W microwave power, 9.8 cm downstream, ~50 V RF induced bias, 1 mTorr
pressure and 20 sccm argon ﬂow the sputtering rate was about 40 Angstroms/
min. The sputter rate increased, as expected, with increasing DC bias and
increasing microwave power. The sputter rate decreases with increasing
pressure, which may seem counter intuitive since plasma density increases with
pressure. However, the rate decrease is attributed to the fact that plasma
potential decreases with pressure thereby reducing the ion energies and sputter
rate. The experimental sputter rates agreed approximately with the predictions
of the ambipolar diffusion model in the radial and the downstream directions for
both 3 inch and 6 inch wafers. Some discrepancy between the experimental data
and the ambipolar diffusion model towards the edge of the wafers exists.

In conclusion, the sputter removal of oxide from silicon wafers was
demonstrated with useful sputter rates and good uniformity using a 25 cm
diameter ECR argon discharge. The ambipolar diffusion model of the plasma
processing chamber proved useful for predicting the scaling properties of plasma
reactors. However, the ambipolar diffusion model is limited in that it does not
predict the plasma potential (i.e., ion energy) and it assumes a uniform plasma

density existing at the source which is known to have non-uniformities.

66

It should also be noted here that the ambipolar model applied in this chapter
is useful in estimating the discharge characteristics of downstream regions in
the case of radial symmetry and the absence of conﬁning magnets. In studies
involving source regions and/ or in the presence of magnets, a more complicated

and detailed simulation detailed in subsequent chapters will have to be done.

3.2 Compact Ion Source

The compact ECR source shown in Figure 3.8 has been extensively
characterized in [24]. This microwave cavity has an outer wall diameter of 5.8 cm
and a height of 12.4 cm feeding an ECR plasma generation region of 3.6 cm
internal diameter and approximately 3 cm height. The ECR zones are created by
3 annular ring magnets of pole strength 0.25 Tesla. Experimental
measurements of a helium discharge [74] showed charge densities ranging
between 3.8x109 - 6.2x109/cm3 for pressures between 0.07 - 0.2 mTorr, input
power between 90 ~ 115 Watts, and electron temperatures between 8 ~10 eV.

Field pattern measurements of the absolute magnitude and spatial variation
of the impressed electric ﬁeld have been conducted on this source using an
array of holes drilled in its wall [75]. Measurements verify that a standing wave
is created in the coaxial section of the cavity. The ﬁeld is measured using a
calibrated micro-coaxial probe which reads power from which the electric ﬁeld is
calculated. It has been shown that a maximum electric ﬁeld of 40 kV/m can be
obtained at 130 W input power. Investigations of this plasma source and another
small diameter microwave plasma source by Mahoney [23], [76], have shown
that these sources operate (in the pressure range (0.1 ~ 10 mTorr)) with a free fall

diffusion behavior at lower pressures and with a Schottky diffusion behavior at

67

60.58 GB mom 53:50 ”we. 959m

Edzoo
52— 2.
925235
.555 025%
mmmzéo <2m<d
msmzo<2 azmz<2~ma mob:ozo%zzE.z<

mmPZm—U

w...
.__ t

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

West) (H

 

 

 

 

OZ—AOOU ~=<

HE

 

WZS w<0
Qmmm m<O 1"

NUMDOm
MES/Om EOE
X<OU 230 On

68

higher pressures. The understanding and optimized design of these small
microwave plasma sources has in the past been accomplished using knowledge
and scaling techniques derived from the free-fall diffusion and Schottky
diffusion models. As these sources are further studied and optimized, the limits
of the application of these analytical models are reached and more

comprehensive numerical models are needed.

Chapter 4

A 3D3V Plasma Simulation System

 

4. 1 Introduction

This chapter introduces the model developed in this research to simulate
plasma generation sources in three dimensions. The implementation of non-
terrninated grids to improve numerical stability is presented. The methodology
used in developing a complete self-consistent EM-plasma model and the details
of combining the two is also presented. Methods to take advantage of the

inherent parallel nature in the simulations are also presented.

4.2 Design of the Plasma Simulation

Particle type plasma simulation are by nature very compute and memory
intensive. The fundamental constraint of numerical simulation of plasmas is
that the grid spacing in each of the simulated dimensions should be less than

the Debye length Ad (in cm) given by

Te
Ad = 743 l-v—e . (4.1)

where Te is the electron temperature in electron volts and Ne is the background
charge density in cm'3. This restriction presents fundamental problems on the

size of plasmas that can be simulated on conventional desktop computers. For

7O

example, a plasma at an electron temperature of 7 eV and charge density of
7x109cm'3 corresponds to a Debye length of approximately 2x10'2 cm. In order
to simulate a small cavity of 1.8 cm radius and 3 cm length, one would require
nearly 80 grids in the radial and 140 grids in the axial direction. With these
restrictions, one cannot simulate the complete cylindrical structure on most
conventional computers. Therefore, a simulation has to take advantage of the
inherent symmetry of the structure as discussed in Section 2.4.

The iteration timestep of a plasma simulation is required to be < 21t/(Dpe.

where rope is the plasma frequency in rad/ sec given by

1

2 2
neq
we=[ ] =56.5n
P "1880

q is the electron charge and me is the mass of the electron. Further, the timestep

. (4.2)

 

N NIH

should be small enough that no particle should be able to traverse a distance
larger than the Debye length. This ensures that Debye shielding phenomenon is

maintained in the plasma being simulated.

4.2. l A Non-Terminated Grid Design

The example given in the Section 4.2 leads to another inherent problem
present in simulation in cylindrical co-ordinates. A large number of gids in the
radial direction directly leads to a large difference in the volumes enclosed by the
grids in the center to the ones further out in the radial direction. This can lead to
large instabilities in the solution of plasmas in general and Poisson’s equation in

particular. In order to reduce such instabilities, a different grid scheme has to be

71

developed. This section introduces one such grid scheme developed for this
research.

The main aim behind the development of this grid scheme is to ensure that
the volumes enclosed by the grid points are of the same order of magnitude. This
is accomplished by having different number of grid points in the it) directions at
different radial points as can be seen in Figure 4.1. It can be seen that as the
distance increases in the radial direction, the number of q) grids also increase
leading to the equal volume design requirement. This sort of grid design presents
new challenges in determining the grid point closest to the present location of a
particle and in locating the closest grid points while performing CIC type
calculations. Further, the elementary volumes for these grids also have to be
calculated carefully so as to prevent incorrect charge density calculations.

In order to achieve this end, a linked list type neighbor scheme was designed
in which each point maintains a list of all the indices of its neighbors in the
three directions. In addition to these, the grid points in the radial region where
there is a discontinuity in the (p direction also maintain the indices of the two
closest grid points in the lower radial direction. The grid point that is closest to
the particle in all three dimensions is considered the operating point for
calculating the contribution of the particle to the grid and vice-versa. This
calculation is the same as the conventional CIC calculation when the particle is
in a cell between radial points with the same number of it: grids. However, when
the particle straddles non-terminated grid boundaries as seen in Figure 4.2.
extra computation has to be performed to calculate CIC contributions. In this
case. the elementary cell contains 10 grid points instead of the conventional 8

and the location of the particle in the c direction determines which half of the

 

 

 

 

 

 

l: Non-terminated grid scheme.

73

 

 

 

5
2 4
/
/ 3
/‘ .
/ ;
Particle g,
3‘
f 3
i
7 ‘ 9
/
8

 

Figure 4.2: Elementary cell in non-terminated scheme.

74

elementary volume the particle is considered to be located as can be seen in
Figure 4.2. In the ﬁgure, the location of the particle determines that the particle
belongs to the volume created by the grid points 2, 4, 5, 7, 9,10 and the pseudo
points x’ and y'. The value of the potentials at the two pseudo grids points are
interpolated from their respective ¢ neighbors.

This grid design proved to be very successful is solving serious problems of
stability of the simulation at the center. In addition to this, this type of grid
structure reduced the overall number of gids required to simulate the structure
by over 50%. This allowed for the design of larger structures, or alternatively,
higher density plasmas (smaller Debye length). However, it should be noted here
that these advantages came at some computational overhead required in

locating and maintaining the grid points.

4.2.2 Poisson’s Equation Solution

Two dimensional rip and rz ﬁnite-difference equations developed in [7] for
Poisson’s equation (2.5) can be extended into three dimensional cylindrical

coordinates as follows

 

 

 

2r. 1
-pj,k,l 1+5
2 (Ar2).Ar (Vi+1.kI‘Vj.k.t)"
J j+1
2
2r 1 2 (
j—- (V. — V. +V. 4.3)
2 (V'kz‘V'-1k,l)+ ,,k,1+1 1,21 1.2.1-1)+
(Ar2)J-Ar.1 1.. I ~ (A22),
1.—
2
2Ar

(Ar2)j (ACP) ﬁr}. WM” 1’1— 2V1} k. l + V“- 1. 1)

75

where V3.“ is the potential at grid point (i,k,l) corresponding to the three
directions (no.2), (A¢)k is the angle between grids (i,k+1,l) and (i,k,l), r} is the radial

distance to the fh grid point, 5+1” = 1/2 (0+ 1+0).

Ar. 1 = er—rj (4.4)
1+-
2
and
(M),- = r? l-r? 1 ”-5)
1+5 1’5

with the condition j > 1. Numerical solution of Poisson’s equation in cylindrical
coordinates gives rise to a peculiar problem at 1 = 1 corresponding to r = 0. In
this case the k points in the c direction are essentially the same point. Again

Birdsall’s [7] 2-D version can be extended, with (A12)0 = r1/22, as

 

 

2r
1
“90,21 _ 2 V V + (V0,k,l+1_2V0,k,I+V0,k,l—l)
e — (Arz) Ar ( l'k'l— 0'“) (A22)
0 1 z
2 (4.6)
V1 lat-V0 kl
Q0 = , Ar ’ ’ r1A¢
k l 2
2

with the second part of the equation being a summation of all charge at origin.
This equation is equivalent to the statement that all it) points at the origin are the
same. E, and E, the electric ﬁelds in the r and 2 directions, respectively are

calculated similar to equation (2.32). However E, is calculated as

(E) = (VWH'VW)

 

(4. 7)

since rfii) is the grid spacing in the (t: direction. These equations are modiﬁed
slightly for non-terminated grid points by assuming a pseudo point in the lower

radial direction as described earlier.

76

4.2.3 Plasma Kinetic Model

A realistic simulation of plasma dynamics should include the collisions
undergone by the different particles in the system. These collisions which can be
broadly classiﬁed into elastic and inelastic types, lead to a change in the ﬁnal
state of the particle being considered. The different collision types that are
considered in this simulation are:

l. Electron-neutral ionization collisions,

2. Electron-neutral momentum transfer collisions,
3. Electron-electron collisions,

4. Ion-neutral momentum transfer collisions, and
5. Ion-ion collisions.

Of the above listed collisions, all are considered elastic except for (I) which is
inelastic leading to formation of new particles. Monte-Carlo methods introduced
in Section 2.3. 1.2 are used extensively to determine the occurrence of a collision
and the ﬁnal state of the particles after collision. The details of implementation

of the collisions are explained in the sections that follow.

4.2.3. 1 Particle-Neutral Collisions

One of the many collisions that can occur between a charged particle and
neutrals is an elastic momentum transfer collision. This is a momentum
randomizing event leading to the overall change of the particle velocity from a
directed to a less directed state. This type of collision is very important in higher
pressure plasmas wherein they play a major role in sustaining the plasma by

collisional or joule heating. The momentum randomizing nature of these

77

collisions lead to a change in phase between the particle motion and the driving
microwave ﬁeld thus enabling the particle to gain energy from the ﬁeld.
The general momentum transfer collision frequency is given by the formula

v = "n<°mV) (4.8)

me

where om is the cross section for momentum transfer and v is the velocity and
<omv> denotes the average value of the product over all energies. Plots of cross
section vs. energy for ion-neutral and electron-neutral collisions for argon and
helium can be found in (77). Equation (4.8) is multiplied by the iteration
timestep, At, to obtain the number of each type of collisions per iteration. These
numbers are used in the Monte-Carlo method to determine the occurrence and
type of collision as described earlier.

The determination of post-collision velocity components of the particle also
makes extensive use of random numbers. Since this is an elastic collision, the
total energr of the particle does not change, however, the particle loses some of
its directionality. Let vx, vy and v2 be the velocity components of the particle
before any collision and let r1 and r2 be random numbers between 0 and l. The

scattering angles in the azimuth and the elevation are then calculated as follows

it = 21:er
(4.9)
9 = acos(1—2r2)

and then the new components of the velocity, vx’, vy’ and vz’, are calculated as

follows
vx’ = Vmgxcosqix sine
vy’ = Vmgx sintp x sine, (4.10)
vz’ = Vmagxcose

78

where VM = (v: “:3 + v: is the absolute value of the velocity before and after the

8

collision.

4.2.3.2 Charged Particle-Particle Collisions

Charged particle collisions of the type (3) and (5) are governed by long range
coulombic interactions and cross sections for momentum transfer for collisions
of this form are given in [6]. The calculation of the ﬁnal states from the initial
states of the two particles is done similar to the method detailed by Jacoboni and
Luigli [32]. The calculation of the ﬁnal states P and 2'0 from the initial states I
and 7:0 are done by ﬁrst calculating the initial and ﬁnal relative wave vectors, g

and g' , respectively as follows

2 = t—l‘co
2' = P-Po.

(4.11)

The azimuthal angle between 3' and g is taken at random between 0 and 21:.

the elevation angle is calculated as

cosO = 2r (4.12)

1+g2(1—r)/2.a,2

 

where r is a random number between 0 and l, g = m = )ng and id is the Debye

length. Finally the new states, P and PO , of the particles are calculated as

7"0 = £0415 (F-P) .

(4.13)
I? = t+0.5(g'—g)

The collision cross-sections and frequencies for these type of collisions are

tabulated in [781.[79].

79

4.2.3.3 Electron-Neutral Ionizing Collisions

Unlike the collisions discussed earlier, ionizing collisions are inelastic. After
an ionizing collision, the original electron involved in the collision loses the
amount of energy required for ionization, termed as ionization potential Em, and
results in the creation of a new electron-ion pair. This type of collision is the
source of fresh charged particles to compensate for those lost due to different
mechanisms. The values of the ionization rates depends on the background
neutral density and uniquely deﬁnes the qualities of the plasma with respect to
the power required to maintain a certain density. A typical ionization cross
section curve is shown in Figure 4.3. As can be seen in the ﬁgure, the magnitude
of the cross section changes with the energy of the electron showing an initial
increase and then a decrease with respect to energy. The magnitude of Em for
argon and helium are 15.76 eV and 24.588 eV, respectively. The collision

frequency is calculated by numerically integrating the equation

vi = Q; o(E)F(E)EdE (4.14)
me

8

where E is the energy 0(E) is the collision cross section as a function of energy
and F(E) is the distribution function. In particle type simulations, the number of

ionizing collisions occurring per iteration is calculated as follows

v. = nno(E)V

mag

(4. 15)

where nﬂ is the neutral density, Vmag is the magnitude of the velocity of an
electron with energy greater than Eth, At is the iteration timestep and Nae, is the

number of ionizing collisions per iteration. If this number is less than 1. a

80

 

 

 

 

le~16_ - - -ﬁ----- - e------- - r------
I t
L .
' i
16-17 : l
or I I
E t .
o _ .
E
b D
P I
le~19 - - . . u
0.1 l 10 100
E! Energyin eV

Figure 4.3: Example ionization cross section plot.

81

random number is generated and an ionization is assumed to occur if its value is
less than Nita. If the value Nae, is greater than 1, an integral number of new
electron-ion pairs are created and the remaining fraction is treated using the
Monte Carlo method.

Ionizing collisions result in the formation of a new electron-ion pair and in
the reduction of the total energy by the amount equal to the ionization potential
of the background gas. The total energy of the three particles after the collision,
Etot, is given by

Etot = oid’Eth “-13

where Edd is the pre-collision energr of the electron causing the ionization.
Since the average energy of electrons is much higher than that of the ions and
the neutral gas, it is assumed that the newly created ion of the electron-ion pair
is essentially at rest, and Em is distributed randomly between the two electrons

as follows

E1=Et

0

E2 = Etot(1"’1)

x r
t 1
(4.1 7)
where E1 and E2 are the post-collision energies of the two electrons and r, is a
random number between (0.1). The velocities of the two electrons in each of the

three directions is calculated similar to equations (4.9) and (4.10).

4.2.4 Maxwell’s Equation Solution

A complete 3D3V simulation of microwave plasmas involves the self-
consistent coupling of Maxwell’s equation with particle motion. The microwave E

ﬁeld driving the plasma is calculated using a Finite Difference Time Domain

82

technique (FDTD) to calculate the ﬁelds at the next iteration timestep. This
involves solving equations (2.1) and (2.2) on the simulation grid structure. The
plasma acts as a load providing the current in equation (2.2). The design and
details of the Maxwell’s equation solver is discussed in considerable detail in

[41].

4.2.5 Self-Consistent 3D3V Plasma Simulations

A self-consistent 3D3V plasma simulation involves solving plasma dynamics
along with Maxwell’s equation. While individually the plasma and Maxwell
equations simulation can be done with reasonable effort, the combined
simulation provides new problems. In some cases, the solution of Maxwell’s
equation and Poisson’s equation have contradicting requirements and as a
result the combination of the two can prove challenging. The following sections
detail the methods to successfully accomplish a self—consistent plasma

simulation.

4.2.5.1 Geometrical Requirements

The fundamental requirement of a Maxwell’s equation solver is that the
values of the grid spacing and the timestep be such that the EM wave being
solved does not advance more than a grid distance per iteration. Mathematically,

this requirement for a solution in rectangular co-ordinates is given by
l

.[ 2.2.1.]
sz Ay2 A22

A: < (4.18)

 

 

83

where c is the speed of the EM wave, At is the iteration timestep and Ax, Ag and
A2 are the grid spacing in the x, y and 2 directions. respectively. A further
restriction on At is that it should be less than the value of the EM wave’s time
period.

These requirements dictate that a stable Maxwell’s equations solver have a
very small timestep and a relatively larger grid spacing as compared to the
plasma simulation requirement. Simulations wherein Maxwell’s equation
timestep is a thousandth of the plasma timestep and its grid spacing is four
times the plasma grid spacing are quite common. This mismatch becomes worse
when simulating a higher density (smaller Debye length] plasma.

Further, the Maxwell’s equation solver designed by [41] solved the equations
in regular cylindrical coordinates whereas the plasma simulation used the
staggered grids introduced in Section 4.2.1. The use of different grid structures
leads to the requirement of interpolation and extrapolation of the data between
the two programs.

In addition to all these requirements, there is a hidden instability in the
discrete solution of Maxwell’s equations. For example. the solution for E,, the

radial component of the electric ﬁeld, at the grid point (i+1/2J,k) is given by

E"+l(i+l,j,k)=a" (i+1,j,k)+ A’ (4.19)
r 2 r 2 (.1 . )
£1+§,J,k

 

1 l
n+- n+-
2 . l. l ) 2(. l. I )
x HZ (l+§,j+i,k -HZ l+§,j-§,k -
r 1A¢
1+5
n+1 n+1

2 . l. I) 2(. l. I)
H¢ (I+§,],k+§ -H¢ i+§,],k—i j"(i+.1.jk)
Az r 2’ ’

 

where Hz and H4, are the z and it) components of the magnetic ﬁeld and J, is the
radial component of the current. In the above equation the current term can be

modiﬁed by using an equivalent conductivity of the form

120+ 51k) = of(i+%,j,k)xef" (i+%,j,k). (4.20)
to give a modiﬁed form of equation (4.19) as follows

n+1 1

E (i+-,j,k) = E: (14%.);ij 14:045.]; k)X-(——3Al’ . (4.21)
r 2 8 i+§,j,k

1 l
n+- n+-
2(. 1. l ) 2(. l. l )
A! HZ l+§,j+§,k “-112 l+§,j-§,k

x _
. l . r A¢
e(¢+§,},k) ”é

 

n+1 n+1

2. l. I) 2(. l. 1)
H4, (l+§,],k+i -H¢ I+§,j,k-i
Az

 

Since the instantaneous current in the plasma region is calculated using
equation (2.35), a momentary noise in the current calculated using
of(i+%,j,k)x (Ar)/(e(i+%,j,k)), leads to a value greater than unity which
can produce instabilities. This requirement also dictates that the Maxwell’s

equation timestep be as small as possible to aid stability of the simulations.

4.2.5.2 Combined Plasma-Maxwell Simulation

In view of the conﬂicting requirements of the two parts of the simulation, it
was decided to keep the two programs separate and to communicate only the
relevant details between the two programs. The plasma program calculates the

instantaneous currents at each grid point and interpolates them onto the

Ma

9
cc

*7
d1

ix:

01

I?

(r)

85

Maxwell's equation grid system and the Maxwell’s equation solver calculates the
new E and H ﬁelds using the current and communicates the new values to the
plasma simulation as can be seen by the ﬂow chart in Figure 4.4. The vertical
dotted line divides the plasma and the EM subprograms. The horizontal dashed
arrows signify the movement of data between the two subsystems. The different
timestep requirements of the two programs has another effect on the overall time
required for the simulation. As can be seen in Figure 4.4. the EM program does
a number of iterations for every iteration of the plasma program. Therefore an
optimum trade-off between speed of convergence and stability has to be located
very carefully. Further, if there is a large difference in the magritudes of the two
timesteps, a small noise in current introduced to the EM program tends to get
ampliﬁed over the iterations as explained in the previous chapter.

The grid currents are calculated using Marder’s correction as explained in
Section 2.3.4. PVM routines were used to synchronize and communicate data
between the two programs. It should be noted here that the nature of the
program is strictly serial, i.e., the two programs alternate in their execution and
PVM is used here only as a data communication tool. The two programs use no-
blocking receive to ensure strict serial fashion of execution ﬂow and
communicate using ﬁle I/O. The EM program writes to a mutually agreed ﬁle
and sends a prearranged message to the plasma program which reads the ﬁle

upon receipt of this message and vice-versa.

86

EM Plasma load

 

 

 

Initialize Geometry etc. Initialize Geometry etc.

 

 

 

 

 

 

l

Calculate Charge,
Current densities,
b‘ackground Potential ‘—
e c.

I
Send:J,. J¢, .1z
4"- ------------- . ---------------
v 1 _______ ,1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

,.
1
Receive Current I '
. Densities 0f the | : Solve Poisson’s E n.
load. | g for the Static E ﬁe d
I i ,
l I l j
I 1
I i
Solve for new EM ﬁelds ,
in Maxw 11’ E . | Calculate Total
“S g e s qn , i E Field
I
' 1
_______ .1 |
Send new EP:E¢, Ez {
i
l
I Move Charged Particles
: under the new E Field -——*
i
I
l
l
l
I
I

Figure 4.4: Combined plasma-EM simulation ﬂowchart.

87

4.2.5.3 Power Absorption Calculations

Power absorbed by the plasma can be calculated in many different ways. One
way is to calculate the value of the Poynting integral at a plane immediately
above the quartz plane. The Poynting integral is written as

Pin = fawn) 0ds. (4.22)

where P is the power in watts and d5 is the area in the rqi plane immediately
above the quartz chamber.

The instantaneous power absorbed by the plasma can also be calculated by
summing the product of current and ﬁeld strength at every grid point. This is

equivalent to the equation

Pa... = I (J-E)dv = ”2* (10.1; k) -E(i.j. k)) xAvm. k) (4.23)

where Av(iJ,k) is the incremental volume at the grid point (i,i,k).
Another measure of the power required to sustain a plasma is the loss of
energy from the electron gas due to ionizations and loss of particles to the walls.

This can be written in the form of an equation as

E10” = zenergy-i- 2 El}. (4.24)
1

as! ionizations

where E0, is the ionization threshold and E1055 is the energy lost in Joules. The

power lost is calculated by averaging the energi loss over a microwave cycle. In

stead?

power.

4.3 1

limit
solutii
pragrz
[n We

slmul

4.3

ch

pr.

 

88

steady state the loss of power from the plasma is compensated by the input

power.

4.3 Efforts to Speed up the Simulation

The combined 3D3V simulations tend to be very CPU intensive and the serial
nature of the progam shown in Figure 4.4 adds the time taken to converge to a
solution. The smaller EM timesteps needed to ensure stability lead to the EM
progam spending considerably more time executing than the plasma program.
In view of these problems, it is imperative that efforts be made to speed up the

simulations. This section presents two such methods devised in this research.

4.3.1 Cycling the EM Program

It was theorized that since a stable plasma solution does not show rapid
changes in behavior but instead shows a steady move towards a solution,
considerable speed gains could be made by “turning” the EM progam on and
off. The EM program was let to run a certain number of integral microwave
cycles and during this time the plasma program stored all components of the
electric ﬁeld for every plasma iteration. The EM program was then paused while
the plasma program used the stored values of the ﬁelds to move particles around
for the next few cycles. Then the EM program was restarted with the new values
of the current. It was thought that the new values of the current calculated
would allow the EM program to get back in synchronization with the plasma

program.

89

One of the major aspects of this implementation is the decision on the
optimal number of cycles required to keep the results the same as the serial
version and yet obtain the maximum speed-up possible. Initially it was thought
that cycling the EM program every microwave cycle would be the ideal choice
since this allows the EM program to be in relatively more communication with
the plasma program. The accuracy of the implementation was decided by
comparing the time evolution of particle and ﬁeld characteristics of the cycled
programs with their serial counterpart. In particular. the time evolution of
particle number density, particle energy and EM ﬁeld strength were considered
as indicators of accuracy.

The time evolution of the absolute E ﬁeld strength ( (53+E:+Ei ), averaged
on all the grids at every iteration timestep, for the cases when the EM program
was cycled on and off every cycle and every 3 cycles was identical as can be seen
in Figure 4.5 for the plasma source described in Section 3.2. Note that the three
plotted lines in the ﬁgure lie on top of each other and are indistinguishable.
However, the time evolution of particle quantities present a different picture as
can be seen from Figures 4.6 and 4.7. It can be clearly seen that the choice of
cycling off and on every cycle is clearly the inaccurate one since it leads to
considerable error in electron (most easily effected) number and energy.
However, the choice of cycling the EM source every 3 cycles gives results that

closely track the serial version. Clearly there is an optimal number between too

many and too few cycles to ensure accuracy.

IE2|

90

3e+09

 

Note that the three platted liners lie on top of each other: Cycle = 1 _

Cycle=3—

2.5e+09 531131 “" .

U

2e+09 - -

1 .5e+09

Ie+09

 

 

5e+08

A

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Time (10'9 seconds)

J I I

 

Figure 4.5: Time evolution of ﬁelds for the serial and cycled cases.

Energy in eV

12

10

91

 

r

   

Cycle = l, Electrons —
Cycle = l. Ions -—
Serial Electrons ——

Electrons

 

 

 

 

Serial Ions ------
' Cycle -- 3. Electrons - -.
Cycle a 3. Ions ....... d
I I . "ﬁs [“W I
Ll." v Q: . '
Ions
0 0.2 0.4 0.6 0.8 l 1.2 1.4 1.6

Time (10'8 seconds)

Figure 4.6: Energt of particles with source cycled on and off.

1.8

Number [103’

92

 

 

 

    

 

 

 

140 U I I I' Y t i I
Ions Cycle in l, Ions —
Cycle = l, Electrons —
—-—- Serial Ions ~-—-
135 >21. Serial Electrons --- ‘
2... CYCIC . 3. Ions ........
\ Cycle = 3. Electrons ------- '
130 - -
125 _ Electrons '
120 L i
115 l 5' ‘ ‘ ‘ ‘ ‘ ‘ ‘
0 0.2 0.4 0.6 0.8 l 1.2 1.4 1.6 1.8

Time (10'8 seconds)

Figure 4.7 : Number density of particles with source cycled on and off.

93

4.3.2 Implementation of a Parallel EM-Plasma Version

The ﬂowchart shown in Figure 4.4 is strictly serial and it was theorized that
some speed-up may be obtained by modifying the ﬂowchart as shown in Figure
4.8. Instead of following the serial fashion of the complete simulation, this
version theorizes that a phase difference of one timestep may correct itself over
the complete run of the simulation. The ﬂowchart in Figure 4.8 is identical to the
one in Figure '44 except for the coupling between the two programs. Instead of
following a strictly serial fashion where the EM program calculates the new
ﬁelds, then sends the ﬁelds to the plasma program which moves the particles
and calculates the new current components and so on, this version forces the
EM code and the plasma code to execute in parallel. The two programs use
initial values and continue their execution and at the end of each iteration,
exchange data. This leads to the EM program generating the new ﬁelds from the
load current values calculated for the ﬁelds generated in the previous step.
Similarly, the plasma program uses the ﬁelds generated for the earlier timestep
to move the particles: This leads to the plasma -and EM program to'be out of
phase by one timestep. The electric ﬁelds generated by both these versions were
identical as can be seen in Figure 4.9. Further, the time evolution of particle
number density and particle emery closely track the serial case as can be seen
in Figures 4.10 and 4.11, respectively.

It is interesting to note here that even though the two programs appear to
execute in parallel, the speed gain is dictated by the slower program. For
example, if the EM program iterates 1000 times for each plasma timestep, the

time taken by the EM program dictates the overall speed. The plasma program

EM

 

 

Initialize Geometry etc.

 

 

 

 

 

Receive Current and
Charge Densities of the:

 

 

load.

 

 

Solve for new EM ﬁelds
using Maxwell’s Eqn.

-§ '— _ _

 

 

 

Send new EP E», E2

 

94

o---‘--‘---------‘-+---------‘---~.------‘----+---------‘-‘-‘----‘----‘-‘c

Plasma load

 

Initialize Geometry etc.

 

 

l

 

 

Calculate Initial Charge
and Current densities,
Background Potential
etc.

 

 

Send Jr, J¢, Jz l

 

 

Solve Poisson’s n.
for the Static E ﬁe d

 

 

 

 

Move Charged Particl
under the new E Field

 

Figure 4.8: A parallel EM-plasma implementation.

 

 

IE2I

3e+09

2.5e+09

2e+09

Ie+09

95

 

 

 

 

 

 

 

Note that the two plotted lines lie or; top of each other. Parallel ...
A Serial _
0 0.2 0.4 0.6 0.8 l 1.2 1.4 1.6 1.8
Time (10’9 seconds)

Figure 4.9: Time evolution of ﬁelds for the serial and parallel cases.

 

Number [103)

96

 

 

 

  
  
 

 

 

 

140 . . .
Ions
Note that the two plotted lines lie on top of each other.
135 Parallel Ions —
Parallel Electrons —
Serial Ions ---
Serial Electrons ---- ‘
130 .
Electrons
I25 -
Note that the two plotted lines lie on top of each other.
120 - .
L J . A m
0 0.5 l 1.5 2 2.5

Time (10'8 sec)

Figure 4. 10: Evolution of number density for the serial and parallel cases.

Energy in eV

97

 

I Y t I

Note that the two plotted lines lie on top of each other.

  
    

Electrons

Parallel Electrons ~ -
Parallel Ions —— .

Serial Electrons —
Serial Ions - ~ - .

Ions
' \ Note that the two plotted lines lie on top of each other. '

 

 

 

I A j I

0 0.5 l 1.5 2 2.5
Time (10’8 seconds)

 

Figure 4.11: Time evolution of enery for the serial and parallel cases.

98

would have ﬁnished one iteration and would wait for data from the EM program.
The two sub-systems have to be carefully designed to take full advantage of the

parallel nature of this algorithm.

Chapter 5

Downstream Plasma Simulations

 

5. 1 Introduction

A subset of the plasma-EM simulation model of the previous chapter is
applied to the modeling of the downstream plasma processing region in this
chapter. This chapter will demonstrate the techniques developed in Chapter 4 as
they apply to the modeling problem described earlier in Chapters 2 and 3.

The downstream region of plasma processing chambers of the type shown in
Figure l. 1 is essentially microwave ﬁeld free and is controlled by space charge
ﬁelds. The particles generated in the source region ﬂow into the large processing
region and expand to ﬁll the volume. In the process some particles are lost to the
walls leading to a build up of a plasma sheath. Any material placed in this region
will interact electrically and chemically with the plasma leading to sputtering
and deposition as the case may be.

This region when occupied by a low pressure plasma can be simulated using
a particle type method. In the absence of non-uniform 8,» ﬁelds, this region can
be simulated in two dimensions. This chapter introduces a 2D3V sirnulatiom of
the downstream region of a typical plasma processing chamber. The use of
different weighting factors for the charged particles in different regions to assist

stability is also detailed.

100

These types of simulations can prove useful in predicting the ﬁnal processing
characteristics by providing an estimation of the spatial variation and
magnitude of the different plasma variables like charged particle ﬂux, densities

and temperatures.

5.2 2D3V Downstream Simulation

The downstream region is considered uniform in the it) coordinate and the
simulations are done in the rz coordinate system. The area being simulated can
be seen in Figure 5.1. It is assumed that there is a fresh inﬂux of charged
particles from the source region that compensates for the loss of particles to the
walls. The particles are added to the sirnulatiom at a rate calculated ﬁ'om the ion

ﬂux density given by the equation [80]

 

_ kae m
r, .. 0.61N0 M (5.1)

l

where I“, is the average ion ﬂux, No is the constant ion density at the quartz
opening to the downstream region, and M, is the mass of the ion. Usage of
equation (5. 1) assumes that ﬂux exiting the opening in the quartz tube reaches
the walls in the downstream region and that the electron ﬂux is mual to the ion
ﬂux. If the area of the quartz opening is A and the timestep is At then the
average number of particles to be added per iteration is given by 1‘1AAt. Since
the number calculated in this fashion can be a fraction, the Monte-Carlo method
is used to determine the number of particles added per iteration. Each “addition”
indicates the creation of an electron-ion pair. These pairs are given random

locations in space within the area of the source opening and a Maxwellian

essse‘

101

I'llllllflfllllllllllll

Source Region

: Dome
A .
‘ F ,/ T

r
I

 

Von N eumamn 2

Condition

Downstream Region

 

 

Figure 5.1: Downstream simulation zone.

102

distribution in emery is assigied based on the assumed distribution
temperature. Further, the source boundary is considered reﬂecting such that
any particle going into the source region is reﬂected back. The boundary
conditions assumed can also be seen in the ﬁgure. The walls are assumed to be
conducting and grounded. The Poisson equation boundary condition at the

particle entrance boundary is a normal electric ﬁeld of zero.

5.2. 1 An Unequal Weighting Scheme

The non-terminated grid design for cylindrical structures introduced in
Section 4.2.1 uses unequal number of grids in successive 4) locations to aid
Poisson’s equation stability in the center. In the 2D case, the absence of the O
dimension makes it impossible to apply non-terminated grids. However, the
problem of the successively larger areas for outer grid points remains. Further,
an initially uniform distribution of particles gives considerably fewer particles in
the grid points closer to the center. These two problems lead to instability while

solving Poisson’s equation.

Plasma simulations use a weighting scheme to determine the amount of

Charge carried by a superparticle. The simplest weighting scheme involves

Providing equal weights for all superparticles as given by

Volx Ne

_ (5.2)
N tot

Where W is the weight of the superparticle, Ne is the charge density. Ntot is the

t-0tal number of superparticles and Vol is the total volume of the structure being

Silnulated.

103

This uniform weighting scheme leads to instabilities at grid points closer to
the center since there are fewer particles in this region and the movement of
some particles out of the region leads to large ﬂuctuations in charge density and
hence in the solution of Poisson’s equation.

In order to reduce these problems a non-uniform weighting scheme [81] has
been implemented in this research. This scheme involves dividing the simulation
area into a number of strips of areas that have similar size as can be seen in
Figure 5.2. In cylindrical co-ordinates. this leads to strips closer to the center
having more grid points. The weights of the particles are dictated by the region
in which they lie. This scheme leads to particles being located in the strips closer
to the center carrying correspondingly smaller weights. In this case the particle

weights are calculated as

Vol(i) xNe

W“) = New

(5.3)

where Wli) is the weight of the superparticle in strip t. Ne is the charge density,
Nmtli) is the total number of superparticles in the strip and VoIItl is the volume of
the strip i. Particles are distributed uniformly in space in each of the strips. This
(ﬁltration reduces to the simple equation (5.2) when the weights are all equal.
This weighting scheme has the advantage of reducing the instabilities in the
grids closer to the center. However, the computational overhead of the
inlplementatiom increases considerably due to the discontinuous nature of the
sCheme. Particles crossing strip boundaries have to be compensated for
Correctly. Monte-Carlo methods are used extensively to determine the status of

Particles crossing strip boundaries. Particles crossing from a lower weighting

104

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

’ ‘ > +—-—-> H
Strip 1 Strip 2 Strip 3 Strip 4

 

Figure 5.2: Unequal weighting implementation.

stip

run
the
cre.

[TIE

C0

105

strip to a higher weighting strip are annihilated if the generated random number
(between 0 and l) is greater than the ratio of the two weights. The reverse case of
a heavier weighted particle moving into lesser weighted zone is treated as
follows. The original particle is retained with the same emery, an integral
number of particles equal to the ratio of the two weights minus one is created at
the same location with the same emery. The fraction left over, if any, leads to the
creation of a new particle depending on the random number generated. This
method has to be followed since the different weights need not be exact
multiples of each other.

The unequal weighting scheme also increases substantially the number of
computations required to calculate charge density. The calculation of the
contribution of charge onto the background grids for each particle requires
determination of the weighting factor corresponding to the region of particle
location. The calculation of average particle enery is also more complicated
since the total average enery is now a weighted average of the individual enery
and particle weight and is given by

2(2150) )Wm
. j . _.

‘=‘ " "" ' “ (5.4)

Eavg 2( j NU))W(i)

 

l

where W(i) is the weight of the superparticle in strip i, E0) is the enery of the
particle in the strip i, N0) is the number of particles in strip i and Em,g is the

weighted average enery of the particles.

106

5.2.2 Poisson’s Equation Solution

The solution to Poisson’s equation is done in two dimensional rz coordinate
system similar to the method detailed in Section 4.2.2. As can be seen in Figure
5.1, all external boundaries are considered grounded and hence held at constant
zero potential. However, the plasma source region corresponding to the area
under the quartz dome, cannot be treated in this fashion. The electrons of the
freshly generated electron-ion pairs move away from the ions due to their higher
enery. This leads to an ambipolar ﬁeld being built which attempts to slow down
the electrons and speed up the ions. The Poisson’s equation solver should be
able to correctly resolve this ﬁeld.

In order to correctly simulate this effect, the grid points in the source region
were considered to follow the Von Neumann boundary condition wherein the
derivative of the potential in the z direction is considered zero. This is done by
assuming a pseudo-grid point outside the simulation region as shown in Figure
5.3. The part of Poisson’s equation pertaining to the z dimension becomes

2 (9‘1) —(-’—’)
3 V 82 1,1 82 1,—1

57 = 212 + - + ‘5-5’

 

where Az is the grid spacing in the z direction, ($91 1 is the variation of the

a_v

82
the pseudo grid point. After applying the Von Neumann boundary condition

($91.1 = —(g_:)l,-l' - (5'6)

equation [5.5) can be ﬁnally written in terms of internal points as

voltage at the higher 2 location and ( )1 1 is the variation of the voltage at a at

107

(19-1)
r #

z I “PA: '1"

(1.1)

Pseudo Point

 

 

 

 

 

 

 

 

 

 

 

 

Figure 5.3: Implementation of Von Neumann condition.

108

2
M: V(1,1)—V(1,0) (57)
322 A22

 

where WI, 1) is the potential at grid (1,1) and V(1,0) is the potential at the desired
location in the source region.

These modiﬁcations to the Poisson’s equation solver in cylindrical
coordinates increase the numerical complexity of the program. Further, the grid
point at the origin is a special case since it is part of the central charge column
and the source region. Proﬁles of the code running this simulation show that up
to 60% of the time spent per loop is spent in the Poisson’s equation solver.

Charge densities in the region of the source can be substantially higher than
the average charge density of the structure. The Debye length condition has to
be satisﬁed both globally and locally for overall stability. Therefore, the number
of grids per unit length in the r direction under the quartz dome has to be more
than those required for the rest of the volume. The increased number of grids

also increases the time required to solve Poisson’s equation.

5.2.3 Example Simulation

The 2D3V simulation system developed in this research was. applied to an
example system of cavity radius 9 mm, length 9 mm and quartz dome radius of
3 mm. The simulations were conducted assuming an electron temperature of 7
eV and charge density of 1 .0x109 /cm3.

The time evolution of particle density is shown in Figure 5.4. It can be seen
that after a certain time, the simulation comes to a steady state with the rate of

loss of particles being compensated for by the rate of gain from the source area.

Number ( 1 02)

109

 

32* . - . . - - .

' Electrons —

 

 

 

 

20 - - - - - - -
o 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Time ( 10‘8 seconds)

Figure 5.4: Time evolution of number of particles.

110

The plasma potential in the simulation area is shown in Figure 5.5 and it can be
seen that the sheath around the grounded walls and the potential at the source
region are clearly resolved. The time evolution of enery of the particles can be
seen in Figure 5.6. From the ﬁgure it can be observed that the enery of the
electrons drops rapidly and then reaches a steady state after some time. The
spatial variation of electron density vs. 2 at a number of different r points can be
seen in Figure 5.7. The electron density is higher in the source region, constant
in the middle and drops off at the grounded edges. The variation of electron
density vs. r for a number of different 2 points is shown in Figure 5.8. The
variation of ion density vs. r and z is shown in Figures 5.9 and 5.10,
respectively. It can be seen that the charge densities in the grids near the quartz
dome are higher than the densities in the rest of the simulation area.

The simulations can also be performed assuming non-uniform source region
ﬂux. The results of electron charge density vs. 2 for a case where the input ﬂux
density showed a Bessel function dependence as seen in Figure 5.1 l, is shown
in Figure 5.12. It can be seen that the charge density follows the ﬂux density
variation in the source region and is highest in the center reducing towards the
edge of the generation zone. The selection of the Bessel function dependence was
arbitrary, any source region spatial variation could be the input to this
downstream simulation model.

The simulations presented in this chapter demonstrate the application of
numerical modeling techniques to plasma processing. The example developed in
this chapter was limited to a small simulation region size due to memory

limitations on the single CPU workstation used. The techniques described later

Volts

lll

 
 

6
4
01 Abnm}

r (mm)

Figure 5.5: Plasma potential in the simulation region.

Enery in eV

112

 

 

..
---------’----
--------——--------------
A J l A

Electrons ——
Ions - -

-----‘--‘l

 

 

 

O 50 100 150 200 250 300

Iteration

Figure 5.6: Time evolution of particle enery.

350 400

lll

 

 

 

 

 

 

 

 

 

 

 

20.
15
10-
.4
g s-
0. 78
6
_- 5
4
- - 23/
0123;A---12(mm
5 6 7 3‘0

Figure 5.5: Plasma potential in the simulation region.

Enery in eV

112

 

 

 

Electrons —
Ions - -

 

 

Iteration

Figure 5.6: Time evolution of particle enery.

Charge Density (cm'3)

113

~5e+08-

~1e+09.

 

-ln5e+09

~2e+09

 

Figure 5.7: Electron density vs. 2.

Charge Density (cm’3)

113

“564-08 )-

~le+09+

 

~l.5e+09

~2e+09

 

Figure 5.7: Electron density vs. 2.

‘

Charge Density (cm‘3)

~5e+08
~5.5e+08
~8.3e+08
~ 1 . le+09
l .38e+09
~ 1 .66e+09 ,
~ 1 .9e+09
~2.2e+09
~2.5e+09

114

 

Figure 5.8: Electron density vs. r.

Charge Density tom's)

115

 

Figure 5.9: Ion density vs. r.

Charge Density (cm’3)

3e+09

O

 

0
1

116

2.5e+09 -
2e+09 -
l.5e+09 -
le+09 .
5e+08 \M ANA
\ ‘ ‘ _ \ \

k

Figure 5. 10: Ion density vs. z.

Relative ﬂux density

117

 

 

 

0 n n n n n

 

 

0 0.5 0.1 1.5 2 2.5

Radial distance in mm

Figure 5.11: A Bessel function type ﬂux distribution.

Charge Density (cm‘a)

0
~ie+10
~2e+10
~3e+10
~4e+10
~5e+10
~6e+10

O

118

source region

 

: x - 1"”‘Qlll

, ”A

v/Ii"‘
‘t.

    

1
2 3

2 (mm) 8 1 r (m)

Figure 5. 12: Electron density vs. 2 with non-uniform source.

119

in Chapters 7 and 8 allow the methods of this chapter to be extended to larger

geometry structures.

Chapter 6

Simulation of a Compact ECR Source

 

6.1 Simulation of a Compact Ion Source

An example of doing a 3D3V self-consistent electromagnetic ﬁeld-plasma
simulation is presented in this chapter. A compact ion source which was
described earlier in Section 3.2 is simulated. The permanent magnet static B
ﬁeld in the plasma region is obtained by solving equation (2.3) beforehand [8].
Figure 6.1 shows a contour plot of the magietic ﬁeld lines in the rz plane. Of
particular interest is the 875 Gauss ﬁeld line around which ECR power coupling
to the electrons is expected to occur. For this particular magnet conﬁguration,
the 875 Gauss line is located close to the edge of the quartz.

The structure being simulated is shown in detail in Figure 6.2. The axial
coordinate of the cavity program is denoted z’ and the axial coordinate of the
plasma program is denoted 2 as shown in the ﬁgure. The plasma source
geometry and operating conditions are selected to permit the simulation to run
on a single CPU workstation. Simulations of an argon discharge were conducted
for this source assuming an initial background charge density of 1 .0x109/ cm3 at
a pressure of l .0 mTorr and electron temperature of 5 eV. Under these
conditions the Debye length is 5.25x10’2 cm. Hence, to simulate the plasma

generation region, 35 grids in the radial direction and 59 grids in the axial

120

0.025

f
2 0.015
0.01

0.005

121

875 Gauss

 

 

 

 

..... \ \. 33:: I . , J
~\. ‘ I '
''''' I, ' / 1.26103 —
’ , ~~~~ ' '~ I . 1e+03 ——
_ x ; f 875 ~—
0' d I I 700 .........
\ (I 600 , - _,
1 1‘ \ ‘ ‘ :‘l' / /3,’ 388 ..
. “““ ; \. " g .I g’ ( 150 ll.
”fur", s“ 1, \ “5’8 "..
1 . ; ' . ‘1.)i\
0 0.005 0.01 0.015
r ——->

Figure 6. l: B ﬁeld contours inside ion source.

12.4 cm

 

122

 

 

 

 

 

Center z‘
Conducto
g Cavity Walls
N
5.4 cm
< Microwave
’ Cavity

 

 

 

ECR
Magnets

 

 

 

 

 

<——3.0 cm—>

 

 

 

Plasma Region

Figure 6.2: Structure being simulated.

123

direction are required. A non-terminated grid for a 45 degree slice in the ((1
direction was setup.

The grid structure for the EM program in the plasma region had 18 grids in
the radial direction, 30 grids in the axial direction and 5 grids in the it) direction.
This leads to the EM program containing less than half as many total grid points
as the plasma program for the same region and so the data acquired from the
EM program was interpolated to the extra grid points in the plasma region. In
order to ensure stability, the EM timestep was set at 5x10”l4 seconds and the
plasma timestep was set at 2x10'll seconds. This leads to communication
between the two programs every 400 iterations as shown in Figure 4.8.

Simulations on a HP 735 took an average of three days to reach steady state.

6.2 Results

The time evolution of average enery for the structure is given in Figure 6.3.
This example simulation shows that the structure settles down to a steady state
electron temperature of a little over 4 eV. The power absorption of the structure
can be seen in Figure 6.4 as a greyscale map. The location of maximum power
absorption is given by the whitest area and can be seen to correspond to the
location of the 875 Gauss line. The simulation predicts that even though the 875
Gauss line is present all along the quartz, the conditions favorable for ECR
heating, where the E ﬁeld is perpendicular to the B ﬁeld, exist only in the region
closest to the microwave input. The time evolution of particle numbers is given
in Figure 6.5. It can be seen that there is an initial outﬂow of the more energetic

electrons but later, the combination of built up plasma sheaths and ionization

Temperature in eV

1.5

0.5

124

 

 

Electrons —
Ions ~ -

 

.---------------‘---------------------‘--‘----‘

 

 

 

Time (10'8 seconds)

Figure 6.3: Time evolution of average particle enery.

125

 

Figure 6.4: Power absorption locations

Number (103)

126

 

110

d
------"

Ions - -
108

106 -

104

102

100

98-

 

96

 

 

 

0 5 10 l 5 20 25 30 35

Time (nsec)

Figure 6.5: Time evolution of number of particles.

127

brings the number of electrons to a steady state. The plasma potential in the
simulation structure can be seen in Figure 6.6. The time averaged values of E,
along the edge of the cavity (maximum r location) is shown in Figure 6.7. The
electron enery distribution function for this example is given in Figure 6.8. The
enery shows a peak at approximately 0.5 eV and drops off rapidly at higher
energies. The self-consistent electromagnetic ﬁeld-plasma simulation model
calculates several important plasma source characteristics including plasma
potential, plasma sheath thickness, electromagietic ﬁled strengths, microwave
power absorption characteristics, and particle energies. The general techniques
of doing the EM-plasma simulation developed in Chapter 4 have demonstrated
themselves for the compact ion source simulated in this chapter. The next two
chapters present techniques that allow the extension of these techniques to
larger plasma sources and higher plasma densities where larger grid structures
and more superparticles are needed. These extensions involve the use of

advanced computing resources.

Potential (V olts)

128

  
 
 
 
 

-- .'.“ . - - ‘
v '1. - - ’ ' c - O- .
‘ >9 ' -‘ r ' ' '
_ . b ' -‘--- -“
-‘4 o - ‘ -‘
"v.’ ‘I- . ” < O. n O
- _ - v.‘ ‘
l . ‘ ’O‘.— O '..‘. .. ~ \ - v v
'I 5 e
C. . ‘.
- - ' .:'1 véa‘.’ --

“‘.:

\‘\-\ ‘\‘ \ \

1111111111“‘1‘11\\\\\m.

 
 
  

   

 

 

Figure 6.6: Plasma potential in the rz plane.

E, in kV/m

129

 

1.5 -

1.0 .

 

O 4 n n n n .

 

 

0.02 0.04 0.06 0.08 0. 1 0.12
’ Axial distance in m

ego

Figure 6.7: Time average values of E, along the edge.

0.14

Number (I 02)

 

130

 

 

 

 

12 14 16

o 2 4 6 8 10
Enery in eV

Figure 6.8: Electron enery distribution function.

18

Chapter 7

Implementation on the MasPar MP2 System

 

7. 1 Introduction

The movement of particles in particle type plasma simulations are essentially
independent of each other once the governing forces are resolved. This gives rise
to the possibility of implementing a simulation wherein the motion of each
particle is resolved and computed independent of the others. Potentially, this
type of simulation can give considerable speed-up over serial implementations.
In this research, a Massively Parallel Processor (MPP) containing a number of
Processing Elements (PEs) has been used to implement this idea. Each PE has a
certain volume assigned to it and performs computations on the particles that
lie in this volume.

The system chosen to implement a parallel version of a plasma simulation
was the MasParTM MP2 system. Systems of the type MP2 are termed as data
parallel or SIMD (single instruction multiple data) machines. In SIMD machines,
several relatively small processors execute the same program instruction
simultanedusly. Massively parallel SIMD units consist of one instruction fetch
unit and at least 1000 data processors. The instruction stream is processed

serially by a control unit and dispatched to the processing elements (PEs) that

execute them in parallel. These are also called data parallel since the only

131

132

distinguishing feature of each processor is the data it operates upon. A typical
MP2 consists of a front end and a data parallel unit (DPU).

The front end is a modiﬁed workstation that runs a ﬂavor of the popular
UNIX operating system. The DPU consists of an Array Control Unit (ACU), an
array of at least 1024 PBS and communication mechanisms. The ACU is a
processor with its own registers and data and instruction memory. Certain types
of serial instructions can be executed on the ACU. The ACU can be considered
as the slave master which drives the PEs in lock step and sends data and
instructions to each PE simultaneously.

The DPU is where all the parallel processing is done. Each PE consists of a
32 bit load/ store arithmetic processing element with dedicated registers and 64
KBytes of local RAM. Each RISC (Reduced Instruction Set Computer) PE unit is
rated at 4.15 MIPS and 0.38 MFLOPS [82]. There is no ﬂoating point chip in a
MasPar system, though there is some special hardware to improve ﬂoating point
capability. The ﬂoating point operations are rrricrocoded and hence take several
clock cycles. SPEC benchmarks are not strictly applicable to parallel systems
and are hence unavailable on the MP2. These PEs are distributed physically as a
two dimensional array which can be lK~16K in size and are scalable between the
limits in powers of two. Each non-overlapping square matrix of 16 PBS in a PE
array is termed as a PE cluster. For example, a 1K PE array is made up of 64 4x4
PE clusters. ‘

One of the major aspects, and a source of bottleneck, of a massively parallel
system is its communication subsystem. Communication between PEs and the
ACU occurs over a special bus. Communication between two PEs consists of

Xnet and global router [83] communications.

133

Direct communication by any PE to any other PE which is in a straight line in
the eight directions, i.e., North. South, East, West, North-East, North-West,
South-East and South-West, can be achieved by a special communication
protocol called Xnet. Xnet can provide scalable performance of up to 23 GBytes/
sec with variations that can provide wrap around on edges and pipelined
versions for long distance communication.

The router communication system provides a method of communication
between any PE to any other PE which may not necessarily satisfy the Xnet
criterion. This form of communication utilizes a three-stage, circuit switched
crossbar with a peak bandwidth of 1 .3 GBytes / sec.

The target system available at the Scalable Computing Laboratory at Arnes
Laboratory at Iowa State University consists of a 64x64 MP2 array with each PE
having 64 KBytes of local RAM. The front end is a DEC workstation running
ULTRDI operating system.

In addition to all of the hardware, the MasPar systems have software
programming and debugging tools. The language used to implement the parallel
version of the application was the Maspar Programming Language (MPL). This
language is an extension of standard ANSI C with constructs to provide for serial
and distributed variables. Debugging support is provided by a visual debugger,
WPE, [84] which allows setting break points and viewmg the data and state of
processors in the PE array. The data can be viewed either as numbers or
graphically visualized as a 2D graph with the points representing the value of
the variable being probed. Further, graphical routines are available which can be
used to build graphical display routines for the required data. Experiences with

visualization tools are given in Section 7.5.

134

7.2 Implementation of a he Dimensional Parallel Program

The processors on the Maspar MP2 are arranged in the form of a two
dimensional array which leads to the possibility of a direct implementation of a
two dimensional particle-type plasma simulation. With this in nrind, a program
simulating a 2D3V (two spatial dimensions and three velocity dimensions)
plasma particle simulation on a 32x32 grid was directly implemented on the
MP2 by considering each PE as representing one grid point. The MasPar
command mpswopt [85] was used to make the ﬁnal program run on a 32x32
portion of the PE array. The PEs present on the edges were considered as
boundary elements and were assigned grounded boundary values for the
potential. Each PE was assigned eight electrons and ions initially. The initial
locations of these particles were randomly distributed in the PE’s domain and
their initial energies were distributed according to a Maxwellian distribution
across all the processing elements. Some of the major issues involved in
parallelizing this application involved the movement of particles which leave a
PE’s domain. solving Poisson’s equation and the calculation of the local electric

and magnetic ﬁelds, all of which require extensive use of communications.

7 .2. 1 Movement of Particles Across PEs

A particle leaving the PE is removed from the current location and moved to
the correct PE so as to preserve the total particle count- This involved locating
the direction of particle motion in terms of the eight directions and transferring
the total details of the particle, including its location in space and velocity to that
neighbor using Xnet. The particles that left the boundary PEs beyond the

simulation region were considered lost to the grounded walls.

135

7.2.2 Poisson’s Equation Solution

The solution of Poisson's equation was done very simply by using the
potential values from the current PE’s north, south, east and west neighbors as
shown in Figure 7.1. The technique used to solve Poisson’s equation was the
Gauss-Seidel iterative method applied to a ﬁnite difference formulation. In this
case, since the communication is always with an immediate neighbor, Xnet

communications were used to enormous speed advantage.

7.2.3 Calculation of Fields

Local electric ﬁelds at each PE are calculated by using the potential values,
found from solving Poisson’s equation, at its immediate neighbors. For example,
Ex, the electric ﬁeld in the x direction is calculated using the difference in the
value of potential from the eastern and western neighbors. Similarly the y
component of the ﬁeld is calculated using the northern and southern neighbors
as shown in Figure 7.2. The problem of calculating the value of electric and
magnetic ﬁelds acting upon a particle involves computing the contribution of the
electric ﬁelds from the neighboring four PEs as shown in Figure 7.3. Therefore,

signiﬁcant communication overhead can be involved in this part of the program.

7 .2.4 Results of the Two Dimensional Simulation

Implementation of a two dimensional program was quite successful and the
program was not communication bound as was initially feared. Proﬁling the
parallel code showed that the PEs spent most of the time in moving the particles

(solving the Lorentz force equation) than in communicating any of the variables

 

 

Xneth l ] .poten

 

 

136

 

XnetN] l ] .poten

 

 

 

 

H poten H

 

 

 

 

XnetSl l ] .poten

 

 

 

If (dx = dy) then poten =

 

 

XnetEl l ] .poten

 

(XnetN[l].poten + XnetS[l].poten +XnetEll].poten + XnetW] l].poten) / 4

Figure 7 . 1: Solving Poisson’s equation.

 

 

Xneth l ] .poten

 

 

 

137

 

XnetNl 1].poten

 

 

 

 

 

 

 

w
dy
dx dx
dy

 

XmetSl 1].poten

 

 

 

 

 

XnetEl l ] .poten

 

Ey = (XnetN[I].poten - XnetS]l].poten) /(2"dy)
Ex = (XnetW]l].poten ~ XmetEll].poten)/(2‘dx)

Figure 7.2: Solving for electric ﬁeld.

 

138

Ax l-Ax
-Il -------- iI-- -Il ----------------------- I” XnetEll].Ex

 

AY
4...

.g'}....

.OOOJ...

l ~Ay

 

 

 

XnetSll].Ex XnetSEll].Ex

Ex. = (r:x *(l-Ax)*(1-Ay))
+ (XnetElll- Ex ‘ (Ax) * (l-Ayll
+ (XnetSElll.I?3x * (Ax) " (Ay) )

+ (XnetS[ll.Ex . (l-Ax) .. (Ay))

Figure 7.3: Interpolation of x component of E-Field at particle location.

139

discussed earlier. Table 7.1 tabulates the speciﬁcations of the three platforms
[86], [87] and Table 7.2 tabulates the time taken per iteration on these three
platforms for a 32x32 grid.

It can be seen that the MP2 was not much faster than the HP 735 and over
three times as fast as the Sparc 10. The next test conducted was to check the
scalability of the problem. While scaling the problem up to a 64x64 array size did
not slow the MP2 program by much, it made the HP 735 program nearly 10
times slower. This was due to the fact that increasing the number of grid points
also increases the overall number of particles required for the simulation in a
serial program while leaving the parallel version uneffected. These results
provided encouragement to implement a complete three dimensional parallel

simulation.

7.3 Implementation of a 3 Dimensional Parallel Program

7 .3. 1 Implementation of Third Dimension in Memory

Since the MasPar PE array is in the form of a two dimensional array, it does
not lend itself in a simple way to scale a 2D program into three dimensions. One
of the simplest solutions was to simulate the third dimension in the local
memory of each PE. In this case the local variables such as potential and electric
and magnetic ﬁeld vectors are no longer single variables as in the 2D case but
are vectors of length required to simulate the z dimension as shown in Figure
7.4. This approach uses a large amount of local memory since in addition to the

local variable vectors, the details of many more particles have to be stored in

140

Table 7 . 1: Speciﬁcation of the three systems

 

  

 

 
   

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(4096 PEs)

 

 

 

 

Platform £11322; 1:211); M12213? SPECint92 SPECfp92

Sun Sparc 10 j 205— 45.2— 54

HP 735 99 MHz 125 45 109.0 168.0

MasPar MP2 12 MHz 4.15 0.38 NA NA

per PE

Table 7.2: Results for the 2D simulation
Time/
Platform Size iterailrtlion SspueIec/larg) Nggéneall-iﬁgd Efﬁitoency
seconds

Sun Sparc 10 32x32 2.18 - - -
HP 735 32x32 0.8662 - — -
MasPar MP2 32x32 0.629 3.47/ 1.38 187.2/ 159.3 18/16
(1024 PEs)
Sun Sparc 10 64x64 23.93 - - -
HP 735 64x64 6.9773 - — —
MasPar MP2 64x64 0.69086 3468/ 10.1 1 1871 .0/ 1 166.9 46/28

 

 

 

 

141

 

 

 

 

l l

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

l-(l

 

 

 

 

ll-Illl

 

 

 

 

 

 

[lllll

 

 

Figure 7 .4: Implementation of 3"d dimension in memory.

 

 

 

 

 

 

 

 

 

 

 

 

 

lU-ll

 

 

 

 

 

 

l-Il.

 

 

 

 

 

 

 

Illlll-l

 

 

[lllll-ll

 

 

 

142

local memory. A 32x32x32 problem was implemented on a 32x32 PE array with
the size of local variable vectors being 32.

The motion of particles across PE boundaries was identical to the 2D case
although in this case the presence of considerably more particles per PE led to
more computation and communication. The solution of Poisson’s equation was
similar to the 2D case but involved a loop on the z dimension and hence also led
to more computation and communication. The interpolation of electric ﬁelds was
found to be time consuming since the computation of the component of the
ﬁelds at a particle involved twice as many communications as the 2D case. This
coupled with the fact that there are more overall particles increased the time
spent in communication to 8% of the total time spent per iteration, up from less
than 1% in the two dimensional case. It was further noted that the system was
heavily compute bound since most of the time was spent in moving the particles.
In spite of this, the parallel 3D3V program was found to be nearly 4 times faster
than a HP 735 serial version and nearly 13 times faster than a Sparc 10. Due to
these encouraging results it was presumed that further speed-up could be
obtained by spreading the load of the third dimension across processors as

described in the next section.

7 .3.2 Implementation of Third Dimension Across PEs

It was noticed that implementing the third dimension in the PEs local
memory disadvantaged the PE array in terms of loading them down
computationally. So the next design considered simulating a 32x32x32 three

dimensional grid across the whole 64hr64 processor array. The third dimension

143

was simulated by dividing it across a group of four processors (called a quad),
two each in the two physical directions of the PE array as shown in Figure 7.5,
with 32 such groups in the x and y directions. It was theorized that the increase
in the complexity and amount of communication involved would be more than
offset by the increase in speed due to reduced computation overhead. This
implementation involved reworking communication processes derived in the 2D

and simple 3D cases.

7.3.2. 1 Movement of Particles Across PEs

The PBS in each quad were numbered as an ordered pair (i,j) with (0,0)
simulating the top part of the z dimension and (1,1) simulating the bottom part
of the z dimension as shown in Figure 7.5. The movement in the z dimension
when the particle leaves the 0th gid on [0,0) or the last grid on (1,1) is
considered as loss to the grounded walls. Motion in the xy plane such that there
is no 2 motion across processors is handled simply by moving the particle two
steps in one of the eight elementary directions instead of the simple one step in
the earlier cases. Further. motion of a particle in the 2 plane alone, not involving
motion crossing xy boundaries, is handled simply by moving the particle to the
correct neighbor one step away. For example, a particle moving from (0,0) to
(1,0) in the quad would be moved one step east and one that moves from (1,0) to
(0, 1) involves a move one step south-west.

The motion of particles involving movement in which their ﬁnal location
demands a motion across PBS in the xy plane as well as the 2 plane is much

more complicated. The motion of particles can be reduced to simple combination

144

 

 

 

 

‘

Figure 7.5: Implementation of 3"d dimension across PEs.

145

of directional functions. Suppose the motion in the xy plane requires the particle
to move two steps NE in the simulated space. while the motion in the 2 plane
demands that the particle be moved to a PE one step deeper (east) of PE (0.1).
Since two steps in the simulated NE direction is equivalent to a step (~2,2) in the
y and x directions and E is equivalent to (0,1), the ﬁnal destination can be
considered as NE(E] resulting in directional position of (~2,3) as can be
graphically seen from Figure 7.6. The same logic can be applied to the many

different combinations of xy and z motions.

7.3.2.2 Poisson’s Equation Solution

The solutions to Poisson’s equation is slightly more complicated than the
earlier cases since calculation of potential requires communication for some
points lying on the 2 plane also. For example, calculating the potential at the 0th
grid point of the PE (1,0) in the quad would require the potential at the last grid
point of (0,0). A similar situation arises at the last grid point requiring the
potential at the 0th grid point of the PE (0,1). The location of the four neighbors
in the xy plane is the same as in earlier cases except that they are two steps
away in this case. Since there are one-fourth as many gid points at each PE as
in the earlier case, there is less total communication and computation than for

the earlier simpler 3D version.

7.3.2.3 Calculation of Fields

Calculation of local electric ﬁelds is similar to the earlier cases except that
neighbors in the xy plane are two steps away. The calculation of ﬁelds in the 2

plane requires communication with PEs downstream in the z direction in a

146

,.
I
I
l
I
l
I
l
I
I
l
I
l
s

   

Motion in the xy plane

_---“

Motion in the 2 plane

——> Final location

Figure 7.6: Motion of particles across PBS in the z and xy planes.

147

manner similar to the case discussed in Section 7.3.2.2. The interpolation of
ﬁelds on to the location of a particle is also similar to the earlier cases except for

the peculiarities discussed in earlier sections.

7.4 Implementation of 3D Maxwell’s Equations Solution

Since a complete plasma simulation includes the coupling of Maxwell’s
equations with particle motion, it was decided to implement a Finite-Difference-
Time-Domain (FDTD) parallel implementation of a simple rectangular microwave
cavity with perfectly conducting walls to test massively parallel solutions.
Equations (2.1) and (2.2) can be expanded into a set of six linear interrelated

equations, for example, Ex can be written as

 

E:+1(i+%,j,k) = E’:(i+%,j,k)+ A’

 

 

 

 

f +1/2(. I. 1 ) +1/2(. l. 1 ) l
H: 1+2,1+2,k —H: 1+2,_]—§,k
X M (7.1)
H"+l/2(i+-l-,j,k+l)-H“l/2(i+l,j,k—-l-)
r 2 2 y 2 2 4110+; .k)
K Az x 2’1’ )

It should be noted here that this set of equations is very well suited for
parallel implementation and requires communication only between immediate
neighbors. The third dimension for this simple case was done in the PE’s
memory. It was seen that the parallel version was over eleven times faster than

the serial HP 735 code.

148

7 .5 Discussion of Results

A baseline for comparison between serial and parallel code would be a single
processor workstation built around a MasPar PE. Alternatively, it was decided
that comparison would be made between the parallel MasPar code and the
fastest, most optimized serial code on two locally available workstations. The
speed-up for the 2D and 3D cases on the three platforms under different
conditions are listed in Tables 7 .2 and 7.3, respectively. It should be noted here
that the serial HP 735 code was compiled with optimization options. The
comparison of time taken to run was done by using the time command on serial
versions and gettimeofday on the MP2.

It should also be noted that the random number generators provided with the
MP2 system generated an unacceptably large number of repeated numbers and
hence could not be used for Monte Carlo simulation. An alternative third party
random number generator developed by Aluru et al. [89] showed proper behavior
and was used instead.

The parallel MasPar program did not incorporate a number of optimizations
which were available to the serial programs. Attempts to use SOR (successive
over relaxation) led to instability and non-convergence of Poisson’s equation.
Further. many MasPar variables had to be declared double precision since
declaring them single precision led to unacceptably large errors produced by
numerical calculations. These two main differences also contributed in slowing
down the parallel progams. Once SOR was disabled and double precision
implemented in certain critical sections of the code, accurate and correct

solutions were obtained on the MP2. However communication was not a major

149

Table 7.3: Results for the 3D simulation

 

 

 

 

 

 

 

 

Time/
Normalized Efﬁciency
Platform Size iterflition 88216:}ng Speed-up %
Sun/ HP Sun/ HP
seconds

Sun Sparc 10 32x32x32 122.92 - - -
HP 735 32x32x32 41.26 — - —
MasPar MP2 32x32x32 10.61 1 1.59/ 625.28/ 61/44
in memory 3.89 448.98
(1024 PEs)
MasPar MP2 32x32x32 3.065 40. 1 / 2163.4/ 53/ 38
across PEs 13.46 1553.6
(4096 PEs)
MasPar MP2 32x32x32 3.849 31 .96/ 1724.2/ 42/ 31
across PEs 10.72 1237.3
(frequent data
output to ioram)
(4096 PEs)
MasPar MP2 32x32x32 3. 1053 39.58/ 2135.3/ 52/37
across PEs 13.29 1533.9
(graphiml display)
(4096 PEs)
MasPar MP2 64x64x64 7.4116 NA NA NA
across PEs
(graphical display)

(16384 PEs)

 

 

 

 

 

 

 

 

150

bottleneck as was initially feared since the problem was more compute bound
than communication bound and the communication was less than 8% in all
cases.

The argument that similar portions in serial code be also made double
precision and that SOR be disabled was discarded; it was concluded that the
above two problems were actually a drawback of the MP2 and hence penalizing
the serial machines for their better behavior was not correct. It should be further
noted here that the 3D MP2 code did not output data to disk at the same
frequency as the serial program. Versions of MP2 code writing the same amount
of data to a fast toram or, alternatively to graphical displays, did not exhibit
signiﬁcant deterioration in performance as can be seen in Table 7.3.

The results in Tables 7.2 and 7.3 show that speed-up of up to 13 times over
the HP 735 and of over 40 times of a Sparc 10 can be achieved by properly
designing the 3D parallel computation. It would also be interesting to compare
the performance of serial and parallel machines equipped with similar
processors. It is encouraging to note that even though the MFLOPS rating of the
MP2 PE was much less than the Sparc 10 and considerably less than the HP
735’s, the inherent parallelism of the application provides for tremendous speed-
up over the serial case. The last row of Table 7.3 also shows the scalability of the
problem. It can be seen that a plasma prOCessing chamber eight times the size
can be simulated with a slow-down of little over a factor of two. It should be
noted here that a problem of this size could not be simulated in either of the
serial machines due to unavailability of sufﬁcient local memory.

For a workload W, the speed-up 8,,(W) with n processors to one processor is

deﬁned as

151

T
S (W) = —— (7.2)

where TM is the time required to complete W amount of work on iprocessors.
The speed-up presented here is the so-called absolute speed-up and is deﬁned as
the ratio of time required on a workstation (HP 735 or SUN Sparc) to the time
required on a 1K or 4K MasPar. Because the processing power of a single
processor of the MasPar is much lower than the HP or SUN workstation, the
absolute speed-up is not a clear measure of the performance achieved by the
MasPar parallel system. It is not possible to run the program on a single
processor of the MasPar because of the memory limitation, and thus the serial
program was run on a workstation. For the purposes here, the normalized
speed-up is deﬁned to be S,,(W) " (speed of workstation / speed of a MasPar PE).
The workload W in this application is mostly ﬂoating point computation, so the
manufacturers’ MFLOPS ratings are used as the speed of interest.

The physical meaning of the normalized speed-up is the speed-up the
parallel machine can achieve if its processors have the same processing power
as the workstation. In an ideal case of linear speed-up, the normalized speed-up
should equal the number of processors used (to be called ideal speed-up). In real
applications, it is impossible to avoid the communication overhead among
processors. The efﬁciency of a parallel system then is deﬁned as the ratio of the
normalized speed-up to the ideal speed-up. In Tables 7.2 and 7.3, the following
numbers are reported: speed-up of the MasPar program with respect to the Sun
and HP programs (separated by ‘/’ in this column and the next two columns),
the normalized speed-up, and the efﬁciency (as a percentage, where 100% is

ideal). The numbers are for various programs on different numbers of PEs. While

 

152

the efﬁciency numbers indicate there may be some room for additional
optimization, they are sufﬁciently high to judge the MasPar implementation as
successful in terms of performance improvement.

The results of the simulations were viewed graphically using MPDDL and
gnuplot. While MPDDL was easy to use, it lacked ﬂexibility in display options and
was not very useful for the programmers involved in this project. Color plots
were generated locally from workstation versions of the code using NCAR.
MATLAB and gnuplot. The availability of a fast disk allowed the design of a work-
around for the parallel version. This involved writing data to disk while
simultaneously an independent program, running on‘the front-end used gnuplot
to provide run—time graphics. Overall, this chapter demonstrated the success

with which particle-type plasma simulations can be parallelized.

"'~".‘." mat-twirﬁ—

Chapter 8

Distributed 3D3V Simulation

 

8. 1 Introduction

The previous chapter demonstrated the implementation of plasma 2D3V and
3D3V simulations on a massively parallel processor computer. Since very few
computing facilities have access to such specialized and expensive machines, it
was theorized that a similar version of the simulation could be implemented
across a group of single-processor machines. Further, since many of the
laboratory machines undergo extended periods of idleness, it was thought that
one could increase the overall usage of machines.

This chapter details the implementation of a distributed simulation across a
group of Sun and HP workstations. This chapter also introduces a set of
software routines named Parallel Virtual Machine (PVM) which was used in the
implementation of the distributed application. The results of this study are also

tabulated.

8.2 Parallel Virtual Machine

PVM is a software system that permits a network of heterogenous Unix
computers to be used as a single large parallel computer [90]. Thus,
theoretically, large computational problems can be solved using the aggregate

power of many computers. Under PVM, a user deﬁned collection of serial,

153

 

154

parallel, and vector computers appear as one large distributed memory machine.
PVM provides a user interface and routines to start up, control, synchronize and
communicate with tasks. Applications can be written in either Fortran or C, and
PVM uses message passing constructs to communicate information amongst
cooperating applications to solve a large problem in parallel.

PVM provides routines for packing and sending messages between tasks that
make up the distributed application. Messages can be tagged and can be either
task to task or global broadcasts. Similarly, tasks can await the receipt of
messages by the message tags in a blocking on non-blocking manner. Further,
there exists the ability for a receiving task to time out in a blocking receive and
thus help break some potential deadlocks.

It should be cautioned here that PVM provides only very primitive
communication and synchronization constructs and by itself does very little to
aid the actual design of the distributed application. The actual design, control
and fragmentation of the application has to be done independent of the message
passing constructs used. Therefore, a well designed distributed application can
be implemented using any one of the many distributed computing aids available

of which PVM is one.

8.3 A Distributed Poisson’s Equation Solver

The prerequisite to distributing a complete plasma simulation is the ability
to design and implement a distributed solver for Poisson’s equation. Designing a
distributed Poisson’s equation solver requires a very different perspective than
the one used in the MPP case. In the implementation in the previous chapter,

explicit synchronization between the processors was unnecessary since the

155

instructions were executed in lock-step across the many processors. However,
distributing the application across many different laboratory machines
presented an entirely different challenge. Even though it was decided to use a
homogeneous set of machines with the same instruction speed and memory
limitation for the application, the instantaneous load at any given time was
unpredictable. Further, the initial and boundary conditions of the portion of the
plasma being simulated can lead to different computational load on each of the
machines. In addition to these challenges, the communication delays between
the computers have also to be factored in. This leads to the conclusion that even
under ideal conditions, explicit synchronization of the different portions of the
distributed application is necessary. These requirements make the design of a
distributed application more challenging than the MPP one.

It was decided that the application would be divided across many computers
in the order of powers of 2. For example, the serial 32x32x32 grid structure
could be broken up among two machines along the z axis leading to two
32x32xl7 grids on each of the machines. The scheme used by this approach to
distribute applications is shown in Figure 8.1. The extra grid in the direction of
the distribution is shared by the two computers in question as shown in Figure
8.2. The application was fragmented in this fashion since this maximized the
number of faces which were held at some boundary condition thus reducing
communications between machines. Figure 8.3 shows one of the many potential
partitioning schemes which was discarded. In this version, the middle computer
simulated a volume which had its upper and lower 2 planes shared with other
machines. This would have lead to considerably more communication and

possible instabilities.

 

 

 

 

 

"""O'O"'

 

 

 

 

2 Computers

156

 

 

 

 

 

 

1
; ______ o ______ _
. l
3/ :
i~ -: ~~~~~~~~~~~~~
/§ :
; J ............ ,—
11:,” 2
T ............... _-
a", 3

 

 

 

4 Computers

 

 

1°

 

 

\.
...~..

.\~

\,

 

 

 

 

 

 

 

8 Computers

Figure 8.1: Distribution of load across computers.

 

157

 

 

 

 

 

 

 

 

l6 1 0 16 15

Figure 8.2: Shared planes across computers.

158

 

 

 

h- --

t
\s.
I 'IVVO

Pseudo-Boundaries
/

 

 

 

 

 

 

 

Figure 8.3: A discarded partitioning scheme.

159

It can be seen from Figure 8.1 that as the number of computers in the
distribution increases, so does the amount of data being communicated. In the
case of 8 computers. each processor simulates a volume that shares three faces
with three different machines. The effect of this is to increase the overall
communication to and from each individual machine. Further distributions were
not implemented due to lack of many more similar machines and because it was
considered that the sizeable communication overhead would offset any
computational gains made.

Figure 8.2 shows the implementation of the Poisson’s solver across two
machines. The program iterates over only the internal points and considers all
grids points on the extremities to be boundary points. However in this case, the
last 2 plane for machine 1 and the ﬁrst 2 plane for machine 0 are “pseudo-
boundaries” and are shared between the two. Therefore, at the end of each
Poisson solver iteration, machine 0 sends its lat plane to machine I which
receives this plane on its 16th plane and similarly, machine 1 sends its 15th
plane to machine 0 which receives it on its 0th plane. This process is continued
until both machines converge to a solution.

In order to implement scalability of this application, a master-slave type
implementation was designed. The master program running on one of the
computers is given all the input data and spawns the number of slaves required
on as many different computers. The slaves do the actual task of solving the

Poisson’s equation. (It is‘possible that one of the many slaves may come to a local
convergence ahead of the 9th}; _ slaves. In order to ensure synchronization and
Prevent deadlock, all slaves communicate their convergence status to the master

and continue with the task in hand. Once the master receives convergence

160

status from all the slaves, it broadcasts this to all slaves which exit from the
solver routine as shown in the ﬂow chart in Figure 8.4. The actual
implementation of this solver was slightly more complicated since it required
some modiﬁcations to prevent possible deadlock. The Poisson’s equation solver
was successfully implemented for the 2, 4 and 8 machine case providing a basis

for building a distributed plasma application.

8.4 Distributed 3D3V Plasma Simulation Implementation

3D3V plasma simulations are extremely compute intensive as discussed in
the previous chapter and most of the time is spent in movement of particles. It is
theorized that by spreading this load across many machines, a deﬁnite
improvement in speed can be achieved. However, in addition to the distributed
Poisson’s equation solver, the motion of particles across machines has to be
accounted for and transferred. The loss of particles to the boundaries is simple
and dealt with as before. However, the motion of particles across “pseudo-
boundaries” involves transmitting considerable data involving the particle state
across machines. In the case of two machines, the decision of the ﬁnal location
is very simple and any particle the crosses the pseudo-boundary is sent to the
only other computer. However, in the four processor case, this becomes more
complicated since each computer shares a face with two other computers and
shares an edge with the remaining one. This adds to the already increased
communication load due to the Poisson’s solver. Figure 8.5 shows the ﬂow chart
of a distributed 3D3V plasma simulation. It can be see that the Poisson’s

equation solver forms an essential component of the overall simulation. At the

 

Master

 

Read Input Data and Spawn
the Slave Processes

 

 

 

 

    
 

All Converged

 

Broadcast Flag to All Slaves

 

 

Exit

 

161

 

 

Slaves

 

Initialize Geometry
and Location Relative
to other Processors

1

Solve Poisson’s Equation
on Internal Points

1

Transmit and Receive
Shared Faces

 

 

 

 

 

 

 

 

 

 

 

     

Yes
Convergence?

Yes
Exit

Figure 8.4: Flow chart of distributed Poisson’s equation solver.

 

 

Initialize Geometry,
Location of Neighbors

l

Solve Poisson’s eqn.
using distributed solver

1

Calculate Local Fields

 

 

 

 

 

 

 

 

 

 

 

1

Move Particles using
Lorentz Force eqn.

 

 

 

 

 

162

No

Yes

 

 

. Discard

 

 

Loop over ‘____._.
Particles

 

 

 

Locate Appropriate
Neighbor and Store
for Transfer later

 

 

 

i

 

 

Send and Receive Particles
Crossing Boundaries

 

 

Figure 8.5: Flow chart of distributed 3D3V plasma simulation.

 

163

end of each iteration, the details of the particles crossing processor boundaries
are transmitted to the corresponding computer. This process is continued till the
whole system achieves a steady state. The eight processor implementation is
much more complicated because in this case, the particles can move to one of

the ﬁve potential destinations, three faces and two edges.

8.5 Discussion of Results

Time usage simulations for the distributed Poisson’s equation solver was not
done since it was expected to be slower that the one processor case due to its
communication bound nature. In the case of the distributed plasma simulation
it should be noted that the nature of the application provides considerable
difﬁculty in obtaining exact timing details of the programs. In this case, one
cannot use actual CPU usage as an indicator since during the time spent in
waiting for communication, the UNIX kernel may decide to provide the CPU to)
some other unrelated process. This wait time that does not show up as CPU
usage, should also be added to the actual CPU usage of the distributed
applications since communication over-heads are a essential part of the timing
analysis. The nature of this application makes it impossible to differentiate
between communication induced delays and legitimate UNIX swap-out delays.
However, one can get a measure of the communication to computation ratios by
measuring the wall clock time used by the master program and comparing this
with the CPU usage reported for the slaves. In the cases where communication
tends to be the bottleneck, the slaves spend a large percentage of their time

idling. Crow] [9 1] has discussed these problems in extensive detail. Further, he

164

described different methods to concisely present results of distributed
computations.

3D3V simulations were conducted using 32x32x32 grids for a single
processor version, 32x32x17 grids for the two processor version and 32x1 7x17
grids for the four processor version. In order to minimize interruptions by
unrelated processes, the application was run on Sums and HPs late at night in
order to ensure as light a computational and network load as possible. The
speciﬁcations of the Suns and HPs are given in Table 8.1. The results of the
computations for two and four processors are given in Table 8.2. It can be seen
that the two processor distributed version is nearly twice as fast for both the
Suns and the HPs. However, contrary to what was hoped, the four processor
version is much slower than the two processor version for the HPs and
marginally slower for the Suns. This is due to the fact that while the processing
speed of the machines have increased tremendously over the past few years, the
communication backbone has remained virtually the same. In the case of the
laboratory machines used in this research, the background network trafﬁc is
typically high. Further, it should be noted that these machines are connected in
a serial fashion and therefore, communication between machines that are not
physically one hop away tend to be more disadvantaged. All this is substantiated
by the fact that actual CPU usage of each of the slave tasks was less than a
quarter of the total wall clock time used leading to the conclusion that the
processes are waiting for communication.

In view of the above results, it can be concluded that distribution of
computation across several computers has a deﬁnite advantage, however, this

can be negated due to the lack of sufﬁcient communication bandwidth.

165

Table 8.1: Speciﬁcation of the two systems

 

Platform SPECint92 SPECfp92

 

Sun Sparc 10
HP 715 80 MHz ~ ~ 83.5 120.9

 

 

 

 

 

 

Table 8.2: Results of distributed simulation

 

  
 

Secs / iteration Secs / iteration Secs / iteration
(1 processor) (2 processors) (4 processors)

      
   

Sun Sparc 10
HP 715/75 29.07 14.53 24.23

 

 

 

 

 

 

166

Therefore, it was decided to conduct the same set of simulation runs on a set of
systems connected by a high speed network.

The set of systems chosen for this purpose were a set of Sparc 105 connected
on a high speed Asynchronous Transfer Mode (ATM) network available at the
High Speed Network Processing laboratory in the Computer Science Department
at Michigan State University. One further advantage that this set of machines
had was that they were connected on a sub-net thus shielding them from the
background communication trafﬁc on the main network. Therefore, it was
considered instructive to run the set of simulations on one. two and four sets of
sub-netted machines using the regular communication channels. The result for
this set of simulations is given in the ﬁrst row of Table 8.3 and it can be clearly
seen that there is a deﬁnite improvement in the 4 machine case over the two
machine case. However, these results do not seem to justify the extra number of
machines needed for the speed-up.

The same set of simulations were done using the high speed ATM network for
communicating data between the processes. The results of this set of simulation
runs is given in the second row of Table 8.3. However, the results were quite
disappointing because the speedup provided by the high speed network was
marginal for this particular problem. This could be due to the small size of the
messages broadcast by the distributed application and hence this application
does not take advantage of the high throughput capability of the ATM network. A
comprehensive comparison for solving matrix multiplication and partial
differential equations using a number of different Application Programming
Interfaces (APIs) has been done by Lin et al [92]. In their work, they compared

the distributed implementations of the two applications using Fore System’s

167

Table 8.3: Results on a sub-netted system

 

          

Secs/ iteration Secs / iteration
(1 processor) (2 processors)

Secs / iteration

Platform (4 processors)

     

  
 

 

Sun Sparc 10
(low speed network)

 

Sun Sparc 10 — 24.38 17.38
(ATM network)

 

 

 

 

 

 

168

ATM API, BSD socket programming interface, Sun’s Remote Procedure calls and
PVM. Their performance parameters included maximum achievable throughput
and round trip timing and the results clearly show that for small packet sizes (<
2 KBytes), there was no signiﬁcant difference in the performance of the four
interfaces. However, a choice of larger packet sizes led to higher throughput and
lower round trip timing for the ATM interface.

For the example case implemented using PVM, the maximum message size
was less than 2 KBytes in all cases. Further, the size of the message decreases
as the number of processors increases but the number of messages increases
with an increase in the number of processors leading to more communication
bottleneck. In view of these results. an eight processor distributed
implementation was not considered. A comprehensive overview of the results
can be seen in Figure 8.6. It is clearly shown that the communication overhead
effects a faster system to a larger extent than a slower system. PVM also allows
for tasks to set up direct TCP links between each other and this option has been
used [92] to achieve better performance. This option could be enabled only on
the HPs and the results are shown in Figure 8.6 under the legend “HP_DIR”. It
can be seen that there is a deﬁnite improvement over the default case which
uses the PVM daemon as a communication interface. It can be concluded that
this option can lead to higher performance in sub-netted systems. It is possible
to reduce the communication overhead involved in solving Poisson’s equation by
reducing the number of messages by grouping them together to form one large
packet. However, this will involve redesigning the algorithm completely to take

advantage of throughput.

169

.3393 .5 265 gxmconoeaﬁoo Hod Paw—m

£88005 .5 52852

N

 

I

 

 

14 gonzow
ax. cocoa... 25m
:3. avenged: 75m
in ED 2: i
I? unease-o: a:

_

 

 

o—

oo—

spuooos m acumen/emu

Chapter 9

Conclusion

 

9.1 Summary of Research

Simulation techniques for the modeling of microwave plasma sources have
been investigated and improved in this dissertation. The difﬁculties associated
with two and three dimensional plasma source simulations which self-
consistently solve the plasma behavior and electromagnetic / electrostatic ﬁelds
were identiﬁed and solutions to the difﬁculties tested. The difﬁculties
investigated included simulation instabilities in cylindrical coordinates, large
computer memory requirements and long computer simulation time. No
simulation programs (2D3V and 3D3V particle models) were developed to
investigate these difﬁculties and their solution. The simulation programs were
applied to two different simulation examples including a compact ECR ion/free
radical source and the downstream plasma processing region of a ECR plasma
etching machine.

A self-consistent 3D3V plasma-EM numerical model has been developed to
simulate discharges inside the microwave plasma source. The software couples
an electromagnetic ﬁeld model and a particle plasma model to simulate low
pressure ECR plasmas. The primary particle collision mechanisms were
included to realistically simulate plasma dynamics. The plasma source

investigated was a 3cm inner diameter compact ion/free radical source used in

170

 

171

MBE machines. The microwave power absorption pattern was simulated and
demonstrated to be at the expected ECR zones. A staggered grid system was
developed to improve numerical stability in cylindrical structures. The model
was applied to argon plasma simulation. although other discharge types can
also be simulated. The input parameters to the simulation included operating
pressure and magietic ﬁeld pattern. Characteristics of the plasma including
plasma temperature, charge density, plasma potential. radial electric ﬁeld
distribution near the walls and power absorption patterns were also
investigated. Further, simulations were modiﬁed to take advantage of parallelism
between the plasma and EM programs to produce considerable speedup.

A 2D3V simulation of the downstream region of a plasma processing
machine was conducted assuming ¢ symmetry. The plasma source was assumed
to provide a constant ﬂux of particles to the downstream region. The plasma
potential and the charge densities across the simulation volume were resolved. A
non-uniform particle weighting scheme was implemented to aid stability in the
grid points closer to the origin and was shown to be successful.

Both the 3D3V compact ion source and the 2D3V downstream plasma
processing region simulations were limited in their ability to model only small
volumes at reasonably low densities because of the memory limitation of the
single CPU workstation used. In order to extend the 3D3V and 203V
simulations to large plasma machines operating at higher densities, parallel
computing and distributed computing techniques were investigated.

The movements of particles in particle type plasma simulations are
essentially independent of each other once the governing forces have been self-

consistently determined. Therefore, implementations of 2D3V and 3D3V plasma

172

simulations on a MasPar MP2 massively parallel processor were performed.
Speed-up of over 13 times the fastest locally available serial machine was
achieved, demonstrating the possibility of parallel computing simulations.
Further. simulation of problems sizes beyond the computing and storage
capability of desktop machines was also demonstrated. Different methods of
simulating the third dimension in order to achieve maximum speed-up were also
demonstrated.

The concept of parallel simulations was modiﬁed to spread the computation
load across a set of networked desktop machines. Since very few institutions can
afford access to a MPP, it was postulated that considerable speed-up could be
gained by dividing an application across a number of similar workstations. It
was also thought that the speed-up gains would offset the loss due to increased
interprocessor communications. A distributed simulation using PVM was
designed to run across 2. 4 and 8 workstations. The two processor
implementation was shown to provide considerable speed-up over a single
desktop case. However, communication bottlenecks reduced the gain
considerably for the four processor implementation.

Overall, the objective of performing 2D3V and 3D3V simulations of
microwave plasma ECR sources and machines was achieved. Contributions
were made in improving the stability of cylindrical co-ordinate simulations, and
handling computer memory and CPU time requirements by using parallel and

distributed computing.

173

9.2 Directions for Future Research

The complete plasma-Maxwell equations show considerable promise in
simulating microwave processing cavities used in the laboratory. An example
simulation was applied to a relatively low density argon plasma. Computer speed
and memory limitations prevent the application of this model to higher densities
and larger cavities on a single CPU workstation. It is expected that the
availability of faster and larger computer resources in the near future will allow
the simulations of larger plasma machines.

Further, the simulation includes only the most basic collisions, gas types and
two charged particles. This should be extended to include more complex
collisions, many ions and a mixture of gas types. This will enable prediction of
various plasma component densities and ﬂuxes.

The downstream simulation developed in this research was also applied to a
small example system at lower plasma densities. Here again, the lack of
computer resources hampered the simulation of a larger system. This system is
well suited to be implemented on an Massively Parallel Processor. Further,
implementations on an MPP will allow simulations beyond the range and
capacity of conventional desktop machines.

The 3D3V MPP simulations successfully demonstrated in this research were
for regular rectangular structures. Extensions of this idea to simulate complex
structures and geometries hold considerable promise. The PVM version of
parallel simulations have been shown to aid parallelization of plasma-EM
simulations. This concept should be extended to distribute both components of

the simulation across multiple workstations. Further research needs to be done

 

174

regarding communication bottlenecks and methods to reduce these in
distributed plasma implementations.

Lastly. the utility of these simulation programs can be harnessed in the
future for the design and optimization of microwave plasma machines and

processes.

List of References

 

l1]
l2]

[3]

l4]

[5]

l6]

l7]

l8]

[9]

[10]

[11]

List of References

J. Asmussen, J. Vac. Sci. Technol. A, 7, 883 (1989).

D. M. Manos and D. L. Flamm, “Plasma Etching, An Introduction.”
Academic Press, 1989.

J. Hopwood, M. Dahimene, D. K. Reinhard, and J. Asmussen, J. Vac. Sci.
Technol. B 6, 268, (1988).

G. T. Salbert, “Anodic Growth And Cathodic Removal of Silicon dioxide
layers utilizing an Electron Cyclotron Resonant Microwave Plasma Disk
Reactor,” Ph.D. Dissertation, Michigan State University, 1992.

Gopinath, V., G. T. Salbert, T. A. Grotjohn, and D. K. Reinhard, “ECR
Sputter Removal of Si02 on Silicon Wafers,” J. Vac. Sci. Technol. B 1 1(6)
Nov[ Dec 1993.

J. A. Bittencourt, “Fundamentals of Plasma Physics,” Pergamon Press,
1988.

C. K. Birdsall and A B. Langdon, “Plasma Physics via Computer
Simulation,” McGraw Hill, N. York, 1981.

T. A. Grotjohn, “Numerical modeling of a compact ECR ion source,” Rev.
Sci. Instrum., 63 (4). Apr. 1992. PP. 2535-2537.

V. Vahedi., R. A. Stewart, and M. A. Lieberman, “Analytic Model of the,
Angular Distribution in a Collisional Sheath,” 39th National AVS
Symposium and Topical Conference, Chicago, Nov. 1992.

H. X Vu, Brackbill J. U., “CELESTI D: an implicit fully kinetic model for
low-frequency, electromagnetic plasma simulation,” Comp. Phys.
Communication, Vol. 69, pp. 253-76, Mar-Apr 1992.

J. U. Brackbill, and D. W. Forslund, “An Implicit Method for
Electromagnetic Plasma Simulation in No Dimensions,” J. Comp. Phys,
46, (1982), pp. 271-308.

175

 

[12]

[13]

[14]

[15]

[16]

[17]

[181

[19]

[20]

I21]

[22]

176

Trombley H. W., Terry F. L. Jr., Elta M. E., “A Self-Consistent Particle
Model For the Simulation of RF Glow Discharges,” IEEE Transactions on
Plasma Science, Vol. 19, No 2, Oct. 1991, pp. 158-162.

Vender D., Boswell R. W., “Numerical Modelling of Low-Pressure RF
Plasmas,” IEEE Transactions on Plasma Science, Vol. 18, No 4, Aug.
1990, pp. 725-732.

Sprott J. C. and Edmonds P. H., “Computer Calculations of Electron
Cyclotron Heating in a Nonuniform Magnetic Field,” The Physics of
Fluids, Dec. 1971, pp. 2703-2707.

Sadeghi N., Nakano T., Trevor D. J ., and Gottscho R A., “Ion Transport in
an Electron Cyclotron Plasma,” J. Appl. Phys., 1991.

M. Hussein and Emmert G. A., “Numerical Simulation of ECR microwave
plasma stream,” International Conference on Plasma Science, pp. 147,
1989.

Shagirov E. A., “Computation of three-dimensional magnetic ﬁelds and
particle trajectories in the presence of ferromagnets,” Computational
Mathemach and Modeling, Vol. 2 pp. 458-61, Oct—Dec. 1991.

S. Brandon, D. J. Larson, N. Madsen, D. E. Nielsen Jr., and P. Weidhaas,
“3-D Electromagnetic Plasma Simulation Using Non-Orthogonal
Unstructured Grids,” presented at the IEEE conference on Plasmas,
1993.

W. Tan and T. A. Grotjohn, “Modeling The Electromagnetic Excitation of
Microwave Resonant Cavities Used For Plasma Discharge Generation,”
Submitted to IEEE Transactions on Plasma Science.

C. S. Lin, A. L. Thring, Koga J ., and E. J. Seiler, “A parallel particle-in-cell
model for the Massively Parallel Processor,” Journal of Parallel and
Distributed Computing, Vol. 8 pp. 196-9. Feb. 1990.

R. Lohner and J. Ambrosiano, “A Vectorized Particle Tracer for
Unstructured Grids,” J. Comp. Phy., 91, pp. 22-31 (1990).

S. Matsushita, Narusawa M., Kurita G. et. al., “Parallelization of nonlinear
MHD plasma simulation,” Transactions of Information Processing Society
of Japan, Vol. 33, p 360-8.

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]
I32]

[33]

[34]

177

L. J. Mahoney, “The Design and Testing of a Compact Electron Cyclotron
Resonant Microwave-Cavity Ion Source,” M. S. Thesis, Michigan State
University, 1989.

A. K. Srivastava, M. Dahimene, T. Grotjohn, and J. Asmussen,
“Experimental Characterization of a compact ECR ion source,” Rev. Sci.
Instrum. 63(4). PP. 2556-2558, Apr. 1992.

Gopinath V., and T. A. Grotjohn, “ECR Ion/Free-Radical Plasma Source
Simulation in 'leo and Three Dimensions,” presented at the 1994 IEEE
International Conference on Plasma Science, Santa-Fe, June 1994.

V. Gopinath, T. A. Grotjohn, D. Rover, and Y-K Chu, “Parallelization and
Performance of Three-Dimensional Plasma Simulation,” submitted to
Frontiers ‘95. '

V. Gopinath, T. A. Grotjohn, D. Rover, and Y-K Chu, “Three-Dimensional
Plasma Simulations using a Massively Parallel Processor computer,” to
appear in the Proceedings of the Seventh SIAM Conference on Parallel
Processing for Scientiﬁc Computing.

D. W. Hess, “Plasma Material Interactions,” J. Vac. Sci. Technol. A (8), pp.
1677-1685, May/June 1990.

S. Ramo, J. R Whinnery, and T. VanDuzer, “Fields and Waves in
Communications and Electromagnetics,” John Wiley and Sons, N. Y. 1984.

T. Tajima, “Computational Plasma Physics: with applications to ﬁssion and
astrophysics”. Addison-Wesley, 1989

Hammersley J. M., “Monte Carlo Methods,” Metheun, London, 1964.

C. Jacoboni and P. Luigli, “The Monte Carlo Method for Semiconductor
Simulations,” Springer-Verlag, New York, 1989.

Thompson, B.E., Swain H. H., Fisher D. A., “Monte Carlo simulation of
ion transport through rf glow-discharge sheaths,” J. Appl. Phys., 63 (7),
Apr. 1988.

Govindaraju G. R., “Monte Carlo Simulation of Electron Swarms in
Nitrogen in Uniform E x B ﬁelds.” IEEE Transactions on Plasma Science,
Vol. 18. No 5, Oct. 1990, pp. 819-825.

[35]

[361

[37]

[38]

[39]

[40]

[41]

[42!

[43]

[44)

[45]

[46]

[47]

178

J. lgiacio, Ulacia F. and James P. McVittie, “A two-dimensional computer
simulation for dry etching using Monte Carlo techniques,” J. Appl. Phys.
65(4). Feb. 1989.

J .P. Biersack and W. Eckstein, “Sputtering Studies with the Monte Carlo
Progam TRIM.SP,” Appl. Phys. A 34, 73-94 (1984).

D. B. Graves, “Fluid Model Simulations of a 13.56 MHz rf discharge: time
and space dependence rates of electron impact excitation,” J. Appl. Phys.
62(1), pp. 88-94, July 1987.

M. S. Barnes, T. J. Colter, and M. E. Elta, “Large-signal time-domain
modeling of low-pressure rf glow discharges,” J. Appl. Phys. 61 (1), Jan.
1987.

Bandyopadhyay S. et. al., “A Rigorous Technique to Couple Monte Carlo
and Drift-Diffusion Models for Computationally Efﬁcient Device
Simulation,” IEEE Transactions on Electron Devices, Vol. ED-34, No 2,
Feb 1987.

J. K. Lee, C. K. Birdsall, “Particle-ﬂuid hybrid simulations for weak beam-
plasma interactions,” Comp. Phys. Comm., 64 (1991) pp. 214-220.

W. Tan, Ph. D. Dissertation, Michigan State University, 1994.

M. Surendra, D. R. Graves, and L. S. Plano, “Self-consistent dc glow-
discharge simulations applied to diamond ﬁlm deposition reactors,” J.
Appl. Phys. 71 (10), May 1992.

Birdsall D. K., and D. Fuss, “Cloud-in-clouds, clouds-in-cells physics for
many body simulations”, J. Comput. Phys., 3, pp. 494-51 1, Apr. 1969.

A. B. Langdon, “On Enforcing Gauss’s law in Electromagretic Particle-in-
cell codes,” Computer Physics Communications, 70 (1992) 447-450.

B. Marder, “A Method for Incorporating Gauss’s Law into Electromagnetic
PIC Codes,” J. Comp. Phys., 68. PP. 48-55 (1987).

D. E. Nielsen, Jr. and A. D. Drobot, ”An Analysis and Optimization of the
Pseudo-current Method,” J. Comp. Phys., 89, pp. 31-40 (1990).

Hese J. V., and Zutter D. D., “Modeling of Discontinuities in General
Coaxial Waveguide Structures by the FDTD Method,” IEEE Transactions
on Microwave Theory and Techniques, Vol. 40, No 3,.Mar. 1992

[48]

[49]

[50]
[51]
[52]

[53]

I54]

I55]

[56]

[57]

[58]

[59]

179

A. Matsumoto, S. Kawata, “TRIPIC: Triangular-Mesh Particle-in-Cell
Code,” J. Comp. Phy., 87, pp. 488-493 (1990).

E. Sonnendrucker, J. Ambrosiano, S. Brandon, “A Finite Element
Formulation of the Darwin Electromagnetic PIC Model for Use on
Unstructured Meshes” presented at the IEEE conference on Plasmas,
1993.

Schottky. W. Phys. Z., 25, 635 (1924) as cited by [23].
J. Asmussen, “EE850 Class notes”.

G. A. Geist, V. S. Sunderam, “Network-based Concurrent computing on
the PVM system.” Concurrency: Practice and Experience. Vol: 4, pp. 293-
3 1 l.

S. Y. Liem, Brown D. and Clarke J. H. R., “Molecular Dynamics on
Distributed memory machines,” Comp. Phy. Comm., 67 (1991) pp. 261-
267.

F. Muller-Plathe, “Parallelizing a molecular dynamics algorithm on a
multi-processor work station,” Comp. Phy. Comm., 61 (1990) pp. 285-
293.

Beguelin, A. et. al., “Graphical Development tools for network-based
concurrent supercomputing,” Proceedings, Supercomputing ‘91. pp. 435-
44.

R. E. Ewing, R. C. Sharpley, D. Mitchum, P. O’Leary and J. S. Sochaki.
“Distributed Computation of Wave Propagation Using PVM,” IEEE Parallel
and Distributed Technology, Systems and Applications, Spring 1994. pp.
26-3 1 .

P. C. Liewer and V. K. Decyk, “A general concurrent algorithm for plasma
particle-in-cell simulation codes,” J. Comp. Phys., Vol. 85, pp. 302-22,
Dec 1989.

A. F. Kuckes, “Resonant Absorption of Electromagnetic Waves in a Non-
Uniformly Magnetized Plasma,” Plasma Physics Vol. 10, pp. 367-380,
1968.

P. K Trost and J. L. Shohet, “Electron behavior during electron cyclotron
resonance heating in a stellarator.” Phys. Fluids B 1 (6), June 1989.

[60]

[61]

[62]

[63]

[64]

[65]

[66]

[67]

[68]

[69]

[70]

[71]

180

J. C. Sprott, “Electron Cyclotron Heating in Toroidal Octuples,” Phys.
Fluids, Vol. 14, 8, Aug 1971.

M. A Lieberman and A. J. Lichtenberg, “Theory of Electron Cyclotron
Resonance Heating-II. Long Time and Stochastic Effects,” Plasma
Physics, Vol. 15, pp. 125 - 150, 1973.

M. Hussein and G. A Emmert, “Collisional effects on plasma flow along
the divergent magnetic ﬁeld of an ECR plasma stream source,” J. Vac. Sci.
Technol., Vol. 8A. no 3, pp. 2913-2918, 1990.

S. A. Self and H. N. Ewald, “Static theory of a discharge column at
intermediate pressures,” Phys. Fluids, Vol. 9, no 12, pp. 2486-2492,
1966.

Porteus R. K., and Graves D. B., “Modeling and Simulation of
Magnetically conﬁned Low-Pressure Plasmas in Two Dimensions,” IEEE
Transactions on Plasma Science, Vol. 19, No 2, Apr. 1991, pp. 204-213.

M. Hussein and G. A Emmert, “Modeling of ion energr spectrum in
electron cyclotron resonance plasmas,” IEEE Conf. on Plasma Science,
1991.

Weng. Y. and Kushner Y. J ., “Simulation of remote plasma activated
processing using electron cyclotron excited discharges” IEEE Conf. on
Plasma Science, 1990.

Graves D. B., Wu H. and Porteus—R. K., “Modeling and Simulation of High
Density Plasmas,” Jpn. J. Appl. Phys. Vol. 32 (1993) pp. 2999-3006.

Y. Yasaka, A Fukuyama, A Hatta and R. Itatani, “TWO-dimensional
modeling of electron cyclotron resonance plasma production,” J. Appl.
Phys. 72 (7) Oct. 1992, pp. 2652-2658.

S. Salimian, C. B. Cooper, 111, and A Ellingboe, Appl. Phys. Lett., 56,
1311, (1990).

J. L. Vossen, “Physics of Thin Films: Contemporary Preparative
Techniques,” edited by M. H. Francombe and J. L. Vossen (Academic
Press, Boston, 1989), pp. 201-240.

P. Raymaud and C. Pomot, J. Vac. Sci. Technol. B. Vol. 1 1, pp. 699 (1993)

I72]

[73]

[74]

[75]

[76]

[77]

[78]

[79]

[80]

[81]

[82]
[83]

[84]

[85]

181

P. Mak, G. King, T. A. Grotjohn, and J. Asmussen, J. Vac. Sci. Technol. A
10(4), 1281, (1992).

G. King, F. C. Sze, P. Mak, T. A. Grotjohn, and J. Asmussen, J. Vac. Sci.
Technol. A 10(4), 1265, (1992).

Grotjohn T. A, V. Gopinath, A. K. Srivastava, and J. Asmussen,
“Modeling and Characterization of Hydrogen and Helium Discharges in a
Compact ECR Ion/ Free Radical Source,” Fifth International Conference
on Ion Sources, Beijing, August 1993.

A K. Srivastava, J. Asmussen, “Measurements of the Impressed Electric
Field inside a Coaxial Electron Cyclotron Resonance Source,” to be
published.

F-C. Sze, Ph.D. Dissertation, Michigan State University, 1993.
Y. P. Raizer, “Gas Discharge Physics,” Springer-Verlag, 1991.

R. K. Janev, W. D. Langer, K. Evans Jr., and D. E. Post, Jr., “Elementary
Processes in H ydrogen-Helium Plasmas, Cross Sections and Rate
Coeﬂicients,” Springer-Verlag, Berlin, 1987.

B. E. Cherrinton, “Gaseous Electronics and Gas Lasers,” Pergamon press,
Oxford, 1979.

B. Chapman, “Glow Discharge Processes: Sputtering and Plasma Etching,”
John Wiley and Sons, New York, 1970.

C.-H. Wu, X. Wu, J.-H. Tsai, F. F. Young and C. Li, “’I‘wo-Dimensional
Kinetic and Fluid Models for Parallel-Plate RF Glow Discharges.”
presented at the 1994 IEEE International Conference on Plasma Science,
Santa-Fe, June 1994.

Tim Busse, MasPar Computer Corporation, Personal Communications.

MPL reference manual, Software version 3.2, Rev. A4, May 1993. MasPar
Computer Corporation.

MPPE user guide, Software version 3.0, Rev. A8, July 1992, MasPar
Computer Corporation.

Online MasPar commands reference man pages.

[86]

[87]
[88]
[89]

[90]
[91]

[92]

[93]

182

Hewlett-Packard product catalogue, <HREF =http://www.wsg.hp.com/
wsg/Global/Products/S700Workstations/HighEnd/Model735.html>

Sun product catalogue, Rev. 5/ 1 I/ 93. pp. 12.
MasPar Data Parallel Programming Guide and miscellaneous handouts.

S. Aluru, G. M. Prabhu, and J. Gustafson, “A Random Number Generator
for Parallel Computers,” Parallel Computing, Vol. 18, Aug 1992.

PVM 3 User’s Guide and Reference Manual, A Geist et. al., May 1994.

Lawrence A Crowl, “How to Measure, Present, and Compare Parallel
Performance,” IEEE Parallel and Distributed Technology, Systems and
Applications, Spring 1994. pp. 9-25.

M. Lin, J. Hsieh, D. H. C. Du, J. P. Thomas, and J. A. MacDonald,
“Distributed Network Computing over Local ATM Networks,” to appear in
IEEE Journal on Selected Areas in Communications Special Issue of ATM
IANs: Implementations and Experiences with an Emerging Technology
(Early ‘95).

Srivastava A, Private communications, Ion density vs. Position
measurements for [24], not yet published.

HICHIGRN smTE UNIV. LIBRQRIES
llHIHill“)!ill‘l'llll‘lllilllmlli(I)lMHllWllWHl
31293010289365