STUDIES OF CHARGE NEUTRAL FCC LATTICE GAS WITH YUKAWA INTERACTION AND ACCELERATED CARTESIAN EXPANSION METHOD By He Huang A DISSERTATION Submitted to Michigan State University for the degree of DOCTOR OF PHILOSOPHY Physics and Astronomy 2011 ABSTRACT STUDIES OF CHARGE NEUTRAL FCC LATTICE GAS WITH YUKAWA INTERACTION AND ACCELERATED CARTESIAN EXPANSION METHOD By He Huang In this thesis, I present the results of studies of the structural properties and phase transition of a charge neutral FCC Lattice Gas with Yukawa Interaction and discuss a novel fast calculation algorithm -- Accelerated Cartesian Expansion (ACE) method. In the first part of my thesis, I discuss the results of Monte Carlo simulations carried out to understand the finite temperature (phase transition) properties and the ground state structure of a Yukawa Lattice Gas (YLG) model. In this model the ions interact via the potential qi q j exp( −κ r ij ) / r ij where qi,j are the charges of the ions located at the lattice sites i and j with position vectors Ri and Rj; r ij =Ri-Rj, κ is a measure of the range of the interaction and is called the screening parameter. This model approximates an interesting quaternary system of great current thermoelectric interest called LAST-m, AgSbPbmTem+2. I have also developed rapid calculation methods for the potential energy calculation in a lattice gas system with periodic boundary condition bases on the Ewald summation method and coded the algorithm to compute the energies in MC simulation. Some of the interesting results of the MC simulations are: (i) how the nature and strength of the phase transition depend on the range of interaction (Yukawa screening parameter κ) (ii) what is the degeneracy of the ground state for different values of the concentration of charges, and (iii) what is the nature of two-stage disordering transition seen for certain values of x. In addition, based on the analysis of the surface energy of different nano- clusters formed near the transition temperature, the solidification process and the rate of production of these nano-clusters have been studied. In the second part of my thesis, we have developed two methods for rapidly computing -ν potentials of the form R . Both these methods are founded on addition theorems based on Taylor expansions. Taylor’s series has a couple of inherent advantages: (i) it forms a natural framework for developing addition theorem based computational schemes for a range of potentials; (ii) only Cartesian tensors (or products of Cartesian quantities) are used as opposed to -ν special functions. This makes creating a fast scheme possible for potential of the form R . Indeed, it is also possible to generalize the proposed methods to several potentials that are important in mathematical physics. An interesting consequence of the approach has been the demonstration of the equivalence of FMMs that are based on traceless Cartesian tensors to those based on spherical expansions for ν = 1. Two methods are introduced; the first relies on exact translation of the origin of the multipole whereas the second relies on cascaded Taylor’s approximations. Finally, we have shown the application of this methodology to computing Coulombic, Lennard-Jones, Yukawa potentials and etc. We have also demonstrated the efficacy of this scheme for other (non-integer) potential functions. To my mother, my father and my family member iv    ACKNOWLEDGEMENT Looking back at the progress of my Ph.D degree; I am grateful to many people -- my advisors, my guidance committee members, my office and class mates, my family and others. This thesis could not have been possible without their strong support and encouragement; I want to make a full use of this part of my thesis to thank them. First of all, I thank my advisors Prof. S.D. Mahanti and Prof. Shanker Balasubramaniam. They have done so much that excellent advisors do for their student. They initiated many fruitful discussions with me and showed me how to communicate more efficiently with research colleagues. They showed me how important it is to make a successful presentation. They supported me to attend domestic and international conferences whenever possible. They also put a lot of concern about my life and future success. As a physics student, during these years, I have also learned from Prof. Mahanti’s great passion for scientific research and his unique way to inspire me to think about every physics problem. He is my physics library from whom I could learn and obtain any knowledge of physics. He looks as my family elder; I would say that in physics, I would not have grown up as fast without his continuous support and guidance. Prof. Balasubramaniam brought me to a brand new world, the research of fast calculation methods. He taught me everything about fast calculation methods. He gave me the idea to overcome my research bottleneck. He taught me how to code and how to optimize codes. He treated me as his family member. I want to show my gratitude to the members of my Guidance Committee: Professor Pawel Danielewicz, Professor Michael Moore and Professor Chong-Yu Ruan. v    I was a teaching assistant of Prof. Pawel Danielewicz in the fall semester 2007. At that time, he taught me a lot of skills to solve physics problems in an easy way. As my committee member, he taught me his research experience and gave me his paper that was very useful to understand more about Cartesian Harmonics, especially in my fast calculation method research. He was very patient when I asked him questions and he also encouraged me to practice for successful presentation. When I think about Prof. Danielewicz, the picture in my mind will always be his kind and smile face. I had been a teaching assistant of Prof. Michael Moore. I would say that Prof. Moore is one of the best professors. In his class, one could learn not only the fundamental knowledge of modern quantum mechanics but also how to solve the problems, especially using numerical methods. Actually, from discussions with him and from his notes that he gave me, I have benefitted very much. I am also benefitted very much from Professor Chong-Yu Ruan. In my research, he taught me a lot. He treated me as his student, and he was always very patient to answer my questions. Actually, his teaching helped me to understand a lot of fundamental physics problems, such as details about solidification process in a lattice gas. I would give a special thank to Prof. Jack Baldwin. It is Prof. Baldwin who brought me to Michigan State University. When I was a student in the Astronomy group, he helped me very much not only in my studies but also in my life. Just as my family elder, it is he who gave me the greatest support in 2002-2003. I want to tell Prof. Baldwin that I will appreciate him all of my life. vi    Thanks for Prof. Thomas Kaplan. He helped me to avoid “frustrated feeling ” when I tried to understand “frustration in FCC lattice system”. He helped me to understand the physical meaning of “competing interaction”. I appreciate his help. I thank my officemates, Jun Gao, Chuan Lu, Gregory Kobidze, Ozgur Tuncer, Naveen Aneesha Nair, Vikram Melapudi and Jun Yuan, when I was in the ECE department. I thank my officemates, Dat Do, Mal Soon Lee, Andrew Baczewski, Mitchell Wood, Khang Hoang and Zsolt Rák when I worked with them in the physics department. I thank all of above officemates for many scientific discussions and beautiful memories during my MSU life. I thank my Chinese classmates, He Zhang, Teng Yang, Jiwu Liu, Jianghao Yu, Xin Liu, Yiming Deng, Zhen Zhu, Yang Liu and Jianjun Chen for their help. Also I thank them for their constant encouragement. I appreciate the efforts of our graduate secretary, Debbie Barratt. She is extremely nice, helpful and kind. Finally I thank all of my family members. vii    TABLE of CONTENTS LIST OF TABLES ........................................................................................................................ xii LIST OF FIGURES ..................................................................................................................... xiii PART A ................................................................................................................................1 CHAPTER 1 Lattice Gas Models with Frustration .........................................................................1 1.1 Introduction and Review ............................................................................................................1 1.2 Geometrical Frustration and Frustration from Competing Interaction ......................................2 1.2.1 1-D Anti-ferromagnetic Next Nearest-Neighbor Ising Model .....................................8 1.2.2 Heisenberg Model ........................................................................................................9 A.Geometrical Frustration in a Simple Nearest-Neighbor Classical Heisenberg Model .......................................................................................................9 B.Frustration in a Heisenberg Model with Competing Interaction ...............................11 1.2.3 Degeneracy Caused by Frustration.............................................................................16 A. Triangle Model .........................................................................................................17 B. Tetrahedron Model ...................................................................................................18 1.3 Real Lattice Gas and Its Ising Model Mapping .......................................................................19 1.3.1 Real Lattice Gas .........................................................................................................19 1.3.2 Screening Effect .........................................................................................................21 1.3.3 Ionic Model for LAST-m, Yukawa Lattice Gas (YLG) Model .................................23 1.4 Monte Carlo Simulation (MC) .................................................................................................26 1.4.1 Monte Carlo Algorithm and Simulation Process .......................................................27 1.4.2 Limitations of MC Simulation ...................................................................................29 1.4.3 Application of MC Simulation ...................................................................................31 1.5 Total Energy Calculation in the YLG Model ..........................................................................32 1.5.1 Ewald Summation for Yukawa Interactions ..............................................................32 1.5.2 Ewald Summation Parameters....................................................................................37 1.5.3 Fast Calculation Method for Ewald Summation ........................................................39 1.5.4 Evaluation of Madelung Constant – a Check .............................................................40   CHAPTER 2 YLG Model on a FCC Lattice .................................................................................42 viii    2.1 Introduction ..............................................................................................................................42 2.2 Study of YLG Model on a FCC Lattice ...................................................................................48 2.2.1 First Order Phase Transition, Heat Capacity and Ground State Structure .................49 2.2.2 Effect of Screening Parameter (κ) and Concentration on the Phase Transition ..................................................................................................................54 A.Critical Temperature, Change in Entropy across the Transition for Different x, κ=0 .........................................................................................................55 B. Melting Transition in the YLG Model, κ Dependence for a Fixed x .......................56 2.2.3 Critical Temperature Vs. Change in Energy and Entropy at the 1st Order Phase Transition........................................................................................................58 2.2.4 Critical Temperature and Screening Parameter κ.......................................................60 2.2.5 Unusual Degeneracy for x =0.5 ..................................................................................63 2.3 Nature of Two Phase Transitions for x > 0.5 ...........................................................................65 2.4 Nucleation of Nano “Tubular” and “Platelet” Structures ........................................................72 2.4.2 Nucleation Theory of Freezing – Basics ....................................................................72 2.4.3 Different Types of Clusters and the Droplet Nucleation Model ................................75 A.Driving Force for Solidification in the YLG Model .................................................76 B.Energy Barrier and Generating Rate of Different Structure in the Solidification Process................................................................................................78 2.5 Concentration and melting point..............................................................................................81 2.6 Summary ..............................................................................................................................83 PART B ...............................................................................................................................85 CHAPTER 3 Accelerated Cartesian Expansion (ACE) Based on Fast Multipole Method (FMM) ......................................................................................................85 3.1 Introduction ..............................................................................................................................85 3.1.1 Divide and Conquer Strategy .....................................................................................90 3.1.2 General Statement of the Problem..............................................................................91 3.2 Method 1: Cartesian Expansions with Totally Symmetric Tensors.........................................93 3.2.1 Multipole Expansion (Theorem 3.2.1) .......................................................................93 3.2.2 Multipole to Multipole Expansion (Theorem 3.2.2) ..................................................94 3.2.3 Multipole to Local Translation (Theorem 3.2.3) ........................................................95 3.2.4 Local to Local Expansion (Theorem 3.2.4) ................................................................97 3.3 Method 2: Cartesian Expansions with Cascaded Taylor's Series ............................................98 ix    3.3.1 Traceless Multipole (Lemma 3.3.1) ...........................................................................99 3.3.2 Traceless Multipole to Multipole (Lemma 3.3.2) ....................................................100 3.3.3 Traceless Multipole to Local (Lemma 3.3.3) ...........................................................102 3.3.4 Traceless Local to Local (Lemma 3.3.4) ..................................................................104 3.4 Error Bounds ..........................................................................................................................105 3.5 Computational Methodology and Cost ..................................................................................109 3.5.1 Near Field Evaluation...............................................................................................110 3.5.2 Far Field Evaluation .................................................................................................110 A. Multipole Expansion ..............................................................................................110 B. Multipole to Multipole............................................................................................111 C. Multipole to Local Translation ...............................................................................112 D. Local to Local Translation......................................................................................113 E. Local to Observer....................................................................................................113 3.6 Implementation Subtleties .....................................................................................................114   CHAPTER 4 Applications of Accelerated Cartesian Expansion Methods .................................116 4.1 Preliminaries ..........................................................................................................................116 4.1.1 Generalized Maxwell Expansion (GME) .................................................................116 4.1.2 New Expansion of Modified Bessel K (New K) ......................................................119 4.2 The Error in Numerical Simulation Based on ACE...............................................................122 4.3 Numerical Simulation Results ...............................................................................................122 4.4 Yukawa (Screened Coulomb) Potential Calculation Based on ACE.....................................131 4.4.1 General Statement of the Problem............................................................................132 4.4.2 Cartesian Expansions with Totally Symmetric Tensors...........................................134 Theorem 4.4.2.1 (Multipole Expansion) .....................................................................134 Theorem 4.4.2.2 (Multipole to Multipole Expansion) ................................................134 Theorem 4.4.2.3 (Multipole to Local Translation) ......................................................135 Theorem 4.4.2.4 (Local to Local Expansion) ..............................................................138 4.4.3 Calculation Cost Based on ACE ..............................................................................139 4.4.4 Error Bounds ............................................................................................................140 4.4.5 Simulation Results....................................................................................................141 4.5 Summary ............................................................................................................................146   x    APPENDIX .............................................................................................................................148 1. Tensors .............................................................................................................................149 2. Tensors Contraction .................................................................................................................149 Theorem A2.1 ....................................................................................................................150 Lemma A2.2 ......................................................................................................................151 3. Homogeneous Polynomials .....................................................................................................151 Lemma A2.3 ......................................................................................................................152 Theorem A2.4 ....................................................................................................................152 Theorem A2.5 ....................................................................................................................152 4. Detracer .............................................................................................................................153 Theorem A2.6 ....................................................................................................................154 Theorem A2.7 ....................................................................................................................154 Corollary A2.8 ...................................................................................................................154 Corollary A2.9 ...................................................................................................................155 5. Maxwell Cartesian Tensors......................................................................................................155 Corollary A2.10 .................................................................................................................157 6. Gegenbauer Expansions (Addition Theorem A2.8).................................................................159 REFERENCES ..........................................................................................................................160 xi    LIST OF TABLES Table 2.2.5.1 Topological equivalence (charge weighted connectivity) of the layered and tubular structures for x=0.5. After analyzing the structure up to 1146th neighbor in detail, we found that the two structures have exactly the same energy in the YLG model. ..........................................................64 Table 4.3.1 The variation of errors in multipole to multipole and local to local operations with fixed translation error for ν = 2.2 ...............................................123 Table 4.3.2 Variation of errors in multipole to multipole and local to local operations with fixed translation error for computing the lattice gas potential function ................................................................................................................124 Table 4.3.3 Errors in Coulomb potential (ν = 1) computed using the proposed scheme and the directly timing data for the two methods ................................................126 3.3 Table 4.3.4 Errors in Coulomb potential (1/R ) computed using the proposed scheme and the directly timing data for the two methods ...................................127 3.3 Table 4.3.5 Errors in Coulomb potential (R ) computed using the proposed scheme and the directly timing data for the two methods ...................................128 Table 4.3.6 Error in the Lennard-Jones potential; the computations are performed using one tree .......................................................................................................129 Table 4.4.5.1 Variation of errors in multipole to multipole and local to local operations with fixed translation error .................................................................142 Table 4.4.5.2 Errors in Yukawa potential κ = -0.17, computed using the proposed scheme and the directly timing data for the two methods ...................................143 Table 4.4.5.3 Errors in Yukawa potential κ = -1.0, computed using the proposed scheme and the directly timing data for the two methods ...................................144 Table 4.4.5.4 Comparison with "Plane Wave Representation (PWR)" under same error. Errors in case κ =1 at level 3, RXXX is the relative calculation speed, RACE = Tdir/TACE, RPWR = Tdir/TPWR ...................................................145 Table 4.4.5.5 ACE Errors in Helmholtz potential calculation in low frequency case with the largest box size = λ/2 .............................................................................146 xii    LIST OF FIGURES Figure 1.1.1 Face centered cubic lattice, where a is lattice constant ...............................................2 Figure 1.2.1 Effect of frustration on a square ..................................................................................4 Figure 1.2.2 Effect of frustration on a triangle ................................................................................5 Figure 1.2.3 One of the configurations of 1-D ANNNI model ........................................................6 Figure 1.2.4 Frustration in a square lattice ......................................................................................7 Figure 1.2.1.1 One of the configurations of 1-D ANNNI model .....................................................9 Figure 1.2.1.2 One of the configurations of 1-D ANNNI model .....................................................9 Figure 1.2.2.1 Classical spins orientation along the easy axes of a tetrahedron ............................10 Figure 1.2.2.2 One-dimensional classical Heisenberg model with competing interactions. (a) Helical ordering (b) Phase diagram for the model with a nearest-neighbor exchange J1 and a next-nearest-neighbor exchange J2 ............................................................................................................................12 Figure 1.2.2.3(a) The elementary cubes.........................................................................................14 Figure 1.2.2.3(b) Tetrahedron which is characterized ...................................................................15 Figure 1.2.2.3(c) Particular GS configuration ...............................................................................15 Figure 1.2.3.1 Spins on a triangle with anti-ferromagnetic interaction .........................................17 Figure 1.2.3.2 Spins with anti-ferromagnetic interaction on a tetrahedron ...................................18 Figure 1.3.1 A. A high resolution transmission electron microscopy image of AgSbPb18Te20 sample. B. An extended region of a AgSbPb10Te12 specimen ................................................................................................................19 Figure 1.3.2.1. Example of ionic mixing on the Pb sublattice of the NaCl type structure of PbTe. The average basic structure of LAST-m is that of a NaCl lattice ............................................................................................................20 Figure 1.3.2.2 NaCl structure.........................................................................................................21 Figure 1.3.2.3 Mapping NaCl structure (left) to FCC Lattice(right) .............................................24 Figure 1.3.2.4 Frustration in FCC Lattice ......................................................................................25 Figure 1.4.1.1 Metropolis algorithm ..............................................................................................27 Figure 2.1.1 One cube of the FCC lattice showing the three possible configurations (a) the type-III structure, (b) the type-IIb structure and (c) the type-II structure..................................................................................................................44 xiii    Figure 2.1.2 Concentration (x) versus Temperature (T), phase diagram constructed from the loci of the heat capacity maxima .............................................................45 Figure 2.1.3 Tubular Structure. Connected balls are for Ag/ Sb, unconnected balls are for Pb; Te sub-lattice is not shown. Checkerboard pattern formed by AgSbTe2 and Pb2Te2 .............................................................................................47 Figure 2.2.1.1 Average total energy as a function of T for x =0.25; both heating and cooling runs are shown. Also shown are the charge ordered layered structures and how they melt when heated ............................................................49 Figure 2.2.1.2 Heat capacity and Phase transition in the 2D Ising model. The inset gives the T dependence of the average energy ) .............................................51 Figure 2.2.1.3 Heat capacity and Phase transition in 3D YLG model on FCC lattice ..................53 Figure 2.2.2.1 Melting (phase) transition of YLG model with κ=0 for different concentrations x ....................................................................................................54 Figure 2.2.2.2 Melting process of the YLG model for different κ and fixed x =0.25 ...................56 Figure 2.2.2.3 Melting process of the YLG model for different κ and fixed x =0.375. st For a small κ, phase transition is the 1 order. Energy Change at Tc decreases with κ .....................................................................................................57 Figure 2.2.2.4 Melting process of the YLG model for different κ and fixed x =0.5. For st a small κ, phase transition is the 1 order. Energy Change at Tc decreases with κ .....................................................................................................58 Figure 2.2.3.1 Energy change at Tc vs. Tc when the screening parameter κ is changed from 0.0~6.4 for different values of x ....................................................................59 Figure 2.2.4.1 The numerical results of Screen Parameter vs. Criticle Temperature ....................61 Figure 2.2.4.2 Simulation results for Tc vs. κ for two different x (x =0.250 and x =0.375 ....................................................................................................................62 Figure 2.2.5.1 Layered Structure ...................................................................................................63 Figure 2.2.5.2 Tubular Structure ....................................................................................................63 Figure 2.3.1 Total energy of super-lattice and heat capacity versus temperature for κ = 0.4 (YLG) at x = 0.75 .............................................................................................66 Figure 2.3.2 Melting process (“Solid”-“Liquid”) transition for κ =0 (CLG) at x =0.75. Before melting, “Solid”-structure (T=0.095) is built up by two parts. One part is of x =1 and other one is of x =0.5 ........................................................67 Figure 2.3.3 “Liquid”-“Gas” transition for κ=0 (CLG) at x =0.75 ................................................68 Figure 2.3.4 Melting process (“Solid”-“Liquid”) transition for κ =0.4 (CLG) at x =0.75. Before melting, “Solid”-structure (T =0.015) is built up by two parts. One part is of x =1 and other one is of x =0.5 ..............................................69 Figure 2.3.5 “Liquid”-“Gas” transition for κ = 0.4 (YLG) at x = 0.75 ..........................................70 xiv    Figure 2.3.6, the GS structure for x =0.75 .....................................................................................71 Figure 2.4.3.1 Gibb’s free energy vs. Temperature for two phases, “liquid” and “solid” ....................................................................................................................76 Figure 2.3.3.2 Energy barrier in the solidification process ............................................................77 Figure 2.4.3.3 Basic cells of Layer and Tubular structures and crystal growing processes ................................................................................................................78 Figure 2.4.3.4 Surface between two domains ................................................................................80 Figure 2.5.1 Ground structure for different concentrations ...........................................................81 Figure 3.4.1 Analytically derived upper bound for the error for different values of ν for source/observation domains of radius a whose centers are separated by 4a.....................................................................................................................108 Figure 4.3.2 Cost scaling of the direct method and the fast algorithm (ACE) for 1.5 different values of the precision P for computing the potential R ..................130 xv    PART A CHAPTER 1 Lattice Gas Models with Frustration 1.1 Introduction and Review Lattice gas models (LGM) have been extensively used over last several decades to understand the thermodynamic and dynamic properties of structural ordering of atoms or ions in solids. A lattice gas is a collection of atoms whose position s can take on only discrete values. Each lattice site can be occupied by at most one atom. We neglect the kinetic energy of an atom in LGM. The potential energy of the system is equivalent to that of a gas in which the atoms are located only on lattice sites and interact through a two-body potential V (| ri − r j |) . For example, two types of long-range models have been studied. One is Coulomb lattice gas (1/R), or CLG, and the other one is lattice Coulomb gas (LnR), or LCG. Studies of various models of one[1-4] and two  [4-6] dimensional CLG and LCG using different methods have shown the existence of multiple phase transitions, complexity in phase diagrams and their practical applications to real materials, for example, KCu7−xS4[1-4], Ni1−xAlx(OH)2(CO3)x/2·yH2O[4]. In LGM the space is divided into cells (usually with a given lattice structure) in which the occupation number (n) of an atom can be either 0 or 1. Due to strong short range repulsion between atoms, n=2 is energetically unfavorable, i.e., two atoms do not occupy the same cell. The LGM can be also mapped onto an Ising model where the Ising variable takes appropriate values. Once we map a physical problem onto a lattice gas model, the next thing to consider is the structure of the underlying lattice (simple cubic, BCC, FCC etc) and the nature of the interaction between the particles. It is known that the BCC lattice can be represented in the form of two mutually penetrating simple cubic sub-lattices. In accordance with this, if atoms of one kind, e.g. 1    atoms, occupy the sites of one sub-lattice, and B atoms, of the other sub-lattice, a superstructure with the AB stoichiometric composition arises. This only superstructure (at such a division of the BCC lattice) is called by Selissky  [7] a first rank superstructure. Then, each of the two cubic sublattices with a period can be represented by means of two FCC sub-lattices with 2a period. Thus, the initial BCC lattice turns out to be represented in the form of four FCC sub-lattices. For example, in a simple binary alloy (AB) problem on a BCC lattice where two different atoms A and B can occupy the BCC lattice sites, we have ni =1. If there are some vacant sites (defects) ni =0 for that site. However to distinguish between the two atoms at a given site one constructs an Ising model where the Ising variable σ i = ±1 , corresponds to A and B, respectively. The nature of the exchange interaction in the Ising model depends on the physical interaction between AA, BB, and AB pairs. If A and B atoms like to be nearest neighbors (nn) and the interaction is short ranged then the model reduces to nn antiferromagnetic Ising model. Furthermore if the number of A and B atoms is the same then ∑ σ i = 0 . 1.2 Geometrical Frustration and Frustration from Competing Interaction (Fig.1.1.1 Face centered cubic lattice, where a is lattice constant) 2    Physicists have made great efforts to understand the basic mechanisms responsible for spontaneous ordering as well as the nature of phase transition in many kinds of systems. For example, the study of order-disorder phase transition is a fundamental problem in equilibrium statistical mechanics. In particular, during the last several decades, much attention has been paid to “frustration” in the lattice systems. The idea of “frustration” is well-known in Ising spin models. The word "frustration" was introduced by Toulouse and Vannimenus[8, 9]  in 1977 to describe the situation where a spin (or a number of spins) in the ground state of a system cannot find an orientation that “fully satisfies” simultaneously interactions with all other spins. Here, “ fully satisfies” means that each interaction is at its minimum possible value. Early work on this subject includes a study of the Ising model on a triangular lattice with nearest-neighbor spins coupled anti-ferromagnetically by Wannier[10]  in 1950. In a parallel development interacting spins with competing interactions (for example when nearest and next nearest neighbor interactions lead to conflicting spin orientations) also resulted in frustration. Also, unlike Ising spins, for vector spins which can assume all possible orientations, non collinear spin configurations due to competing interactions were discovered independently in 1959 by Kaplan[11], Yoshimori[12] and Villain[13]. Here again there is frustration because all spin interactions are not at their minimum in the ground state. In general, “frustration” is caused either by the lattice structure as in the case of triangular and face-centered cubic (FCC) (See Fig. 1.1.1) lattices with anti-ferromagnetic nearest-neighbor (nn) interaction or by competing interactions (See the later section of this chapter for a detailed description of competing interaction). Effects of frustration on physical properties of a system such as the ground state, its degeneracy and thermodynamic properties are rich and often unexpected. In addition, real 23    magnetic materials are often frustrated due to several kinds of interactions. Also frustrated spin systems have their own fundamental significance in statistical mechanics. Recent studies show that many established methods and theories have encountered difficulties in dealing with frustrated systems[8, 9] [14, 15]. A well-known example is the pyrochlore lattice, where frustration arises from the fact that the spins occupy the vertices of a network of linked tetrahedra, and one cannot find a configuration of the pyrochlore lattice that all interactions have their minimum possible values simultaneously[8, 9]. In some sense, frustrated systems are excellent candidates to test approximations and to improve theoretical methods of statistical mechanics. Therefore it is important to find out and analyze the effect of frustration by simulating a spin system. The results of simulation can not only help us to understand qualitatively the behavior of real systems which are in general much more complicated, but also test the accuracy of approximate theoretical methods. Next, we discuss some of the basic concepts of “frustration”, resulting from geometrical structure of the lattice (with only nearest neighbor interaction) and competing interaction. (a) (b) (Fig. 1.2.1 Effects of frustration on a square[8, 9]) 4    Consider a system with pair-wise nearest neighbor (nn) interaction energy given by ( ) E = J σ iσ j , where J is usually called the exchange constant and (σ i, j = ±1) are the spins at sites i and j. If J is ferromagnetic ( J < 0) then the minimum of E is -|J| which corresponds to the configuration in which σ i is parallel to σ j . On the other hand, if J is anti-ferromagnetic ( J > 0) , the minimum of E corresponds to the configuration where σ i is anti-parallel to σ j . (a) (b) (Fig. 1.2.2 Effect of frustration on a triangle[8, 9]) It is easy to see that in a spin system with nn ferromagnetic interaction, the ground state (GS) corresponds to a spin configuration where all the spins are parallel: the interaction of every pair of spins is fully satisfied. This is true for any lattice structure and there is no frustration. Now let us consider when “frustration” occurs. If J is anti-ferromagnetic, the spin configuration of the GS depends on the lattice structure: • For lattices containing no elementary triangles, i.e. bipartite lattices (such as square lattice, simple cubic lattices and body centered cubic lattice) (see Fig. 1.2.1a) the GS 5    configuration is such that each spin is anti-parallel to its neighbors, i.e. every bond is fully satisfied. The spin “question mark” is spin-up, so there is no “frustration”. • For lattices containing elementary triangles such as the triangular lattice, the FCC lattice, one cannot construct a GS where all the (nn) bonds are fully satisfied (see the triangle at Fig. 1.2.2a and Fig. 1.2.2b). The GS does not correspond to the minimum of the interaction of every spin pair. In this case, we can say that the system is frustrated – this is referred to Geometrical Frustration. Based on the above analyses, it appears that geometrical “frustration” cannot occur when the interactions are purely ferromagnetic (i.e., J < 0 ). Now we consider another situation where the spin system can be “frustrated” but for a different reason: this is the case where different kinds of conflicting (or competing) interactions are present in energy expression (or Hamiltonian) and the GS does not correspond to the minimum of energy interaction. For example, consider a chain of spins where the nearestneighbor (nn) interaction (J1) is ferromagnetic while the next nn (nnn) interaction (J2) can be either ferromagnetic or anti-ferromagnetic. The Hamiltonian is given by H = J1 ∑σiσ j + J2 ∑σiσ j , where ( J1, 2 ) are the nn and nnn exchange constants. nn nnn (Fig. 1.2.3 One of the configurations of 1-D ANNNI model[16]) When J2 is ferromagnetic, obviously the system is ferromagnetic (see Fig. 1.2.3). When J2 is anti-ferromagnetic, as long as J 2 << J1 , the GS configuration will be same as Fig. 1.2.3 and the system GS is ferromagnetic: every nn bond is then satisfied but the nnn bonds are not. 6    According to the definition of “frustration”, we know that nnn bonds are frustrated (such as 1-3, 2-4 bonds and etc.). When J 2 exceeds a critical value, the GS becomes quite complex and all the nn and nnn bonds are not fully satisfied. For this situation, we can say that the system is fully frustrated – this is referred to as Competing Interaction. (Fig. 1.2.4 Frustration in a square lattice [17]) Let us look at (Fig. 1.2.4); there is no frustration between nn pair wise spins (antiferromagnetic J1) but there are frustrations for the nnn pair wise spins (antiferromagnetic J2). Obviously, frustration is affected by the range of interaction in this lattice model. For example, if there is only nn interaction (J2=0), there will be no frustration for the square lattice system. In summary, we can say that a spin system is “frustrated” when one cannot find a ground state configuration of the spins that fully satisfies the interaction between every pair of spins. 7    Based on the discussions above, we know that there can be two kinds of “frustration” associated with a system: 1) “Geometrical frustration” which changes with the system geometry but with a fixed Hamiltonian. 2) “Frustration” from competing interaction which changes with the Hamiltonian of the system but with a fixed geometry. Now we would like to discuss in more detail some of the well studied models with “frustration”. 1.2.1 1-D Anti-ferromagnetic Next Nearest-Neighbor Ising Model The anisotropic (or axial) next-nearest-neighbor Ising model, usually known as the ANNNI model (discussed above) is a variant of the standard nn Ising model. In this model competing ferromagnetic and anti-ferromagnetic exchange interactions couple spins at nearest and nextnearest neighbor sites. The model is used to describe complex spatially modulated magnetic superstructures in many crystals. The model was introduced in 1961 by Elliott[18], but the name “ANNNI model” was given by Fisher and Selke[19] in 1980. It provides a theoretical basis for understanding numerous experimental observations on commensurate and incommensurate structures, as well as accompanying phase transitions in magnets, alloys and other solid state systems. Now let us consider the 1-D ANNNI model; the Hamiltonian is given by H = J1 ∑σiσi+1 + J2 ∑σiσi+2 i (1.2.1.1) i where ( J1,2 ) are the exchange constants, (σ i = ±1) are the i-site spins. Let us consider the case ( J1 > 0, J 2 ≥ 0) to see how frustration works. When J 2 = 0 , there is no frustration and the ground state will be 8    (Fig. 1.2.1.1 One of the configurations of 1-D ANNNI model[18]) However there is frustration when J 2 ≠ 0 . When J 2 << J1 , the GS will be still given by Fig. 1.2.1.1. Any spin will be anti parallel with its nearest neighbor and satisfies its’ nn bond but will be parallel and not satisfy with its it’s nnn bond. There is frustration in the bonds between next nearest neighbor spins and this frustration increases with increasing J2/J1 due to a more competitive J2 interaction. (Fig. 1.2.1.2 One of the configurations of 1-D ANNNI model[18]) When J 2 >> J1 , the GS is Fig. 1.2.1.2; a given spin is anti-parallel to its next nearestneighbor spin and satisfies the nnn bonds; but some spins are parallel to their nearest neighbors, i.e., there is frustration between some nearest neighbor spins and this frustration increases with increasing J1/J2 due to a more competitive J1 interaction. 1.2.2 Heisenberg Model A. Geometrical Frustration in a Simple Nearest-Neighbor Classical Heisenberg Model 9    (Fig. 1.2.2.1 Classical spins orientation along the easy axes of a tetrahedron [11, 12]) Until now we were considering Ising spins which can be either “up” or “down”. Now let us look at a system of classical vector spins where the spins have fixed lengths and can take all possible orientations in space. The spins are called “Heisenberg spins”. The simple nearestneighbor Heisenberg anti-ferromagnetic model can be described by the Hamiltonian ⎡ J ⎢⎛ ⎜ H =J Si ⋅ S j = 2 ⎢⎜ (i , j ) ⎢⎝ ⎣ ∑ ∑ i 2 ⎞ Si ⎟ − ⎟ ⎠ ∑ i ⎤ Si − 2 Si ⋅ S j ⎥ ⎥ (i >< j ) ⎥ ⎦ 2 ∑ (1.2.2.1a) Where J is the exchange constant ( J > 0) , the sum (i, j) means the nearest- neighbor (nn) interaction, the sum (i> 4|J2|, only ferromagnetism or anti-ferromagnetism (depending on the sign of J1) is allowed, this forms the most part of phase diagram in Fig. 1.2.2.2 (b). If |J1| < 4|J2|, we have to be more careful. For J2 < 0, the sign of J1 determines the spin order, and J2 substantiates this order since next-nearest-neighboring spins are always parallel no matter it is ferromagnetic or anti-ferromagnetic. For J2 > 0, the next-nearest-neighboring interaction dominates and we will expect helical magnetic order, as shown in the triangle area of Fig. 1.2.2.2(b). For a relevance and later comparison to my study on the Yukawa lattice gas system (although my system is Ising-like and 3-dimensional) we consider here the case where all the interactions are anti-ferromagnetic ( J1,2 > 0 ), namely, the upper right part of the phase diagram in Fig.1.2.2.2(b). We summarize some essential points for this 1-dimensional system below: 13    1) When J 2 << J1 , the GS will be as shown in Fig. 1.2.1.1, one spin will be anti-parallel to its nearest neighbor but will be parallel to its next nearest neighbor, i.e., there is frustration between the next nearest neighbor spins. 2) When J 2 >> J1 , one spin will have a right angle with respect to its nearest-neighbor, as shown in Fig. 1.2.2.2(b). Therefore, this spin will be anti-parallel to its next nearestneighbor. 3) The critical point occurs at J1 = 4J2, as we have mentioned above, and it separates helical order from usual anti-ferromagnetism (Fig. 1.2.2.3(a) The elementary cubes [23]) The non-collinear ground state spin alignment caused by competing nn and nnn interactions is found not only in 1-dimensinoal system, but also in many well-known high-dimensional systems such as the anti-ferromagnetic FCC lattice[22], fully frustrated simple cubic (SC) lattice[21] and so on. For the sake of illustration we choose here the SC lattice system. For this 14    simple cubic lattice (see Fig. 1.2.2.3a ), the classical ground state can be determined numerically by an iterative procedure, minimizing the local energies until the internal energy is stabilized[21, 24].   (Fig. 1.2.2.3(b) Tetrahedron which is characterized [25])   (Fig. 1.2.2.3(c) Particular GS configuration [23]) 15    It was found that, similar to 1-dimensional frustrated spin system, SC also has a J1 = 4J2 boundary in the phase diagram[26]. For 0 < J 2 < 0.25 J1 , the classical GS is the collinear antiferromagnetic structure. For J 2 > 0.25 J1 , the classical GS has a large frustration. The elementary cubes containing two tetrahedra formed by the nnn sites and stacked as in Fig. 1.2.2.3(a). The spin configurations (black site or white site) a, b, c, and d of the nnn tetrahedra are those of the elementary tetrahedra in the GS of the FCC anti-ferromagnets[25]: each nnn tetrahedron is characterized by two angles, θ , formed by two up spins (or two down spins) and φ , as shown in Fig. 1.2.2.3(b), formed by the plane containing the two up spins and the plane containing the two down spins[25]. For collinear spin alignment, the three configurations (one line up, one line down) are particular GS configurations in this range of parameters (see Fig. 1.2.2.3(c)). As we will see later for the frustrated Ising YLG on a FCC lattice, some of the ground state configurations are very similar to the collinear spin structures seen in the frustrated Heisenberg models. Even if the spins can orient to any direction, frustration forces them to be collinear.   1.2.3 Degeneracy Caused by Frustration For the Ising model the system entropy SG associated with the ground state can be written as SG ∝ ln D , where D is the degeneracy, D = α N [10], N is the number of spins in the lattice and 1 ≤ α ≤ 2  [10]. One fundamental question is whether there is any relationship between frustration and degeneracy (or system entropy)? For certain anti-ferromagnetic systems, we know that all pair interaction energies cannot be simultaneously minimized due to either “ geometric frustration” or “competing interaction”. This can sometimes lead to a large macroscopic (thermodynamically significant) degeneracy of the ground state. We would like to discuss the 16    nature of degeneracy caused by frustration in simple models consisting of a small number of spins. Let us consider an anti-ferromagnetic Ising Model H = J1 ∑ σ iσ j + J 2 ∑ σ iσ j nn (1.2.3.1) nnn where (J1,2 > 0) is constant and (J1>>J2), (σi,j = ±1) is spin value. (Fig. 1.2.3.1 Spins on a triangle with anti-ferromagnetic interaction [11, 12]) A. Triangle Model A simple 2D example is given in Figure 1.2.3.1. Now let us look at the “spin or magnetic ordering”. Three spins reside on the corners of a triangle with anti-ferromagnetic interactions between them, i.e., the energy is minimized when each spin is aligned opposite to its neighbors. Once the first two spins align anti-parallel, the third one is frustrated because its two possible orientations, up and down, give the same energy. The third spin cannot simultaneously minimize 17    its interactions with both of the other two. Thus, the ground state is twofold degenerate, for this 3-spins system with fixed 2-spins. B. Tetrahedron Model (Fig. 1.2.3.2 Spins on a tetrahedron with anti-ferromagnetic interaction [11, 12]) Similarly in three dimensions, four spins arranged in a tetrahedron (Fig. 1.2.3.2) also experience Geometric frustration. If there is an anti-ferromagnetic interaction between spins, then it is not possible to arrange all the spins so that interactions between spins are anti-parallel. There are six nearest-neighbor interactions, four of which are anti-parallel and thus satisfied, but two of which (between 1 and 2, and between 3 and 4 in the fig. 1.2.3.2) are not satisfied. It is impossible to have all the interactions satisfied simultaneously, and the system is frustrated. The ground state is twofold degenerate with fixed 2-spins. This idea of frustration caused degeneracy of a finite cluster (triangle or a tetrahedron) generalizes to a macroscopic system. However the calculation of D for a macroscopic system is highly nontrivial. A classic example is the case of a 2-dimensional triangular Ising model with nn 18    AF interaction. For this model, Wannier has calculated the degeneracy of the ground state and N finds D = α , where α = 25/12 . 1.3 Real Lattice Gas and Its Ising Model Mapping 1.3.1 Real Lattice Gas Recently, the quaternary thermoelectric compound (AgSbTe2)x(PbTe)2(1−x), or equivalently AgSbPbmTem+2 (also called “LAST-m”, LAST stands for Lead, Antimony, Silver and Tellurium) [27] has emerged as a material for potential use in efficient thermoelectric power generation. The concentration x above is related to the quantity m by the relation x = 2 . (m + 2) (Figure1.3.1 A. A high resolution transmission electron microscopy image of AgSbPb18Te20 sample. B. An extended region of a AgSbPb10Te12 specimen [27]). In Fig. 1.3.1A, a high resolution transmission electron microscopy image of AgSbPb18Te20 sample shows nano-sized region (a “nano-dot” shown in the enclosed area) of the crystal that is Ag-Sb–rich in composition. The surrounding structure, which is epitaxially related to this feature, 19    is Ag-Sb–poor in composition with a unit cell parameter of 6.44 Å, close to that of PbTe. Fig. 1.3.1B shows compositional modulations over an extended region of a AgSbPb10Te12 specimen. The spacing between the bands is ~20 to 30 nm. In essence, the observed compositional modulation is conceptually similar to the one found in the artificial PbSe/PbTe super-lattices which also show excellent thermoelectric properties[27]. In the latter system, the compositional modulation exists along the stacking direction. (Fig. 1.3.2.1. Example of ionic mixing on the Pb fcc sublattice of the NaCl type structure of PbTe [28]. In LAST-m this sublattice is occupied by Pb (blue balls), Ag (yellow balls) and Sb (green balls). However the average structure of LAST-m is that of a NaCl lattice (See Fig. 1.3.2.2) ) PbTe has a number of derivatives that have shown some of the most promising results to date among bulk materials[28]. Early studies of AgPbmSbTe2+m (LAST-m) system reported it to be a solid solution between PbTe and AgSbTe2 (both rock salt NaCl type structures) with p-type 20    properties and an unusually low lattice thermal conductivity. X-ray diffraction studies of the compound AgSbPbmTem+2 (See Fig. 1.3.2.1 obtained by Sootsman, Chung and Kanatzidis[28]) shows that the average lattice structure is NaCl-type (Fig. 1.3.2.2), where the cation (Na) sites are occupied by Ag, Sb, and Pb and the anion (Cl) sites are occupied by Te. The lattice constant changes nearly linearly with Pb/Te ratio (See Fig.1.3.2.2). One can consider a lattice gas model when the Na-sub-lattice of NaCl-type structure, which is a FCC lattice, is occupied by three types of ions, Ag, Sb and Pb. (Fig. 1.3.2.2 NaCl structure) 1.3.2 Screening Effect Consider a positive charge placed in the electron gas and the charge’s position is fixed, Poisson’s equation tell us that −∇2φ (r) = 4πρ (r) , (1.3.2.1) 21    where φ (r) is the physical potential and ρ (r) = ρ ext (r) + ρ ind (r) is the total charge density, where ρ ind (r) is the charged density induced in the electron gas due to the presence of the external particle and ρ ext (r) is the charge density associated with this external particle. Suppose that φ (r) and φ ext (r) are related linearly; we will then have ∫ φ ext (r ) = d ρ ' ε (r, r ')φ (r ') (1.3.2.2) Fourier transforms give ∫ ε (q) = dre −iq⋅rε (r ) ε (r ) = 1 (2π )3 ∫ dq e iq⋅r (1.3.2.3) ε (q) which implies that in Fourier space, we have φ (q) = 1 ext φ (q) . ε (q) (1.3.2.4) This equation shows that the potential in Fourier space is reduced by a factor 1 . Actually, the ε (q) quantity ε (q) was studies by Thomas-Fermi and is well known as the TF dielectric constant: ε (q ) = 1 + κ2 q2 . (1.3.2.5) To understand the physical meaning of the parameter κ , we can consider the following case: φ ext (r ) = Q r (1.3.2.6) and it’s Fourier space form: 22    φ ext (q ) = 4π Q (1.3.2.7) q2 So the total potential takes the form 4π Q φ (q) = (1.3.2.8) 2 q +κ2 The potential in real space is the inverted Fourier space form, which is dq ∫ (2π ) φ (r ) = eiq⋅r 3 4π Q q2 + κ 2 =Q e −κ r r (1.3.2.9) This equation means that total potential is of Coulombic form times an exponential damping −κ r factor e e −κ r . Here κ is called the screening parameter and is called the screened potential. r In PbTe system, Pb and Te ions attract each other through Coulomb potential in the presence of itinerant electrons or holes resulting from doping which is necessary for energy and charge transport. In the region of higher doping the Coulomb interaction is weaker due to the screening effect that naturally affects the internal field. 1.3.3 Ionic Model for LAST-m, Yukawa Lattice Gas (YLG) Model Now consider the following interaction model for the system: E= ∑ f (| Ri − R j |)QiQ j (1.3.2.1) where R i and Qi are, respectively, the position and the charge of an atom at site i of the lattice. As mentioned before, we consider the NaCl-type structure made of two interpenetrating FCC lattices with mixtures of different atomic species at the Na sites, i.e. j = {Na(i.e. Ag, Sb, Pb), Cl(i.e., Te)}. Ionic mixing occurs on the Na sites (not on the red sites, see Fig. 1.3.2.1). For PbTe, 23    2 if we look at “Periodic Table”, we can know that Pb is of structure [Hg]6p and Te is of structure 10 2 4 [Kr]4d 5s 5p . For the combination of PbTe, because Pb prefers to donate 2 electrons and Te prefers to accept 2 electrons, the combination of PbTe will be stable. So in an ionic model for + - + + LAST-m, we can assume the Pb ion 2 , Te ion is 2 , Ag ion is 1 and Sb ion is 3 . So the charge of the Na site will be 2 + Δ q , where Δq = {−1, 0, +1} . The total energy can be written as E= ∑ f ( R i − R j )Qi Q j = ∑ f (rm − rn )Δqm Δqn + EPbTe = Ec + EPbTe m≠ n (1.3.2.2) where E PbTe is the total Madelung energy of the ideal PbTe lattice. Obviously, contribution to Ec comes from the occupation of a FCC lattice only (i.e., Na site, see Figure 1.3.2.3). (Fig. 1.3.2.3 Mapping NaCl structure (left) to FCC Lattice(right) ) 24    (Fig. 1.3.2.4 Frustration in FCC Lattice) As we will show below Ec will be mapped onto an Ising model on a FCC lattice with antiferromagnetic interaction whose range can be tuned. It is well known that short range antiferromagnetic Ising models on a FCC lattice involve “frustration” [11, 12] (see Fig. 1.3.2.4). So, it is of considerable interest to see the effects of frustration in Yukawa systems when tuning the range of interaction. 1 , we get the Coulomb Lattice Gas (CLG) model. | Ri − R j | • If f (R i − R j ) = • i j e If f (Ri − R j ) = , we get the Yukawa Lattice Gas (YLG) model. | Ri − R j | −κ |R −R | Now we can map the above lattice gas model to the following Ising spin model Ec = ∑ J ( rm − rn )σ mσ n (1.3.2.3) 25    where (σ m = ± 1, 0) , +1 for Sb, -1 for Ag and 0 for Pb. Since the number of Ag and Sb ions are ⎛ the same we have ⎜ ⎜ ⎝ ∑ ⎞ σ m = 0 ⎟ . In AgSbPbmTem+2 system, the content of Ag and Sb ions is m ⎟ ⎠ defined as the concentration x, which is given by x = 2 / (m + 2) = (1/ N ) ∑σ i2 . i The CLG model was investigated by Hoang, Desai and Mahanti (HDM)  [29]. In the next chapter, we will introduce their work. Note that YLG model becomes CLG model when the Yukawa screening parameter κ =0. 1.4 Monte Carlo Simulation (MC) In order to study the physical properties of the ionic lattice gas system using YLG model (for example, its structural and thermodynamic properties), we have to calculate the Free energy (partition function) corresponding to the Hamiltonian (Eq.1.3.2.3) for an infinite system. There are several approaches available: (i) exact solution which is very difficult; (ii) mean field approximation which works well if we know something about the structural order for different concentrations x (works well when there is no frustration) and (iii) simulation of finite but large systems. Because of the complex nature of our system, such as both geometrical frustration and frustration caused by competing interactions, we will use the third, Monte Carlo (MC) simulation approach. MC simulations have been extremely useful in understanding the physical properties of classical spin systems  [14, 30]. MC simulations, in fact all simulation studies, are usually limited by the memory size and the central processor of computers. The size of the system (N = number of lattice sites = number of spins) one can simulate depends on the number of configurations. For a 2-state Ising model the number of configuration is 2N. Also the size of the system one can simulate depends on the nature of the interaction between the spins. For example, 26    for a simple nearest neighbor ferro or anti-ferro interaction, one can simulate systems up to 6 N=10 or more. On the other hand for long range interactions it becomes difficult to simulate 3 systems N>10 . In MC simulation, the main idea is to generate a large number of highly probable configurations randomly and then compute averages of physical quantities such as energy, magnetization etc over these generated configurations. For systems in thermodynamic equilibrium the configurations are generated with Boltzman distribution. The details of the MC algorithm used in our simulation are discussed next. 1.4.1 Monte Carlo Algorithm and Simulation Process (Fig. 1.4.1.1 Metropolis algorithm) As discussed above, to study phase transitions in the FCC YLG model by computing different thermodynamic averages we use Monte Carlo simulations. The straightforward application of the basic Monte Carlo method is not however the best way to compute these average values, because it samples all configurations uniformly. Instead, we generate random configurations which are highly probable which amounts to carrying out importance sampling. A better approach is to use Boltzmann factors themselves as a guide during the generation of a subset of sampled states. Metropolis algorithm generates a sequence of configurations that have 27    approximately the required probability distribution [31]. This is how the Metropolis algorithm works: 1) One starts with a completely random configuration α 2) To generate the next configuration β in the sequence one computes the energy change ΔE in going from state α to state β. If ΔE is zero or negative, one accepts the new configuration (i.e., we always accept changes that lower the total energy). If ΔE is positive, one accepts the new configuration with probability P(ΔE) = exp(−ΔE/kBT), where T is the desired temperature. In practice, one generates a uniformly distributed random number x between 0 and 1, and accepts the new configuration if P(ΔE) > x. This constitutes a single MC step within Metropolis algorithm. The repeated Metropolis steps are also referred to as a Metropolis sweep. This Metropolis step is carried out a large number of times. The updated configurations are used in the calculation of averages as they become available. 3) One carries out MC sweeps until thermal-equilibrium is achieved, i.e., when the averages of different physical quantities remain constant within a prescribed range. The range is chosen to obtain a balance between accuracy and the number of MC simulation (or equivalently the computer time of the simulation). These are referred to as equilibrium runs. Once the system approaches “thermodynamic equilibrium” one takes averages of physical quantities over an additional number of MC steps and uses these averages to represent thermodynamic averages. 4) Typically, the Metropolis–Monte Carlo estimate of the ensemble average of a macroscopic quantity (say O) is given by 28    O = 1 M M ∑ ∑ k =1 (all accepted) < O >Config (1.4.1.1) < O >Config is the value of O for a particular configuration and the summation is over the accepted configurations. It is usually good to omit a certain number of configurations at the beginning of the simulation; these are not distributed with the proper probabilities, because the system has not yet reached thermal equilibrium. How long the system takes to thermalize depends on several factors such as the size of the system, the nature of the interaction, and the temperature of the system. The length of the required warm-up period is usually estimated empirically before starting the simulation. 1.4.2 Limitations of the MC simulation Although MC simulation is a powerful and useful approach to investigate the equilibrium properties of complex physical systems, we still need to be aware of its limitations. For example, if we are studying a second order phase transition, then in a finite system there is a problem because there is no sharp transition (either as a function of the temperature or some other intensive parameter) between zero and non-zero magnetization or a sharp peak in the specific heat. As a result, the critical point associated with a second order or continuous phase transition cannot be located accurately (Binney, Dowrick, Fisher and Newman) [32]. Furthermore, the absence of a sharp peak (or singularities) in the specific heat hinders in understanding the nature of its singularity at the critical point. The limitations of the Monte Carlo simulations are discussed nicely by Binder and Luijten[15]. We summarize some of these limitations below. 1) Only in the limit M (the number of simulations) → ∞ one can expect to obtain exact results, while for finite M, there is a “statistical error”. The estimation of this error is a 29    nontrivial matter, since it depends sensitively on the choice of M and subsequently generated states are more or less correlated [30]. 2) If the total number of MC steps is not large enough, then there will be some “memory” of the (arbitrary) initial state from which MC simulation was started [33, 34]. 3) In order to carry out a MC sweep, pseudo-random numbers are used both for constructing a trial state from a given state and for the decision whether or not to accept the trial state as a new state. For instance, in the Metropolis algorithm (see last sub-section), this is done if the transition probability exceeds a random number that is uniformly distributed in the interval from 0 to 1 (Binder)[14]. It is therefore necessary to carefully test the quality of the random numbers generated by the algorithm used since bad random numbers indeed cause systematic errors. However, this is again a nontrivial matter, since there is no unique way of testing random – number generators, and there is no absolute guarantee that a random – number generator that has passed all the standard tests does not yield true random numbers leading to systematic errors in a particular application ( Knuth)  [35]. 4) Monte Carlo methods apply to system of finite size only, and the results of calculations near a phase transition are affected by the finite size and by the boundaries. The finite size of the simulated lattice, typically a hyper-cubic lattice of linear dimension (L) with periodic boundary conditions in all lattice directions, causes a systematic rounding and shifting of the critical singularities. This is because singularities of the free energy can only develop in the thermodynamic limit L → ∞. This remark is particularly obvious for the correlation length ξ, which cannot diverge in a finite simulation box, so that serious finite – size effects must be expected when the correlation length ξ has grown to a size 30    comparable to L (Kim, Souza and Landau)[36]. Therefore, the results obtained by the Monte Carlo simulations depend sensitively on the numbers of Monte Carlo steps, the linear dimension L of the simulation box with periodic boundary conditions, the nonequilibrium relaxation time of the system and the quality of the random – number generators, etc. Finally, as the correlation length diverges at the critical point of a second order phase transition, physical fluctuations in the magnetization become very large[37]. These fluctuations cannot be entirely suppressed by the importance sampling and averaging over reasonable number of MC steps. In order to obtain a reliable average, MC simulations should be run for an inordinately long time (Butera and Comi)[37], and to reach the exact solution, unfortunately, for infinite time. Most importantly, any approximate method cannot provide exact information at or near the critical point, since whenever the thermodynamic functions have an essential singularity it is difficult to perform any computation by successive approximation because the convergence of approximation in such cases is notoriously slow (Onsager) [38]. 1.4.3 Application of MC Simulation Nowadays, calculations of the Ising models have been performed on lattice sizes of L = 256 ~ 5888[39]. Up to date, most of 3D simulations with the lattice sizes lager than L=4800 are short runs, which can not produce well – equilibrated configurations at the critical point (Stauffer and Knecht)[39]. In a normal case, increasing the lattice sizes lowers the estimates of the critical point (Gupta and Tamayo)[40] (Binney, Dowrick, Fisher and Newman)[32], this would push the values of the critical temperatures collected in the Binder and Luijten’s review article towards the putative exact solution ( Binder and Luijten)[15]. For the 2D Ising model however, the MC simulations give much better results (Binder and Luijten) [15]. 31    1.5 Total Energy Calculation in the YLG model For the YLG system on a lattice with periodic boundary condition (PB) the energy is given by E= 1 2 ∑∑ n I ≠J QI QJ e−κ |R I −R J +nL| | R I − R J + nL | (1.5.1) where κ is the screening parameter, L is the size of the cubic super-cell lattice, the outer sum is taken over all integer vectors nœ(nx, ny, nz), QI is the charge at the I th site RI. Note that simulation box is cubic but the charges occupy the FCC lattice sites embedded in the cubic cell. Obviously, E reduces to that for a CLG when κ~ 0. Periodic boundary conditions are used to best simulate an infinite system. In principle, one should take larger and larger periodic cubic cells (size L) and then extrapolate L to infinity to extract results in the thermodynamic limit. To carry out simulations in systems with periodic boundary conditions (modeling an infinite systems) one uses Ewald’s method to carry out the summation over n in Eq. 1.5.1. This method will be discussed in the next section. However, if we are interested in seeing how charge ordering takes place in finite size systems such as PbTe nanoparticles doped with Ag and Sb, one has to carry out simulation without imposing periodic boundary conditions. 1.5.1 Ewald summation for Yukawa interactions In fact, the speed of direct calculation of (Eq. 1.5.1) is very slow. One can use famous Ewald summation methods to accelerate the calculation speed. The key for Ewald summation is to divide the calculation of (Eq. 1.5.1) into two parts; one calculation runs on Fourier space and other runs on real space. In this section, we will show how to obtain the equation of Ewald summation for the YLG model. 32    We denote V a cube of side L and assume periodic boundary conditions. The Yukawa potential in real space will be defined as the Green’s function ψ V (r) of the Helmoltz equation (∇2 − κ 2 )ψ V (r) = −4π δV (r) (1.5.1.1) where δV (r) = ∑δ (r + nL) = ∑{[δ (r + nL) − λρ (r + nL)] + λρ (r + nL)} n n ⎛α2 where ρ (r ) = ⎜ ⎜ π ⎝ ⎞ ⎟ ⎟ ⎠ 2/3 2 exp( −α 2 r ) is a normalized Gaussian charge density and the vector n has three components (nx, ny, nz) along the three cubic axes, suppose that ψ V (r ) = ψ D (r ) + ψ F (r ) and (∇2 − κ 2 )ψ D (r) = −4π ∑{[δ (r + nL) − λρ (r + nL)]} (1.5.1.2) ∑{[λρ (r + nL)]} (1.5.1.3) n (∇2 − κ 2 )ψ F (r) = −4π n For eq. (1.5.1.3), we already have solution, ψ F (r ) = 4πλ 3 L where k = − e |k|2 4α 2 ∑ k 2 + κ 2 eik·r (1.5.1.4) k 2π n , is a reciprocal lattice vector associated with the cubic super cell of side L. L Suppose ψ D (r ) = ∑[φ1(r + nL) − λφ2 (r + nL)] (1.5.1.5) n then we have two equations 33    (∇ 2 − κ 2 )φ1 (r + nL) = −4πδ (r + nL) (1.5.1.6) (∇ 2 − κ 2 )φ2 (r + nL) = −4πρ (r + nL) (1.5.1.7) For equation (1.5.1.6), we have solutions that satisfy e −κ |r +nL| φ1 (r + nL ) ~ | r + nL | For equation (1.5.1.7), replace φ2 (r + n L ) by φ (r ) , ρ ( r + n L ) by ρ (r ) and consider the following processes ⎛α2 ρ (r ) = ⎜ ⎜ π ⎝ ⎞ ⎟ ⎟ ⎠ 2/3 2 exp( −α 2 r ) e −κ |r '−r| ⎛ α 2 ⎞ =⎜ φ (r ) = d r ′ρ (r ′) ⎟ | r '− r | ⎜ π ⎟ ⎝ ⎠ ∫ 3/2 3 ∫ d 3r′e−α 2 r' 2 e −κ |r '−r| | r '− r | Defining τ = r' - r , τ = τ and r = r , we get, 3/2 ⎛α2 ⎞ φ (r ) = ⎜ ⎟ ⎜ π ⎟ ⎝ ⎠ ∫ 2 2 2 e d 3τ e−α (τ + r + 2 τ⋅r ) 3/2 2π π ∞ ⎛α2 ⎞ =⎜ ⎜ π ⎟ ⎟ ⎝ ⎠ = eκ 2 /4α 2 2r ∫ ∫∫ −κτ τ 2 2 2 e−α (τ + r +2τ r cosθ )−κτ τ dτ sin θ dθ dϕ (1.5.1.8) 0 00 ⎡ κr ⎛ ⎤ κ ⎞ −κ r ⎛ κ ⎞ ⎢e erf ⎜ α r + 2α ⎟ + e erf ⎜ α r − 2α ⎟ − 2sinh(κ r ) ⎥ ⎝ ⎠ ⎝ ⎠ ⎣ ⎦ 2 2 2 erf (α | r′ − r |) 1 → α , choosing = eκ /4α and replacing φ (r ) by φ2 (r + n L ) , | r′ − r | λ r′→r π Using lim then the equation (1.5.1.5 ) becomes 34    ψ D (r + nL) = ∑ n ⎛ e −κ r +nL 1 ⎛ κ r + nL κ ⎞ ⎛ ⎜ erf ⎜ α r + nL + − ⎜e ⎟ ⎜ r + nL 2r ⎝ 2α ⎠ ⎝ ⎝ +e = ∑ n −κ r +nL κ ⎞ ⎞⎞ ⎛ erf ⎜ α r + nL − ⎟ − 2sinh(κ r + nL ) ⎟ ⎟ 2α ⎠ ⎝ ⎠⎠ r nL ⎛ eκ r +nL κ ⎞ e−κ + κ ⎞⎞ ⎛ ⎛ ⎜ erfc ⎜ α r + nL + erfc ⎜ α r + nL − + ⎟ ⎟⎟ ⎜ 2 r + nL 2α ⎠ 2r 2α ⎠ ⎟ ⎝ ⎝ ⎝ ⎠ (1.5.1.9) Usually, for a particle system with periodic boundary conditions, Ewald summation rewrites the interaction as a sum of two terms, the short-range term that sums quickly in real space and the long-range term that sums quickly in the Fourier space. Φ (r ) = Φ Fourier (r ) + Φ Real (r ) where, 4π Φ Fourier (r ) = Ω ∑ ∑ QI I G ≠0 2 2 2 e − (|G| +κ )/4α eiG⋅(r − R I ) | G |2 +κ 2 (1.5.1.10) 3 where (Ω=L ) is the volume of the super lattice cell. Φ Real (r ) = ∑ I ⎡ κ ⎞ κ |r −R I | κ ⎞ −κ |r −R I | ⎤ ⎛ ⎛ + erfc ⎜ α | r − R I | − ⎟e ⎢erfc ⎜ α | r − R I | + 2α ⎟ e ⎥ 2α ⎠ ⎝ ⎠ ⎝ ⎣ ⎦ QI 2 | r − RI | (1.5.1.11) In the final expression for the energy of a super-lattice cell, we need to exclude the contribution to the energy of the true Yukawa potential due to I evaluated at RI, the so called 35    self-term to avoid infinites. In fact, this arises from the term exp( −κ r − R I ) part of the real r − RI space summations, which must be dropped. Then we have: ⎡ ΦSelf (r ) = lim ⎢Φ Real (r ) − r →R I ⎢ ⎣ ⎧ ∑ QI I e −κ |r −R I | ⎤ ⎥ | r − RI | ⎥ ⎦ 1 ∑ QI ⎨− 2 | r − R I | r →R I ⎩ = lim I ⎡ ⎢erf ⎣ + κ ⎛ ⎜α | r − RI | + 2α ⎝ ⎞ κ |r −R I | + erf ⎟e ⎠ κ ⎛ ⎜α | r − RI | − 2α ⎝ ⎞ −κ |r −R I | ⎤ ⎟e ⎥ ⎠ ⎦ ⎫ 1 sinh (κ | r − R I |) ⎬ | r − RI | ⎭ Using L’Hospital Rule, we get ⎡ ⎤ κ2 − ⎢ 2α 2 ⎛ κ ⎞⎥ e 4α + κ erfc ⎜ ΦSelf (r ) = ⎢ − ⎟⎥ π ⎝ 2α ⎠ ⎥ ⎢ ⎣ ⎦ ∑ QI (1.5.1.12) I Obviously, the self part is independent of r. For the charge neutral Yukawa Lattice Gas on a FCC Lattice System with PB conditions, based on above discussion, we have the total Yukawa energy in a the super lattice system given by 36    EEwald = EFourier + ESelf + EReal 2π = Ω ∑ ∑ QIQJ 2 2 2 e−(|G| +κ )/4α eiG.( RI -R J ) I,J G ≠0 | G |2 +κ 2 ⎡ ⎤ κ2 − ⎢ α 2 κ ⎛ κ ⎞⎥ + ⎢− e 4α + erfc ⎜ ⎟⎥ 2 π ⎝ 2α ⎠ ⎥ ⎢ ⎣ ⎦ + 1 2 ∑ QI2 I QQ ∑∑ 2 | RI - IR JJ + nL | ⋅ n I≠ J κ ⎞ κ |R I -R J +nL| ⎡ ⎛ ⎢erfc ⎜ α | R I - R J + nL | + 2α ⎟ e ⎝ ⎠ ⎣ κ ⎞ −κ |R I -R J +nL| ⎤ ⎛ +erfc ⎜ α | R I - R J + nL | − ⎟e ⎥ 2α ⎠ ⎝ ⎦ (1.5.1.13) This is the main equation used in our MC simulation. The choice of the Ewald summation parameter, α and G, depends on the lattice system. We will discuss it next. 1.5.2 Ewald Summation Parameters There are several parameters that control the convergence of Ewald summation for the YLG model. It is useful to discuss the choice of these parameters in the CLG model because the range of interaction is larger for the lattice model. If for certain choice of parameters the CLG model converges then it is obvious that for the same choice of parameters the YLG model will also converge. Independent of the model considered the choice of Ewald parameters should be based on several considerations: 1) |nmax| is an integer which defines the range of the real-space sum and controls its maximum number of vectors. 37    2) Similarly |Gmax| is an integer defining the summation range in the reciprocal-space and its number of vectors. 3) α, appearing in the Gaussian function, is the Ewald convergence parameter, which determines the relative rate of convergence between the real and reciprocal sums. Note that a large value of α, i.e. a narrow Gaussian distribution, makes the real-space sum converge faster. This means that a small number of n-vectors (i.e. |nmax| is small) is sufficient for a rapid convergence. On the other hand, a small α causes the reciprocalspace sum to converge faster, i.e. a small |Gmax| will suffice. 4) For the calculation in the real-space sum, we introduce a cut-off radius (Rcutoff) which is the size in real-space we accept for a properly relative error ∆ in the calculation. Usually, the calculation of the cut-off radius will be determined by the supper-cell size L. The larger of the super-cell size and the cut-off radius we choose, the more accurate results we get. For the calculation of Columbic interaction potential with periodic boundary(PB) condition, Toukmaji and Board[41] found that the reciprocal sum was calculated more efficiently than the real sum, hence α is generally chosen to minimize the real sum and thus it dictates the value of truncation number in the Fourier space sums. The choice of the Ewald sum parameters for Columbic interaction with PB is system dependent and is subject to trade-offs between accuracy and speed which in turn is influenced by the algorithmic implementation. For a relative error ∆ that one accepts Perram and Petersen[42] have suggested using α ~ 2 − ln Δ or α~3.5/ Rcutoff . 38    1.5.3 Fast Calculation Method for Ewald Summation As discussed in the last section, for the YLG model on a FCC lattice, the total energy is the sum (Ewald sum) of three parts, namely, the real (direct) space sum, the reciprocal (imaginary, or Fourier) sum, and the constant term, known as the self-term. If we fix |nmax| and |Gmax|, the calculation costs are the following: 3 (1) the reciprocal (imaginary, or Fourier) sum: O(|nmax||Gmax| ) (2) the self-term sum: O(|nmax|) (3) the real (direct) space sum: O(|nmax| ) 2 From above cost estimation, we know, for a large lattice system with huge numbers of ions, the real (direct) space sum will take most CPU time. As we know, the error function erfc(x) ~ 0.0 when the argument x > 2.5, so, for the YLG model the real (direct) space sum becomes a sum of Columbic potential in a small finite domain depended on a fixed cutoff radius, Rcutoff. Meanwhile, if we consider the expansion of erfc(x) erfc( x) = 2⎛ 1 3 e− x ⎜1 − 2 + 4 + π 4x ⎝ 2x 1 ⎞ ⎟ ⎠ (1.5.3.1) the error function erfc(x) ~ 0.0000 when x > 2.5. For a relative error ∆ that we accept, we use (following Perram and Petersen[42]) α ~ 2 − ln Δ and α~3.5/ Rcutoff . Based on the above analysis, we can use the criterion | RI − RJ + nL | κ Rcut-off 2.5 − > Rcut-off 2 3.5 (1.5.3.2) to decide if the calculation can be stopped. 39    We don’t need to carry out total energy calculation directly for every MC because we will have an optimized method to get the energy difference between two different configurations if we just randomly switch two particles to get a new configuration of the system. When we interchange the position (or charge) of two particles, the energy change in the real (direct) space sum will be: N ΔEReal = ∑ i ≠ m ,n ( qn − qm ) qi erfc(α Rmi ) + Rmi N ∑ ( qm − qn ) q j j ≠ m ,n Rnj erfc(α Rnj ) (1.5.3.3) There is no contribution to the change in energy from the self parts (See Eq. 1.5.1.13). Because the calculation cost for the Fourier space part is almost O(|nmax|), we can calculate the energy of the Fourier part (Efourier) directly for a new configuration of system and hence obtain the energy difference ( ΔEFourier ) between old configuration and new configuration. Therefore the change of total energy in a single MC move is given by Δ E = Δ EFourier + ΔEReal . 1.5.4 Evaluation of Madelung Constant – a Check In order to test our code for computing the energy for the YLG (and CLG) system we first calculate the well known Madelung constant for the NaCl lattice. This corresponds to our case x=1 (see eq. 1.3.2.4), which corresponds to no neutral sites (i.e., all sites have a charge, positive or negative). As we know, for NaCl, in the CLG model, the Madelung constant M can be obtain by using ECLG = − Nq 2 M 4πε 0 R0 (1.5.4.1) 40    where any site is charged positive (q) or negative (-q), ECLG is the total potential energy of the lattice, N is the total number of charges and R0 is the nearest-neighbor distance. Madelung  [43] calculated this constant exactly by using ∞ M= ∑ (−1) j + k +l 2 2 2 1/2 j ,k ,l =−∞ ( j + k + l ) ≈ −1.7475 (1.5.4.2) where j = k = l = 0 is to be left out. In our code testing, we use some of the ideas of choosing the parameters appearing in the Ewald sum (See Section 1.5.2 – 1.5.3) to accelerate the calculation and to reduce the calculation cost. The following results were obtained based on our simulations for the NaCl lattice when we chose α = 0.8 and |Gmax| = (5, 10), 1) For 10x10x10 lattice, M = -1.7476, -1.7476 2) For 20x20x20 lattice, M = (-1.7479, -1.7476) These results agree very well with Madelung’s result. We will use these parameters in our MC simulations. 41    CHAPTER 2 YLG Model on a FCC Lattice 2.1 Introduction As we have discussed in Chapter 1, the YLG model for the LAST-m system maps onto an anti-ferromagnetic spin-1 (3-state) Ising model. By changing the Yukawa coupling constant κ one can go from the long range CLG model to a short range model where only the nearest neighbor interactions dominate. Anti-ferromagnetic interaction between Ising spins on a FCC lattice has both geometrical and competing interaction induced frustration. Also when the concentration x → 1 , the 3-state Ising model goes to 2-state Ising model. Frustrated 2-state Ising systems with short range interactions have been known to have unusual properties such as large ground state degeneracy, large fluctuations and successive phase transitions as a function of temperature with complicated structures[44] [45] [46]. A simple model to understand these frustration dominated features is one where only the exchange interaction between nearest and next nearest neighbor spins are taken into account. The Hamiltonian is given by H = J ∑σ iσ j − α J ∑σ iσ j + nn nnn − h∑σ i (2.1.1) where ( J > 0) is the exchange constant and (σ i = ±1) . The nearest neighbor (nn) interaction is anti-ferromagnetic and leads to “geometric frustration” because of the FCC lattice (see Fig.2.1.1a). The next nearest neighbor (nnn) interaction can be changed from ferromagnetic to anti-ferromagnetic by changing the sign of α. For the YLG system, α<0. The external field h controls the average magnetization (or the average charge for the CLG or YLG). For the neutral ionic lattice gas system in which we are interested, the total charge (also magnetization) is zero and therefore we will consider the case h=0. There is a major difference between this 2 state model and our 3 state (spin-1) model. As mentioned before, the latter goes over to the 2-state 42    Ising model for x=1, which corresponds to the system AgSbTe2. This is the limit when the effect of frustration is maximum. When x is less than 1, some of the sites have no spins (or ions). This can help in reducing the effect of frustration by suitably choosing the neutral or σi = 0 sites as the nearest neighbors of charged σi = +1,-1. The ground state of Eq. (2.1.1) was first studied by Cahn and Allen  [45]. For −0.5 < α < 0 , they found that the GS was 6-fold degenerate with the energy E0 = −2 − α . (To describe the structure we will use the magnetic or spin description. For the case of ions, one has to replace the spins by corresponding charged ions.) The structure is called type-III anti-ferromagnet and is shown in Fig. 2.1.1a. For α < −0.5 , there are two types of ground state structures called type II (Fig. 2.1.1c) and IIb (Fig. 2.1.1b) anti-ferromagentic structures with energy E0 = 3α and degeneracy 8. For α = 0 and α = -0.5 where the GS structure changes from one type to the other, L the GS degeneracy is macroscopic (~b , b>1. L is the linear dimension of the system). Slawny [46] later showed that although type II and IIb structures were degenerate for α<-0.5, the type-IIb structure (Fig. 2.1.1b) dominated at low temperatures. As we see, some of these GS structures caused by frustration are similar to what was found for the Heisenberg model in 3-dimensions with competing AF interactions (See section 1.2.2). The finite temperature properties of this model was studied in detail by Phani et al,[44]. Their Monte Carlo (MC) simulations showed that there was a 1 st order phase transition from an ordered to disordered state for -1≤ α ≤ 0, whereas for α ≤ -1, the transition appeared continuous. 43    (a) (b) (c) (Fig. 2.1.1 One cube of the FCC lattice showing the three possible configurations (a) the type-III structure, (b) the type-IIb structure and (c) the type-II structure) 44    As far as we are aware the effect of introducing neutral sites on these properties i.e. looking at the 3-state Ising model on a FCC lattice with only nn and nnn AF interaction has not been done. (Fig 2.1.2 Concentration (x) versus Temperature (T), phase diagram constructed from the loci of the heat capacity maxima [29] for the CLG model.) Our interest here is to see how the introduction of longer range interaction such as the YLG, changes the phase transition (its order and the number of transitions, ground state degeneracy etc) for all values of x. By increasing the strength of the Yukawa coupling we can make some contact with these earlier nn, nnn model calculations. The initial attempt to address some of these issues for long range interaction was made by Desai, Hoang, Mahanti (DHM) [29] who carried out limited MC study of the CLG model for the entire range 0 < x ≤1. The phase diagram obtained by DHM is shown in the Fig.2.1.2. 45    st For the CLG, as a function of T, they found one (1 order) phase transition for x ≤ 0.5 and st two phase transitions (one at lower T which is 1 order and the other at higher T which can be st either 1 order or continuous depending on the value of x) for x > 0.5. The critical temperature st Tc for the lower T 1 order transition was found to be a weak function of x except for very low and very high concentrations (see Fig. 2.1.2). Tc for this transition is 0 for x = 0 and x = 1. The higher T transition for x > 0.5 is strongly x-dependent. For x = 1, this transition was found to be st 1 order, in agreement with the results of Phani et al. [44], HDM [29] found that generic low T structures of the CLG model consisted of anti-ferromagnetic layers (sheets) (σi =+1,-1) separated by layers with σi=0. In the charge picture this corresponds to layers with ordered positive and negative charges (corresponding to sheets of AgSbTe2 for the actual physical system), separated by neutral layers (corresponding to Pb2Te2 layers). In addition to these layered structures, they also discovered a tubular structures for x=0.5, shown in Fig.2.1.3. We will discuss about this tubular structure in more detail later. In the present work we generalize the result of DHM [29] to the YLG model. We will mainly focus on the region x ≤ 0.75 and not discuss results for 0.75< x ≤1 because of computational difficulties we faced. In DHM [29] paper, they did not elaborate on the differences in the structural properties in regions referred to as “solid”, “liquid” and “gas” for x > 0.5. We will discuss the differences between these regions for x=0.75 and also point out the differences in the nature of melting between “solid” and “liquid” and between “liquid” and “gas” regions. 46    (Figure 2.1.3 Tubular Structure [29]. Connected balls are for Ag/Sb, unconnected balls are for Pb; Te sub-lattice is not shown. Checker board pattern formed by AgSbTe2 and Pb2Te2.) First, we would like to give a brief summary of the MC simulation method. For each MC step, we pick a pair of sites randomly inside a super-cell of size L × L × L and then carry out the standard Metropolis algorithm. For each sweep, the change in energy in going from an initial state (Ei) to a final state (Ef) is computed. We accept the new state (Ef) if Ef < Ei. If Ef > Ei then we accept the state (Ef) if exp( −Δ E / k B T ) > r , where r is a random number in the interval [0,1], ΔE =Ef - Ei is the energy difference, and T is the temperature. The results we present in this thesis (if not specifically mentioned) were calculated with 3 system size L=8 (i.e., 8 simple cubic FCC cells in one direction, 4x8 FCC lattices sites in total) 47    8 with periodic boundaries. To achieve thermal equilibrium at each temperature, at least 2.048x10 6 MC sweeps were simulated followed by another 10 sweeps to calculate the averages of physical quantities like energy, heat capacity (energy fluctuation) etc. In the present simulation the unit is chosen such that Boltzmann constant kB =1, the charge QI=0, ±1, and the distance between the nearest neighbors = (0.5) 1/2 . The error (SN) in the present calculation comes from the standard 6 deviation of the averages from the final 10 MC sweeps: SN = 1 N N ∑ ( Ei − E )2 (2.1.1) i =1 By the way, in the rest of this thesis, energy calculation is based on E= e2 2ε a J ∑ QiQ j f ( Ri − R j ) ≡ 2 ∑σ iσ j f (Ri − R j ) (2.1.2) where ε is the static dielectric constant, a is lattice constant, Ri and Qi are, respectively, the position and charge. 2 The unit of energy in our calculations is J = e / ε a where a defines a characteristic length (unit cell length of the FCC lattice) and J / kB defines the unit of temperature. 2.2 Study of YLG Model on a FCC Lattice For the YLG model, we analyzed the ground state structure and calculated different physical quantities as a function of T and x and the screening parameter κ. Heat capacity can be calculated in two different ways. One, is to look at the T derivative of average energy and the other 48    using the fluctuation dissipation theorem for a system in Canonical ensemble CV = E2 − E k BT 2 2 . We also looked at different structures (charge ordering) as we changed T for different x values. 2.2.1 First Order Phase Transition, Heat Capacity and Ground State Structure (Fig. 2.2.1.1 Average total energy as a function of T for x=0.25; both heating and cooling runs are shown. Also shown are the charge ordered layered structures and how they melt when heated) Figure 2.2.1.1 gives a typical result obtained in our MC simulation. It gives as a function of T for κ = 0 (CLG model) and x=0.25, both for the cooling and the heating runs. We 49    st clearly see a 1 order transition in agreement with earlier simulation results of DHM [29]. There st is a hysteresis associated with this transition confirming its 1 order nature. Below the melting temperature, charges are confined within layers because the thermal energy given to the charges are less than the binding energy. Except for the lower concentration x << 0.5, our MC simulations (slow cooling from high T) shows that the ordered layered structures (2-d sheets) dominate the ground state structure for all values of the Yukawa screening parameter κ. This is consistent with earlier MC results of DHM for κ=0. This can be understood using the following simple argument. If we only consider the nearest neighbor interaction between the spins (or charged particles), we easily see that the ordered layered structure is nothing but an anti-ferromagnetic Ising model on a square lattice without any frustration in the nn bonds. However the nnn bonds are frustrated but the strength of these nnn bonds are 1 2 of the nn bonds and the number of nn and nnn bonds are the same. As a result the ground state is the AF or (+,-) charge ordered state. For x < 0.5, the charged sheets are sandwiched between neutral sheets. Since the neighboring layers have no charge there is no frustration in the nn bonds. Physically, when x < 0.5 the system removes frustration and lowers its energy by sandwiching neutral sheets between charge ordered layers with total charge zero. There is a small attractive interaction between these charged ordered layers coming from the nnn interaction if a positive charge from one layer is just above the negative charge of the other. Above the transition temperature Tc, the layered structure suddenly breaks up, charges leave st their original layer and move into the neighboring neutral layers. One can understand the 1 order nature of the transition qualitatively using the following argument. The binding energy of 50    the charges is sufficiently strong so that a charged particle cannot escape from its layer for T < Tc. Above Tc, a charge moves into the neighboring layer by creating a local defect. At Tc many such defects are created thereby precipitating a 1st order phase transition. (Figure 2.2.1.2 Heat capacity and Phase transition in the 2D Ising model. The inset gives the T dependence of the average energy ) Now let us look at the heat capacity associated with the above transition. We know that the heat capacity is related to the fluctuation in energy through fluctuation dissipation (FD) theorem, given by 51    E2 − E 2 = ( E − E) 2 = k BT 2 ∂ E = k BT 2CV ∂T (2.2.1.1) To find the mean square fluctuation of energy, we did a series MC simulation where we recorded the average energy and its standard deviation for every temperature. Before carrying out the calculations for the 3D YLG model we checked our calculations for the well known 2D nn Ising model. A 200x200 square lattice system with periodic boundary condition was 6 considered. To achieve thermal equilibrium, we simulated 10 MC sweeps followed by another 5 10 sweeps to get the average energy. In figure 2.2.1.2, we show the average energy and the heat capacity calculated using both the derivative of with respect to T, and the FD theorem. We found a phase transition at T= 2.25 (here we choose the unit k B = 1 ) and the results of two ways of calculating the heat capacity are in close agreement. The T dependence of and CV show nd the well known 2 order phase transition with a logarithmic heat capacity. To check the simulations further we analyzed the change in entropy associated with this transition. We know that the entropy/spin S → ln Ω , where Ω is the number of accessible states. At very low temperature Ω=1 and at very high temperature Ω=2; so the change in entropy/spin in going from infinite to zero T will be T =∞ ∫ T =0 CV dT = T T =∞ ∫ dS = ln 2 − ln1 ≈ 0.693 . (2.2.1.2) T =0 T =∞ Our simulation results using either of the two methods gave ∫ T =0 with the result of Eq. 2.2.1.2. 52    dS ≈ 0.7 , in good agreement Once we were convinced that our simulation time steps and the methods were adequate to reproduce rather accurately the well known 2D Ising model results, we proceeded to study the phase transition in the YLG model. The heat capacity results for κ=0 and x=0.25 are shown in Fig. 2.2.1.3. The ( vs. T) st curve strongly suggests a 1 order transition. (Figure 2.2.1.3 Heat capacity and Phase transition in 3D YLG model on FCC lattice) The value of heat capacity will be different because of different process and measurement methods. To understand the difference, let us look at the figure (Fig. 2.2.1.3). In our simulation, 53    because there is hysteresis associated with this transition, we can observe two transition points. In simulation, we also observed a transition temperature associated with the peak in heat capacity (CV) obtained using the FD theorem, which is located between freezing and melting points. In the rest of thesis, the critical temperature used in this thesis is obtained based on the FD theorem. 2.2.2 Effect of Screening Parameter (κ) and Concentration on the Phase Transition (Fig. 2.2.2.1 Melting (phase) transition of YLG with κ=0 for different concentrations x.) 54    One of our main motivations for our MC simulation is to see what is the effect on the transition when we change interaction range by tuning the screening parameter κ. In our simulations, 1/a defines a characteristic unit of κ. Based on our investigation, we found that for the CLG model[29], the critical temperature is a weak function of concentration. Then we have a question, does this result work for the YLG? We run MC simulation to answer this. A. Critical Temperature, Change in Entropy across the Transition for Different x, κ=0 In order to understand the nature of melting and freezing transitions starting from the low T layered structure, we slowly heated the system from a low T ordered structure (which was obtained from earlier cooling runs) until a very high T and then gradually cooled it back to low T. In Fig. 2.2.2.1 we give the heating results for κ=0 at four different concentrations. The first order melting transition can be easily seen. For x ≤ 0.5, the transitions occur almost at the same temperature. For x > 0.5, the Tc appears to be slightly higher. Also for x > 0.5, a second order (continuous) transition is seen at a higher temperature, in agreement with the previous work of DHM [29]. For the first order phase transitions in CLG model, the energy change ΔE at the critical temperature depends on the concentration x. Since at a 1st order transition, Δ E ∝ T Δ S , the entropy change ΔS at the transition also depends on x. For smaller x, the entropy change is larger, due to the larger accessible phase space. We know that CLG model corresponds to a long range interaction (κ→0) model, so if we consider the limit of short range interaction with κ→∞ which means there is no interaction between charged particles, there should be no transition. Now let us see how this 1st order melting transition changes when we change the range of interaction by increasing the value of the screening parameter κ. 55    B. Melting Transition in the YLG Model, κ Dependence for a Fixed x To study the phase transition in the YLG model and to see how the transition depends on the screening parameter κ for a fixed concentration, we have done a series MC simulation for different values of concentration x. In Fig. 2.2.2.2, we show the vs. T for x=0.25 and several values of 0 < κ < 6.4. A value of κ = 6.4 corresponds to strong screening; for this value of κ the ratio of nnn to nn coupling strength is ~0.1. Note that for κ = 0 this ratio is 1 2 = 0.707 . (Fig. 2.2.2.2 Melting process of the YLG model for different κ and fixed x=0.25) 56    (Fig. 2.2.2.3 Melting process of the YLG model for different κ and fixed x=0.375. For a small κ, st phase transition is the 1 order. Energy Change at Tc decreases with κ.) Now let us look at Fig. 2.2.2.2 (x = 0.25), Fig. 2.2.2.3(x = 0.375) and Fig. 2.2.2.4 (x = 0.5). st We can find that the GS energy rises when screening parameter κ increases and for the 1 order transition, the critical temperature Tc decreases when κ increases. From our simulation results, we also find that, for a fixed x, the relationships, GS energy vs. screening parameter κ and critical temperature Tc vs. κ, are not liner. (We will discuss this in detail in the section 2.2.4) .We 57    st find that there is a 1 order melting transition in the YLG model for small κ, but the transition appears to become continuous for large values of κ. (Fig. 2.2.2.4 Melting process of the YLG model for different κ and fixed x=0.5. For a small κ, st phase transition is the 1 order. Energy Change at Tc decreases with κ) 2.2.3 Critical Temperature vs. Change in Energy and Entropy at the 1 Transition 58    st Order Phase st To understand the change in entropy driving the 1 order melting phase transition, we have examined the dependence of the critical temperature (Tc) and the energy change per charged particle at the transition (ΔE) on x (See Fig 2.2.3.1) ( Fig. 2.2.3.1 Energy change at Tc vs. Tc when the screening parameter κ is changed from 0.0~6.4 for different values of x) 59    In Figure 2.2.3.1, for a given x, we plot ΔE as a function of Tc when the screening parameter κ is changed from 0.0~6.4 which in turn changes Tc. We can see that ΔE varies linearly with Tc. st This relationship implies that for a fixed x the entropy change at 1 order transition remains constant, independent of Tc and hence κ. ΔE Tc = ΔS x (2.2.3.1) x The change in entropy at the melting transition is independent of the range of the interaction between the charge particles. It does however depend on the concentration of the charge particles. Thus we have ∂ (ΔS ) x ∂κ =0 (2.2.3.2) The above results suggest that Tc can be approximated as a product of two independent functions and so can the energy change, ΔE at Tc, i.e., Tc ∝ fTc ( x ) g (κ ) (2.2.3.3) ΔE ∝ f E ( x, Tc ) g (κ ) where fTc ( x ) and f E ( x , Tc ) are functions of concentration and g (κ ) is a function of κ only. 2.2.4 Critical Temperature and Screening Parameter κ We found that Tc is a weak function of x for different values of κ, in agreement with the previous work of DHM[29] for κ=0 (CLG model, see Fig. 2.1.2). We have also investigated the 60    effect of κ on the melting transition. We present the simulation result showing the dependence of Critical Temperature Tc as a function of κ for x=0.25, 0.375, 0.5, and 0.75 in Figure 2.2.4.1. 0.1 0.08 0.06 0.04 0.02 0 0 1 2 3 4 5 6 Screen Parameter κ (Fig. 2.2.4.1 The numerical results of Screen Parameter vs. Criticle Temperature) Further analysis indicated that when κ is close to 0.0, Tc or g (κ ) (See Eq. 2.2.3.3) is almost a constant and when κ becomes large, Tc or g (κ ) is close to 0.0 and decreases exponentially with κ. The equation for Tc we have obtained to fit the simulation results is ⎡ 1 κ ⎤ Tc ∝ f ( x ) ⎢ 2 + ⎥ e−κ Rc Rc ⎥ ⎢ Rc ⎣ ⎦ (2.2.4.1) 61    where f(x) is a weak function of x and Rc is a measure of the average distance between the charges. In Eq. 2.2.4.1, we can see: when κ→0, Tc ∝ f ( x ) 1 Rc 2 , that means critical temperature depends ( only on concentration for CLG model; when κ is large enough, Tc → f ( x ) κ Rc eκ Rc ) −1 . Critical Temperature 0.10 0.08 x=0.250 x=0.375 0.06 0.04 0.02 0.00 0 1 2 3 4 5 6 Screen Parameter κ (Fig. 2.2.4.2 Simulation results for Tc vs. κ for two different x (x=0.250 and x=0.375). The continuous curve is Eq. 2.2.4.1) Form Fig.2.2.4.2, we can see that the fitting equation is extremely good, except for very low and very high concentrations. Eq. 2.2.4.1 is useful for predicting the values of critical temperature for a YLG on a FCC lattice. 62    2.2.5 Unusual Degeneracy for x =0.5 In this section, we will show another important property of the YLG on a FCC lattice that we have discovered. In our simulations, we usually cool the system slowly (small change in the temperature and long equilibration times) to find different ground state and low T configurations. For each cooling process, we start from a random (disorder) configuration as the initial state, corresponding to very high T. For the YLG model, our simulations showed that the ordered layered structure dominated the ground state structures for x ≤ 0.5. (Fig. 2.2.5.1 Layered Structure) (Fig. 2.2.5.2 Tubular Structure) But for x = 0.5, we found that there are two ordered ground state structures with exactly the same energy, one layered (Fig.2.2.5.1) and the tubular (Fig.2.2.5.2), but with completely different charge orderings. In fact the tubular structure was already seen in the CLG simulation studies of DHM[23]. But we found that these two widely different structures have the same energy for all values of κ studied. That is this is an exact result for the YLG, at least for the values of κ we studied. We conjecture that this might be true for all 0 ≤ κ ≤ ∞. Concept of degeneracy is not new, since we know there should be degeneracy for competing interaction models on a FCC lattice because of the frustration (see our discussions in Chapter 1). 63    The interesting thing is that the degeneracy we have found is not macroscopic but it is independent of the nature of the interaction (long range or short range), rather it comes from unusual topologies of the two ordered structures. The interesting question is how to understand how these two totally different structures lead to the same energy, independent of the range of the interaction between the charges. Layer Tubular Index of  Distance Neighbor Total  #(+) #(‐) #(0) Total  #(+) #(‐) #(0) Charge Charge 1 0.70711 0 4 8 ‐4 1 5 6 ‐4 2 1 4 2 0 2 2 0 4 2 3 1.22475 8 0 16 8 10 2 12 8 4 1.41422 4 8 0 ‐4 0 4 8 ‐4 …… …… …… …… …… …… …… …… 599 18.0416 64 0 128 64 80 16 96 64 600 18.0555 88 176 0 ‐88 0 88 176 ‐88 601 18.0693 0 72 144 ‐72 18 90 108 ‐72 …… …… …… …… …… …… …… 1145 24.9700 80 0 160 80 100 20 120 80 1146 24.9900 0 160 320 ‐160 40 200 240 ‐160 (Table 2.2.5.1, Topological equivalence (charge weighted connectivity) of the layered and tubular structures for x=0.5. After analyzing the structure up to 1146th neighbor in detail, we found that the two structures have exactly the same energy in the YLG model.) 64    To understand this unusual result further, we wrote a special computer code to trace the difference between these two structures and discovered something very interesting. Let us look at the charge distribution shown in the Table 2.2.5.1. If we start from a lattice site (chosen as the origin) with a positive ion we find that for any neighbor shell (all lattice points equidistant from the origin) the total charge is identical for the two different structures. For example look at the th 599 neighbor shell. There are 192 lattice sites in this shell. In the “layered structure”, there are 64 positive charges, 0 negative charges, and 128 neutrals, giving a total charge of +64 in this shell. On the other hand, in the tubular structure, out of the same 192 lattice sites, there are 80 positive charges, 16 negative charges, and 96 neutrals, again giving the total charge in the shell as +64. As a result the energy of the positive charge at the origin has exactly the same energy in both the structures. This is true for all the shells at different distances. Similarly, starting from any negative charge at the origin we have the same conclusion. This observation explains why the degeneracy between the two structures is true for any κ. So our observation of this unusual symmetry of the charge ordering implies that for these two different structures, if the pair-wise interaction depends only on the distance between two charges, the energies of two structures are going to be exactly the same. 2.3 Nature of Two Phase Transitions for x > 0.5 In the study of the CLG model on a FCC lattice by DHM[23] (See Fig. 2.1.2) it was found that for 0.5< x <1, there were two phase transitions. At x = 0.5, the two transition temperatures merged and at x = 1, only one, the high temperature one, remained. DHM designated the three phases seen for x > 0.5 as lattice “solid”, “liquid” and “gas”. The precise natures of these three phases were not explored in detail in these earlier studies. 65    (Fig 2.3.1 Total energy of super-lattice and heat capacity versus temperature for κ = 0.4 (YLG) at x = 0.75.) In this thesis, we have explored the nature of these three phases and the transition between them (“solid”-“liquid” and “liquid”-“gas”) in some detail. We have only explored one concentration, x = 0.75 and two values of κ = 0 and κ = 0.4. 66    (Fig 2.3.2 Melting process (“Solid”-“Liquid”) transition for κ=0 (CLG) at x=0.75. Before melting, “Solid”-structure (T=0.095) is built up by two parts. One part is of x =1 and other one is of x =0.5) 67    (Fig 2.3.3 “Liquid”-“Gas” transition for κ=0 (CLG) at x=0.75) 68    Look at Fig. 2.3.1, for κ = 0.4 at x = 0.75, phase transitions occur at T=0.02 and 0.06 which are 1 st nd order and 2 order transitions, respectively (See the figure “Heat Capacity vs. Temperature” in Fig. 2.3.1). (Fig 2.3.4 Melting process (“Solid”-“Liquid”) transition for κ=0.4 (CLG) at x=0.75. Before melting, “Solid”-structure (T=0.015) is built up by two parts. One part is of x =1 and other one is of x =0.5) 69    st With increasing κ, the 1 melting point (Tc) becomes lower and lower. This evidence can be st easily found in our series numerical simulations. Compare Fig. 2.3.2 with Fig. 2.3.4, for the 1 order transition at concentration x=0.75, we can see that, in going from κ =0.0 to κ =0.4, the critical temperature changes from Tc =0.10 to Tc =0.02 (Fig 2.3.5 “Liquid”-“Gas” transition for κ = 0.4 (YLG) at x = 0.75) 70    (Fig 2.3.6, the GS structure for x=0.75) The GS structure for the case x = 0.75 is shown in Fig. 2.3.6. We can see that the GS structure is made up of two types of structures, layered structure (x =0.5) and condensed layered structure (x =1.0). For melting process at x=0.75, our main observations are summarized below: st 1) 1 order transition occurs in the region (x =0.5) because bond energy between layers is less than in the region (x =1.0). (See Fig. 2.3.2 and Fig. 2.3.4). This melting process corresponds to “Solid”-“Liquid” transition. The long range order associates the solid state structure is lost at this transition. 2) If one keeps heating the lattice further the bond in the hybrid structures (the Structure for κ =0.0 at T =0.19 the structure κ =0.4 at T =0.06 are broken as second transition is found (See Fig. 2.3.3 and Fig. 2.3.5), which can be thought of a “liquid” to “gas” transition. 71    2.4 Nucleation of Nano “Tubular” and “Platelet” Structures st As discussed earlier, the 1 order phase transition temperature Tc from “solid” to “gas” for x ≤ 0.5 and from “solid” to “liquid” for x > 0.5 is nearly independent of x. In fact, the transition temperatures for the layered and the tubular structures for x = 0.5 are nearly same. Also we found that the average energy is almost T independent until one reaches Tc. This implies that as we heat the system, a critical thermal energy is needed to create local defects which then st proliferate leading to a 1 order melting (or evaporation) process. From the simulation of the cooling process, we found that initial size and distribution of charged clusters (or droplets) were quite similar for different x, and for two different structures with same x, even if our MC simulation started with different random configurations. In order to understand the distribution of different types of clusters seen in our simulation we use nucleation theory, discussed in the following section. 2.4.2 Nucleation Theory of Freezing - Basics According to the theory of freezing proposed by Ramakrishnan-Yussouff (RY)[47], the free energy of the system (either liquid or solid) is a functional of the density. When the liquid transforms to a solid two things happen. The density changes and the crystalline order develops. The density ρs of the solid can be written as ⎡ ρ s = ρl ⎢ (1 + η ) + ⎢ ⎣ ∑ G ⎤ μG eiG⋅r ⎥ (2.4.2.1) ⎥ ⎦ 72    where ρl is the density of the liquid, η is the fractional density change upon freezing, {μG} are the Fourier coefficients characterizing the periodic density of the solid, and {G} are the reciprocal-lattice vectors of the crystalline solid. To study the phase transition RY use the quantities {μG} and η as order parameters and expand the free energy difference between the liquid and the solid in terms of the quantity [ρs - ρl] using a density-functional theory. Later Oxtoby, Haymet and Harrowell[48-50] improved upon the RY theory by adding gradient corrections and their expression of the free energy difference is given by ⎡ 1 F (η , μ , T ) = dr ⎢ f (η , μ , T ) + K 0 (∇η ) 2 + 2 ⎢ ⎣ ∫ ∑ G ⎤ 3 2 K G g ⋅ ∇μG ⎥ 2 ⎥ ⎦ (2.4.2.2) where f is the bulk free energy density difference calculated by RY [47], K0 and {KG} are temperature dependent constants, and g is the unit vector in the G direction. The nucleation process consists of formation of droplets or clusters of the low T phase above the phase transition. These clusters form spontaneously as thermal fluctuations and if T is far above the transition temperature they decay rather quickly. But when T approaches Tc and goes below it (super-cooled metastable state) these droplets become more stable, form in large st numbers and percolate giving rise to a sudden (precipitous or 1 order) transition to the low T ordered structure. Consider a given volume of liquid at a temperature ∆T below Tc with a free energy G1. If some of the liquids transform to a small sphere (bulk) of solid, the free energy of the system will change to G2. 73    L G1 = (VS + VL )GV (2.4.2.3) S L G2 = VS GV + VLGV + ASLσ Where: VS is the volume of the solid sphere; VL is the volume of the liquid; ASL is the S L solid/liquid interfacial area; GV and GV are free energies per unit volume of solid and liquid, respectively; σ is the solid/liquid interfacial free energy. Let the difference between the bulk free energy densities be given by L S ΔGb = GV − GV (2.4.2.4) In this model the free energy change associated with the nucleation of a spherical “solid” droplet of radius R is given by ΔG = 4π R2σ − 4π 3 R ΔGb 3 (2.4.2.5) The first term gives the energy increase due to the surface tension of the new interface and the second term shows the energy gain by creating a cluster of the ordered (solid) phase (only * when T is below Tc). When Δ Gb > 0 , the critical nucleation size R is determined by ∂ΔG = 0, ∂R R=R* it gives R*=2σ/ΔGb . The change in free energy associated with the formation of this cluster is given by ΔG ( R* ) = 16π σ 3 3 ( ΔGb ) 2 (2.4.2.6) which is positive. Based on a field-theoretic approach to nucleation, Langer[51]  showed that the nucleation rate I of a cluster of size R* is given by 74    *)/ k T B I = γ Ω0e−ΔG( R (2.4.2.7) where, the parameter γ gives the initial growth rate for a critical cluster and depends on the dynamics, the quantity Ω0 gives the amount of phase space accessible for a critical droplet fluctuation and is proportional to the volume. 2.4.3 Different Types of Clusters and the Droplet Nucleation Model Now we will see whether the nucleation of cluster idea can be used to explain some of our simulation data about the rate of nucleation of different ordered structures. For the YLG model, our slow cooling simulations showed that ordered layered structures dominates the ground state for x ≤ 0.5. For x = 0.5, due to the frustration of FCC lattice, there are two different structures (Fig. 2.2.5.1 and Fig 2.2.5.2), the layered structure and the tubular structure, which have the same energy at T = 0, they are degenerate. The degeneracy between two completely different structures led us to probe further the thermodynamic properties of these two structures, their melting and freezing characteristics. In particular, we wanted to see whether the melting mechanism depended on the range of interaction. When the system is cooled from a high T disordered phase one sees nucleation of “layer” and “tubular” nanostructures as one approaches the transition. These are precursors of the “layered” and “tubular” ground state structures. In addition we see structures which are different and we call them as “metastable” structures. We estimated the rate of generations of the first two structures in the following way. Let N = NMetastable + NLayer + NTubular be the total number of the cooling simulations. NLayer is the number of simulations when we saw layered structures, NTubular is the number of simulations when we saw tubular structures and NMeta-stable is the number simulations when we saw structures other 75    than Layered and Tubular structures (less than ~2%). We found NLayer / NTubular 0.35 for the long range interaction (κ=0.0) and NLayer / NTubular 0.28 for the short range interaction κ=2.6. The natural question is why these two different types of clusters appear with different probabilities even if their ground state energies are identical. A. Driving Force for Solidification in the YLG Model (Fig. 2.4.3.1 Gibb’s free energy Vs. Temperature for two phases, “liquid” and “solid”) Usually when we deal with a phase transition, we are often concerned with the difference in free energy between the two phases at temperatures away from the transition temperature. For the YLG model, if a disordered structure is under-cooled by ΔT below Tm, the solidification will be accompanied by a decrease in free energy ΔG as shown in Fig. 2.4.3.1. 76    (Fig. 2.4.3.2 Energy barrier in the solidification process) Consider a small piece of the solid at temperature T. The Gibb’s free energies/particle of the disordered (liquid) and the ordered (solid) phases are given by b GL = H L − TS L (2.4.3.1) b GS = H S − TS S Where H and S are the enthalpy and entropy functions and T is the temperature. The bulk b free energy difference ΔG is Δ G b = Δ H − T Δ S , where Δ H = H L − H S and Δ S = S L − S S . For the lattice gas models, dV=0. Then from H = U + pV , we know ΔH = ΔU . At the melting b temperature Tm, ΔG =0, ΔS = ΔH L = , where L is the latent heat at the transition. For T Tm Tm close to Tm, ΔGb L −T L L ΔT = (Tm − T ) = L Tm Tm Tm (2.4.3.2) 77    One can use this bulk free energy difference in the expression for the nucleation rate to estimate the size and shape of the clusters provided one knows the surface tension (Eq. 2.4.2.5) B. Energy Barrier and Generating Rate of Different Structure in the Solidification Process (Fig. 2.4.3.3 Basic cells of Layer and Tubular structures and crystal growing processes) b For the YLG system, we put the bulk energy ΔG (Eq. 2.4.3.2) into the formula (2.4.2.5) then we have 2 16π ⎛ Tm ⎞ 3 σ ΔG = 3 ⎜ L ΔT ⎟ ⎝ ⎠ * (2.4.3.3) Therefore, we know that the energy barrier to nucleate a droplet or a cluster is proportional 3 to σ . Based on Langer[51], the generation rate of the clusters in the solidification process is 78    given by I const · Ne − Δ G */ k BT , where N is the number of particles in the volume. It shows that the solidification process is proportional to the number of particles and depends on the energy barrier ∆G which depends on the basic bulks (clusters) of solid. Fig. 2.4.3.3 shows the basic clusters of the two structures (layered and tubular) and how the crystal grows from these clusters. Now let us estimate the surface energy of the basic cells (clusters) of layered and tubular structures. For (m+n) particles system in domain (Ω1+Ω2)(See Fig. 2.4..3.4), the total energy of system can be described by E = ∑ Ei, j , where Ei, j comes from interactions between charge i i≠ j and charge j. Suppose there are n charges in sub-domain Ω2 (outside part of Ω1) with index (1, 2, 3, …, n) and there are m charges in domain Ω1 with index (n+1, n+2, …, n+m), so the total energy of system can be described by m+ n E= ∑ Ei, j = EΩ1 + EΩ2 + Esurface n = m+ n ∑ Ei, j + ∑ (2.4.3.4) i , j = n+1, i ≠ j n E i, j + m+ n ∑ ∑ Ei, j i =1 j = n+1 79    (Fig. 2.4.3.4 Surface between two domains) From Eq. 2.4.3.4, we know that the surface energy is contributed by n Esurface = m+n ∑∑ (2.4.3.5) Ei , j i =1 j = n +1 For a basic cell of layered or tubular structure, starting from central particle, from Eq. ⎛ ∞ ⎞ E1i ⎟ . Suppose the surface 2.4.3.4, the surface energy of basic cell is given by ES = ⎜ ⎜ ⎟ ⎝ i=2 ⎠Bacis Cell ∑ E area is A, then we obtain the surface energy density σ = S A . Our simulation results show that although the layer and tubular structures have the same ground state energy, it is more difficult to generate clusters of “layered” structure compared to that of “tubular” structure. Based on the basic nucleation theory of clusters, we can roughly estimate the surface energy density assuming that only the nearest-neighbor interactions contribute. 80    For example, when κ = 2.6, the surface energy of the layer structure is about σL~0.27 and that of the tube structure σT~0.19. Putting the values of σL and σT into Eq. (2.3.2.7), we obtain that the ratio of the rate of generation of the two structures is 0.31. This roughly matches the simulation result NLayer / NTubular 0.28 . 2.5 Concentration and melting point (Fig. 2.5.1 Ground structure for different concentrations) In our numerical simulations, we found that for a fixed concentration, no matter what is the value of the screening parameter κ, the configuration of the ground state is the same. For x=0.5, 81    we found two ground state structures (Layered and Tubular). For x<0.5, there are two co-existing structures that form the ground state configuration (See Fig. 2.5.1). One part is layered with x=0.5, the rest is neutral with x=0. For x>0.5, there are also two co-existing structures which the ground state configuration, one part is layered with x=0.5 and another part is full of charges (x=1). In heating process, our simulations show that critical temperatures of 1 st transition are st nearly independent of concentration x and 1 transition always occurs in the region x=0.5, i.e., when the temperature is lower than the melting temperature, charged particles are confined to almost within layers. This might be understood as follows: • For charged particles, the binding energy in the area x<0.5 is less than the energy in the st area x=0.5, this is the reason why 1 transition occurs in the area x<0.5 • For the case x<0.5, when the temperature is below the melting point, the charged particles are confined within layers because the thermal energy obtained by particles is less than its’ binding energy. In fact, no matter what the global concentration is, for a particle within the layers, its local area is always of concentration 50%, this observation may explain st why the 1 transition occurs almost at the same critical temperature, i.e., critical temperature is nearly independent of concentration. • For the case x>0.5, for a particle confined within the layers, because of the contribution of extra-binding energy from x=1, the particle need to obtain more thermal energy to overcome the binding energy, so the melting temperature should be a little bit higher than the temperature in the case x<0.5. 82    2.6 Summary Monte Carlo simulation has been used to study the finite temperature (phase transition) properties and the ground state of Yukawa Lattice Gas model (YLG) on a FCC lattice, a system which is characterized by competing interaction and frustration. This model maps on to an antiferromagnetic 3-state Ising model (σ = -1, 0, +1) on a FCC lattice with tunable range of interaction. We discussed the energy calculation method for the YLG model. We have explored how the nature and strength of the phase transition depend on the range of competing interaction by tuning the Yukawa screening parameter κ. We also investigated the degeneracy of the ground state for different concentrations and range of the interaction. We discussed the phase diagram and what the role frustration plays in YLG FCC lattice model. Monte Carlo simulation results show that our analyses can explain how the first order transition occurs. For (AgSbTe2)x(PbTe2)(1−x) systems, the concentration x plays a very important role, it determines st the energy change and the change of entropy at the 1 order phase transition. The concentration also determines what kind of the phase transition can be observed. We have given an empirical st formula which gives the x and κ dependence of the 1 order transition from “solid” to “gas” or “liquid”. For x = 0.5, we found two special structures which have identical ground state energy if the interaction between the charged particles is only a function of the distance between the two particles.. Our results show how frustration affects the degeneracy of the YLG lattice system. Based on an analysis of the surface energy density around the critical temperature, we studied the solidification process and the generating rate of clusters or droplets of different shapes and structures and used nucleation theory to explain our MC simulation results qualitatively. 83    In our MC simulations we used Ewald summation to calculate the total energy of the YLG 3 model. The size of the system we simulated contains 4x8 lattice sites. For CLG model, consider a site closed to center of lattice, we can calculate the energy (E) of the site via Ewald Summation. In fact, if the lattice size is large enough, we can directly calculate (E) of the site in a finite domain. For a precision of 8 decimal figures, our calculation result indicates roughly that the size 3 of a finite domain should contain, at least, 4x(8x8) lattice site for concentration (x=0.25). For a st better understanding of the details of the 1 order and continuous transitions we need to simulate 1.5 much larger systems. Usually, the calculation cost of Ewald summation is about O(N ). For simulation studies of large systems we can use our Accelerated Cartesian Expansions (ACE)[52] method, an O(N) methods, to compute energy (See Chapter 3 & Chapter 4). This method is discussed in the second part (Part B) of this thesis. Most of my ACE work has been published in 2 papers and several conference proceedings. Reference to these works will be given later when necessary. Part B of my thesis is based primarily on these published papers. 84    PART B CHAPTER 3 Accelerated Cartesian Expansion (ACE) Based on Fast Multipole Method (FMM) 3.1 Introduction The calculation of pair-wise interaction (~R-ν) (for instance, Columbic or Yukawa interactions, London potentials, or Van der Waal's potential etc.) is important in numerous research areas that are as diverse as biophysics, physics, computation chemistry, astrophysics, and electrical engineering to name a few. For example, ν = 1,5, 6,10 corresponds to Columbic interactions/gravitational potentials, London dispersion potentials, Van der Waals potentials, and H-bonds, respectively. However, it is well known that these potentials’ calculation requires prohibitive computation resources, both in terms of CPU cycles and memory. These costs are exacerbated when such computation is required either in a molecular dynamics or a Monte Carlo setting. It is well known that the CPU cost of computing mutual interactions between N particles distributed in ℜ3 2 scales as O( N ) . Consequently, this has engendered the need for computational methodologies that are efficient both in terms of memory and CPU time. Some of these include cutoff techniques[53], particle mesh algorithms[54-56], Ewald summation (based on an assumption of periodicity)[57, 58], tree-codes  [59-61] and fast multipole methods (FMM)  [62‐65]. Tree codes (and FMM) are based on subdividing the computational domain into hierarchical subdomains, and computing the influence between subdomains that are sufficiently separated using multipole/local expansions. The fundamental differences between FMM and tree codes notwithstanding, these methods have revolutionized analysis in application domains ranging from astrophysics to biophysics to engineering sciences. 85    At this point, we note that there is rampant confusion in terminology; in fact, the terms Fast Multipole Methods (FMM) and tree codes are used interchangeably in the literature that we have come across. This is not surprising as the two techniques are closely related. The differences between these two methods, albeit subtle, are significant. As was elucidated in[66], tree codes compute interactions between source pairs using one of three methods: 1) directly, 2) evaluating fields at each observation point using multipole expansion due to a cluster of sources, or 3) using local expansion at observation clusters to find fields. The decision on the operation used depends on which one is computationally efficient. On the other hand, the algorithmic structure of FMM enables the computation of potentials in an optimal manner[66]. Two addition operations that permit this are aggregation and disaggregation functions. These permit the computation of information at coarser (or finer) using information at finer (or coarser) levels. It so happens that for ν = 1 , tree codes typically rely on cartesian expansions, whereas FMM is based on spherical harmonics. In this part of my thesis, we develop theorems and operations necessary for constructing FMM methods for all ν using cartesian tensors. Tree codes for computing the Lennard-Jones potential was developed as early as 1996[67]. More refined methods for computing the same were proposed by [68-70]. All three methods essentially use Taylor's expansion in the cartesian coordinates, and some of these use Gegenbauer polynomials based recursion techniques to accelerate translation of multipole expansion to fields at the observer. In 2005, Chowdhury and Jandhyala[70] proposed a variation 86    of these schemes using Taylor's series expansion in a different coordinate systems. Other techniques that have been proposed rely on pre-corrected Fast Fourier Transform and using a singular value decomposition. Developing FMM-like techniques using special function has proven difficult [70] as functions using Gegenbauer polynomials in the spherical coordinate system are not separable. However, it is well know that Gegenbauer polynomials can be written in terms of Legendre polynomials. Sarin used this fact to develop a tree code; since Legendre polynomials are used, FMM scheme may be readily derived from these expressions as well. Also, [71] developed operators necessary to extend their scheme to a multilevel setting. However, using Gegenbauer polynomials either for recursion of for developing tree codes has a singular disadvantage; these polynomials are defined for ν ≥ −1/ 2 . Consequently, methods that rely explicitly on these cannot be generalized to non-oscillatory potentials (like the lattice gas potential) nor can one prove convergence ∀ν ∈ℜ . The motivations for this work are four fold: 1) To formulate a fast method for 1 / Rν in terms of totally symmetric tensors, and exploit these to reduce the costs. We also prove convergence ∀ν ∈ℜ . 2) To introduce new theorems that enable exact traversal up and down the tree. This implies that one does not accrue error as the height of the tree increases. 3) To prove convergence ∀ν ∈ℜ . 4) To demonstrate the intimate connection between the classical FMM introduced by Greengard[62] and its Cartesian counterparts using traceless Maxwell Cartesian tensors. This connection will show that properly constructed Cartesian FMM schemes have the same computational complexity as classical FMM. Other advantages of the proposed schemes are outlined in later in this section. 87    Fast methods for computing Columbic interactions have progressed along two fronts: (i) using spherical harmonics, and (ii) using Taylor's expansions. The latter was introduced at approximately the same time as the Greengard's first paper on three dimensional FMM[72], and has found extensive application in tree-codes. Recent, FMM codes based on Cartesian expansions have used recurrence relations to avoid derivatives[73]. Furthermore, asymptotic computational cost of schemes based on Cartesian expansions is higher. This is because spherical harmonics are optimal for representing harmonic functions in three dimensions. For a system with N mutually interacting particles, it can be readily proven that for a one-level implementation, the computational complexity of FMM scales as O ( p 4 N ) the latter O ( p 6 N ) , where p is the number of harmonics used in the computation. This cost can be reduced by choosing the number of particles at the lowest level in the tree in an optimal manner. Our interest in revisiting these schemes is motivated by the following observations: (i) Taylor's series expansions provides a natural framework for developing addition theorems[74]; (ii) Taylor's expansion involves representing the fields in terms of Cartesian Tensors. These connections are well known, and have been explored extensively (as early as Maxwell!); see [75-77] and references therein. In fact, the following statements hold true: (i) components of a traceless tensor of rank n serve as constant coefficients in a spherical harmonic of degree n, and (ii) there is a class of traceless tensors of rank n whose components are n-degree spherical harmonics functions of x, y, z. These connections imply that there should be an intimate connection between the two seemingly disparate methodologies, and one should be able to obtain a similar cost structure for both methods. As we will show, the recurrence relationship that were conjectured for translating multipole expansions [73] can be rigorously derived using traceless tensors. This implies that one should be able to derive a computational scheme using Cartesian tensors that are optimal in the 88    sense of FMM. The method presented herein can be readily generalized to analyze potentials for all ν , or for that matter, to any potential function whose power series converges rapidly  [78], and more importantly, without the need to use special functions! Thus, this chapter will focus on the use of Cartesian tensors to derive fast computational schemes for all ν . Similar to FMMs, the methodologies developed herein will rely on a divide and conquer computational strategy. This is facilitated by a hierarchical partitioning of the computational domain through the construction of an oct-tree data structure. The underlying mathematics for two different computational methods will be derived; in the first, operators will be derived for traversing up, down and across the tree. This technique will rely on the using totally symmetric tensors. The salient feature of this method is that the traversal up and down the tree (or shifting the origin of the multipole/local expansion) is exact. The second method produces optimal technique, in the sense of FMM, for ν = 1 . This optimality is achieved using traceless totally symmetric tensors. For ν ≠ 1 , it yields the same computational complexity as the first albeit a different error bound. One of the most interesting features of both algorithms, and which separates the proposed technique from the ones presented earlier  [66, 68-71, 79], is the fact that the operators derived for traversing up, and down the tree are independent of ν , and the ν dependance comes in only when traversing across the tree. This is a powerful feature, as one can use a single traversal up and down the tree to compute the combined effects of different potentials! This also implies that this technique can be readily generalized to create a similar fast method for any potential function that is sufficiently smooth in R . In fact, one of the examples presented will demonstrate the simulation of the lattice gas potential ( LnR ) that is used extensively in electronic structure calculations. Finally, the algorithm presented does not involve 89    any explicit (or numerical computation of) derivatives and quantities are expressed in terms of (products of) traceless (or totally symmetric) tensors. 3.1.1 Divide and Conquer Strategy Typically, potentials are evaluated between source and observation pairs that are randomly 3 distributed in a domain Ω∈ℜ . The computational scheme developed here will follow those typically used in FMM. To this end, the entire domain is embedded in a fictitious cube that is then divided into eight sub-cubes, and so on. This process continues recursively until the desired level of refinement is reached; an N l -level scheme implies N l − 1 recursive divisions of the domain. At any level, the domain that is being partitioned is called the parent of all the eight children that it is being partitioned into. At the lowest level, all source/observers are mapped onto the smallest boxes. This hierarchical partitioning of the domain is referred to as a regular oct-tree data structure. The interactions between all source and observation points are now computed using traversal up and down the tree structure. This is done using the following rule: at any level in the tree, all boxes/domains are classified as being either in the near or far field of each other using the following dictum: two subdomains are classified as being in the far field of each other if the distance between the centers is at least twice the side-length of the domain and their parents are in the near field of each other. Once, the interaction list have been built for all levels, the computation proceeds as follows; at the lowest level, the interactions two sub-domains are classified as being in the far field of each other if the distance between the centers is at least twice the side length of the domain, and their parents are in the near field of each other between the elements of boxes that are in the nearfield of each other is computed directly, i.e., using 1 / Rν . All other interactions are computed using a three stage algorithm: 90    1) compute multipoles of a clusters of sources that reside in each box; 2) convert these to local expansion at all boxes that are in its far field; 3) from the local expansion, compute the field at each observer. It is apparent that one can gain more efficiency by embedding this scheme within itself. That is, if two domains that interact with each other are far away, then these clusters may be combined to form larger clusters that then interact with each other at a higher level and so on. As will be shown later, this computational strategy considerably mitigates the overall cost. To accomplish these sequence of tasks, it is necessary to develop theorems that enable the following: 1) computation of multipoles at leaf (or smallest boxes); 2) theorems to shift the origins of multipole so that effects of small clusters can be grouped together to form larger clusters; 3) translate multipole expansion to local expansion; 4) move the origin of local expansion so that expansions at the origin of the parent may be disaggregated to those of its children; 5) finally, aggregate the local expansions in a box to compute the field at all the observers. These sequence of tasks are generally referred to as moving up, across, and down the tree, and is facilitated by the theorems developed next. 3.1.2 General Statement of the Problem 3 3 Consider a domain Ω s ∈ℜ that is populated with k sources and a domain Ωo ∈ℜ that contains k observers. With no loss of generality, assume that these domains are spherical and of radius a. These spherical domains completely enclose one of the cubical subdomains generated 91    earlier. The location of these points is random, however we will assume that the distribution in the domain is sufficiently dense and relatively uniform. Centers of Ω s and Ωo are denoted by rs and ro . It is assumed that Ωs ⊂ Ωs and Ωo ⊂ Ωo and Ωs ∩Ωo = ∅ . In what follows, the domains Ω s and Ωo will be called parents of Ω s and Ωo , respectively. As before, the parent domains will be assumed to be spherical of size 2a , and their center are denoted by rsp and rop , respectively, and rsp ∉Ωs and rop ∉Ωo . The potential due to sources ∀ri ∈ Ω s , ∀ri ∈ Ω o observed at r ∈ Ω o is given by φ (r ) = k q i ∑ | r − ri |ν i =1 k = ∞ q 1 q 1 ∑∑ (−1)n ni! rin ⋅ n ⋅∇n rν (3.1.1.1) i =1 n =0 ∞ = k ∑∑ (−1)n ni! rin ⋅ n ⋅∇n rν n =0 i =1 This equation is derived using Theorem 2.10, and qi are values of the sources at locations ri. The exchange of the summation indices is permissible as the summation converges. Unless otherwise stated, the operator ∇ operates on r. In what follows, we will prescribe the means that will enable that rapid computation of equation (3.1.1.1). The presentation will be in two stages: 1) We will develop expressions for computing these using totally symmetric tensors. In this algorithm, the operations for traversing up and down the tree are ``exact,'' i.e., once the multipole or local tensor is known at an origin, shifting the origin is exact. 92    2) We will develop another method that relies on cascaded Taylor's series expansion. However, in our thesis, we will specialize this technique for ν = 1 so as to cast it in terms of traceless tensors. Consequently, the resulting algorithm is optimal in the number of operations. We also give insights into how this method can be implemented. Parenthetically, we note that in both methods, the ν dependence exists only when traversing across the tree. 3.2 Method 1: Cartesian Expansions with Totally Symmetric Tensors 3.2.1 Multipole Expansion (Theorem 3.2.1) The total potential at any point r ∈ Ω o due to k sources qi, i = 1, , k located at points ri ∈ Ω s is given φ (r ) = ∞ 1 ∑ M(n) ⋅ n ⋅∇n rν n =0 M ( n) = k (3.2.1.1) q ∑ (−1)n ni! rin i =1 where M ( n ) is the totally symmetric multipole tensor about the origin rs = {0, 0, 0} Proof. See equation (3.1.1.1) In the above equations, the tensor M ( n ) is of full rank and totally symmetric. This implies that the number of independent components is (n + 1)(n + 2) / 2 . As we will show in the next subsection, if this tensor is contracted with another tensor that is traceless, then it is possible to use the traceless form of the multipole moment. As a traceless tensor contains only 2n + 1 independent components it results in an method with lowered computational cost. 93    Next, we present the first addition theorem that enables shifting the origin of multipole expansions centered around rs to that centered around rsp . This is facilitated by the following theorem. Variations of this theorem have been developed in other contexts [43]. 3.2.2 Multipole to Multipole Expansion (Theorem 3.2.2) Given a multipole expansion of k sources about the rs = {0, 0, 0} O(n) = k q ∑ (−1)n ni! rin (3.2.2.1) i =1 then the multipole expansion about the point rsp can be expressed in terms of above equation as M (n) = k ∑ i =1 ( −1) n q! (ri − rsp ) n = n! n ∑ ∑ m = 0 P ( m ,n ) ( ) m! p rs n! n−m O(m) (3.2.2.2) where P(n, m) is the permutation of all partitions of n into sets n-m and m. Proof: (n) Using equation (3.2.2.1), Theorem A2.1, Theorem A2.10 and noting that the tensors O ∇ n r −ν are totally symmetric, results in 94    ∞ ⎧ k ⎪ n qi φ (r ) = ri − rsp ⎨ (−1) n! n=0 ⎪ i =1 ⎩ ( ∑∑ n⎫ ⎪ ) ⎬ ⋅ n ⋅∇n | r − 1sp |ν r ⎪ ⎭ n ⎧ k ⎪ n qi = (−1) n−m rsp ⎨ (−1) n! n=0 ⎪ i =1 m =0 P ( n, m ) ⎩ ∞ ∑∑ ⎧ n m! p ⎪ = rs ⎨ ⎪ m =0 P ( n,m ) n ! n =0 ⎩ ∞ ∑ ∑ ∑ ( ) ⎧ n m! p ⎪ rs ⎨ n! n =0 ⎪ m =0 P ( m,n ) ⎩ ∞ = ∑( ) ∑ ∑ ∑ ∑ ( ) n−m n−m n−m m ⎫ ⎪ rs ⎬ ⋅ n ⋅∇ n ⎪ ⎭ 1 | r − rsp |ν ⎡ k ⎤⎫ q 1 ⎪ ⎢ (−1) m i rsm ⎥ ⎬ ⋅ n ⋅∇ n m! ⎥ ⎪ | r − rsp |ν ⎢ i =1 ⎣ ⎦⎭ ∑ ∞ ⎫ 1 1 ⎪ = O( m) ⎬ ⋅ n ⋅∇ n M ( n) ⋅ n ⋅∇ n | r − rsp |ν n=0 | r − rsp |ν ⎪ ⎭ ∑ (3.2.2.3) Next, we prescribe the means to translate the multipole expansion that exists about rsp to a local expansion about rop . There are two ways of deriving this translation operator; (i) using reciprocity and (ii) using a Taylor's series expansion. The theorem that is presented next uses the first approach whereas the second approach is used in presenting a similar operation for method 2. As is expected, both result in identical expressions. 3.2.3 Multipole to Local Translation (Theorem 3.2.3) Assume that the domains Ω s and Ωo are sufficiently separated, and the distance between { } { } p p their centers ros =| ros |=| rop − rsp | is greater than diam Ωs and diam Ωo . If a multipole 95    (n) (n) is located at rsp , then another expansion L expansion M that produces the same field ∀r ∈Ωo is given by φ (r) = ∞ ∑ ρn ⋅ n ⋅ L(n) n=0 L( n) = ∞ (3.2.3.1) 1 ∑ n!M(m−n) ⋅ (m − n) ⋅∇ m m= n 1 p ( ros ) ν where ρ = r − rop . Proof: The potential φ (r)∀r ∈Ωo is given by equation (3.2.1.1). It follows by reciprocity that if (n) the multipole M were located at r it would produce the same potential at rsp . In other words, the potential at all points r ∈ Ω o can be computed by placing the multipole moments at r and evaluating φ (r ) = ∞ ∑ M (n) ⋅ n ⋅∇ n n =0 At rsp . Here, 1 (3.2.3.2) | r − rsp |ν ∇ is the derivative with respect to rop . Since this valid at all points r ∈ Ω , the (n) multipole tensor M at r may be translated to the center rop using the symmetric translation. (n) Denoting the multipole tensor at rsp by O and ρ = r − rop , using the multipole expansion theorem, we obtain the potential 96    φ (r ) = ∞ ∑ O(n) ⋅ n ⋅∇ n =0 1 n p ( ros ) ν ∞ ⎛ n 1 m (n−m) ⎞ n 1 = ∑⎜ ∑ ρ M ⎟ ⋅ n ⋅∇ ⎜ ⎟ p ν n =0 ⎝ m =0 m ! ⎠ ros ( ) ⎛ ⎞ ⎜ 1 ( n−m ) 1 ⎟ = ∑ ∑ ρm ⋅ m ⋅ ⎜ M ⋅ (n − m) ⋅∇ n ⎟ p ν ⎟ n = 0 m =0 ⎜ m! ros ⎝ ⎠ ∞ n ( ) = ∞ ∑ρ (3.2.3.3) n ( n) ⋅n⋅L n =0 where L( n) = ∞ 1 ∑ n !M (m−n) ⋅ (m − n) ⋅∇ m=n 1 m (3.2.3.4) ( ) p ν ros is obtained by gathering tensors that operate up on ρn . Next, we prescribe the means to traverse down from rop to ro . This theorem is almost a mirror of that used to go up the tree. 3.2.4 Local to Local Expansion (Theorem 3.2.4) (n) A local expansion O that exists in the domain Ωo centered around rop can be shifted to the domain Ωo centered at ro using L( n ) = ∞ ∑ ⎜ m − n ⎟O ( m ) ⋅ ( m − n) ⋅ ( rocp ) ⎛ m ⎞ m=n ⎝ m−n (3.2.4.1) ⎠ 97    Proof: cp From definition of φ (r) , we use ρ = r − ro + (ro − rop ) = ρoi + ro to obtain φ (r ) = ∞ ∑ O( n) ⋅ n ⋅ ρn n =0 = ∞ ( ∑ O(n) ⋅ n ⋅ ρoi + rocp n =0 = = ∞ ) n ⎡ n ⎛ n ⎞ cp O( n) ⋅ n ⋅ ⎢ ∑ ⎜ ⎟ ro ∑ ⎢ m =0 ⎝ m ⎠ n =0 ⎣ ( ) ∞ ∑ L(n) ⋅ n ⋅ ( ρoi ) ⎤ n−m ( ρoi )m ⎥ ⎥ ⎦ (3.2.4.2) n n =0 where L( n ) = ∞ ∑ ⎜ m − n ⎟O ( m ) ⋅ ( m − n) ⋅ ( rocp ) ⎛ m ⎞ m=n ⎝ m−n (3.2.4.3) ⎠ In deriving the above proof, we used Theorem A2.1 to permute the tensors, and then m gathered terms associated with ( ρoi ) to arrive at the final result. Finally, the potential at any point in the Ωo can be obtained using φ (r ) = ∞ ∑ L( n ) ⋅ n ⋅ ( ρoi ) n (3.2.4.3) n=0 3.3 Method 2: Method 2: Cartesian Expansions with Cascaded Taylor's Series In the above subsection, the scheme presented relies on the use of totally symmetric tensors, and exact operations to traverse up and down the tree, i.e., once the multipole expansions at the lowest level are known, traversal up the tree is exact. Similarly, once the local expansions are 98    known at a level, traversal down the tree is exact. Alternatively one can derive a fast algorithm that is based on cascaded Taylor's series expansions. In fact, it can be shown that the classical FMM falls into this category as do the algorithms proposed by [67-69, 79]. Note also, that for a given error, method 1 will typically require lower value of P than method 2. However, our motivation herein is to demonstrate that for ν = 1 , this algorithm can be formulated in terms of traceless tensors, thus, making the number of operations optimal in the sense of classical FMM. For ν ≠ 1 , the translation operator is not traceless but a symmetric tensor. Consequently, the asymptotic cost is the same as method 1. • Traceless operations for ν = 1 As is evident in Theorem 3.2.1, the multipole tensor is contracted with a tensor that is both totally symmetric and traceless. As was mentioned earlier, the latter has only (2n + 1) independent components while the former has (n + 1)(n + 2) / 2 independent components. However, as was shown in Lemma A2.2, it may well be possible to derive another tensor that results in lower number of operations; here, we develop a method using traceless tensors that are henceforth denoted by a subscript t . The following Lemma demonstrates this fact: 3.3.1 Traceless Multipole (Lemma 3.3.1) (n) If the potential at a point is given in terms of contraction between the multipole tensor M and a symmetric traceless tensor, then the same potential at that point may be obtained using an ( n) equivalent traceless symmetric tensor Mt . Proof: Starting with equation (3.2.1.1), it follows that 99    φ (r ) = ∞ ∑ M (n) ⋅ n ⋅ ⎡(−1)n r −2n−1 nr n ⎤ ⎣ ⎦ from (A11) n =0 (−1) n r −2 n−1 ( n) M ⋅ n ⋅  n n r n =∑ n =0 (2n − 1)!! ∞ from (Theorem A2.7) ∞ 1  n M ( n) ⋅ n ⋅ ⎡(−1) n r −2n−1 nr n ⎤ ⎣ ⎦ n =0 (2n − 1)!! =∑ = ∞ from (Theorem A2.6) 1 ∑ Mt(n) ⋅ n ⋅∇ n r n =0 (3.3.1.1) Next, we need to translate the multipoles located at rs to one that is located at rsp . In contrast to what was done in Theorem 3.2.2, the starting point of our expansion will be Theorem ( n) 3.2.1. The following lemma prescribes the relation between the traceless tensor Οt that exists ( n) at rs to a traceless tensor Mt . 3.3.2 Traceless Multipole to Multipole (Lemma 3.3.2) A traceless multipole tensor at rs = {0, 0, 0} is related to that centered at rsp via ( Mt m) = m ∑ n=0 ( ) P ( −1)  n rs n n! n (2n − 1)!! ( Ot m−n) (3.3.2.1) Proof: The proof presented herein relies on the repeated use of Theorems A2.6, A2.7, and Corollary A2.8. These theorems essentially permit manipulations of the detracted operator, as 100    long as one of the tensors involved in the contraction is traceless. As seen in equation A11, the tensor ∇n r −1 is traceless; consequently, all quantities that are contracted with it can be made traceless as well. Starting with Lemma 3.3.1.1 and using Taylor expansion (Theorem A2.10) results in φ (r ) = ∞ 1 ∑ Ot( n) ⋅ n ⋅∇n r n =0 = ∞ ∑ n =0 = = ( Ot n) ⋅ n ⋅∇ n ∞ ( ) ∞ k ∞ ∞ ⎡ ∞ ( −1) k p rs ⎢∑ ⎢ k =0 k ! ⎣ ⋅ k ⋅∇ k ⎡ ( −1) k (  k rsp Ot n ) ⋅ n ⋅∇ n ⎢ ∑∑ (2k − 1)!! k ! ⎢ n =0 k = 0 ⎣ ( ) (−1) k ∑ ∑ (2k − 1)!!k !Ot(n) k rsp n =0 k = 0 ( ) ∞ = ⎤ ⎥ | r − rsp | ⎥ ⎦ k 1 k ⋅ k ⋅∇ k ⋅ ( n + k ) ⋅∇ n+ k ⎤ ⎥ | r − rsp | ⎥ ⎦ 1 1 | r − rsp | 1 ∑ M t( m) ⋅ m ⋅ ∇ m | r − r p | m =0 s (3.3.2.2) where ( M t m) = m ∑ n =0 ( ) P ( −1)  n rs n n! n (2 n − 1)!! ( Ot m−n) (3.3.2.3) ( n) The next stage is the translation of the multipoles Mt to local expansion. Indeed, the procedure for doing so is similar to the one derived for symmetric tensors. 101    3.3.3 Traceless Multipole to Local (Lemma 3.3.3) Assume that the domains Ω s and Ωo are sufficiently separated, and the distance between p p their centers is ros =| ros |=| rop − rsp | is greater than diam{Ω s } and diam{Ωo } . If a traceless (n) ( n) multipole expansion Mt for all n is located at rsp , then another expansion L that produces the same field ∀r ∈ Ω o is given by φ (r ) = ∞ ∑ ρn ⋅ n ⋅ L(tn) n =0 where L(tn) ∞ 1 = ∑ ∇m m=n n ! (3.3.3.1) 1 ( ⋅ ( m − n) ⋅ M t m − n ) ν p ros ( ) Proof: The proof presented in this section relies on using another Taylor expansion to create the ( n) traceless local expansion Lt as opposed to using reciprocity to derive similar operators for Method 1. p Following Theorem 3.2.1, assume that a multipole expansion exists at rs ∈Ωs . Using Theorem 3.3, we can write the potential at any point r ∈ Ω as φ (r ) = ∞ ∑M n =0 ( n) ⋅ n ⋅∇ n 1 | r − rsp | = ∞ 1 ∑ Mt(n) ⋅ n ⋅∇n | r − r p | n =0 (3.3.3.2) s p Because r − rsp = r − rop + (rop − rsp ) = ρ + ros , ∇ = ∇ (where ∇ denotes a derivative with p respect to the ros ), using Theorems A2.6 and A2.10, we can rewrite the above equation as 102    φ (r ) = ∞ 1 ∑ M t(n) ⋅ n ⋅∇ n | ρ + r p | n =0 = os ∞ ∑ M t(n) ⋅ n ⋅∇ p | ρ + ros | n =0 = ∞ ∑ n =0 = n ( M t n ) ⋅ n ⋅∇ ∞ ∑ M t(n) ⋅ n ⋅∇ n ∞ ⎡1 k k 1 ⎤ ⎢ ρ ⋅ k ⋅∇ p ⎥ ∑ ros ⎥ k =0 ⎢ k ! ⎣ ⎦ ∞ ⎡1 ∑ ⎢ k ! ρtk ⋅ k ⋅∇ k =0 ⎢ ⎣ n =0 = 1 n k 1 ⎤ p⎥ ros ⎥ ⎦ ∞ ∑ L(n) ⋅ n ⋅ ρtn n =0 = ∞ ∑ L(tn) ⋅ n ⋅ ρtn (3.3.3.3) n =0 where ρtk = L(tn) = ( n) L = 1  k ρk (2k − 1)!! 1  n L( n) (2n − 1)!! ∞ k =n = 1 ∑ n !∇ ∞ k 1 p ros ( ⋅ ( k − n) ⋅ M t k −n ) (−1) k p ∑ n ! ros k =n ( ) −2 k −1 ( ) p  k ros k ( ⋅ ( k − n) ⋅ M t k − n ) (3.3.3.4) Next, we prescribe the means to shift the origin of the local expansion from rop to ro . This is facilitated by the following Lemma. 103    3.3.4 Traceless Local to Local (Lemma 3.3.4) ( n) Given a local expansion Ot that exist in the domain Ωo centered around rop , it can be shifted to the domain Ωo centered at ro using L(tm ) = ∞ ( )t ⎛ m + n ⎞ (m+n) cp ⋅ ( m ) ⋅ ro ⎟O t m ⎠ n =0 ⎝ ∑⎜ n (3.3.4.1) Proof: The process of proving this expansion is similar to what was done for developing the expression for multipole to local translation. The crux of this proof is that all the tensors involved are totally symmetric. The potential at any point r ∈ Ω o can be written as φ (r ) = = = ∞ n⎡ 1 ( c M t n) ⋅ n ⋅∇ ⎢ ∑ r − ro ∑ ⎢ m =0 m ! n =0 ⎣ ∞ ( )t m ∞ m⎛ 1 c ⋅ m ⋅∇ ⎜ ∑ ro − rop ⎜ ⎝ k =0 k ! ( ∞ ∞ ( )t ( ∞ ∞ ( ⎛m+ k ⎞ c ⎟ r − ro m ⎠ m =0 k =0 ⎝ ∑ ∑⎜ ∞ ∑( m =0 ∞ = ⋅ k ⋅∇ k 1 ⎞⎤ ⎟⎥ p ros ⎟ ⎥ ⎠⎦ )t (roc − rop )t ⋅ (m + k ) ⋅ Ot(m+k ) ⎛m+ k ⎞ ∑ ∑ ⎜ m ⎟ r − roc ⎠ m =0 k =0 ⎝ = )t k ∑( m =0 m c ro − rop m )t k ∞ n+m+ k 1 1 ∇ ⋅ n ⋅ M tn p ros n=0 ( m + k )! ⋅ (m + k ) ⋅ ∑ k c r − ro )t ∞ m+k ⎛ ⎞ c p ⋅m⋅ ∑ ⎜ ⎟ ro − ro m ⎠ k =0 ⎝ c r − ro )t ⋅ m ⋅ L(tm) m m ( )t ⋅ k ⋅ Ot(m+k ) k (3.3.4.2) finally, the fields at all observation points in the finest level can be obtained using φ (r ) = ∞ ∑( m =0 c r − ro )t m ⋅ m ⋅ L(tm ) (3.3.4.3) 104    The proof for this statement can be obtained trivially from the Lemma 3.3.4. 3.4 Error Bounds The error bounds derived herein will be for Method 1; the estimates for Method 2 can be obtained either by nesting those obtained here or using those  [63]. As was mentioned earlier, the shifting of origin of the multipole expansion is exact as is shifting the origin of the local expansion. This implies that the error primarily comes from two sources; (i) Taylor's expansion to create the multipole expansion at the level that it is being translated, and (ii) conversion to local expansion. We shall deal with both cases separately. In what follows, we will assume that the source/observation spheres are of radius a, then we have M (n) ( n1, n2 , n3 ) ≤ Cma n (3.4.1) ⎛ n 1 ⎞ − n −ν ⎜ ∇ ν ⎟ ( n1 , n2 , n3 ) ≤ r ⎝ r ⎠ (3.4.2) (n) Consider an arbitrary constant nth rank tensor, A . Then, the contractions will be 1 1 A(n) ⋅ n ⋅∇n ν ≤ Cm A(n) ⋅ n ⋅ 2n+ν rn r r (3.4.3) 1 ( n) A ⋅ n ⋅ rin ,max n! (3.4.4) A(n) ⋅ n ⋅ M(n) ≤ Cq 105    where Cm and Cq are constants, and ri,max is the vector corresponding to the charge that is farthest away from the origin. The proof for (3.4.3) can be derived using fumula (A12) or (A13). Likewise, the proof for (3.4.4) can be trivially derived. It follows from these expressions that the absolute error in making the multipole approximation may be obtained using P ε m = φ (r ) − ∑ M ( n) ⋅ n ⋅∇ n n =0 = ≤ ≤ ≤ 1 rν ∞ 1 M ( n ) ⋅ n ⋅∇ n ν r n = P +1 ∑ ∞ 1 M ( n ) ⋅ n ⋅∇ n ν r n = P +1 ∑ ∞ 1 ⎞ ⎛ M ( n) ( n1, n2 , n3 ) ⎜ ∇ n ν ⎟ (n1, n2 .n3 ) ⎝ r ⎠ n= P +1 n1,n2 ,n3 ∑ ∑ ∞ ∑ n= P +1 Cm 3n a n r − n−ν (3.4.5) ≤ n ⎛ 3a ⎞ Cm ⎜ ⎟ r −ν ⎝ r ⎠ n= P +1 ∑ ⎛ 3a ⎞ ≤ ν −1 ⎜ ⎟ r (r − 3a) ⎝ r ⎠ Cm P +1 Since the ratio of the distance to the radius of the sphere is always greater than 3, it implies that the series converges for all values of ν . Next, the error bounds on truncating the local expansions are derived as follows: 106    ∞ ∑ εl = ρ n ⋅ n ⋅ L( n) n = P +1 ∞ ∑ = ρn ⋅ n ⋅ n = P +1 ∞ ∑ ≤ n = P +1 ∞ 1 ∑ n !M (m−n) ⋅ (m − n) ⋅∇ m=n Cm 3n a n a m−n 3m−n ⎛ 3a ⎞ ≤ ⎜ p⎟ ⎜ ⎟ p ν −1 p (ros − 3a ) ⎝ ros ⎠ ros ( ) Cm 1 p ( ros ) ν 1 p ( ros ) ν +n P +1 (3.4.6) These equations imply that the approximations converge ∀ν ∈ . The above estimates are for the absolute error; similar estimates for the relative error can be readily obtained. Actually, using the Cauchy–Schwartz inequality, ri,max ⋅1⋅ r ≤ ri,max r ≤ ar , we could get more precise error bounds, those are: εm ≤ εl ≤ Cm 1 ⎛a⎞ ⎜ ⎟ ( P + 1)! rν −1 (r − a ) ⎝ r ⎠ 1 ( P + 1)! P +1 ⎛ a Cm ⎜ p ν −1 p ⎜ (ros − a ) ⎝ ros p ( ros ) ⎞ ⎟ ⎟ ⎠ (3.4.7) P +1 The proof of equation (3.4.7) can be obtained trivially followed by equations (3.4.5) and (3.4.6). Figure 3.4.1 shows the error bounds with different ν values (-3.3, -1.5, 1.0, 2.2, 3.3) and plots the predicted relative error for different values of P for interaction between two domains of 107    radius a whose centers are separated by 4a. The values of m chosen for this demonstration are those that are used in the numerical results section as well. (Fig. 3.4.1 analytically derived upper bound for the error for different values of ν for source/observation domains of radius a whose centers are separated by 4a.) The salient facts evident from Fig. 3.4.1 are: (i) the expansions converge for all m with increasing P, and (ii) the expansions converge faster for m < 0 than for m > 0. 108    Both these facts are borne out in numerical experiments in Chapter 4. On a slightly different note, relative error bounds derived here do not depend on the number of levels in the tree as no error is accrued in traversing either up (Multipole to Multipole expansion) or down (Local to Local expansion) the tree. The multipole (or local) error at any two levels in the tree are approximately the same due to the fact that both the distance r and the size of the box a at any level are scaled by the same factor with respect to the smallest box, and this factor drops out. 3.5 Computational Methodology and Cost As mentioned earlier, the entire computational domain is embedded in a cubical domain that is recursively partitioned into smaller cubes. Both near and far interaction list are created at all levels. In what follows, we describe the computational methodology as well as the cost i associated with each operation. The cost associated will be denoted by Cop where i = 1, 2 denotes the method used and “op” denotes the specific operation. It is to be noted that the cost 2 1 2 Cop is specifically for ν = 1 and Cop = Cop for ν ≠ 1 . The operations op ∈{NF , C 2M , M 2M , M 2L, L2L, L2O} that stand for 1) near field 2) charge to multipole 3) multipole to multipole 4) multipole to local 5) local to local, and (vi) local to observer. In what follows, we will denote the total number of interaction points by N, the number of harmonics by P, the number of levels in the tree by N l and the number of boxes at any level by N b ,l . It will be assumed that the interaction points are uniformly distributed in a volume. We will 109    also assume that the number of unknowns in each leaf box is s. It follows that Nb,1 = N / s , N b,l −1 = 8 N b,l , and Nl ∑ Nb,i ∝ N / s . With these preliminaries, the computational methodology i =1 can be prescribed as follows: 3.5.1 Near Field Evaluation At the leaf level, all boxes that lie in the near field are tabulated. This implies that one computes the interaction between the points that are in the vicinity of each other using whatever 2 method is classically used. Therefore, C1 = CNF , and NF C1 ∝ Total no. boxes × Cost of interaction of each box with its near field NF ∝ N / s × 27 s 2 (3.5.1.1) ∝ 27 Ns 3.5.2 Far Field Evaluation The far field evaluation comprises of four operations. The terminology used in this paper will be similar to the one introduced in [63]. A. Multipole Expansion For all boxes at the lowest level, compute the multipole expansion for all charges that reside in it. This is done using Theorem 3.2.1 for method 1 or Lemma 3.3.1 for method 2. The former forms a set of totally symmetric tensors whereas the latter forms a traceless tensors. The cost for this operation is 110    i CC 2 M ∝ total no of level 1 boxes × Cost for each box ∝ total no of level 1 boxes × no. charges per box × cost per tensor × no of tensors 1 CC 2 M ∝ P N × s × ∑ ( p + 1)( p + 2) / 2 s p =0 (3.5.2.1) ⎛N ⎞ ∝ ⎜ P3 ⎟ ⎝6 ⎠ 2 CC 2 M P N ∝ × s × ∑ (2 p + 1) s p =0 ∝ NP 2 B. Multipole to Multipole For all boxes, at any given level, we compute the multipole expansion for the parent box from those of its children. This operation is repeated at all levels. In ACE, same as the classical FMM, the number of operations is independent of child/parent levels. The cost for obtaining a th term in the n rank multipole tensor scales as (m - n + 1)(m - n + 2)/2 for m = n,…, P. Given th that there are (n + 1)(n + 2)/2 independent terms in n rank tensor, the cost for constructing all terms of the tensor scales as Cost Child Multipole to Parent Multipole1 P P n =0 = m=n ∑ (n + 1)(n + 2) / 2 ∑ (m − n + 1)(m − n + 2) / 2 6 ( P + i) i i =1 =∏ 111    (3.5.2.2) The number of operations specified in this equation is exact and exploits the total symmetry of the tensors involved. Likewise, the number of operations required when using traceless tensors is Cost Child Multipole to Parent Multipole2 P P n =0 m=n ∑ (2n + 1) ∑ (2(m − n) + 1) = (3.5.2.3) P ∑ ( P − n + 1)2 (2n + 1) n =0 This implies that the cost will be i CM 2 M ∝ Total number of multipole to multipole translations × Cost per translation C1 2 M M N 6 ( P + i) ∝ ×∏ s i =1 i 2 CM 2 M ∝ (3.5.2.4) N P × ∑ ( P − i + 1) 2 (2i + 1) s i =0 C. Multipole to Local Translation The cost for translating P+1 multipole tensors of one box to local tensors at another is that in equation (3.5.2.3) for symmetric tensors and (3.5.2.4). Consequently, the cost for translation scales as i CM 2 L ∝ No. translations per box × No of boxes × cost for one translation C1 2 L ∝ 189 M N 6 ( P + i) ∏ s i =1 i 2 CM 2 L ∝ 189 N P ∑ ( P − i + 1)2 (2i + 1) s i =0 (3.5.2.5) 112    D. Local to Local Translation The cost is exactly the same as that for multipole to multipole translation, i.e., 2 2 C1 2 L = C1 2M and CL2 L = CM 2M . L M E. Local to Observer Again, the cost for this operation is exactly the same as that for mapping charge to 1 2 2 multipoles, i.e., C1 2O = CC 2M and CL2O = CC 2M L The above analysis implies that the cost for the total analysis scales as i i i i i i i Ccost = C NF + CC 2 M + CM 2 M + CM 2 L + CL 2 L + CL 2O 1 Ccost for i = 1, 2 N P 6 NP 3 ∝ 27 Ns + 191 + s 720 3 2 Ccost ∝ 27 Ns + 191 N P4 + 2 NP 2 s 6 (3.5.2.6) It is readily apparent that the optimal number of unknowns per box is s ∝ P 3 / 10 for 2 method 1, and s ∝ P for method 2. Existing methods for R −ν [68, 80] do not use symmetry in their formulations. Consequently, their translation cost does not have a factor of 1/720, and their cost will be more expensive for a given value of P. Given that the cost reduction is a consequence of symmetric tensors, the application of method 2 to R −ν will have identical cost as method 1. The application of method 2 to R −1 results in a complexity that is very similar to that of the classical FMM algorithm that was presented in [63] with an exception of a factor of 1/2 in the translation term. However, recent improvements to the classical FMM scheme have reduced 4 3 the P scaling to P by using a plane wave based translation operator [81]; using a plane wave 113    based translation operator has enabled some algorithmic changes that have further reduced the ‘‘cost in front.’’ Similar modifications to method 2 are possible for ν = 1 . 3.6 Implementation Subtleties Two issues stand out when implementing ACE algorithm: th 1) the n rank operators used for multipole to multipole translation from level l → l + 1 and those used for doing the same from level l + 1 → l + 2 differ by a multiplicative constant; th 2) similarly, the n rank translation operator for translation of multipole from level l → l + 1 and that used for translating multipoles from l + 1 → l + 2 are related by a constant. Both these statements can be proven by examining the explicit forms of the operators involved. These facts imply that these operators need to be constructed and stored only for boxes at the lowest level and all others can be readily obtained by the operations listed above. It is also evident that the key component of the cost equation is the number of translations and the number of operations necessary to compute each translation. Some modifications that were originally suggested in [81] can be adapted to this algorithm. Indeed, since the error in shifting the multipole expansion is negligible, one can cluster translations to further reduce the total number of translations. Furthermore, one can exploit the structure of the tensor contractions involved in translation. Some of these improvements have already had a positive impact on the speed of the resulting code [82-84]. Finally, as with all tree codes (and FMM), the algorithm may be made adaptive in terms of the number of the tensors used, with very little loss in precision. For example, at higher levels in the tree (which correspond to interactions that are further away in space), one could reduce P as the potential is dominated by interactions that are closer. However, in the results presented herein, 114    this is not done. Finally, while the resulting power series expansion is convergent, a concern is the numerical stability of the resulting series. It will be evident in Section 3.8 that for the range of P, s and ν, this is not a concern. However, further insights into the behavior of the series can be gleaned from equation (A13), and using this expression to compute higher order derivatives yields stable results. 115    CHAPTER 4 Application of Accelerated Cartesian Expansion Methods We designed a new fast calculation method (ACE) (See Chapter 3) to calculate the interaction between N particles and our algorithm is based on the Cartesian tensor theorem and 2 reduced the calculation cost to O(N) from O(N ) by using the scheme of Fast Multipole Method. In this chapter, we will demonstrate that, by using the expansion, our algorithm can calculate not only the 1/R Coulomb interaction, but any interaction in the form of R-υ such as Lenard-Jones potential. Also we will show how to get the exact solutions of the translation operator from the multipole to the local translation for the kernel of r −ν and Yukawa e −κ r and how ACE works on r different interactions. 4.1 Preliminaries 4.1.1 Generalized Maxwell Expansion (GME) From ACE, for an interaction with form r −ν we know that the kernel of ACE is Multipole to Local translation (Translation Operator) because only this operator depends on the parameter υ. The key part of the translation is to solve the expansion of ∇ n r −ν . An expression for solid harmonics in terms of the gradient of the position r-1 was first derived by Maxwell of an arbitrary set of n axis [76]. In the Cartesian coordinate frame, his expressions reduce to ∇nr −1 = (−1)n r −(2n+1) nrn Maxwell equation This relationship has been obtained by others [85, 86] as well. 116    (4.1.1.1) Maxwell found the expression of ∇ n R −1 80 years ago. Till the work done by us, to the best of our knowledge, there is no the exact expression for ∇ n r −ν  for  any  real ν ≠ 1 .  To be a first one, I found that the element of the total symmetric tensor ∇ n r −ν takes the form ⎛ 1 ⎞ ∂ n1∂ n2 ∂ n3 ⎜ ν ⎟ x y z ⎝r ⎠ = (−1)n r −2n−ν n n n d 1t d 2 t d 3 t 2 2 2 ∑ ∑ ∑ (−1)m r 2m m1=0 m2 =0 m3 =0 ⋅ n−m+1 ∏ i =0 ⎡n ⎤⎡n ⎤⎡n ⎤ (ν + 2i ) ⋅ x n1−2m1 y n2 −2m2 z n3 −2m3 ⎢ 1 ⎥ ⎢ 2 ⎥ ⎢ 3 ⎥ ⎣ m1 ⎦ ⎣ m2 ⎦ ⎣ m3 ⎦ (4.1.1.2) to construct the n-fold gradient, where n = n1 + n2 + n3 , m = m1 + m 2 + m3 . It can be readily shown that this definition reduces to that obtained for ν = 1 . Proof: • For n = 0, we can easy confirm GME. ⎧ n1 = n2 = 0 • For n = 1, when n3 = 1, then ⎨ ⎩ m1 = m2 = m3 = 0 z the left side of GME = -ν ν + 2 r z the right side of GME = -ν ν + 2 r Same as above, when n1 =1 or n2=1, we can verify GME. • Suppose when n1 = N1, n2 = N2, n3 = N3, GME is right, 117    then if ( n3 = N3 + 1), we have N N n ⎛ 1 ⎞ ∂ x 1 ∂ y 2 ∂ z3 ⎜ ν ⎟ ⎝r ⎠ N N N +1 ⎛ 1 ⎞ = ∂ x 1∂ y 2 ∂ z 3 ⎜ ν ⎟ ⎝r ⎠ N N N ⎛ 1 ⎞ = ∂1 ∂ x 1∂ y 2 ∂ z 3 ⎜ ν ⎟ z ⎝r ⎠ = ∂1 z = N N N d 1t d 2 t d 3 t 2 2 2 ∑ ∑ ∑ (−1) N + m r 2 m−2 N −ν m1=0 m2 =0 m3 =0 N N N d 1t d 2 t d 3 t 2 2 2 ∑ ∑ ∑ (−1) = ∑ ∑ ∑ N +m = ∑ ∑ ∑ (ν + 2i )x N1− 2 m1 N 2 − 2 m2 N3 −2 m3 y z N − m+1 ∏ (−1) N +m N − m +1 ∏ N1−2 m1 N 2 − 2 m2 y ⎡ N1 ⎤ ⎡ N 2 ⎤ ⎡ N3 ⎤ N − 2 m −2 N −ν + 2 m ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⋅ ∂z z 3 3r ⎢ m1 ⎥ ⎢ m2 ⎥ ⎢ m3 ⎥ ⎣ ⎦⎣ ⎦⎣ ⎦ ( ) (ν + 2i )x N1−2 m1 N 2 − 2 m2 y ⎡ N1 ⎤ ⎡ N 2 ⎤ ⎡ N3 ⎤ N − 2 m −2 N −ν + 2 m ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⋅ ∂z z 3 3r ⎢ m1 ⎥ ⎢ m2 ⎦ ⎢ m3 ⎥ ⎣ ⎦⎣ ⎥⎣ ⎦ ( ) (ν + 2i )x N1−2 m1 N 2 − 2 m2 y ⎡ N1 ⎤ ⎡ N 2 ⎤ ⎡ N3 ⎤ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥⎢ ⎦⎣ ⎦ ⎣ m1 ⎦ ⎣ m2 ⎥ ⎢ m3 ⎥ i =0 (−1) N + m m1=0 m2 =0 m3 =0 N − m+1 ∏ i =0 ⎛ N −2 m −1 N −2 m (2 N + ν − 2m) ⋅ z ⎞ ⋅ ⎜ ( N3 − 2m3 ) z 3 3 r −2 N −ν + 2m − z 3 3 ⋅ 2 N +ν −2 m+ 2 ⎟ ⎝ ⎠ r = (−1) N +1 r −2( N +1)−ν N N N d 1t d 2 t d 3 t 2 2 2 ∑ ∑ ∑ m1=0 m2 =0 m3 =0 ⋅x N1− 2 m1 N 2 −2 m2 N3 − 2 m3 y z (−1) m r 2 m N − m+1 ∏ (ν + 2i ) i =0 ⎡ N1 ⎤ ⎡ N 2 ⎤ ⎡ N3 ⎤ ⎛ N3 − 2m3 r 2 ⎞ ⋅ ⎟ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⋅⎜ z − 2 N + ν + 2m z ⎟ ⎢ m1 ⎥ ⎢ m2 ⎥ ⎢ m3 ⎦ ⎜ ⎥ ⎝ ⎣ ⎦⎣ ⎦⎣ ⎠ 118    ⎡ N1 ⎤ ⎡ N 2 ⎤ ⎡ N3 ⎤ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ m1 ⎥ ⎣ m2 ⎦ ⎢ m3 ⎥ ⎣ ⎦⎢ ⎥⎣ ⎦ (ν + 2i )x i =0 m1=0 m2 =0 m3 =0 N N N d 1t d 2 t d 3 t 2 2 2 ∏ i =0 m1=0 m2 =0 m3 =0 N N N d 1t d 2 t d 3 t 2 2 2 N −m +1 = (−1) N +1 r −2( N +1)−ν N N N d 1 t d 2 t d 3 t+1 2 2 2 ∑ ∑ ∑ (−1)m r 2m N +1−m+1 m1=0 m2 =0 m3 =0 ∏ (ν + 2i) i =0 ⎡ N ⎤ ⎡ N ⎤ ⎡ N + 1⎤ ⋅ x N1−2m1 y N2 −2m2 z ( N3 +1)−2m3 · ⎢ 1 ⎥ ⎢ 2 ⎥ ⎢ 3 ⎥ ⎣ m1 ⎦ ⎣ m2 ⎦ ⎣ m3 ⎦ N N +1 We know that d 3 t + 1 = d 3 t , finally we get 2 2 ⎛ 1 ⎞ ∂ N1∂ N 2 ∂ N3 +1 ⎜ ν ⎟ = x y z ⎝r ⎠ N +1 N N d 1t d 2 t d 3 t 2 2 2 ∑ ∑ ∑ (−1) N +1+ m r 2m−2( N +1)−ν ⋅ (4.1.1.3) m1=0 m2 =0 m3 =0 N +1−m+1 ∏ i =0 ⎡n ⎤⎡n ⎤⎡n ⎤ (ν + 2i ) ⋅ x N1−2m1 y N2 −2m2 z ( N3 +1)−2m3 ⋅ ⎢ 1 ⎥ ⎢ 2 ⎥ ⎢ 3 ⎥ ⎣ m1 ⎦ ⎣ m2 ⎦ ⎣ m3 ⎦ For n1 = N1 + 1 and n2 = N2 + 1, the proof is same as above. GME is very important for ACE because a lot of interactions are in the form of r −ν (such as Lenard-Jones potential) or can be expanded in the form of r −ν (such Yukawa interaction). We will see the applications of the equation (4.1.1.3) in our numerical simulations. 4.1.2 New Expansion of Modified Bessel K (New K) For Multipole to Local translation of Yukawa interaction in ACE methods, we will find that the convergence of convention calculation method (by using GME) is not fast enough (See section 4.4). 119    In order to accelete the calculation speed of the translation operator, we need to find a new fast convergence formula to replace the convention one. Due to a good fortune, I found a new expression of Modified Bessel K function, that is ∞ l N −1 1 ⎞2 1 +N r ⎛2 (1 − l + 2i ) = − ⎜ ⎟ ( − r ) 2 K ∑ l! ∏ 1 (−r ) −N − ⎝π ⎠ l =0 i =0 2 (4.1.2.1) where, N is a positive integer and K is Modified Bessel function. By using mathematical induction, we can prove formula (4.1.2.1) Proof: When N=0, we easily prove ∞ r l 0−1 1 ⎞2 1 ⎛2 ∑ l ! ∏ (1 − l + 2i) = − ⎜ π ⎟ (− r ) 2 K − 1 (− r ) ⎝ ⎠ l =0 i =0 2 Suppose, for N, New K formula is correct, l N −1 ∞ 1 2 1 +N r ⎛2⎞ 2 ∑ l ! ∏ (1 − l + 2i) = − ⎜ π ⎟ (− r ) K − N − 1 (− r ) ⎝ ⎠ l =0 i =0 2 • For N+1, from the left side of New K formula, we have 120    ⎞ d ⎛ ∞ r l ( N +1)−1 (1 − l + 2i ) ⎟ ⎜∑ ∏ ⎟ d ( − r ) ⎜ l =0 l ! i =0 ⎝ ⎠ l ·r l −1 ( N +1)−1 ∏ (1 − l + 2i) l! l =0 i =0 ∞ = −∑ r l −1 ( N +1)−1 ∏ (1 − l + 2i) (l − 1)! i =0 l =0 ∞ = −∑ ⎤ r l −1 ⎡ ( N +1)−1 = −∑ ⎢ ∏ (1 − l + 2i ) ⎥ ⋅ (1 − l ) ⎥ l =0 (l − 1)! ⎢ i =1 ⎣ ⎦ ∞ r l −1 ( N +1−1)−1 ∏ [1 − l + 2(i′ + 1)] (l − 2)! i′=0 l =0 ∞ =∑ let i′ = i -1 r l −1 N −1 ∏ [1 − (l − 2) + 2i′] (l − 2)! i′=0 l =0 ∞ =∑ = r k +1 N −1 ∑ k ! ∏ (1 − k + 2i′) k =−2 i ′= 0 ∞ r k N −1 ∏ (1 − k + 2i′) k ! i′=0 k =0 ∞ = r· ∑ • let k = l - 2 ∵ (− n)! = 1 = ∞ for n ≥ 0 0 ⋅1⋅ 2 ⋅⋅⋅ (n + 1) For N+1, from the right part of New K formula, we have 1 1 ⎡ ⎤ 1 1 + ( N +1) d ⎢⎛ 2 ⎞ 2 ⎥ = − ⎛ 2 ⎞ 2 ( − r ) 2 + ( N +1) K 2 K 1 (−r ) ⎥ 1 (−r ) ⎜ ⎟ (− r ) ⎜ ⎟ − ( N +1) − −N − d ( − r ) ⎢⎝ π ⎠ ⎝π ⎠ 2 2 ⎢ ⎥ ⎣ ⎦ 1 ⎡ ⎤ 1 +N ⎛ 2 ⎞2 ⎢ ⎥ 2 K = r ·⎢ ⎜ ⎟ ( − r ) 1 (− r ) ⎥ −N − π⎠ ⎝ 2 ⎢ ⎥ ⎣ ⎦ 121    Base on the Eq. 4.1.1.3 and 4.1.2.1, we can generate a fast kernel of ACE, the results we got will be shown on the section “Yukawa (Shielded Coulomb) Potential Calculation Based on ACE” of this chapter. 4.2 The Error in Numerical Simulation Based on ACE In our simulations, the error will be computed by using the following: L2 = ∑ (φnum (ri ) − φexact (ri ) ) i 2 (4.2.1) 2 ∑φexact (ri ) i As is commonly done, all near field interactions are ignored. This gives us a measure of the error produced by the multipole representation. All timing runs are performed on a Linux desktop that has a 2.8 GHz Intel processor, and the run times reported are obtained using an intrinsic function dtime. Finally, P will denote the maximum rank of the tensor used in expansions. 4.3 Numerical Simulation Results In this section, we will demonstrate the validity of the numerical method presented via numerous examples. The overarching goals of this section are as follows: 1) numerically show that the traversal up and down the tree can be performed exactly, 2) demonstrate that the proposed method produces accurate results for different values of ν, 3) demonstrate that this scheme can be used seamlessly for computing potentials that are superposition of potential of the form R -ν for multiple values of m using a single tree traversal, and 4) Experimentally demonstrate that the proposed scheme scales as O(N). 122    Level=3 Level=5 P=2 3.268070962493E-03 3.268070962493E-03 P=5 2.866109269814E-05 2.866109269814E-05 P=8 4.207517301401E-07 4.207517302159E-07 P=11 7.470454043400E-09 7.470454038638E-09 Level=7 Level=10 P=2 3.268070962493E-03 3.268070962493E-03 P=5 2.866109269812E-05 2.866109269808E-05 P=8 4.207517301963E-07 4.207517301868E-07 P=11 7.470454030685E-09 7.470454009644E-09 . (Table 4.3.1 The variation of errors in multipole to multipole with fixed translation error for ν = 2.2) First, we will demonstrate the accuracy of operators used to traverse up and down the tree. The geometric configuration analyzed is as follows: 8000 source/observers populate a cube of dimension 4 x 4 x 4. Of these, 4000 are located at Ω1 = (0.0, 1.0) x (0.0, 1.0) x (0.0, 1.0), and the rest are located in Ω2 = (3.0, 4.0) x (3.0, 4.0) x (3.0, 4.0). The distribution of the points is uniformly random, i.e., the distribution in the domain is almost uniform. This arrangement ensures that following; particles in Ω1 and Ω2 interact with each other at level 3. For a given P, the error bound for this interaction can be computed. As we increase the number of levels, the change in the error norm can be attributed solely to the error multipole-to-multipole and local-tolocal translations. Table 4.3.1 documents the error obtained for different values of P and different levels (while we have data for all 10 levels, only some are presented). All computations 123    are carried out for ν = 2.2. As is evident from the results presented, the variation of the error obtained from using different levels in the tree is accurate to double precision. Next, we perform a similar experiment, but for the lattice gas potential φ (r) = ∑ qi ln r - ri that is very i commonly used in the electronic structure calculations. Again, as is evident from Table 4.3.2, the variation of errors for different levels is within double precision accuracy. Note, the translation operator for this function can be readily derived from the material presented earlier Level =3 Level =5 Level =7 P=2 2.194595063535080E-04 2.194595063534780E-04 2.194595063534940E-04 P=5 2.949087946097280E-07 2.949087945862070E-07 2.949087945902640E-07 P=9 4.696476490009490E-10 4.696476221096200E-10 4.696476168731460E-10 P=15 5.455886813905550E-14 5.453948560581600E-14 5.451163061523900E-14 (Table 4.3.2 Variation of errors in multipole to multipole operations with fixed translation error for computing the lattice gas potential function) In the next series of numerical experiment, we demonstrate the efficiency and convergence of the proposed method. This is done by analyzing potentials due to randomly distributed sources at random observation points. In what follows, the source/observation points are co-located, and 3 are randomly distributed. Four different computational domains are chosen: (0.0, 1.0) , (0.0, 3 3 3 2.0) , (0.0, 4.0) and (0.0, 8.0) . These domains are populated with 500, 4000, 32,000, and 256,000 source/observers. As mentioned earlier, the size of the smallest box Ω0 depends on the degree of approximation P. This implies that the number of levels in the tree will vary with P. 124    Tables 4.3.3 – 4.3.5 demonstrate convergence and speed for ν = 1, 3.3, -3.3. All errors reported are computed using equation (4.2.1). As is evident from these tables, the proposed method converges rapidly for different values of ν, and is faster than the direct computation. The convergence behavior follows the trends expected for different values of ν, as does the timing data with respect to the number of harmonics. Again, as reported earlier, it is possible to choose the size of the smallest box to optimize the timing data; while this is not done precisely here, the size of the smallest box is varied depending on the desired accuracy (P). The breakeven point, of course, depends on the -4 accuracy (and m). For ν = 1, and P = 2 (which results in an error of 10 ), the breakeven point is as low as 250 source/observers. It should be noted that in obtaining this timing data, we have not fully optimized the M2L stage in keeping with some of the development suggested in [81] and in -4 papers thereafter. Even so, the timing data obtained for ν = 1 and accuracy of 10 favorably compares with some of the most optimized codes. 125    (Table 4.3.3 Errors in Coulomb potential (ν = 1) computed using the proposed scheme and the directly timing data for the two methods.) 126    3.3 (Table 4.3.4 Errors in Coulomb potential (1/R ) computed using the proposed scheme and the directly timing data for the two methods) 127    3.3 (Table 4.3.5 Errors in Coulomb potential (R ) computed using the proposed scheme and the directly timing data for the two methods) 128    Next, we demonstrate the application of this technique to compute the Lennard-Jones 3 potential. The computational domain Ω = (0.0, 1.0) is filled with 12,167 source / observers whose location is uniformly random. The potential computed is of the form ⎛ 1 φ ( r ) = qi ⎜ ⎝r 12 6 ⎞ − 6⎟ r ⎠ (4.3.1) A uniform oct-tree with five levels are constructed. As was noted earlier, the traversal up and down the tree are independent of the potential, and only the translation across the tree depends on the specifics of the potential being computed. Table 3.7.6 tabulates the error with increasing P. As is evident, the potential computed converges rapidly with increasing P. P Error 10 4.614155995813000E-02 14 6.242473037897020E-03 18 3.954018566296790E-04 22 8.407075999784550E-05 32 1.246581347878100E-06 (Table 4.3.6 Error in the Lennard-Jones potential; the computations are performed using one tree) Finally, we compare the computational cost of the proposed scheme with that of direct computation. Timing data for domains of increasing size are obtained; the number of source/observation pairs vary from 500 to 1,000,000. The density of particles in the domain is chosen to be 500 per unit cube, the particles are randomly distributed in the domain, and ν = - 1.5. -3 The precision P = {2, 4, 10}, which translates to errors ranging from 10 , 10 129    -6 -8 and 10 , respectively. In the simulation, the size of the smallest box was kept the same, i.e., Ω0 = 0.25 for all values of P. All simulations are run on a Linux desktop (running Redhat9.0) with 2.8 GHz Intel Pentium processor with the Intel Fortan compiler. The timing data is obtained using the intrinsic function dtime. (Fig. 4.3.2. Cost scaling of the direct method and the fast algorithm (ACE) for different values of 1.5 the precision P for computing the potential R .) Fig. 4.3.2 compares the time required for computing pair-wise potentials classically and using the methodology presented herein. It is evident from Fig. 4.3.2 that the cost scales as O(N); the slopes of all three fast methods is approximately 1.0. A noteworthy fact is that the breakeven 130    point, i.e., the number of unknowns above which the proposed technique is computationally -3 more efficient, is as low as 250 source/observation points for an error of 10 . 4.4 Yukawa (Screened Coulomb) Potential Calculation Base on ACE In the molecular systems, especially for large-scale physically realistic charged systems, the long-range interactions play a major role. All interactions between charged particles must be accounted for so that, typically, the exact calculation of the interactions by direct summation requires O(N2) operations in the framework of a N-body problem. As we know, many molecular systems of biological and physical significance, however, are governed by the screened Coulomb (also called the Debye-Huckel or Yukawa) potential. The Yukawa potential energy of a system containing N charged particles, is given by N i −1 φ (r ) = ∑ ∑ qi q j i =1 j =1 e −κ rij (4.4.1) rij where κ is the Debye-Huckel screening parameter (which is proportional to the square root of the ionic strength of the solution), qi is the charge of particle i, and rij is the separation distance between particles i and j. The cost of computing the Coulomb potential was amortized to O(N) via the development of the fast multipole method. The development of similar methods for the Yukawa potential is almost non-existent. A paper was presented by [87]. This method formulates the problem in terms of the addition theorems for the modified Bessel function and derives a series of approximations. However, the methodology presented is not rigorous, in the sense, that error 131    bounds on the set of approximations made are not provided, the convergence is not shown, and the overall methodology is fairly inaccurate. Since the publication of the seminal papers on fast method to compute potentials of the form r -1 [63] and because the researchers believed that the Taylor series represents a poor convergence for exp(-r), there have been a number of attempts to compute potentials of the form ⎛ −κ r ⎞ exp ⎜ ⎟ , and the work relies on spherical harmonic expansions, ⎝ r ⎠ e −κ R 2κ = φ (R ) = R π ∞ ∑ (2n + 1) n =0 π π ˆ ˆ I 1 (κρ ) K 1 (κ r ) Pn ( ρ ⋅ r ) 2κρ n + 2kr n + 2 (4.4.2) 2 where R = r - ρ . Unfortunately, the fast calculation based on traditional FMM will introduce the add-on error between the translations from multipole to multipole expansion and local to local expansion. And also, all the development of the fast calculation algorithms based on traditional FMM were an O(NlogN) methods before ACE was done. So, the motivations for writing this section are three fold: 1) To introduce a new technique (Maxwell Cartesian tensor method) wherein the traversal up and down the tree are exact. 2) We also prove quick convergence for this new method. 3) This method can work on the wave equation e − ikr in low frequency case. r In this section, we will show, by combining Taylor's series expansion with Maxwell Cartesian tensors method, we can reach almost O(pN) algorithms. 4.4.1 General Statement of the Problem 132    Consider a domain Ωs ∈ 3 that is populated with k sources and a domain Ωo ∈ 3 that contains k observers. With no loss of generality, assume that these domains are spherical and of radius a. These spherical domains completely enclose one of the cubical subdomains generated earlier. The location of these points is random, however we will assume that the distribution in the domain is sufficiently dense and relatively uniform. Centers of Ω s and Ωo are denoted by rs and ro . It is assumed that Ωs ⊂ Ωs and Ωo ⊂ Ωo and Ω s ∩ Ωo = ∅ . In what follows, the domains Ωs and Ωo will be called parents of Ω s and Ωo , respectively. As before, the parent domains will be assumed to be spherical of size 2a , and their center are denoted by rsp and rop , respectively, and rsp ∉Ωs and rop ∉ Ωo . The potential due to sources ∀ri ∈ Ω s ∀ri ∈ Ω s observed at r ∈ Ω o is given by e−κ |r −ri | φ (r ) = ∑ qi | r − ri | i =1 m =∑∑ q (−1)n i rin ⋅ n ⋅∇ n e−κ r r ∞ m q (−1)n i rin ⋅ n ⋅∇ n e−κ r r m ∞ i =1 n=0 = ∑∑ n =0 i =1 n! n! (4.4.1.1) This equation is derived using Theorem 2.10, and qi are values of the sources at locations ri . The exchange of the summation indices is permissible as the summation converges. Unless otherwise stated, the operator ∇ operates on r. In what follows, we will prescribe the means that will enable that rapid computation of equation (3.1.1.1). 133    4.4.2 Cartesian Expansions with Totally Symmetric Tensors Theorem 4.4.2.1 (Multipole Expansion) The total potential at any point r ∈ Ω o due to k sources qi , i = 1, , k located at points ri ∈ Ω s is given φ (r ) = ∞ ∑M n =0 k (n) e −κ r ⋅ n ⋅∇ r M ( n ) = ∑ ( −1) n i =1 n qi n ri n! (4.4.2.1) where M ( n ) is the totally symmetric multipole tensor about the origin rs = {0, 0, 0} Proof: See formula (3.1.1.1) Theorem 4.4.2.2 (Multipole to Multipole Expansion) Given a multipole expansion of k sources about the rs = {0, 0, 0} k O( n) = ∑ (−1)n i =1 qi n ri n! (4.4.2.2.1) then the multipole expansion about the point rsp can be expressed in terms of above equation as k M ( n ) = ∑ ( −1) n i =1 n q! m! p rs (ri − rsp ) n = ∑ ∑ n! n! m =0 P ( m,n ) ( ) n−m O(m) (4.4.2.2.2) Where P(n, m) is the permutation of all partitions of n into sets n-m and m. (n) Proof: Using equation (3.2.2.1), Theorem A2.1, Theorem A2.10 and noting that the tensors O ∇ n r −ν are totally symmetric, results in 134    φ (r ) ∞ ⎧k q ⎪ = ∑ ⎨∑ ( −1) n i ri − rsp n! n =0 ⎪ i =1 ⎩ ( n⎫ ⎪ ) ⎬ ⋅ n ⋅∇ ⎪ ⎭ p n e−κ |r −rs | r − rsp | n ⎧k ⎪ n qi = ∑ ⎨∑ ( −1) ∑ (−1)n−m ∑ rsp n ! m =0 n =0 ⎪ i =1 P ( n,m ) ⎩ ∞ | ( ) p| ⎫ e −κ |r −rs ⎪ ⎭ n−m | r − rsp | ⎪ rsm ⎬ ⋅ n ⋅∇ n p ⎧ n m! p ⎪ = ∑⎨∑ ∑ rs ⎪ m =0 P ( n , m ) n ! n =0 ⎩ n−m −κ |r −rs | ⎫ ⎡k ne m qi m ⎤ ⎪ rs ⎥ ⎬ ⋅ n ⋅∇ ⎢ ∑ (−1) m! ⎥ ⎪ | r − rsp | ⎢ i =1 ⎣ ⎦⎭ ⎧ n m! p ⎪ rs = ∑⎨∑ ∑ n! n = 0 ⎪ m =0 P ( m , n ) ⎩ n−m −κ |r −rs | −κ |r −rs | ∞ ⎫ (m) ⎪ ( n) ne ne O ⎬ ⋅ n ⋅∇ = ∑ M ⋅ n ⋅∇ | r − rsp | n=0 | r − rsp | ⎪ ⎭ ∞ ∞ ( ) ( ) p p (4.4.2.2.3) p Next, we prescribe the means to translate the multipole expansion that exists about rs to a local expansion about rop . There are two ways of deriving this translation operator; 1) using reciprocity and 2) using a Taylor's series expansion. The theorem that is presented next uses the first approach whereas the second approach is used in presenting a similar operation for method 2. As is expected, both result in identical expressions. Theorem 4.4.2.3 (Multipole to Local Translation) 135    Assume that the domains Ωs and Ωo are sufficiently separated, and the distance between p p p { } p { } their centers ros =| ros |=| ro − rs | is greater than diam Ωs and diam Ωo . If a multipole p expansion M ( n ) is located at rs , then another expansion L( n ) that produces the same field ∀r ∈Ωo is given by ∞ φ (r) = ∑ ρn ⋅ n ⋅ L( n) n =0 ( n) L p −κ r m e os 1 ( m− n ) = ∑ M ⋅ (m − n) ⋅∇ p ros m=n n ! ∞ (4.4.2.3.1) where ρ = r − rop . Proof: The potential φ (r)∀r ∈Ωo is given by equation (4.4.1.1). It follows by reciprocity that p if the multipole M ( n ) were located at r it would produce the same potential at rs . In other words, the potential at all points r ∈ Ω o can be computed by placing the multipole moments at r p and evaluating at rs . φ (r ) = ∞ ∑M n =0 Here, (n) ⋅ n ⋅∇ n p e −κ |r −rs | | r − rsp | (4.4.2.3.2) p ∇ is the derivative with respect to rs . Since this valid at all points r ∈ Ω , the p multipole tensor M ( n ) at r may be translated to the center ro using the symmetric translation. p Denoting the multipole tensor at rs by O( n) and ρ = r − rop , using the multipole expansion theorem, we obtain the potential 136    φ (r ) = ∞ ∑O ( n) ⋅ n ⋅∇ n =0 n p e −κ ros p ros p −κ ros ⎛ n 1 ⎞ ne = ∑ ⎜ ∑ ρ m M ( n−m) ⎟ ⋅ n ⋅∇ p ⎜ ⎟ ros n =0 ⎝ m =0 m ! ⎠ ∞ p ⎛ −κ ros ⎜ 1 M ( n− m) ⋅ (n − m) ⋅∇ n e = ∑ ∑ ρ ⋅m⋅ p ⎜ m! ros ⎜ n =0 m = 0 ⎝ ∞ = n m ⎞ ⎟ ⎟ ⎟ ⎠ ∞ (4.4.2.3.3) ∑ ρn ⋅ n ⋅ L(n) n =0 where L( n ) p −κ ros me 1 = ∑ M ( m − n ) ⋅ ( m − n) ⋅ ∇ p ros m=n n ! ∞ (4.4.2.3.4) n is obtained by gathering tensors that operate up on ρ . Where the element of the total symmetric tensor ∇n −κ r n n n ⎛e ∂ i 1 ∂ j 2 ∂ k3 ⎜ ⎜ r ⎝ = ⎞ ⎟ ⎟ ⎠ n n n d 1t d 2 t d 3 t 2 2 2 ⎡n ⎤⎡n ⎤⎡n ⎤ −1n + m r 2 m−2 n −1 ⎢ 1 ⎥ ⎢ 2 ⎥ ⎢ 3 ⎥ ⎣ m1 ⎦ ⎣ m2 ⎦ ⎣ m3 ⎦ m1=0 m2 =0 m3 =0 ∑ ∑ ∑ x n1− 2 m1 y n2 −2 m2 z n3 −2 m3 ⋅ 2 π 1 −m+ n (κ r ) 2 K m−n− Proof : 137    e−κ r takes the form r (4.4.2.3.5) 1 (κ r ) 2 ⎛e ∂ n1∂ n2 ∂ n3 ⎜ x y z ⎜ −κ r ⎞ ⎟ r ⎟ ⎝ ⎠ ∞ ( −1)l l n1 n2 n3 ⎛ 1 ⎞ κ ∂ x ∂ y ∂ z ⎜ 1−l ⎟ l! ⎝r ⎠ l =0 =∑ = n n n d 1t d 2 t d 3 t 2 2 2 (4.4.2.3.6) ⎡n ⎤⎡n ⎤⎡n ⎤ (−1) n+ m ⎢ 1 ⎥ ⎢ 2 ⎥ ⎢ 3 ⎥ r 2m−2 n−1 ⎣ m1 ⎦ ⎣ m2 ⎦ ⎣ m3 ⎦ m1=0 m2 =0 m3 =0 ∑ ∑ ∑ (−1)l κ l r l n−m−1 ∏ (1 − l + 2i) l! l =0 i =0 ∞ ⋅ x n1−m1 y n2 −m2 z n3 −m3 ⋅ n ∑ By using New Modified Bessel K (formula 4.1.2.1), then we have n n n ⎛e ∂ i 1 ∂ j 2 ∂ k3 ⎜ ⎜ −κ r ⎞ ⎟ r ⎟ ⎝ ⎠ = n n n d 1t d 2 t d 3 t 2 2 2 ⎡n ⎤⎡n ⎤⎡n ⎤ −1n+ m r 2 m− 2 n −1 ⎢ 1 ⎥ ⎢ 2 ⎥ ⎢ 3 ⎥ ∑ ∑ ∑ ⎣ m1 ⎦ ⎣ m2 ⎦ ⎣ m3 ⎦ m1=0 m2 =0 m3 =0 x n1− 2 m1 y n2 − 2 m2 z n3 − 2 m3 ⋅ 2 π 1 −m+ n K (κ r ) 2 m−n− 1 (κ r ) 2 (4.4.2.3.7) Next, we prescribe the means to traverse down from rop to r0. This theorem is almost a mirror of that used to go up the tree. Theorem 4.4.2.4 (Local to Local Expansion) 138    A local expansion O ( n ) that exists in the domain Ωo centered around rop can be shifted to the domain Ωo centered at r0 using L( n ) = ∞ ∑ ⎜ m − n ⎟O ( m ) ⋅ (m − n ) ⋅ ( rocp ) ⎛ m ⎞ m=n ⎝ m−n ⎠ (4.4.2.4.1) Proof. The proof is exact the same as Theorem 3.2.4. Finally, the potential at any point in the Ωo can be obtained using φ (r ) = ∞ ∑ L( n) ⋅ n ⋅ ( ρoi ) n (4.4.2.4.3) n =0 4.4.3 Calculation Cost Based ACE -ν Same as the cost we got in calculation for R potential, the calculation cost for Yukawa potential includes the following parts: ⎛ N 3⎞ P ⎟ ⎝6 ⎠ • Charge to Multipole ∝ ⎜ • ⎛ N P6 ⎞ Multipole to Multipole ∝ ⎜ ⎜ s 720 ⎟ ⎟ ⎝ ⎠ • ⎛ N P6 ⎞ Multipole to Local ∝ 181⎜ ⎜ s 720 ⎟ ⎟ ⎝ ⎠ • Local to Local, and Local to Observer are the same as Multipole- to-Multipole and Charge-to-Multipole. 139    4.4.4 Error Bounds The absolute error in making the multipole approximation can be obtained using P εm = φ (r ) − ∑ M ( n) ⋅ n ⋅∇ n n =0 ∞ ∑ = M ( n ) ⋅ n ⋅∇ n e −κ r r M ( n) ⋅ n ⋅∇ n e −κ r r n = P +1 ∞ ∑ ≤ n = P +1 ≤ e−κ r r ⎛ e −κ r ⎞ M ( n ) ( n1, n2 , n3 ) ⎜ ∇ n ⎟ (n1, n2 , n3 ) ∑ ∑ ⎜ r ⎟ n = P +1 n1,n2 ,n3 ⎝ ⎠ ∞ ≤ ∞ ∑ n = P +1 ≤ ∞ (−κ )l l r l =0 l ! Cm a n r − n−1 ∑ n ⎛a⎞ ∑ Cm ⎜ r ⎟ r −1e−κ r ⎝ ⎠ n = P +1 Cm ⎛ a ⎞ ≤ ⎜ ⎟ (r − a) ⎝ r ⎠ P +1 (4.4.4.1) e −κ r where, we assume that the source/observation spheres are of radius a. Since the distance is always greater than the radius of the sphere, it implies that the series converges quickly for all negative values of κ and all imaginary value with low frequency k. Next, the error bounds on truncating the local expansions are derived as follows: 140    εl = = ∞ ∑ ρn ⋅ n ⋅ L( n) n= P +1 ∞ ∑ ρn ⋅ n ⋅ n= P +1 ≤ ∞ ∑ n= P +1 −κ r me 1 ( m−n) M ⋅ (m − n) ⋅∇ ∑ p ros m= n n ! ∞ (4.4.4.2) ∞ l (−κ ) p − m+l −1 (ros ) l =0 l ! Cm a n a m − n ∑ ⎛ a ≤ p ⎜ p (ros − a ) ⎜ ros ⎝ Cm ⎞ ⎟ ⎟ ⎠ P +1 p e −κ ros These equations imply that the approximations converge ∀ν ∈ . The above estimates are for the absolute error; similar the estimates for the relative error can be readily obtained. For example, from the Multipole expansion, εmr C ⋅r ⎛ a ⎞ = m ⎜ ⎟ ~ −κ r / r (r − a) ⎝ r ⎠ e εm P +1 (4.4.4.3) So, if we choose the computational accuracy of εmr , we can evaluate the truncation number "P" from the above equation (4.4.4.3). 4.4.5 Simulation Results First, we will demonstrate the accuracy of operators used to traverse up and down the tree. The geometric configuration analyzed is as follows: 9000 source/received populate a cube of dimension 4 x 4 x 4. Of these, 4500 source/observers are located at Ω1 = (0.0, 1.0) x (0.0, 1.0) x (0.0, 1.0), and the rest are located in Ω2 = (3.0, 4.0) x (3.0, 4.0) x (3.0, 4.0). The distribution of 141    the points is uniformly random, i.e., the distribution in the domain is almost uniform. This arrangement ensures that following; particles in Ω1 and Ω2 interact with each other at level 3. Level =3 Level =5 p= 2 1.329108391195769E-003 1.329108391195763E-003 p= 3 1.631733296883059E-004 1.631733296882647E-004 p= 4 2.461467077927862E-005 2.461467077927824E-005 p= 5 4.355505108030183E-006 4.355505108016297E-006 p=6 9.477150289163246E-007 9.477150288876174E-007 p= 7 2.036578472901254E-007 2.036578472487373E-007 p= 8 5.499653006794869E-008 5.499653006720579E-008 p=9 2.262098414476803E-008 2.262098417551747E-008 Level =7 Level =10 p= 2 1.329108391195766E-003 1.329108391195760E-003 p= 3 1.631733296882652E-004 1.631733296882728E-004 p= 4 2.461467077929170E-005 2.461467077929388E-005 p= 5 4.355505108018203E-006 4.355505108017509E-006 p=6 9.477150288988684E-007 9.477150288981971E-007 p= 7 2.036578472441334E-007 2.036578472504749E-007 p= 8 5.499653006677334E-008 5.499653006662830E-008 p=9 2.262098417652223E-008 2.262098417231141E-008 (Table 4.4.5.1 Variation of errors in multipole to multipole and local to local operations with fixed translation error) 142    no.source p a levels Error TFast Tdir 500 2 0.25 3 1.262927728730094E-003 0.5000E-002 2.0000E-002 4000 2 0.25 4 6.013826251951964E-004 4.5000E-002 1.3600 32000 2 0.25 5 5.25555366251479E-004 0.68500 116.5300 256000 2 0.25 6 4.763719518116608E-004 6.87500 7752.4000 1000000 2 0.25 7 4.60E-004(est) 29.685 121338.88505(est) 500 6 0.35 3 5.99055086807625E-005 2.0000000E-02 4000 6 0.35 4 2.213037934701564E-005 0.250000 1.2200 32000 6 0.35 5 1.0474747848110440E-005 2.84500 96.0000 256000 6 0.35 6 1.679191839847712E-005 23.16 6505.4500 4000 13 0.7 3 9.338207243842676E-007 0.46500 0.6800 32000 13 0.7 4 4.416413887399910E-007 6.8750 85.6300 256000 13 0.7 5 5.794151572000142E-007 69.22500 6500.7900 32000 22 1.0 3 1.737072973803084E-008 24.8600 70.6700 256000 22 1.0 4 5.436536138424175E-007 247.5200 6628.5000 1.0000000E-02 (Table 4.4.5.2 Errors in Yukawa potential κ = -0.17, computed using the proposed scheme and the directly timing data for the two methods, a is the size of smallest box) 143    TFast Error Tdir no.source p a levels 1000 2 0.25 3 1.436517045087933E-003 1.499999999E-002 0.22 1000 9 0.25 3 9.433469534878107E-006 4.5000002E-002 0.22 1000 16 0.25 3 3.397105761650021E-008 2.96 0.22 8000 2 0.25 3 1.151710346719531E-003 2.9999999E-002 5.58 8000 9 0.25 3 5.127398846981397E-006 9.1000000E-002 5.58 8000 16 0.25 3 1.205115662289324E-008 3.965 5.58 10000 2 0.25 3 1.106236796297262E-003 3.50000E-02 10.42 10000 9 0.25 3 9.872487112520653E-007 0.905 10.42 10000 16 0.25 3 1.160652577117371E-008 5.615 10.42 64000 2 0.25 3 1.064273818861804E-003 0.175 373.07 64000 9 0.25 3 1.957963023436753E-006 1.82 373.07 100000 2 0.25 3 1.080898257008474E-003 0.27 913.11 100000 9 0.25 3 9.051714873097575E-007 2.26 913.11 (Table 4.4.5.3 Errors in Yukawa potential κ = -1.0, computed using the proposed scheme and the directly timing data for the two methods a is the size od smallest box) For a given P, the error bound for this interaction can be computed. As we increase the number of level, the change in the error norm can be attributed solely to the error multipole-tomultipole and local-to-local translations. Table 4.4.5.1 documents the error obtained for different values of P and different levels (while we have data for all ten levels, only some are presented). Table 4.4.5.1 and Table 4.4.5.2 are carried out for κ = -0.17 because usually the pion's up-limit is about 0.17Mev. As is evident 144    from the results presented, the variation of the error obtained from using different levels in the tree is accurate to double precision. Source P Error Tdir/TACE RACE 1000 2 1.44E-03 0.22/1.50E-02 14.67 1000 9 9.43E-06 0.22/4.50E-02 4.89 8000 2 1.15E-03 5.58/3.00E-02 186 8000 9 5.13E-06 5.58/0.91E-01 61.32 8000 16 1.21E-08 5.58/3.965E-00 1.407 10,000 2 1.11E-03 10.42/3.50E-02 297.71 10,000 9 9.87E-07 10.42/0.905E-00 11.58 10,000 16 1.16E-08 10.42/5.615E-00 1.86 64,000 2 1.06E-03 373.07/0.175E-00 2131.83 64,000 9 1.96E-06 373.07/1.82E-00 204.98 100,000 2 1.08E-03 913.11/0.27E-00 1690.9 100,000 9 9.05E-07 913.11/2.26E-00 404.03 RPWR 8.9 57.83 (Table 4.4.5.4 Comparison with "Plane Wave Representation (PWR)" under same error. Errors in case κ =1 at level 3, RXXX is the relative calculation speed, RACE = Tdir/TACE, RPWR = Tdir/TPWR) 145    P Error 1 1.1726339E-002 3 4.8040578E-004 5 3.9361585E-005 9 4.3802832E-006 12 7.0297358E-007 e jkr (Table 4.4.5.5 ACE Errors in Helmholtz potential calculation in low frequency case with r the largest box size = λ/2) 4.5 Summary In this chapter, we have demonstrated the method for rapidly computing potentials of the e −κ r form R and Yukawa interaction . This method is founded on addition theorems based on r -ν Taylor expansions. Taylor’s series has a couple of inherent advantages: 1) it forms a natural framework for developing addition theorem based computational schemes for a range of potentials; 2) only Cartesian tensors (or products of Cartesian quantities) are used as opposed to special -ν functions. This makes creating a fast scheme possible for potential of the form R . Indeed, it is also possible to generalize the proposed methods to several potentials that are important in mathematical physics [52]. An interesting consequence of the approach has been the demonstration of the equivalence of FMMs that are based on traceless Cartesian tensors to those based on spherical expansions for ν = 1. 146    Finally, we have shown the application of this methodology to computing Columbic, Lennard-Jones, lattice gas potentials and Yukawa potential. We have also demonstrated the efficacy of this scheme for other (non-integer) potential functions. Current research is focused on generalizing the proposed methodology to analyze Yukawa and periodic columbic potentials, retarded and Helmholtz potential for sub-wavelength regimes. Finally, application of is scheme to problems in biophysics, electronic structure calculations, and MD codes are currently underway. 147    APPENDIX 148    APPENDIX This section introduces basic notation and theorems that will be used in my thesis. The material presented builds upon some of the earlier work by Applequist [75, 85, 88]. For completeness, we have described some of the theorems given in his papers (without proofs) as well as added some of our own (with the necessary proofs). 1. Tensors (n) A Cartesian tensor A n of rank n is an array of 3 components and will also be denoted ( ) either in component notation as Aαn...α where α j ∈{1, 2,3} . A totally symmetric tensor is one 1 n that is independent of the permutation of indices α1...α n ; in compressed form it contains (n + 1)(n + 2)/2 independent components, and is denoted by A(n) (n1, n2 , n3 ) where n1 + n2 + n3 = n . Here, ni is the number of times the index i is repeated. For example, consider the direct product of the vectors rr...rr = r n . It forms a tensor of rank n (and will be referred to henceforth as a n times polyadic) whose component can be expressed in compressed form as A(n) (n1, n2 , n3 ) = x n1 y n2 z n3 . The trace of one index pair of a tensor results in a tensor of rank n ( (n 2 and is denoted by Aαn:1)α = Aνν )α ...α ; the superscript (n : μ ) indicates a trace in μ index 3 ... n 3 n pairs, and will result in a tensor of rank n − μ − 1 . If the trace vanishes for any index pair then the tensor is totally traceless. It follows from the above description that if a tensor is symmetric and traceless in one index pair, then it is traceless for all index pairs. 2. Tensors Contraction 149    (m+n) Consider two tensors A (n) and B . The n-fold contraction between these two tensors is ( +n ( given by Aβm...β ) α ...α Bαn)...α and will be denoted using C ( m ) = A ( m + n ) ·n·B ( n ) . As usual, 1 m 1 n n 1 repeated indices denote a summation over that index. Similarly, a direct product between two (n) tensors A (m) and B (n+m) results in a tensor of rank n + m. If A (n) and B are two totally symmetric tensors, then the n-fold contraction between them can be written in compressed notation as C( m ) = A ( m+ n) ·n·B ( n) C ( m ) (m1, m2 , m3 ) = ∑ m1,m2 ,m3 n! A( n+ m ) (n1 + m1, n2 + m2 , n3 + m3 )B ( n) (n1, n2 , n3 ) n1 !n2 !n3 ! (A1) It is evident that the number of operations involved in evaluating each term of the tensor C ( m) (m1, m2 , m3 )  is (n + 1)(n + 2)/2, and since there are (m + 1)(m + 2)/2 terms, the total cost of the above contraction is (m + 1)(m + 2)(n + 1)(n + 2)/4. Next, we consider contraction between a totally symmetric tensor and two other tensors. Theorem A2.1. In evaluating an (n + m)-fold contraction between a totally symmetric rank (n+m) C (n) tensor and two tensors of B (m) and A it is permissible to permute the order of contraction. In other words A(m)B(n) ⋅ (n + m) ⋅ C(n+m) = B(n) A(m) ⋅ (n + m) ⋅ C(n+m) 150    (A2) (n+m) Proof. The proof is best derived in component form. The tensor C is totally symmetric, and a permutation of any pair of indices does not alter the value of the tensor. Carrying this (n+m) procedure between pairs of indices of the tensor C results in ( + ( +n Cβn...m) α ...α = Cαm...α ) . βn 1 m 1 1 m β1...β n Thus, A ( m ) B ( n ) ⋅ ( n + m ) ⋅ C( n + m ) ( ) ( ) ( + = Aαm...α Bβn...β Cβn...m ) α ...α 1 m 1 n 1 βn 1 m ( ) ( ) ( + = Bβn...β Aαm...α Cαn...m ) β ...β 1 n 1 m 1 αm 1 n = B ( n ) A ( m ) ⋅ ( n + m ) ⋅ C( n + m ) Finally, we note a trivial fact that will be useful in generating methods with lower computational complexity. (n) Lemma A2.2 (Non-uniqueness). Let A (m) ,B and C(m) be full rank tensors. Then it follows (n) − B ( n ) ) ⊥ n B ( n ) or A (n ) = C(n ) that if A ( n ) ·n·B ( n ) = C ( n ) ·n·B ( n ) , it implies that either ( A where ⊥ n defines an n-fold orthogonality. 3. Homogeneous Polynomials Consider a vector r ∈ 3 and a homogeneous polynomial f(r) of degree n. The following n lemma prescribes the relation between homogeneous polynomials and the polyadic r . 151    th Lemma A2.3 A polynomial of n degree is homogeneous if and only if it can be written as f n (r ) = A ( n) ⋅ n ⋅ r n (n) Where A (A3) th is an n rank Cartesian tensor that is independent of r. The proof for this lemma may be found in [75]. The following observations are also in order: (1) n The polyadic r is totally symmetric. This implies that one can recast Lemma A2.3 as f n (r ) = A( n) ⋅ n ⋅ r n where the symmetric tensor A ( n) is related to the tensor sym sym A ( n ) . This fact implies that a homogeneous polynomial can always be represented in terms of symmetric tensors. (2) The above expression can also be interpreted as a projection of an nth rank tensor along the vector r. (3) (n) If the tensor A n is totally symmetric and the n-fold contraction with r vanishes, (n) i.e., A ( n ) ⋅ n ⋅ r n ≡ 0 , then each component of A vanishes. The proof for this assertion can be found in [75]. Next, the Gradient and Euler’s theorems are as follows: Theorem A2.4. If f n (r ) is a homogeneous polynomial, i.e, f n (r ) = A ( n ) ⋅ n ⋅ r n , then ∇ k f n (r ) = n! A ( n) n − k ) r n−k (n − k ) ! for k ≤ n Theorem A2.5. If f n (r ) is a homogeneous polynomial of degree n ≥ 0 , then 152    (A4) r k ⋅ k ⋅∇ k f n (r ) = n! f n (r )(k ≤ n) (n − k )! (A5) n In both these equations, and hereafter, the tensor operator ∇ = ∇ ∇ . The proofs for these n (n) theorems can be found in [75]. Likewise, it can be shown that if A is totally traceless, then f n (r ) is a solid spherical harmonic of degree n. This fact will be used extensively to construct operators with low computational complexity. 4. Detracer As seen in the previous subsection, a homogeneous polynomial can be represented in terms of a contraction of a polyadic with a traceless tensor. Obtaining a traceless tensor from a totally symmetric tensor is tantamount to projecting out an nth rank irreducible tensor [89, 90]. In what follows, we shall use the detracer operator that has been used for constructing Cartesian tensorial forms of spherical harmonics [75]. Formally, the detracer operator is defined as  n , which, (n) , results in a traceless totally symmetric tensor. when acting on a totally symmetric tensor A More specifically, this operation is defined as (  n Aαn) = 1 αn n d t 2 ∑ (−1)m (2n − 2m − 1)!! ∑ δα1α2 T {α } m =0 ( δα 2 m−1α 2 m Aαn:m) 2 m +1 α n where the sum over T {α } is a sum over all permutations of double factorial of n. If A (n) α1 α n , and n!! denotes the is expressed in compressed form, the same equation can be rewritten as 153    (A6a)  n A ( n) ( n1, n2 , n3 ) = n n n d 1t d 2 t d 3 t 2 2 2 ⎡n ⎤⎡n ⎤⎡n ⎤ ( −1) m (2n − 2m − 1)!! ⎢ 1 ⎥ ⎢ 2 ⎥ ⎢ 3 ⎥ A( n:m ) ( n1 − 2m1, n2 − 2m2 , n3 − 2m3 ) ⎣ m1 ⎦ ⎣ m2 ⎦ ⎣ m3 ⎦ m1=0 m2 =0 m3 =0 ∑ ∑ ∑ (A6b) ⎡n⎤ n! where n=n1+n2+n3, m=m1+m2+m3, and ⎢ ⎥ = m ⎣ m ⎦ 2 m !( n − 2m)! Note, that a traceless totally symmetric tensor of rank n has only 2n+1 independent components. Some interesting properties associated with the Detracer (and the proofs for the theorems that follow) can be found in [85, 88] are as follows: (n) Theorem A2.6 (Exchange theorem). If A (n) and B are totally symmetric tensors, then A(n) ⋅ n ⋅  nB(n) =  n A(n) ⋅ n ⋅ B(n) (n) Theorem A2.7 If A (A7) is a traceless totally symmetric tensor, then  n A(n) = (2n − 1)!!A(n) Corollary A2.8 If A (n) (A8) (n) is a totally symmetric tensor and B is a traceless totally symmetric tensor then A ( n) ⋅ n ⋅ B ( n) = 1  n A (n) ⋅ n ⋅ B(n) (2 n − 1)!! (A9) Proof: 154    A ( n) ⋅ n ⋅ B( n) = 1 A ( n ) ⋅ n ⋅  n B( n) (2n − 1)!! == 1  n A( n) ⋅ n ⋅ B( n) (2n − 1)!! (l) (n) Corollary A2.9 If A , B (m) and C using Theorem A2.7 using Theorem A2.6 (A10) are traceless and symmetric tensors {∀(n, m) : l = n + m} and A(l ) = ∑ cnmB(n)C(m) } then nm (2l − 1)!! cnm  n B ( n ) m C ( m ) nm (2 n − 1)!!(2 m − 1)!!  l A (l ) = ∑ (A11) Proof. Use Theorem A2.7. 5. Maxwell Cartesian Tensors An expression for solid harmonics in terms of the gradient of the position r-1 was first derived by Maxwell of an arbitrary set of n axis [76]. In the Cartesian coordinate frame, his expressions reduce to ∇n r −1 = (−1)n r −(2n+1) nrn Maxwell equation (A12) This relationship has been obtained by others [85, 86] as well. It has been shown that the components of  nr n are solid harmonics of degree n. This relationship can be used to compute ∇ n r −ν . It can be shown that the following expressions are valid in component form: 155    1 ν 1 ∂i ν = ν −1 ∂i r r r 1 1 1 ν 1 ∂i ∂ j ν = ν∂i ν −1 ∂ j + ν −1 ∂ i ∂ j r r r r r 1 1 1 1 1 1 1 ν 1 ∂i ∂ j ∂ k ν = ν∂i ∂ j ν −1 ∂ k +ν∂ j ν −1 ∂i ∂ k + ν∂i ν −1 ∂ j ∂ k + ν −1 ∂ i ∂ j ∂ k r r r r r r r r r (A13) From the above equation, ∂ i denotes a partial derivative with respect to the component i. It is also evident that this function can be constructed in terms of the traceless tensors of the type defined in Maxwell equation. Furthermore, while the tensor is totally symmetric, it is not traceless. While this equation demonstrates the relationship between traceless Cartesian harmonics, it is easier to use n n n ⎛ 1 ⎞ ∂i 1∂ j2 ∂ k3 ⎜ ν ⎟ = ⎝r ⎠ (−1) n r −2n−ν n n n d 1t d 2 t d 3 t 2 2 2 ∑ ∑ ∑ (−1)m r 2m f (ν , n − m − 1) x n1−2 m1 y n2 −2m2 z n3 −2m3 m1=0 m2 =0 m3 =0 (A14) to construct the n-fold gradient, where n=n1+n2+n3, m= m1+m2+m3, and f (ν , n ) = ν × (ν + 2) × (ν + 4) × (ν + 2n) . It can be readily shown that this definition reduces to that obtained for ν = 1 [75]. Finally, consider a function f (r - r') where r and r’ are used to denote the location of the observation and source points, respectively. An addition theorem for this function may be obtained using Taylor’s expansion. In tensor form, this is stated as follows: 156    ∞ (−1) n n f (r - r') = ∑ r' ⋅ n ⋅∇ n f (r ) n =0 n ! (A15) Where r > r' Proof f ( r − r′ ) 3 ′ = f (r ) − ∑ r j ∂ j f (r ) + j =1 = f (r ) − (r '⋅∇) f (r ) + = ∞ 1 ∑ (−1)n n! ( r '⋅∇ ) n 1 3 3 ∑ ∑ r′j rk′ ∂ j ∂ k f (r) + 2! j =0 k =0 1 ( r '⋅∇ )2 f (r) + 2! f (r ) n =0 = ∞ 1 ∑ (−1)n n! r 'n ⋅ n ⋅∇n f (r) n =0 This theorem gives rise to the following corollary. Corollary A2.10 The function f (r - r') takes the form ⎧ ∞ (n) n ⎪ ∑ M ⋅ n ⋅∇ f (r ) ⎪ n =0 f (r - r') = ⎨ ∞ ⎪ n (n) ⎪∑ r ⋅n⋅L ⎩ n =0 (n) where M (n) and L for r>r' for r'>r are the multipole and local expansions. This formula is the foundation of fast methods that will be proposed in the next section. As an aside, it is interesting to note that an application of this theorem readily leads to an equivalence between Cartesian harmonics and spherical harmonics. 157    Let us consider the function f (r - r') = 1 , where R = R = r - r ' . By using Theorem 2.10 R and Maxwell equation (A11), one can readily arrive at R −1 ∞ (−1) n n =∑ r' ⋅ n ⋅∇ n r −1 n =0 n ! = ∞ 1 ∑ n !r −2n−1r'n ⋅ n ⋅  nrn n =0 ∞ n 1 r ′n n =∑ r' ⋅ n ⋅  n r n +1 n =0 n ! r = ∞ (A16) r ′n ∑ r n+1 Pn ( r' ⋅ r ) n =0 The above equation is immediately recognizable as being equivalent to an expansion in ( ) terms of Legendre polynomials Pn r' ⋅ r , and provides the required equivalence between traceless Cartesian tensors and Legendre polynomials (see references in [75] for more details). In the next two sections, we will use some of these ideas to develop fast methods for potentials of the form R −ν . 6. Gegenbauer Expansions (Addition Theorem A2.11) 158    eik |x+d| |x+d| P ˆ ˆ ≈ ik ∑ (−1)l (2l + 1) jl (kd )hl(1) (kx) Pl (d ⋅ x) l =0 P 1ˆ ˆ = ik ∑ (−1)l (2l + 1) jl (kd )hl(1) (kx) d ⋅ l ⋅  n x l! l =0 1 ⎡ (−1) 1 ⎢⎛ 2 ⎞ 2 (ikd)l ⋅ l ⋅ ⎢⎜ kK 1 (ikx)xl +1∇l ≈∑ ⎟ π (ikx) ⎠ (2l − 1)!! l + l =0 l ! ⎢⎝ 2 ⎣ P′ l ⎤ 1⎥ x⎥ ⎥ ⎦ P′ ik |x| 1 l l e = ∑ (d) ⋅ l ⋅∇ |x| l =0 l ! Taylor Expansion 159    (A17) REFERENCES 160    REFERENCES 1. T. C. King, Y. K. Kuo, M. J. Skove, and S.J. Hwu, Phys. Rev. B, 2001. 63(045405). 2. T. C. King, Y. K. Kuo, and M. J. Skove, Physica A, 2002. 313: p. 427. 3. T. C. King and Y. K. Kuo, Phys. Rev. E, 2003. 68: p. 036116. 4. V. Levashov, M. F. Thorpe, and B. W. Southern, Phys. Rev. B, 2003. 67: p. 224109. 5. J.R. Lee and S. Teitel, Phys. Rev. B, 1992. 46: p. 3247. 6. P. Gupta and S. Teitel, Phys. Rev. B, 1997. 55: p. 2756. 7. Ja. P. Selissky, Uporyadochenie atomov i ego vtiyanie nasvoistva splavov. RMS seriya Mefdofzika ViD. 20. Str. 23.Naukova bumka, Kiev, 1968. 8. Gerard Toulouse, Theory of the frustration effect in spin glasses : I. Communications on Physics, 1977. 2. 9. J. Vannimenus and G. Toulouse, Theory of the frustration effect. II. Ising spins on a square lattice. J. Phys. C: Solid State Phys., 1977. 10. 10. G. H. Wannier, Antiferromagnetism. The Triangular Ising Net. Phys. Rev. 79, 1950. 79. 11. T. A. Kaplan, Classical Spin-Configuration Stability in the Presence of Competing Exchange Forces. Phys. Rev., 1959. 116: p. 888-889. 12. A. Yoshimori, J. Phys. Soc. Japan, 1959. 14. 13. J. Villain, Phys. Chem. Solids, 1959. 11. 14. K. Binder, Phase Transitions and Critical Phenomena. C. Domb and M.S. Green, eds. Vol. 5b (Academic Press, London), 1976. 15. Kurt Binder and Erik Luijten, Monte Carlo tests of renormalization-group predictions for critical phenomena in Ising models. Physics Reports, 2001. 344(4-6). 16. Geoffrey D. Price, The Energetics of Polytypic Structures: a Computer Simulation of Magnesium Silicate Spinelloids. Acta Cryst., 1985. B41: p. 231-239. 17. Walter Selke, THE ANNNI MODEL - THEORETICAL ANALYSIS AND EXPERIMENTAL APPLICATION. PHYSICS REPORTS (Review Section of Physics Letters) 170. No. 4, 1988: p. 213-264. 18. R. J. Elliott, Phenomenological discussion of magnetic ordering in the heavy rare-earth metals. Phys. Rev. , 1961. 124. 19. Michael E. Fisher and Walter Selke, Infinitely many commensurate phases in a simple Ising model. Phys. Rev. Lett., 1980. 44: p. 1502. 20. H. T. Diep, Stephen Blundell “Magnetism in Condensed Matter” (Oxford Maser Series in Condensed Matter Physics). World Scientific, Singapore, 1994. 21. H. T. Diep, A. Ghazali, and P. Lallemand, J. Phys. C, 1985. 18: p. 5881. 22. H. T. Diep and H. Kawamura, Phys. Rev. B 40, 7019 1989. 161    23. C. Pinettes and H. T. Diep, Phase transition and phase diagram of the J1-J2 Heisenberg model on a simple cubic lattice. J. of Appl. Phys., 1998. 83: p. 6617-6619. 24. P. Lallemand, H. T. Diep, A. Ghazali, and G. Toulouse, J. Phys. Lett., 1985. 46: p. L1087. 25. T. Oguchi, H. Nishimori, and Y. Taguchi, J. Phys. Soc. JPN., 1985. 54. 26. C. Pinettes and H. T. Diep, J. of Appl. Phys. 83, 6317, 1998. 27. S. Loo K. F. Hsu, F. Guo, W. Chen, J. S. Dyck, C. Uher, T. Hogan, E. K. Polychroniadis, and M. G. Kanatzids, Cubic AgPbmSbTe2+m: Bulk Thermoelectric Materials with High Figure of Merit. Science, 2004. 303. 28. Joseph R. Sootsman, Duck Young Chung, and Mercouri G. Kanatzidis, New and Old Concepts in Thermoelectric Materials. Angew. Chem. Int. Ed, 2009. 48. 29. Keyur Desai Khang Hoang, And S. D. Mahanti Charge ordering and self-assembled nanostructures in a fcc Coulomb lattice gas. Phys. Rev. B, 2005. 72. 30. H. Mtiller-Krumbhaar and K. Binder, Dynamic Properties of the Monte Carlo Method in Statistical Mechanics. Journal of Statistical Physics, 1973. 8: p. 1. 31. N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, Equation of State Calculations by Fast Computing Machines The Journal of Chemical Physics 21(6):1087-1092 (1953), 1953. 32. N. J. Dowrick J. J. Binney, A. J. Fisher, and M. E. J. Newman, The theory of critical phenomena: An introduction to the renormalization group. International Journal of Quantum Chemistry, Clarendon Press, Oxford 464 pp., 1993. 46. 33. Y.C. Xu and B.L. Hao H. Shi, Acta Physica Sinica. Acta Physica Sinica, 1978. 27. 34. Thomas L. Bell and Kenneth G. Wilson, Nonlinear renormalization groups. Phys. Rev. B, 1974. 10. 35. Donald E. Knuth, Art of Computer Programming. Art of Computer Programming, 1979. Vol. 2. 36. Adauto J. F. De Souza Jae-Kwon Kim, And D. P. Landau Numerical computation of finite size scaling functions: An alternative approach to finite size scaling. Phys. Rev. E, 1996. 54. 37. P. Butera and M. Comi, N-vector spin models on the simple-cubic and the body-centeredcubic lattices: A study of the critical behavior of the susceptibility and of the correlation length by high-temperature series extended to order β21. Phys. Rev. B, 1997. 56. 38. Lars Onsager, Crystal Statistics. I. A Two-Dimensional Model with an Order-Disorder Transition. Phys. Rev. B, 1944. 65. 39. D. Stauffer and R. Knecht, ISING KINETICS WITH HUNDRED GIGA-SITES International Journal of Modern Physics C (IJMPC), 1996. 7: p. 893-897. 40. Rajan Gupta and Pablo Tamayo, CRITICAL EXPONENTS OF THE 3-D ISING MODEL. International Journal of Modern Physics C (IJMPC), 1996. 7: p. 305-319. 162    41. Jr. Abdulnour Y. Toukmaji and John A. Board, Ewald summation techniques in perspective: a survey. Computer Physics Communications, 1996. 95(2-3): p. 73-92. 42. John W. Perram; Henrik G. Petersen, An algorithm for the simulation of condensed matter which grows as the 3/2 power of the number of particles Molecular Physics, 1988. Vol. 65, No. 4. 43. E. Madelung, Das elektrische Feld in Systemen von regelmäßig angeordneten Punktladungen. Phys. Zs. XIX, 524–533, 1918. 44. M. K. Phani and Joel L. Lebowitz, Monte Carlo studies of an fcc Ising antiferromagnet with nearest- and next-nearest-neighbor interactions. Phys. Rev. B, 1980. 21. 45. Samuel M. Allen and John W. Cahn, Ground State Structures in Ordered Binary Alloys with Second Neighbor Interactions. Acta Met., 1972. 20. 46. J. Slawny, Low-temperature expansion for lattice systems with many ground states. Journal of Statistical Physics, 1979. 20. 47. M. Yussouff T. V. Ramakrishnan, First-principles order-parameter theory of freezing. Phys. Rev. B, 1979. 19. 48. A. D. J. Haymet and D. W. Oxtoby, J. Chem. Phys., 1981. 74(2559). 49. D. W. Oxtoby and A. D. J. Haymet, J. Chem. Phys., 1982. 76: p. 6262. 50. P. Harrowell and D. W. Oxtoby, J. Chem. Phys., 1984. 80: p. 1639. 51. S. T. Langer, Theory of Nucleation Rates. Phys. Rev. Lett., 1968. 21: p. 973-976. 52. B. Shanker and H. Huang, Accelerated cartesian expansions–an O(N) method for computing of non-oscillatory potentials. Technical Report MSU-ECE-Report-2006-06, Michigan State University, 2006. 53. M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids. Oxford University Press, 1987. 54. R.W. Hockney and J.W. Eastwood, Computer Simulation Using Particles. IOP Publishing, Bristol, 1988. 55. P. Macneice, C. Mobarry, and K. Olson, Particle-mesh techniques on the maspar. Nasa technical memorandum 104632, NASA, 1996. 56. H.M.P. Couchman, Mesh-refined p3m: a fast adaptive n-body algorithm. Astrophys. J. Lett, 1991. 368: p. 28. 57. T. Schlick, Molecular Modeling and Simulation. Springer-Verlag, 2002. 58. G.H. Kho and W.H. Fink, Rapidly converging lattice sums for nanoelectronic interactions. J. Comp. Chem., 2001. 23: p. 447-483. 59. A. Appel, An efficient program for many-body simulations. SIAM J. Sci. Comput., 1985. 6: p. 85-103. 60. J. Barnes and P. Hut, A hierarchical o(nlogn) force calculation algorithm. Nature, 1986. 324: p. 446–449. 163    61. J.A. Board, W.J. Blamke, D.C. Gray, Z.S. Hukara, W. Elliott, and J. Leathrum, Scalable implementations of multipole-accelerated algorithms for molecular dynamics. Scalable High Performance Computing Conference, 1994: p. 97-94. 62. L. Greengard and V. Rokhlin, A fast algorithm for particle simulations. J. Comput. Phys., 1987. 20: p. 63-71. 63. L. Greengard, The Rapid Evaluation of Potential Fields in Particle Systems. MIT Press, Cambridge, MA, 1988. 64. L. Greengard and W. Gropp, A parallel version of the fast multipole method. Comput. Math. Appl., 1990. 20: p. 63-71. 65. L. Greengard and V. Rokhlin, A new version of the fast multipole method for the laplace equation in three dimensions. Acta Numer. 6: p. 229-269. 66. H. Cheng, L. Greengard, and V. Rokhlin, A fast adaptive multipole algorithm in three dimensions. J. Comput. Phys., 1999. 155: p. 468-498. 67. W.D. Elliott and J.A. Board, Technical Report 94-005. Duke Univ Dept EECS. , 1994. 68. Z. Duan and R. Krasny, An adaptive treecode for computing nonbonded potential energy in classical molecular systems. J. Comp. Chem., 2001: p. 184-195. 69. M. Fenley, W.K. Olson, K. Chua, and A.H. Boschitsch, Fast adaptive multipole method for computation of electrostatic energy in simulations of polyelectrolyte dna. J. Comp. Chem., 1996. 17: p. 976-991. 70. I. Chowdhury and V. Jandhyala, Single level multipole expansions and operators for -k potentials of the form r . SIAM J. Sci. Comput., 2005. 26: p. 930-943. 71. I. Chowdhury and V. Jandhyala, Multilevel multipole and local operators for potentials -k of the form r . Appl. Math. Lett., 2005. 18: p. 1184-1189. 72. F. Zhao, An O(N) algorithm for three-dimensional n-body simulation. Master's thesis, Massachusetts Institute of Technology, 1987. 73. P.B. Visscher and D.M. Apalkov, Charge based recursive fast multipole micromagnetics. Physica B: Phys. Condens. Matter, 2004. 343: p. 184-188. 74. E.J. Weniger, Addition theorems as three-dimensional Taylor expansions. Int. J. Quant. Chem., 2000. 76: p. 280-295. 75. J. Applequist, Traceless cartesian tensor forms for spherical harmonic functions: new theorems and applications to electrostatics of dielectric media. J. Phys. A: Math. Gen., 1989. 22: p. 4303-4330. 76. E.W. Hobson, The Theory of Spherical and Ellipsoidal Harmonics. Cambridge University Press, 1931. 77. T.M. Macrobert, Spherical Harmonics. Dover, 1947. 164    78. H. Huang and S. Balasubramaniam, Cartesian harmonics and fast computation of -ν potentials of the form r North American Radio Science Symposium URSI CNC/USNC, Albuquerque, New Mexico 2006. 79. K. Srinivasan, H. Mahawar, and V. Sarin, A multipole based treecode using spherical -k harmonics for potentials of the form r . in:Proceedings of the International Conference on Computational Science, Lecture Notes in Computer Science, vol. 3514, SpringerVerlag, Atlanta, Georgia, 2005, pp. 107–114., 2005. 80. I. Chowdhury and V. Jandhyala, Single level multipole expansions and operators for -k potentials of the form r ,. SIAM J. Sci. Comput, 2005. 26: p. 930-943. 81. H. Cheng, L. Greengard, and V. Rokhlin, A fast adaptive multipole algorithm in three dimensions. J. Comput. Phys, 1999. 155: p. 468-498. 82. H. Huang and S. Balasubramaniam, Fast Evaluation of Fields from Sub-Wavelength Structures in Frequency Domain Using Accelerated Cartesian Expansions. North American Radio Science Symposium URSI - CNC/USNC, Ottawa, ON, Canada, 2007. 83. H. Huang and S. Balasubramaniam, Accelerated Cartesian Expansion based Method for Rapidly Computing the Shielded Coulomb Potential. North American Radio Science Symposium URSI - CNC/USNC, Ottawa, ON, Canada, 2007. 84. H. Huang, C. Lu, and S. Balasubramaniam, Maxwell Cartesian Harmonics and the Static Fast Multipole Method. IEEE Antennas and Propagation Society International Symposium, Albuquerque, New Mexico, 2006. 85. J. Applequist, Fundamental relationships in the theory of electric multipole methods and multipole polarizabilities in static fields. Chem. Phys., 1984. 85: p. 279–289. 86. E. Buros and H. Bonadeo, Mol. Phys., 1981. 44: p. 1-15. 87. Alexander H. Boschitsch, Marcia O. Fenley, and Wilma K. Olson, A Fast Adaptive Multipole Algorithm for Calculating Screened Coulomb (Yukawa) Interactions. Journal of Computational Physics, 1999. 151: p. 212-241. 88. J. Applequist, Cartesian polytensors. J. Math. Phys., 1983. 24: p. 736–742. 89. R. Mcweeny, Symmetry, MacMillan. 1963. 90. C.G. Grey and K.E. Gubbins, Theory of Molecular Fluids. Clarendon Press, 1984. 165