AB INITIO NANOSTRUCTURE DETERMINATION By Saurabh Gujarathi A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Physics - Doctor of Philosophy 2014 ABSTRACT AB INITIO NANOSTRUCTURE DETERMINATION By Saurabh Gujarathi Reconstruction of complex structures is an inverse problem arising in virtually all areas of science and technology, from protein structure determination to bulk heterostructure solar cells and the structure of nanoparticles. This problem is cast as a complex network problem where the edges in a network have weights equal to the Euclidean distance between their endpoints. A method, called Tribond, for the reconstruction of the locations of the nodes of the network given only the edge weights of the Euclidean network is presented. The timing results indicate that the algorithm is a low order polynomial in the number of nodes in the network in two dimensions. Reconstruction of Euclidean networks in two dimensions of about one thousand nodes in approximately twenty four hours on a desktop computer using this implementation is done. In three dimensions, the computational cost for the reconstruction is a higher order polynomial in the number of nodes and reconstruction of small Euclidean networks in three dimensions is shown. If a starting network of size five is assumed to be given, then for a network of size 100, the remaining reconstruction can be done in about two hours on a desktop computer. In situations when we have less precise data, modifications of the method may be necessary and are discussed. A related problem in one dimension known as the Optimal Golomb ruler (OGR) is also studied. A statistical physics Hamiltonian to describe the OGR problem is introduced and the first order phase transition from a symmetric low constraint phase to a complex symmetry broken phase at high constraint is studied. Despite the fact that the Hamiltonian is not disordered, the asymmetric phase is highly irregular with geometric frustration. The phase diagram is obtained and it is seen that even at a very low temperature T there is a phase transition at finite and non-zero value of the constraint parameter γ/µ. Analytic calculations for the scaling of the density and free energy of the ruler are done and they are compared with those from the mean field approach. A scaling law is also derived for the length of OGR, which is consistent with Erd¨os conjecture and with numerical results. TABLE OF CONTENTS LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 2 The Liga algorithm . . . 2.1 Algorithm details . . . . . . . . 2.2 Limitations . . . . . . . . . . . 2.3 Extension of the Liga algorithm 2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 12 17 19 21 Chapter 3 The Tribond 2D algorithm . . . . . . . . . . 3.1 Rigidity theory of unassigned PD-IP . . . . . . . . . 3.2 Tribond 2D algorithm . . . . . . . . . . . . . . . . . 3.3 Applications . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Tribond for structures with high symmetry . . 3.3.2 Reconstruction from an imprecise distance list 3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 22 25 35 36 37 40 Chapter 4 The Tribond 3D algorithm . . . . . . . . . . 4.1 Tribond 3D algorithm . . . . . . . . . . . . . . . . . 4.2 Applications . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Reconstruction from an imprecise distance list 4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 42 52 57 60 Chapter 5 Statistical physics of the optimal Golomb ruler 5.1 Statistical mechanics formulation . . . . . . . . . . . . . . 5.2 Mean field approach . . . . . . . . . . . . . . . . . . . . . 5.3 Asymptotic analysis . . . . . . . . . . . . . . . . . . . . . 5.3.1 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Phase boundary . . . . . . . . . . . . . . . . . . . . 5.3.2.1 Low temperature . . . . . . . . . . . . . . 5.3.2.2 High temperature . . . . . . . . . . . . . . 5.4 Exact calculations . . . . . . . . . . . . . . . . . . . . . . . 5.5 Search for OGR . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Symmetric theory . . . . . . . . . . . . . . . . . . . . . . . 5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 61 64 70 70 71 71 74 79 79 79 82 iv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . APPENDIX . . . . . . . . . . . . . . . BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . 85 87 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 v LIST OF TABLES Table 4.1 Results from the Tribond 3D algorithm. . . . . . . . . . . . . . . . . vi 53 LIST OF FIGURES Figure 1.1 (color online) Simple examples of structures found from Euclidean distance lists. The figures on the left are plots of the distance lists for: a) (top) a C60 fullerene that has a degenerate distance list, and b) (bottom) a random set of 10 points in the plane that has a nondegenerate distance list. The fullerene has a total of 1770 interatomic distances, but only 21 unique distances. The random point set has, with high probability, 45 unique distances. The multiplicity is on the vertical axis while the distance is on the horizontal axis (in arbitrary units). The figures on the right hand side are solutions to the inverse problem found using the Liga algorithm (fullerene) and Tribond (random point set) to find the structure from the given distance lists, without the use of any other information. For the random point set all interatomic distances are drawn in the figure. For clarity only the nearest neighbor bonds are drawn in the fullerene case. In this study, the distance lists are taken from the known structure and then we try to solve the inverse problem using only the distance list. In the real world, the structure is unknown and the distance lists are derived from experiments, particularly x-ray and neutron scattering data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Figure 1.2 A common ruler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Figure 1.3 A Golomb ruler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Figure 2.1 An example of promotion and relegation in Liga for N=10 and four structures at each level. The algorithm is at level 4, where a winner structure is randomly selected with probability equal to the reciprocal of its cost. The winner attempts to add as many candidate points (denoted by ‘x’) with a low cost as possible. In this example, the winner can add five points and gets promoted to the 9th level. A loser structure is randomly selected from the 9th level, it loses five of its atoms and is relegated to the 4th level. The choice of the losing structure and the points that are removed in relegation are done randomly with a probability equal to the cost. . . . . . . . . . . . . 13 Reconstruction of various platonic solids using Liga. . . . . . . . . . 15 Figure 2.2 vii Figure 2.3 Figure 2.4 Figure 2.5 Figure 2.6 Figure 2.7 Figure 3.1 Reconstruction of a cubic grid with side equal to 4 and N=64 using Liga. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Reconstruction of Lennard Jones clusters using Liga. On the left is solution for N=88 and on the right is N=150. The solutions have a low error and are topologically identical to the LJ-88 and LJ-150 clusters. The atoms in blue have a low error, while those in red have high error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Reconstruction of C60 from experimental PDF data. a) Experimental pair distribution function (G) as a function of distance (r). The background (Gbg ) arising from interparticle correlations is shown in green. b) The radial distribution function (R) as a function of the distance (r). It is obtained after subtracting the background from the PDF data. The interatomic distances are obtained by using the peak maxima and the multiplicities are set equal to the peak areas. c) The solution structure obtained using the exact distance list obtained in the previous step. d) The solution obtained after the multiplicities in the distances are relaxed by 10%. The atoms in blue have a low error, while those in red have high error. . . . . . . . . . . . . . . . . . . 17 Liga’s success and failures in reconstructing structures with different amounts of symmetry. Low symmetry structures have a large number of unique interpoint distances, while those with high symmetry have a small number of unique interpoint distances. Failure is represented by the plus symbol while success is denoted by the star symbol. Success mostly occurs in the region closer to the X axis, which is representative of the structures having high symmetry while failure mostly occurs mostly for structures having a large number of unique distances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Steps involved in the crystal structure determination using experimental PDF. An automated peak extraction routine is used to obtain the distances and their multiplicities. This information along with the lattice parameters for the crystal structure is given as input to Liga. It gives as output a number of candidate solutions that are consistent with the input data. The next step is coloring, which assigns the atom species to each site by minimizing the atom radii overlap and the structure with the lowest cost is declared as the solution. . . . . 19 (color online) An example of a core. In 2D, it consists of 4 points. The horizontal bond is the base (in black), the bonds below it (in blue) make up the base triangle while those above it (in red) make up the top triangle. The vertical bond is the bridge (in green). . . . . . . . 26 viii Figure 3.2 Figure 3.3 Figure 3.4 Figure 3.5 Figure 3.6 Figure 3.7 Figure 3.8 Figure 3.9 Four possible positions for the top triangle are shown. The corresponding bridge bonds are shown using a dashed line. . . . . . . . . 26 Number of feasible triangles using the bonds from a given distance list go up when we choose a larger bond as base for the triangle. Statistically, using the shortest bond in the distance list as the base leads us to the core in the shortest time. This plot shows data from runs using 10 different structures with N = 128. . . . . . . . . . . . 30 Small core hypothesis: (N = 1024) We see that when we have the smallest bond in the distance list as the base, the first core is in a distance window an order of magnitude smaller than other choices for the base bond. Hence, statistically, using the first bond as base is our best bet when searching for the core. . . . . . . . . . . . . . . . 30 Plot illustrating the role of the base bond. For N = 32, the Tribond algorithm ran using base bonds that were picked from 10 different places spread along the sorted distance list. If the smallest bond is chosen as the base, we see that it takes 3 orders of magnitude less time for the core finding stage and an order of magnitude less time for the buildup stage. . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Experimental results for a series of reconstructions from distances lists generated from random point sets in two dimensions. The time for finding the core, the time for doing the buildup starting with the core and the total time are presented as a function of the number of bridge bond checks that were performed. Bridge bond checking is a fundamental process in Tribond and provides a system-independent measure of computational time. Each point on the plots is an average over 25 different instances of random point sets. We find that the total time scales as τtotal ∼ N 3.32 . . . . . . . . . . . . . . . . . . . . 34 A perturbed graphene cut out made from 144 atoms. The Tribond algorithm successfully reconstructed a similar structure in a few minutes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Self-avoiding walk is a sequence of moves that does not visit the same point more than once and is used to model polymers. Tribond was able to successfully reconstruct the above structure (N = 100) in a few minutes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Gently perturbed square grid made of 100 sites. Our algorithm was successfully able to solve such a structure in a few minutes. . . . . . 38 ix Figure 3.10 Figure 4.1 Figure 4.2 Figure 4.3 Figure 4.4 Figure 4.5 Plot of minimum core size vs precision of the input distance list for N = 26, 50, 76 and 100. We can see that a bigger core is needed for a less precise distance list. . . . . . . . . . . . . . . . . . . . . . . . . 40 (color online) An example of a core. In 3D, it consists of 5 points. The points at the top and at the bottom are the apex points. The three points in the middle form the base triangle (in black). The base triangle along with the apex point at the bottom forms the base tetrahedron (in blue), while the base triangle along with the apex point at the top forms the top tetrahedron (in red). The vertical bond connecting the two apex points is the bridge (in green). . . . 43 Number of feasible tetrahedra using the bonds from a given distance list go up when we choose a larger bond as base for the base triangle. Statistically, using the shortest bond in the distance list as the base bond leads us to the core in the shortest time. This plot shows data from runs using 10 different structures with N = 20. . . . . . . . . . 46 Empirical example of the small-core hypothesis. The hypothesis states that there exists a core where at least 9 of the 10 total bonds are drawn from a relatively small window of the shortest bonds in the structure. Varying the base bond’s fractional position in the distance list for ten different N = 50 structures, core finding shows that using the smallest distance as the base bond reduces the typical size of the window required to find a core by an order of magnitude. . . . . . . 47 Figure illustrating the effect of base bond size on the computational cost (bridge bond checks) of reconstruction for N = 10. The plots for the total and core finding steps are nearly indistinguishable because the core finding is orders of magnitude more expensive than buildup. If the smallest bond is chosen as the base, the total computational cost of reconstruction is nearly 2 orders of magnitude lower than larger bonds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Experimental results for a series of reconstructions from distances lists generated from random point sets in three dimensions. The computational cost (bridge bond checks) for finding the core, performing buildup and their total is presented as a function of the number of points. The plots for the total and core finding steps are nearly indistinguishable because core finding takes orders of magnitude more time than buildup. Each point on the plots is the median value from 10 different instances of random point sets. . . . . . . . . . . . . . . 51 x Figure 4.6 Figure 4.7 Experimental results for a series of reconstructions from distances lists generated from random point sets in three dimensions. The computational cost (bridge bond checks) for performing buildup is presented as a function of the number of points. Each point on the plots is the average over 10 different instances of random point sets. We find that the buildup time scales as τbuildup ∼ N 4.98 . . . . . . . . . . . . . . 52 Buildup for LSD (top) and Caffeine (bottom) molecules was done in 48.9 seconds and 2.1 seconds respectively. . . . . . . . . . . . . . . . 55 Buildup for Cystine (top) and Lysine (bottom) molecules was done in 0.24 seconds and 2.8 seconds respectively. . . . . . . . . . . . . . 56 Figure 4.9 Buildup for Quinine molecule was done in 84.4 seconds. . . . . . . . 57 Figure 4.10 Plot of minimum core size vs precision of the input distance list for N = 9, 17 and 25. We can see that a bigger core is needed for a less precise distance list. The typical run time for N = 9, 17, 25 was about 1 second, 20 minutes and 15 hours respectively, on a computer with a 2.2 GHz processor and 2 GB of memory. . . . . . . . . . . . . . . 59 Density (top figure) and free energy per site (bottom figure) as a function of γ/µ for T = 0.2 and L = 35, 56, 107, 200, 493. For each chain length two calculations obtained by iterating through the Golomb lattice gas mean field equations are presented. One trace represented by the symbols is obtained by starting at γ/µ = 0.01, choosing a uniform initial condition and then gradually increasing γ/µ. The solid lines are obtained by starting at γ/µ = 10, choosing an exact OGR state as the initial condition and then gradually decreasing γ/µ. The mean field solutions are clearly strongly metastable. Though the spinodal lines are strongly size dependent the equilibrium transition is relatively size independent. . . . . . . . . . . . . . . . . . . . . . . . . . 69 The symmetric (crosses) and symmetry broken (plusses) states of the mean field theory for L = 107, T = 0.2, γ/µ = 0.01 and γ/µ = 1. . . 70 Finite size scaling behavior of the density in the symmetric phase for T = 0.2 and for different values of γ/µ. The line with slope −2/3 is the prediction from scaling theory given by Eq. 5.20. . . . . . . . . 72 Rescaled free energy per site vs γ/µ for T = 2 × 10−6 . At low γ/µ and large L, we can see that it follows a L2/3 scaling. . . . . . . . . 73 Figure 4.8 Figure 5.1 Figure 5.2 Figure 5.3 Figure 5.4 xi Figure 5.5 Figure 5.6 Figure 5.7 Figure 5.8 Figure 5.9 The equilibrium phase diagram determined from the crossing points of the free energy curves, such as those shown in the lower half of Fig. 5.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 In this log-log plot for the equilibrium phase diagram we see that it has a finite non-zero value for the intercept. . . . . . . . . . . . . . . 77 Plot showing the dependence of the critical γ/µ on T. At low T, the Y-intercept is γ/µ = 0.005 which is the phase boundary for rulers in the large L limit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Density and Free energy calculations done exactly and using mean field theory for L=26 at T=0.2. . . . . . . . . . . . . . . . . . . . . 78 Comparison of numerical results (+ + ++) for the length of optimal Golomb ruler with the best lower bound (solid line), and with the statistical physics scaling law (dotted line) that provides a useful upper bound on all best OGRs. The main figure is for exact OGR states, while the inset is for approximate OGR states of large size. . 83 xii Chapter 1 Introduction Reconstruction of heterogeneous and complex systems using pair correlation functions or pair distance information, is a problem that arises in many branches of materials physics [1, 2], in biology [3, 4] and also in a variety of engineering applications [5]. We distinguish between two problems (i) where the objective is to find a statistical characterization of a heterogeneous system that is consistent with experimental information. In these cases the reconstruction is not unique, but instead generates an ensemble of structures that are on average consistent with the data. Reverse Monte Carlo methods [6] for the atomic structure of glasses and simulated annealing methods for a range of heterogeneous materials are in this class. Large samples are often used and the system is highly underconstrained as there are many more degrees of freedom in the model for the atom locations than there is information in the data. (ii) A related but significantly different problem is where we seek to reconstruct a specific, unique, network or structure. The amount of information in the data must suffice to constrain the degrees of freedom in the structure. This problem can be hard for structures with only ten to hundreds of atoms or components. Uniqueness is lost when the model has too many degrees of freedom as compared to the available data. This unique structure problem is the focus of our study. Surprisingly, we find that it is possible to efficiently reconstruct large complex structures in two dimensions, given only Euclidean distance information. Crystallography represents the gold standard for structure determination and provided methods to overcome the phase problem are implemented and if there are no homometric 1 variants [7], provides a unique crystal structure. When crystals are not available, but a unique structure is still the objective, new methods are required. One successful approach is the determination of protein structure in solution that may be found by using pair distance information extracted from NOESY NMR data [3, 4, 8, 9, 10]. Two other approaches are emerging. The first is determination of the structure of individual nanoparticles using lensless imaging algorithms [11, 12, 13, 14]. The second approach is to extract a list of interatomic distances from scattering data and to solve a new inverse problem to find the atom locations. Here we present a highly efficient method to solve the latter inverse problem for the case of complex or random point sets in two dimensions. As discussed recently in [15, 16, 17] by Torquato and collaborators, reconstruction of heterogeneous systems in general requires multipoint correlation functions. However pair correlations are by far the most readily available structural data for heterogeneous materials as they are found by a Fourier transform of elastic electron, x-ray or neutron scattering data collected, for example, at national facilities. There is thus a strong motivation to find methods to determine the extent to which we can reconstruct heterogeneous systems using pair information only. The most fundamental pair information is the list of distances between points or atoms in a structure, reducing the problem to an inverse problem, namely: Given a set of interatomic distances find the location of the atoms, up to global rotations and translations of the structure. This pair distance inverse problem (PD-IP) may be interpreted as a complex network reconstruction problem where the edge weights are equal to the Euclidean distances between nodes in the network. Moreover, it has been recently shown that a list of pair distances may be extracted from scattering data using the pair distribution function (PDF) method [18]. The PD-IP is central to determining protein structure from NMR data, however there 2 are vital differences between the problem we study and the NMR PD-IP problem. The most important difference is that the list of residues or sequence of a protein is known enabling mutation and other experiments to be carried out to specify the points between which each distance lies. This leads to the assigned pair distance inverse problem (APD). In contrast, the problems concerning materials and most heterogeneous media, the pair distances are not assigned making the inverse problem significantly harder as there less information in the data. This is the unassigned pair distance inverse problem (UPD). In fact, APD algorithms for reconstruction of atom locations from precise distances is known to be easy, being of order the number of atoms in the structure (N). However the NMR problem is plagued by uncertainties in the experimentally determined interatomic distances with experimental imprecisions typically of order 25% or higher [19]. The problem of finding protein structure from NMR data is then best treated using loose restraints rather than hard distance constraints. The energy landscape of the APD with loose constraints has many of the features of spin glass problems leading to the belief that NMR structure determination using loose assigned distances (loose APD) is computationally hard [20, 1, 21]. In almost all other Euclidean network reconstruction problems the distances are not assigned, as we do not know which nodes lie at the end of each distance. For example, the pair distribution function method is used for the analysis of the local structure of nanoparticles and complex materials. In many complex materials, such as high performance thermoelectric materials [22], high temperature superconductors [23] and manganites [24], crystalline order and heterogeneous local distortions co-exist so that crystallographic and PDF methods are complementary. Crystallography finds the average structure and the PDF the local structure [25, 26]. The pair distribution function gives a direct measure of the list of interatomic distances arising in the local structure, however the endpoints of the distances are not known 3 so we face a computationally challenging UPD problem [27]. Recently, in collaboration with Professor Billinge’s group, we developed efficient algorithms for the UPD problem for cases where there is significant symmetry in the structure, including C60 and a range of crystal structures. In those cases we found two types of algorithm worked well, genetic algorithms and a novel algorithm called Liga [28, 29, 30]. Liga works well while reconstructing structures having high symmetry. But for solving structures with hundreds of points, Liga fails miserably for low symmetry problems such as random point sets, due to the fact that there are a large number of unique pair distances in random structures. They thus fail for the general problem of complex Euclidean networks. Here we present an algorithm that is specifically designed for reconstructing complex Euclidean networks where there are a large number of unique distances. A formal statement of the UPD problem is as follows. We are given a list of distances {dl }, l = 1...M, between points in a D-dimensional Euclidean space. Our task is to find the co-ordinates of the points {ri }, i = 1, ..., N such that the distance between every pair of points |ri − rj | = rij is a member of the distance list {dl }. Moreover we require that every distance in the list {dl } occurs for some pair of points (i, j) in the structure. The only inputs to the Euclidean network reconstruction algorithm described below are the number of points in the network N and a list of N(N − 1)/2 Euclidean distances. Physically, it is useful to think of the Euclidean distances as natural lengths of Hookian 0 so that we may define an energy function, springs, lij 0 }) = E({lij ij 0 )2 kij (lij − lij (1.1) In the ideal UPD problem the distance list is known precisely, but we don’t know the 4 150 Multiplicity 120 90 −→ 60 30 0 0 1 2 3 4 d 5 6 7 8 2 Multiplicity 1.5 1 −→ 0.5 0 0 2 4 6 8 10 12 d Figure 1.1: (color online) Simple examples of structures found from Euclidean distance lists. The figures on the left are plots of the distance lists for: a) (top) a C60 fullerene that has a degenerate distance list, and b) (bottom) a random set of 10 points in the plane that has a non-degenerate distance list. The fullerene has a total of 1770 interatomic distances, but only 21 unique distances. The random point set has, with high probability, 45 unique distances. The multiplicity is on the vertical axis while the distance is on the horizontal axis (in arbitrary units). The figures on the right hand side are solutions to the inverse problem found using the Liga algorithm (fullerene) and Tribond (random point set) to find the structure from the given distance lists, without the use of any other information. For the random point set all interatomic distances are drawn in the figure. For clarity only the nearest neighbor bonds are drawn in the fullerene case. In this study, the distance lists are taken from the known structure and then we try to solve the inverse problem using only the distance list. In the real world, the structure is unknown and the distance lists are derived from experiments, particularly x-ray and neutron scattering data. 5 0 . This is the precise UPD. In the precise UPD the key compumapping or assignment dl → lij tational difficulty is to find this mapping or assignment of dl to lij . If the correct assignment is found the energy (1) is zero, while wrong assignments lead to stretched or compressed springs and a finite energy. Strategies to treat the loose UPD problem are discussed in Section IV. In the assigned pair distance inverse problem (APD), when the inter point distances are known precisely, the problem can be solved in polynomial time. A problem can be solved in polynomial time (P) if the computational cost for an input of size N is O(N k ), where k is a non negative integer. When there are uncertainties in the experimentally determined inter atomic distances, the problem is computationally hard and is NP [31, 32]. NP stands for non-deterministic polynomial. For problems in NP, the solution can be verified in polynomial time. Please note that problems in P are also in NP. For the APD problem, the method for solving the precise case was the foundation for solving the problem with imprecise distances. Here, we present an algorithm that solves the unassigned problem (UPD) in the precise case and we hope that it will offer insights that lead to techniques for solving the imprecise case. Two examples of this problem are presented in Fig. 1.1. Fig. 1.1a (top) presents an example of a degenerate distance list, typical of structures which have high symmetry, while Fig. 1.1b (bottom) is an example of a random point set where all distances are, with high probability, unique. Since the number of Euclidean distances is M = N(N − 1)/2, a search over all permutations of the distances to find the correct assignment of dl to lij requires a computational time proportional to the factorial of M, so that τ ∼ M!. This is worse than exponential time complexity and is also a very poor way to proceed. The Tribond algorithm is presented in this work and is shown to have a polynomial complexity. A related problem in one dimension is the optimal Golomb ruler. Common rulers have 6 marks which are equally spaced so that you can measure any distance between 1 and the length of the ruler by placing an object between any two marks with the desired distance. With a common ruler (Fig. 1.2) one can measure the distance 4 in multiple ways, say by placing the object between the marks 0 and 4 or between marks 1 and 5. Golomb rulers can be thought of as a special kind of rulers in which every distance between two marks is different from all others. For example if there is a mark at position 1 and 5, then no other pair of marks must be separated by a distance of 4. From this definition, we can see that a common ruler with more than 2 marks is not Golomb. Using a ruler with the following marks 0, 1, 4, 9, 11 (Fig. 1.3) we can measure the distances {1, 2, 3, 4, 5, 7, 8, 9, 10, 11} by using only one pair of marks, therefore the Golomb property is satisfied. Rulers which have the smallest possible length for a given number of marks are called optimal Golomb rulers (OGR). Figure 1.2: A common ruler Maximizing irregularity and constructing optimal Golomb rulers are closely related [33]. Because of this property, Golomb rulers have applications in a wide variety of fields. Some of the real world applications include x-ray crystallography [34, 35, 36, 7], radio systems [37], radio astronomy [38, 39, 40, 41, 42, 43, 44, 45] and missile guidance [46]. Stated mathematically, a Golomb ruler is a set of non-negative integers with the property 7 Figure 1.3: A Golomb ruler that the differences between the integers are all distinct. To be concrete, consider a set of n integer markers m1 , m2 , ..., mn , and their associated distances dij = |mi − mj |. By convention, the first mark is set at position zero (m1 = 0) so the length of ruler is equal to mn . A Golomb ruler is a set {mi } satisfying the constraint that all distances di> 21, it appears that there are far fewer distance constraints than is required to find the correct structure using the distance list alone. However, the distances with the same length are not necessarily degenerate as they may have different directions in the structure. Mathematical analysis of this issue is currently absent and is an important challenge. In contrast for generic random point sets that are of interest here, all of the distances are unique so that for a random Euclidean network with N = 60 nodes, there are 1770 different distance constraints, which is far more than is required to specify the Euclidean network in three dimensions. The above discussion indicates that there are more than enough pair constraints in complex Euclidean networks to specify the network structure. As described in the next section, these rigidity concepts may be used to develop an efficient reconstruction algorithm. However it is important to keep in mind the limitations of this approach, including the issues of degeneracy and the fact that Laman’s theorem [75] only strictly applies to planar graphs. The theoretical foundation of efficient algorithms for the UPD problem rests on rigidity theory discussed above that states that an isostatic structure in two dimensions (from Eq. 3.1) has Bc = 2N − 3 independent distance constraints. However, the key test of whether the assignment of distances to natural lengths is correct is to place at least one additional, overconstrained Euclidean distance into the structure. A distance incompatible with the isostatic structure leads to a finite strain energy cost in Eq. 1.1, due to stretched or compressed springs, while a distance compatible with the isostatic structure has zero energy cost. Note 24 that many isostatic structures that are inconsistent with the final structure can be made, but with high probability, no overconstrained zero cost structures can be made that are inconsistent with the final reconstruction. 3.2 Tribond 2D algorithm In two dimensions the smallest structure with at least one overconstrained bond is N = 4 where the total number of bonds is 42 = 4 × 3/2 = 6, while the number required for isostaticity is (from Eq. 3.1) 2N − 3 = 5. The key observation is that if six Euclidean distances are found that form a point set structure, and the cost function for this structure and these distances is zero, then a unique substructure has been found. A zero cost, correct, substructure with six distances and four sites is called a core. Once a core is found, and if there is no degeneracy, then this core is a correct substructure of the complete reconstruction. One may then build up from the core iteratively to find the complete structure. At each step there is an existing, correct substructure. Then add one site and search for three edges that are compatible with the new node and with three nodes that are in the existing structure. The addition of one site and two edges is an isostatic addition, while the addition of one site and three edges is overconstrained. If three edges that are compatible with one additional site and three sites in the existing structure are found, with high probability, this site is part of the correct reconstruction. In practice, to construct a core we choose the smallest bond as the base for all our triangles and loop over all triangle pairs which are feasible according to the triangle inequality. For every triangle pair we calculate the length of the bond that connects the two apex points which we call the bridge bond. The length of the bridge in the candidate core is tested 25 against the lengths in the distance list. If the candidate bridge length is equal to an unused distance in the distance list, we have found a core (Fig. 3.1 and Fig. 3.2). Figure 3.1: (color online) An example of a core. In 2D, it consists of 4 points. The horizontal bond is the base (in black), the bonds below it (in blue) make up the base triangle while those above it (in red) make up the top triangle. The vertical bond is the bridge (in green). Figure 3.2: Four possible positions for the top triangle are shown. The corresponding bridge bonds are shown using a dashed line. The build up procedure consists of choosing a triangle as the base triangle in the existing structure, followed by an attempt to add a site to it. The addition of a site consists of 26 choosing an edge in the base triangle as the base bond and generating test triangles with that base bond and using two distances from the distance list. After we place this site, we carry out bridge testing to determine whether the structure has zero strain energy. Our Tribond implementation of the above procedure for the unassigned PD-IP algorithm may be summarized as follows: We are given the sorted distance list {dl } with the number of nodes in the network N. We start with an empty set, then A. Core finding procedure 1. Choose the shortest bond as the base bond and a window (subset) of W = 6 entries in the distance list for the core finding search. 2. Iterate over all triangles constructed with the triangle inequality that have the same base bond using distances in the window W . 3. Recursively search over all the pairs of the feasible triangles generated above and over all lengths in the Euclidean distance list to find a bridge. If a compatible core is found, remove the edges used from the distance list and exit. 4. Increment W and return to (1), making sure not to retest bond combinations. B. Buildup procedure. 1. Choose a base triangle to be the reference for our buildup. 27 2. Search over all sets of two edges from the distance list to find a set compatible with the base triangle in the existing structure. Search over the distance list to find a bridge bond. 3. If successful, remove from the distance list the edges that are used in connecting the newly added node. If size of reconstructed structure is < N return to step 1 of the Buildup procedure. A coarse upper bound on the computational time for this procedure consists of two parts: (i) the time to find the core; (ii) the time to carry out the buildup procedure. The number of unique cores in the point set is N4 , the number of ways of choosing 4 sites from N total sites. The number of ways of choosing six distances from the set of M = N(N − 1)/2 distances is M 6 . A brute force search then finds a core in computational time N 8 τcore ∼ M 6 / 4 ∼ N /1920. Using similar reasoning, a brute force buildup algorithm takes 6 a computational time that scales as τbuildup ∼ M 3 ∼ N /48. This clearly shows that the method is polynomial though the power of the polynomial is too high for this to be practical. The simple methods we have developed reduce the computational time very significantly from the coarse upper bounds of the last paragraph. The key observation is that many of the distances in the distance list violate the triangle inequality d1 + d2 ≥ d3 , so they clearly cannot form a triangle together. A large fraction of the computational time in a brute force search is spent exploring these trivially inconsistent distance combinations. If we fix the base bond and the bridge bond is found using binary search, using simple combinatorial arguments N 6 we get τcore ∼ M 4 ln(N)/ 2 ∼ N ln(N). For a triangle with base bond a and second side b, the range of values for third side c is (b − a, b + a). So for a larger base bond a, there is a much bigger range of feasible values for the the third side and hence the number of feasible 28 triangles goes up. But the actual number of triangles in the target structure is the same for any choice of base bond. This is seen in Fig. 3.3, where the number of feasible triangles goes up with the fractional position of the base bond in the distance list. Hence statistically, we can find a core in the least time if we choose the shortest bond in the distance list as our base. Distances are also more likely to satisfy the triangle inequality if they are drawn from a list of comparable, rather than disparate, lengths. Since the base bond is short, a core is more likely to be found quickly by searching over other short distances first (the small-core hypothesis, Fig. 3.4), and including longer distances only as necessary. This is implemented as a window of the W shortest distances in the distance list, which increases periodically as core finding proceeds. Of the six bonds in the core, the base is fixed, four are drawn from the window, and the bridge bond may appear anywhere in the distance list. We observe that a window of size W ∼ N is usually sufficient to find a core. Therefore, typical computation time is τcore ∼ N4 ln(N) ∼ N 4 ln(N). From Fig. 3.3 and Fig. 3.4, we can guess that when we use the smallest bond as the base it will lead to the core finding and reconstruction in the shortest possible time. This is confirmed from our runs and can be seen in Fig. 3.5 where we used base bonds that were picked from 10 different places uniformly spread along the sorted distance list. If the smallest bond is chosen as the base, it took significantly less time for the entire construction. Attempting to find the core for large point sets (N > 200) frequently leads to bad cores. Bad cores are over constrained clusters whose distances are part of the given distance list, within our tolerance, but the substructure is not present in the target structure. This occurs due to the fact that we are using finite tolerance when checking for the bridge bond and we also have finite precision when we are doing the triangulation while placing the points. 29 Number of feasible triangles 6 10 5 10 0 0.2 0.4 0.6 0.8 1 Fractional position of base bond in the distance list Bond window for core Figure 3.3: Number of feasible triangles using the bonds from a given distance list go up when we choose a larger bond as base for the triangle. Statistically, using the shortest bond in the distance list as the base leads us to the core in the shortest time. This plot shows data from runs using 10 different structures with N = 128. 5 10 4 10 3 10 0 0.2 0.4 0.6 0.8 1 Fractional position of base bond in the distance list Figure 3.4: Small core hypothesis: (N = 1024) We see that when we have the smallest bond in the distance list as the base, the first core is in a distance window an order of magnitude smaller than other choices for the base bond. Hence, statistically, using the first bond as base is our best bet when searching for the core. 30 Number of bridge bond checks 9 10 8 10 7 10 6 10 5 10 4 Core Finding Buildup Total 10 3 10 0 0.2 0.4 0.6 0.8 Fractional position of base bond in the distance list 1 Figure 3.5: Plot illustrating the role of the base bond. For N = 32, the Tribond algorithm ran using base bonds that were picked from 10 different places spread along the sorted distance list. If the smallest bond is chosen as the base, we see that it takes 3 orders of magnitude less time for the core finding stage and an order of magnitude less time for the buildup stage. We also see a loss in precision when placing points by doing triangulation while using the smallest length in the distance list as the base bond. So, we try to use all 6 bonds (in the core) as the base bond and check if the corresponding bridge bond is valid or not. We only take cores for which the bridge bond is valid in all of the 6 cases. This stringent check is very good at identifying bad cores. A confirmatory test is to use a structure comparison routine that flags the core as bad. We have developed a structure comparison routine that overlays the points in the reconstructed structure (r) with the points in the target structure (R) and calculate an overlay error (Eq. 3.2) which can tell us how good the fit is. This routine can also tell us if a given substructure is part of the target structure or not. This is useful for testing purposes and not for the practical application, where we do not know the answer (target structure) ahead 31 of time. ǫoverlay = i |ri − Ri |2 . (3.2) When we find a core, we attempt buildup and add more points to the substructure. If after looping over a certain number of bonds from the distance list, it is unable to add any points, then we call it a bad core and get back to the core finding stage and find the next one. This is a heuristic method that helps us identify bad cores in a short amount of time. We have observed that the core is bad because even when given a large amount of time, it fails to add a significant number of points while doing the buildup. It is important to choose the appropriate amount of tolerance when checking if the bridge bond is part of the given distance list. Using a very loose tolerance leads to a large number of bad cores. On the other hand, if we use a very tight tolerance we miss out on good cores, because we have finite precision when carrying out the triangulation to place the points in our substructure. We use floating point numbers for the input distances which has an accuracy of 18 digits. We found that using a relative tolerance of 10−12 is optimal to make sure that we get the good cores and filter out the bad ones. We observe a loss of precision when trying to place points that are collinear to the points that form the base bond. In such situations we relax the tolerance when checking for the correctness of the bridge bond. To check the validity of a new point while doing buildup, in addition to the bridge bond check, we check 10 additional distances that it creates with the points already in the substructure. Only if these are part of the distance list do we add this new point to the structure. Whenever a new point is added to the substructure, we note the 3 bond lengths (two from the new triangle created and the third is the bridge) that were used and make 32 them unavailable during further reconstruction. This reduces the list of available distances by 3. When placing the nth point if we update all n − 1 distances created between the new point and the points already in the substructure this reduces the number of available distances substantially but we found that it does not lead to any significant speedup in the buildup routine. If after doing the buildup we still don’t have the desired number of points (N), we relax the tolerance for the bridge bond checks and rerun the buildup procedure. If we are still short, we choose a different bond as the base and attempt to do the buildup using that bond. Once we have a full reconstruction, we calculate the distance error, which is based on the agreement between the given distance list and the distances derived from the reconstructed structure. We also calculate the overlay error (Eq. 4.1) using the structure comparison routine which is useful for testing purposes. The Tribond algorithm was run for N = 8, 16, ..., 512 and the computational cost for the core finding and buildup stages was recorded. The cost is the number of bridge bonds that were checked when placing a point in the structure. It is useful as it is a system independent measure of the cost. The timing runs are given in Fig. 3.6. We can see that the time required for doing the buildup is about an order of magnitude less than that for the core finding stage. The scaling is τtotal ∼ N 3.32 , which shows that our algorithm has a polynomial run time albeit a higher order one. This scaling proves to be better than our estimate obtained earlier using simple combinatorial arguments as we had not accounted for the speedup obtained by using the triangle inequality. 33 Number of bridge bond checks 9 10 Core Finding Buildup Total 8 10 7 10 6 10 5 10 4 10 3 10 2 10 8 16 32 64 128 256 512 Number of sites Figure 3.6: Experimental results for a series of reconstructions from distances lists generated from random point sets in two dimensions. The time for finding the core, the time for doing the buildup starting with the core and the total time are presented as a function of the number of bridge bond checks that were performed. Bridge bond checking is a fundamental process in Tribond and provides a system-independent measure of computational time. Each point on the plots is an average over 25 different instances of random point sets. We find that the total time scales as τtotal ∼ N 3.32 . 34 Figure 3.7: A perturbed graphene cut out made from 144 atoms. The Tribond algorithm successfully reconstructed a similar structure in a few minutes. 3.3 Applications In the previous section we showed that Tribond is successfully able to reconstruct random point sets. We now try to solve some structures which occur in the real world. Structures occuring in nature are usually symmetric but because of finite size effects they have defects which cause small deviations in their “ideal” locations. Fig 3.7 shows a graphene nanoparticle with 144 atoms. The location of each point differs from their “ideal” ones via a small noise added to simulate natural imperfections. Tribond reconstructed this graphene sheet in a few minutes. Tribond can also solve 2D polymers modeled by self avoiding random walk (Fig. 3.8). If the polymer is modeled as a random walk in the continuum, then it is just like a random point set. Since we have all the pairwise distances for the structure, it forms a complete graph. 35 Figure 3.8: Self-avoiding walk is a sequence of moves that does not visit the same point more than once and is used to model polymers. Tribond was able to successfully reconstruct the above structure (N = 100) in a few minutes. From graph theory we know that every complete graph is Hamiltonian, i.e. there is a path that visits every point exactly once. Hence polymers modeled as a self avoiding walk equate to a random point set for reconstruction purposes. We are able to reconstruct polymers (like in Fig. 3.8) knowing only their unassigned distances using the Tribond algorithm. We also reconstructed a 100 site point set on a square grid as shown in Fig. 3.9, that was gently perturbed, in a few minutes by our algorithm. 3.3.1 Tribond for structures with high symmetry We refined the idea behind Tribond to make a modified algorithm that can deal with structures having symmetry which have a highly degenerate distance list. We use only one instance 36 of each distance from the given distance list and find a core. During the buildup, at each step of the reconstruction we use only one instance of each distance and keep track of its multiplicities. So, at any given time, the distances that are available to the algorithm are all unique. This cuts down on the number of bad cores and points and helps guide our search. Using this modified approach we have been able to solve square grids with up to N=1024 (32 × 32) points in under 10 minutes on a desktop computer (but there is some trouble for N=400, 676, 900 as the algorithm tries to grow into a bigger lattice instead of completing the grid). 3.3.2 Reconstruction from an imprecise distance list So far, we always started with a distance list which had entries that were known to a very high precision of 18 digits. Imprecise distance lists have less information in them and that makes it more difficult to solve our inverse problem. So we modified our original algorithm to deal with this situation as follows. 1. Start with a core (or substructure) and an empty pool which can save the coordinates and their associated cost for up to a maximum of 20 candidate points. 2. We randomly choose a bond in the substructure as the base for our buildup. Then we search over all sets of two edges from the distance list to make a test triangle and the new vertex is our test point. Evaluate the complete cost of the new substructure if this test point were to be added. If this cost is low and is less than that of the worst point in the pool, then add this point to the pool in the correct place based on its cost. Remove the worst point from the pool so that its size never exceeds the maximum. 3. Now choose another base bond randomly in the current substructure and repeat the 37 Figure 3.9: Gently perturbed square grid made of 100 sites. Our algorithm was successfully able to solve such a structure in a few minutes. 38 previous step. Combine the two pools obtained so far based on the cost such that we have up to 20 candidate points. 4. Iterate over all possible 2 point combinations (in 20 choose 2 ways) from the pool and find the pair which will have the minimum cost if added to the substructure. 5. Add the 2 points found in the above step to get a bigger substructure. If its size is < N, then go to step 1. As compared to our buildup procedure (when we have precise data), we now do the buildup in multiple stages, first generating a pool of candidate points which have a low error with respect to the current substructure (based on single point addition). Then 2 points are added to the substructure from the pool which have the lowest error. We do this iteratively until we have the complete structure. As we gradually grow the structure and generate the pool of candidate points multiple times, we avoid all the bad points (low cost but wrong) and correctly guide the search. Our results can be seen in Fig. 3.10, which shows the minimum core size needed to reconstruct a structure of size N = 26, 50, 76, 100 for different values of the precision (P ) of the input distance list. The units for the precision of the distances is the number of digits. Our criteria for success was that the algorithm should be able to successfully reconstruct at least 5 out 10 different random point set structures. We can see that as the distances become less precise, a core of a larger size is needed for successful reconstruction. The typical run time for N = 26, 50, 76, 100 was about 1 minute, 10 minutes, 4 hours and 20 hours respectively, on a node in our high performance computing center. When the input data has a higher precision (P > 8) than what is shown in the plot, we found that a core size of 4 was sufficient to reconstruct the structure. 39 Minimum initial substructure size N=26 N=50 N=76 N=100 24 20 16 12 8 4 4 5 6 7 8 Precision of distance list (number of digits) Figure 3.10: Plot of minimum core size vs precision of the input distance list for N = 26, 50, 76 and 100. We can see that a bigger core is needed for a less precise distance list. 3.4 Summary In this chapter, the details of the Tribond 2D algorithm for the reconstruction of low symmetry structures represented by random point sets were presented. The Tribond 2D algorithm consists of two steps: core finding and buildup. The core is the smallest substructure with at least one over-constrained bond and is of size 4 in two dimensions. Choosing the smallest bond as the base bond for reconstruction had a dramatic improvement in performance. Computational cost of core finding was orders of magnitude more than buildup. Tribond 2D was able to reconstruct random point sets in 2 dimensions of size ∼ 1000 in about 24 hours on a desktop computer when given precise distances. A modified approach was presented for the buildup step with less precise data and given a known substructure. As precision decreases, it is clear that we need a substructure of larger size, underscoring the importance of the core finding step. We successfully reconstructed 40 random point sets of size 100, with the distances having 4 digits of precision, given a known substructure of size 24. 41 Chapter 4 The Tribond 3D algorithm The extension of Tribond algorithm to three dimensions is discussed in this chapter. 4.1 Tribond 3D algorithm In three dimensions the smallest structure with at least one overconstrained bond is N = 5, where the total number of bonds is 52 = 5 × 4/2 = 10, while the number required for isostaticity is (from Eq. 3.1) 3N − 6 = 9. The key observation is that if we find ten Euclidean distances that form a point set structure, and the cost function for this structure and these distances is zero, then we have found a unique substructure. We call a zero cost correct substructure with ten distances and five sites a core. If the distance list was non-degenerate, then with high probability, this core is a correct substructure of the target structure. We may then build up from the core iteratively to find the complete structure. At each step we have an existing, correct substructure. We then add one site and search for four edges that are compatible with the new node and with four nodes that are in the existing structure. The addition of one site and three edges is an isostatic addition, while the addition of one site and four edges is overconstrained. If we find four edges compatible with one additional site then, with high probability, this site is part of the target structure. In practice, to construct a core (Fig. 4.1) we choose the smallest bond as the “base bond”. We then test all the bond combinations using the triangle inequality to generate 42 Figure 4.1: (color online) An example of a core. In 3D, it consists of 5 points. The points at the top and at the bottom are the apex points. The three points in the middle form the base triangle (in black). The base triangle along with the apex point at the bottom forms the base tetrahedron (in blue), while the base triangle along with the apex point at the top forms the top tetrahedron (in red). The vertical bond connecting the two apex points is the bridge (in green). feasible tetrahedron pairs. This is performed in two steps, first we fix a tetrahedron as the “base tetrahedron” and then search through all other candidate (“top”) tetrahedra that share the same base triangle. After all the top tetrahedra have been exhausted then a new base tetrahedron is selected and the process continues. For every tetrahedron pair we calculate the length of the bond that connects the two apex points, which we call the bridge bond. The length of the bridge in the candidate core is tested against the lengths in the distance list. If the candidate bridge length matches an unused distance in the distance list, we have found a core. In the buildup procedure, we try to add more sites to the core. The addition of a site consists of generating candidate top tetrahedra using the base triangle and three distances from the distance list. After we place this site, we carry out bridge testing to determine 43 whether the structure has zero strain energy. While core finding requires a search over all possible base and top tetrahedra, buildup requires only a search through top tetrahedra as the base tetrahedron is a known part of the structure. Consequently, buildup requires significantly fewer computations than core finding. Our Tribond implementation of the above procedure for the unassigned PD-IP algorithm may be summarized as follows: We are given the sorted distance list {dl } with the number of nodes in the network N. (The target network is generated by randomly placing N points in a cubic box with side of length N.) We start with an empty set, then A. Core finding procedure 1. Choose the shortest bond as the base bond and a window (subset) of W = 10 smallest entries in the distance list for the core finding search. 2. Iterate over all triangles constructed with the triangle inequality that have the same base bond using distances in the window W and generate tetrahedra. 3. Search over all pairs of the feasible tetrahedra generated above and calculate the bridge bond. Using a binary search, test if there is an unused distance that matches the bridge bond. If such a value is found, we have a core. Remove the edges used from the distance list and exit. 4. Increment W by 10 and return to (1), making sure not to retest bond combinations. 44 B. Buildup procedure 1. Search over all sets of three edges from the distance list to find a set compatible with the base tetrahedron in the existing structure. Search over the distance list to test the bridge bond. 2. If successful, remove from the distance list the edges that are used in connecting the newly added node. If size of reconstructed structure is < N, return to previous step and resume the search. A coarse upper bound on the computational time for this procedure consists of two parts: (i) the time to find the core; (ii) the time to carry out the buildup procedure. The number of unique cores in the point set is N5 , the number of ways of choosing 5 sites from N total sites. The number of ways of choosing ten distances from the set of M = N(N − 1)/2 distances is M 10 . If we had done a brute force search then we would find a core in computational N 15 time τcore ∼ M 10 / 5 ∼ N . Similarly, using brute force for the buildup would take a 8 computational time that scales as τbuildup ∼ M 4 ∼ N . This clearly shows that the brute force approach is polynomial, although a high order one. The simple methods we have developed reduce the computational time significantly from the coarse upper bounds of the last paragraph. The key observation is that many of the distances in the distance list violate the triangle inequality d1 + d2 ≥ d3 . A large fraction of the computational time in a brute force search is spent exploring these trivially inconsistent distance combinations. If we fix the base bond and the bridge bond is found using binary N 13 search, using simple combinatorial arguments τcore ∼ M 8 ln(N)/ 3 ∼ N ln(N). For a triangle with base bond a and second side b, the range of values for third side c is (b−a, b+a). So a larger base bond requires a much larger range of feasible values for the third side and, 45 Number of feasible tetrahedra 11 10 10 10 9 10 8 10 0 0.2 0.4 0.6 0.8 Fractional position of base bond in the distance list 1 Figure 4.2: Number of feasible tetrahedra using the bonds from a given distance list go up when we choose a larger bond as base for the base triangle. Statistically, using the shortest bond in the distance list as the base bond leads us to the core in the shortest time. This plot shows data from runs using 10 different structures with N = 20. hence, the number of feasible triangles and tetrahedra increases. But the actual number of triangles and tetrahedra in the target structure is the same for any choice of base bond. This is seen in Fig. 4.2, where the number of feasible tetrahedra increases with fractional position of the base bond in the distance list. Hence, statistically, we find a core in the least time if we choose the shortest bond as our base. Distances are also more likely to satisfy the triangle inequality if they are drawn from a list of comparable, rather than disparate, lengths. Since the base bond is short, a core is more likely to be found quickly by searching over other short distances first (the small-core hypothesis, Fig. 4.3), and including longer distances only as necessary. This is implemented as a window of the W shortest distances in the distance list, and increased periodically as core finding proceeds. Of the ten bonds in the core, the base is fixed, eight are drawn from 46 Bond window for core 3 10 2 10 0 0.2 0.4 0.6 0.8 Fractional position of base bond in the distance list 1 Figure 4.3: Empirical example of the small-core hypothesis. The hypothesis states that there exists a core where at least 9 of the 10 total bonds are drawn from a relatively small window of the shortest bonds in the structure. Varying the base bond’s fractional position in the distance list for ten different N = 50 structures, core finding shows that using the smallest distance as the base bond reduces the typical size of the window required to find a core by an order of magnitude. the window and the bridge bond need not be in the window. It is observed that for small structures, a window of size W ∼ N is usually sufficient to find a core. Therefore, typical computation time is τcore ∼ N8 ln(N) ∼ N 8 ln(N). From these arguments, supported by Fig. 4.2 and Fig. 4.3, we expect that using the smallest bond as the base will lead to the core finding and buildup in a much shorter time. Fig. 4.4 shows the improvement is about 2 orders of magnitude. Attempting to find the core for large point sets (N > 10) frequently leads to bad cores. Bad cores are overconstrained substructures whose distances are part of the given distance list within a given numerical tolerance, but the substructure is not present in the target structure. This occurs due to finite tolerance when checking for the bridge bond and also finite precision 47 Number of bridge bond checks 11 10 10 10 9 10 8 10 Core Finding Buildup Total 7 10 6 10 5 10 4 10 3 10 0 0.2 0.4 0.6 0.8 Fractional position of base bond in the distance list 1 Figure 4.4: Figure illustrating the effect of base bond size on the computational cost (bridge bond checks) of reconstruction for N = 10. The plots for the total and core finding steps are nearly indistinguishable because the core finding is orders of magnitude more expensive than buildup. If the smallest bond is chosen as the base, the total computational cost of reconstruction is nearly 2 orders of magnitude lower than larger bonds. 48 while placing the points using triangulation. Triangles with both small and large distances are likely to have small angles, resulting in a greater loss of numerical precision. A base bond of intermediate length would limit this loss, but is not sufficient to forsake the performance benefits of a small base bond previously outlined. Instead, we try to use all 10 bonds (in the core) as the base bond and check if the corresponding bridge bond is valid or not. We only take cores for which the bridge bond is valid in all of the 10 cases. This check is very good at identifying bad cores. A structure comparison routine provides another test for bad cores by overlaying the points in the reconstructed structure (r) with the points in the target structure (R) and calculates an overlay error, ǫoverlay = i |ri − Ri |2 . (4.1) This error tells us if a given (sub)structure is part of the target structure. This proves useful for testing purposes only, as in principle the latter remains unknown. It is also used to verify the correctness of the final structure. If the buildup step fails to add any points after looping over a certain number of bonds from the distance list, likely due to a bad core, then we discard the substructure. We resume the core finding step and attempt another buildup from a new core. This heuristic helps identify probable bad cores efficiently. It is important to choose an appropriate tolerance when checking if the bridge bond is part of the distance list. Using a very loose tolerance leads to a large number of bad cores. On the other hand, using a very tight tolerance excludes good cores, due to finite precision when carrying out the triangulation to place the points in our substructure. We use floating point numbers for the input distances which have an accuracy of 18 digits. We found that a relative 49 tolerance of 10−12 is optimal to retain good cores and filter out bad ones. When trying to place points which are nearly collinear to the base bond a loss of precision is observed due to small angles, as discussed earlier. In such situations we relax the tolerance when checking for the bridge bond. To check the validity of a new point while doing buildup, in addition to the bridge bond check, we check 10 additional distances that it creates with the points already in the substructure. Only if these are part of the distance list does the new point get added to the structure. The 4 bond lengths (three from the new tetrahedron created and the fourth is the bridge) that were used are removed from further reconstruction. This reduces the list of available distances by 4. After placing the nth point, updating all (n − 1) distances created between the new point and the points already in the substructure reduces the number of available distances substantially. However, due to the computational cost of this update procedure, we see only a small speedup in the buildup routine. If after buildup, the structure has fewer than the desired number of points (N), we relax the tolerance for the bridge bond checks and rerun the procedure. If the structure remains incomplete, we choose a different bond as the base bond and attempt another buildup. After reconstruction, we calculate the distance error, which is based on the agreement between the given distance list and the distances derived from the final structure. The Tribond algorithm ran for N = 6, 7, 8, ..., 12 and the computational cost was measured in a system-independent manner by counting the number of bridge bond checks while placing a point in both the core finding and buildup steps (Fig. 4.5). The time required for buildup is several orders of magnitude less than that for core finding. As core finding is computationally very expensive, complete reconstruction was attempted only for small structures. Assuming that the core is given, buildup was attempted for larger structures 50 Number of bridge bond checks 9 10 8 10 7 10 6 Core Finding Buildup Total 10 5 10 4 10 3 10 2 10 6 7 8 9 10 Number of sites 11 12 Figure 4.5: Experimental results for a series of reconstructions from distances lists generated from random point sets in three dimensions. The computational cost (bridge bond checks) for finding the core, performing buildup and their total is presented as a function of the number of points. The plots for the total and core finding steps are nearly indistinguishable because core finding takes orders of magnitude more time than buildup. Each point on the plots is the median value from 10 different instances of random point sets. having size N = 25, 50, 75, 100. The timing results are shown in Fig. 4.6. 51 Number of bridge bond checks 10 10 9 10 8 10 7 10 6 10 5 10 25 50 75 100 N Figure 4.6: Experimental results for a series of reconstructions from distances lists generated from random point sets in three dimensions. The computational cost (bridge bond checks) for performing buildup is presented as a function of the number of points. Each point on the plots is the average over 10 different instances of random point sets. We find that the buildup time scales as τbuildup ∼ N 4.98 . 4.2 Applications Tribond 3D was used to solve the structure of some well known organic molecules and the results were compared with those from the Liga algorithm. While Liga was able to do the complete reconstruction for 19 structures, Tribond was able to do so for 14 of them. If a starting 5 atom structure is given, then Tribond is successful in reconstructing 56 structures, while Liga can only reconstruct only 21. These organic molecules have a large number of unique distances and are of intermediate symmetry. Hence, Tribond is more successful in doing the buildup as compared to Liga. Tribond first attempted core finding and buildup for 48 hours. If the core finding step 52 did not succeed, then only the buildup was attempted for 2 hours. Table 4.1: The following table lists the results, where success is denoted by 1 and failure by 0. CF stands for core finding and BU for buildup. structure 2me-3ane adrenln alanine arginine asa asparagine aspartate aspirin b-10ane b-11ane borane01 butane-a butane-e butane-g caffeine carboplatin cdecalin cholicac cisplatin cubane cy-5ane cystine d-7ane ethane ethanol glutamate glutamine glycine heptane histidine i-cy5ane isoleucine leucine lsd lysine menthol methane N 14 24 13 27 21 17 15 21 47 71 44 14 14 14 24 23 28 69 11 16 15 14 32 8 9 19 20 10 23 20 21 22 22 49 25 31 5 Liga-BU 1 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 0 1 0 1 1 0 0 1 1 0 0 0 0 0 0 0 1 Liga 1 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 1 Tribond-CF 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 1 53 Tribond-BU 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 Tribond 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 1 Table 4.1 (cont’d). structure methanol methionine mustardgas nicotine octane pbpc pentane phenylalanine piperine proline propane qcyclene quinine r2bu-ts rr-tacid rs-tacid s34ane serine srdimecp ssdimecp threonine tnt transplatin-hack tryptophan tyrosine valine valium vanillin total N 6 20 15 26 26 41 17 23 43 17 11 15 48 31 16 16 22 14 15 15 17 21 11 27 24 19 33 19 Liga-BU 1 0 1 0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 Liga 1 0 1 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 Tribond-CF 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 Tribond-BU 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 Tribond 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 21 19 14 56 14 54 Figure 4.7: Buildup for LSD (top) and Caffeine (bottom) molecules was done in 48.9 seconds and 2.1 seconds respectively. 55 Figure 4.8: Buildup for Cystine (top) and Lysine (bottom) molecules was done in 0.24 seconds and 2.8 seconds respectively. 56 Figure 4.9: Buildup for Quinine molecule was done in 84.4 seconds. 4.2.1 Reconstruction from an imprecise distance list Thus far distances have been known to a precision of about 18 digits, such that in our trials substructures are indistinguishable (to within a very small tolerance) to those consistent with a theoretical distance list of infinite precision. When we have an imprecise distance list, many small sub structures may be consistent with the distance list, though they may not be part of the target structure. The inverse problem under these conditions is significantly more challenging, both theoretically and practically. We have attempted to address structure buildup from a known core with an imprecise distance list in the case of random point sets. The modification of the original buildup algorithm described in Section 4.1 is as follows. Assume a known substructure that serves as the core (seed) for reconstruction. The modified buildup step (adding a point) now has multiple stages; it uses a pool of candidate points which have low error with respect to the current substructure, and adds the two candidates 57 which jointly lead to the lowest cost substructure. Because the pool examines many possible ways to grow the substructure, the likelihood of adding bad points is reduced. Adding two points at once is justified empirically, as this appeared to make the most acceptable trade off between success and run-time. The detailed steps follow. 1. Define an empty pool that saves the coordinates of k1 20 candidate points to add to the current substructure. Associated with each candidate is the cost of the new substructure if that point were added. Populate the pool with candidate points. First, randomly choose a triangle in the current substructure. Generate all tetrahedra using distances from the distance list which share the chosen triangle. Calculate the cost for each candidate point (the new vertices). If this cost is below a user-defined threshold add it to the pool, and if the pool exceeds its maximum size remove the worst candidate. The threshold significantly improves runtime without affecting the final structure. 2. Randomly choose another triangle in the current substructure and generate a new pool of size k2 20 as described above. 3. Select the best candidates from either pool to make a combined pool with k 20 points. 4. Calculate the pair cost for adding 2 candidates to the current substructure for each of the k2 possible pairs. 5. Add the 2 candidates with least pair cost to the substructure. If its size is less than target size then go to step 1. The results can be seen in Fig. 4.10, which shows the minimum core size needed to reconstruct structures of size N = 9, 17 and 25 for different values of the precision (P ) of the 58 Minimum core size 10 N=9 N=17 N=25 9 8 7 6 5 4 3 4 5 6 7 Precision of distance list 8 Figure 4.10: Plot of minimum core size vs precision of the input distance list for N = 9, 17 and 25. We can see that a bigger core is needed for a less precise distance list. The typical run time for N = 9, 17, 25 was about 1 second, 20 minutes and 15 hours respectively, on a computer with a 2.2 GHz processor and 2 GB of memory. input distance list. The criterion for success was that the algorithm successfully reconstructs at least 5 of 10 different random point sets. It can be seen that as the distances become less precise, a core of a larger size is needed for successful reconstruction. A notable case with imprecise distances is the PDF of nanostructured materials, which can give distance lists with uncertainties of order 0.01˚ A. For a typical nanoparticle of size ∼ 15˚ A, this means the input distances from experimental data will have 3-4 digits of precision and the algorithm is a promising approach. Future work will involve working on an algorithm that can better deal with missing, incorrect or less precise distances. Chemical information like the presence of functional groups (aromatic rings, etc) can serve as a core and also help construct the larger core necessary for buildup in the case of less precise distances. Some approaches to these issues are discussed in the context of reconstructing high symmetry 59 nanostructures from experimental PDF data using the Liga algorithm [28, 29, 30]. A hybrid approach using Tribond (low symmetry) and Liga (high symmetry) could potentially solve structures of intermediate symmetry. 4.3 Summary In this chapter, the details of the Tribond 3D algorithm for the reconstruction of low symmetry structures represented by random point sets were presented. The Tribond 3D algorithm consists of two steps: core finding and buildup. The core is the smallest substructure with at least one over-constrained bond and is of size 5 in three dimensions. Choosing the smallest bond as the base bond for reconstruction had a dramatic improvement in performance. Computational cost of core finding was orders of magnitude more than buildup. Tribond 3D was able to reconstruct random point sets in 3 dimensions of size about ten in a short amount of time and if the core is assumed to be given it is able to complete the reconstruction for structures with size N = 100 in about 2 hours on a desktop computer. A modified approach was presented for the buildup step with less precise data and given a known substructure. As precision decreases, it is clear that we need a substructure of larger size, underscoring the importance of the core finding step. 60 Chapter 5 Statistical physics of the optimal Golomb ruler In this chapter, a statistical physics approach to the combinatorial optimization problem of the optimal Golomb ruler (OGR) is taken and the resulting phase transition is studied. 5.1 Statistical mechanics formulation We define the Golomb lattice gas on a chain of length L, where each site i = 1, 2, ..., L of the chain has a lattice gas variable yi that may take the values zero or one. If yi = 1 the site i is occupied while if yi = 0 it is unoccupied. For example one of the two degenerate n = 4 OGR states has marker set {m} = {0, 1, 4, 6}. The lattice gas representation of this OGR is a chain of length L = 7, with site occupancies {yi } = {1, 1, 0, 0, 1, 0, 1}. We introduce the Golomb lattice gas Hamiltonian which consists of a chemical potential term and an energy term associated with the Golomb ruler constraints. L H1 = −µ yi + γ i ′ j=i,l=k yi yj yk yl δ(|j − i| − |l − k|). (5.1) The chemical potential (µ) is the amount by which the energy of the system would change if an additional site were occupied. The first term tries to maximize the density of our lattice 61 gas. The prime on the second sum indicates that degenerate cases where both j = l and i = k are omitted. The second term imposes the Golomb ruler constraints that the distances between occupied sites should be non-degenerate. The parameter γ tunes the constraint energy and in the limit γ/µ → ∞, and at low temperature, OGR states are the ground states of this Hamiltonian. This is due to the fact that OGR states make no contribution to the constraint energy, while the lowest energy state has the highest density, ensuring the optimal energy from the chemical potential term. An alternative formulation is to define a distance degeneracy function, D(d), in terms of the occupancy variables yi through, D(d) = yi yi+d i that gives the number of times a distance, d, appears in a marker set {m}. It is clear that we have the property d D(d) = n(n + 1)/2. Moreover we also have, d [D(d)]2 ≥ n(n + 1)/2 where equality holds iff the distances satisfy the uniqueness condition. The uniqueness condition is then imposed by the equation, [D(d)]2 = n(n + 1)/2 = d D(d) d This motivates introduction of an alternative energy function, E2 = d (D(d))2 − D(d) 62 This energy or cost function is always a positive integer or zero, with zero being correct for OGR marker sets. This leads to the lattice gas Hamiltonian, L−1 L H2 = −µ yi + γ2 i=1 d=1 (D(d))2 − D(d) which in lattice gas variables is, L−1 L H2 = −µ yi yi+d yi + γ2 i=1 d=1 i i yi yi+d − 1 (5.2) It is easy to show that the Hamiltonians obtained in two different ways are indeed equivalent. In Eq. 5.1 the second summand can be broken into 2 parts: one exhaustive counting over all possible index combinations and the other part which subtracts off the binary terms. ′ j=i,l=k = i,j,k,l − j=i,l=k The term with the binaries leads to the factor of −1 in Eq. 5.2. In Eq. 5.2 let j = i + d, and k = i, l = k + d, thus we have l = k + j − i. The -1 term is for the the binaries which has to be subtracted off because they are the cases where (i, j) = (k, l). The difference of the summands will give us the second summand in Eq. 5.1. In the low temperature limit and with γ → 0 the chemical potential is the only term and the sites are all occupied so the density ρ = i < yi > /L = 1. At high temperatures, entropy is maximized as empty and occupied sites occur with equal probability so that ρ = 1/2. Three limiting states of the Golomb lattice gas are then: (i) The dense phase ρ = 1, (ii) the high temperature phase ρ = 1/2 and (iii) the OGR state occuring at low temperatures and as γ/µ → ∞. The mean field analysis also confirms these three limiting 63 states. In the OGR phase, the density approaches zero in the large lattice limit. The way in which it approaches zero can be estimated based on probabilistic reasoning as follows. Define the probability that both sites i and j are occupied to be Dij . This probability can be related to the probability that any other pair of sites (k, l) share the same distance through, Dij = k 1 − Dk,j+k−i Within a uniform approximation, this reduces to D = (1 − D)L , or D 1/L = elnD/L = 1 − D. Solving to leading order gives, D ∼ 1/L. Since average density is ρ = n/L and D ∼ ρ2 , we have (n/L)2 ∼ 1/L, so that n ∼ L1/2 , which is consistent with rigorous bounds derived from analysis of Sidon sets [47]. A rigorous upper bound on the length of optimal Golomb rulers, mn ≤ n(n + 1), indicates that the density of the high constraint ground state goes to zero at large L as ρOGR ∝ a/L1/2 . Now we explore the statistical physics of the Hamiltonian using an effective medium approach that contains the exact OGR state as a limiting solution. 5.2 Mean field approach The Golomb lattice gas mean field theory is developed in the usual way, by writing yi = ρi + δyi , where δyi = yi − ρi is the fluctuation and ρi =< yi > is the average density at site i. Substitute this into Eq. 5.1. HM F = −µ (ρi +δyi )+γ i ′ j=i,l=k (ρi +δyi )(ρj +δyj )(ρk +δyk )(ρl +δyl )δ(|j −i|−|l −k|) 64 Now, keep terms having δyi and ignore higher order terms. HM F = −µ ′ γ j=i,l=k (ρi + δyi )+ i (ρi ρj ρk ρl + ρj ρk ρl δyi + ρi ρk ρl δyj + ρi ρj ρl δyk + ρi ρj ρk δyl )δ(|j − i| − |l − k|) Then substitute δyi = yi − ρi to get an equation which only has terms in yi and ρi . HM F = −µ γ ′ j=i,l=k yi + i (−3ρi ρj ρk ρl + yi ρj ρk ρl + yj ρi ρk ρl + yk ρi ρj ρl + yl ρi ρj ρk )δ(|j − i| − |l − k|) Using the symmetry of the variables, we get the following equation. HM F = −µ yi + γ i i (4yi − 3ρi )αi (5.3) where αi = ′ ρj ρk ρj+k−i . (5.4) Hi , (5.5) j=i,l=k Alternatively HM F = i where Hi = −µyi + γ(4yi − 3ρi )αi . 65 (5.6) The density at a site may then be found using yi e−βHM F /[ ρi =< yi >= yi ρi = e−βHM F ] (5.7) yi 0 + e−β[−µ+γ(4−3ρi)αi ] e−β[γ(−3ρi)αi ] + e−β[−µ+γ(4−3ρi)αi ] (5.8) yielding the Golomb lattice gas mean field equations, ρi = eβµ−4βγαi (5.9) 1 + eβµ−4βγαi The partition function for a lattice site is given by e−βHi = e−β[γ(−3ρi )αi ] + e−β[−µ+γ(4−3ρi)αi ] Zi = (5.10) yi =0,1 Please note that this is also the denominator in Eq. 5.8. Zi = e3βγρi αi (1 + eβµ−4βγαi ) (5.11) The Golomb lattice gas mean field free energy is given by F = −kT ln(Z) = −kT ln( F = −kT i Zi ) = −kT ln(Zi ) (5.12) i ln[e3βγρi αi (1 + eβµ−4βγαi )] (5.13) i Hence, we get F = −3γ i ρi αi − kT 66 ln[1 + eβµ−4βγαi ] i (5.14) The mean field equations Eq. 5.4 and Eq. 5.9 are a coupled set of non-linear equations that have many metastable solutions at large γ/µ and at low temperatures. In this regime, the optimal or equilibrium state of the system is the lowest free energy solution to these equations. The fact that the exact ground state in the OGR regime is known, for n ≤ 26, it enables a systematic study of the phase diagram and behavior of the MFT equations over the whole phase space. Mean field theory may also be developed from Eq. 5.2, where a similar analysis leads to, ρi = ′ eβµ−4βγ2 αi βµ−4βγ2 α′i 1+e or equivalently, ρi = where αi′ = d   2  j 1 −(βµ−4βγα′i ) (5.15) 1+e   ρj ρj+d  − 1 (ρi+d + ρi−d ) (5.16) with the constraints i + d ≤ L, i − d ≥ 1 on the quantities ρi+d and ρi−d respectively. We use Eq. 5.15 for ρi as it needs evaluating only one exponent. It is also more robust as the exponent will not overflow at low temperatures (when β becomes very large). If we look closely at Eq. 5.16, one can see that there are two nested summations to be carried out and αi′ (and hence ρi ) computation has a O(L2 ) complexity. The innermost summation over j for a given value of d, Hence we store the values for j ρj ρj+d , only 3 terms change when doing the site updates. j ρj ρj+d and use an intelligent update procedure instead of evaluating the entire sum every time. Trading memory for computational time we have an implementation that has a O(L) complexity for αi′ calculation and O(L2 ) complexity for 67 one sweep of ρi . We now try to solve Eq. 5.16 and Eq. 5.15 iteratively. If we do a sequential site update we find that it leads to a trivial oscillation between a fully occupied state with ni = 1 and an unoccupied state with ni = 0. Random site updates prevent this state from occuring. The Golomb lattice gas mean field free energy is given by F = −kb T ′ i ln 1 + eβµ−4βγαi − 3γ ρi αi′ . (5.17) i Using the random site update procedure we obtain the results presented in Fig. 5.1. It exhibits a transition from a smooth dependence on γ/µ at low values of γ/µ < (γ/µ)c to an irregular behavior at higher values of γ/µ > (γ/µ)c . Nevertheless, in both cases the steady solution we find ρi at long times is highly dependent on i so in all cases translational symmetry is broken. Moreover for γ/µ > (γ/µ)c , the spatial variation in ρi is more extreme and consists of a large number of sites with ρi = 0, while for γ/µ < (γ/µ)c sites with ρi = 0 rarely occur. The trajectories of all sites for γ/µ > (γ/µ)c are asymmetrical as illustrated by a typical trace as shown in Fig. 5.2. We use the crossing points of the free energy curves and get the phase diagram as shown in Fig. 5.5. Fig. 5.3 gives the scaling behavior of the density in the symmetric phase. In the next section we do an scaling calculations and see that it is close to what is observed in our simulations. 68 0.3 L=35 L=56 L=107 L=200 L=493 0.25 density 0.2 0.15 0.1 0.05 0 0.01 0.1 1 10 γ/µ 0 -0.05 Free energy per site -0.1 -0.15 -0.2 -0.25 L=35 L=56 L=107 L=200 L=493 -0.3 -0.35 0.01 0.1 1 10 γ/µ Figure 5.1: Density (top figure) and free energy per site (bottom figure) as a function of γ/µ for T = 0.2 and L = 35, 56, 107, 200, 493. For each chain length two calculations obtained by iterating through the Golomb lattice gas mean field equations are presented. One trace represented by the symbols is obtained by starting at γ/µ = 0.01, choosing a uniform initial condition and then gradually increasing γ/µ. The solid lines are obtained by starting at γ/µ = 10, choosing an exact OGR state as the initial condition and then gradually decreasing γ/µ. The mean field solutions are clearly strongly metastable. Though the spinodal lines are strongly size dependent the equilibrium transition is relatively size independent. 69 γ=1 γ=0.01 1 site density 0.8 0.6 0.4 0.2 0 0 20 40 60 80 100 site index Figure 5.2: The symmetric (crosses) and symmetry broken (plusses) states of the mean field theory for L = 107, T = 0.2, γ/µ = 0.01 and γ/µ = 1. 5.3 5.3.1 Asymptotic analysis Scaling In the symmetric phase we can use a uniform approximation ρi = ρs that is justified by Fig. 5.2 to give us an estimate for the free energy per site, fs ≈ −µρs + aγL2 ρs 4 + T [ρs ln(ρs ) + (1 − ρs )ln(1 − ρs )]. (5.18) The optimal density is found from δf /δρs , which gives, −µ + 4aγL2 ρs 3 + T [ln(ρs ) − ln(1 − ρs )] = 0. 70 (5.19) so that, when ρs is small we can ignore the T ln(1 − ρs ) term to obtain, ρs ≈ µ − T ln(ρs ) 1/3 4aγL2 As ρ is small but finite, when we are at sufficiently low temperatures we can also ignore the T ln(ρs ) term. The value of a can be found by considering the sums in Eq. 5.1 and our estimate is a = 1/3. This gives us, ρs = 3µ 1/3 4γL2 (5.20) The density in the symmetric phase is then predicted to scale as L−2/3 and is verified by numerical solutions of the mean field equations Eq. 5.15 and Eq. 5.16 at small values of γ/µ (please refer Fig. 5.3). If we use the above equation and substitute ρs ∼ L−2/3 into Eq. 5.18, we see that at low temperature free energy also has the same scaling behavior (fs ∼ L−2/3 ). We rescale the free energy and it is plotted in bottom part of Fig. 5.4 where we can see that the curves overlap and they follow the same scaling behavior. At low γ/µ there is some deviation for the small rulers which is due to finite size effect and we can see that it matters less and less for large L. 5.3.2 Phase boundary 5.3.2.1 Low temperature To find the phase boundary, we equate the free energy in the symmetric phase to the free energy of the asymmetric phase (fs = fa ). To get fs we substitute ρs from Eq. 5.20 into the low temperature free energy expression given in Eq. 5.18. In the asymmetric phase the 71 1 density γ=0.001 γ=0.002 γ=0.003 γ=0.004 γ=0.005 slope=-2/3 0.1 10 100 length of ruler (L) 8 L=493 L=200 L=107 L=56 L=35 7 L2/3 × density 6 5 4 3 0.001 0.01 γ/µ From theory L=493 L=200 L=107 L=56 L=35 8 L2/3 × density 7 6 5 4 3 5 6 7 8 9 10 (γ/µ)-1/3 Figure 5.3: Finite size scaling behavior of the density in the symmetric phase for T = 0.2 and for different values of γ/µ. The line with slope −2/3 is the prediction from scaling theory 72 given by Eq. 5.20. 0 L=35 L=56 L=107 L=200 L=493 log ( |Free energy per site| ) -0.5 -1 -1.5 -2 -2.5 -3 -3.5 10-6 10-4 10-2 100 γ/µ 102 104 106 4 L=35 L=56 L=107 L=200 L=493 log ( L2/3 × |Free energy per site| ) 3.5 3 2.5 2 1.5 1 0.5 -6 10 -4 10 -2 0 10 10 γ/µ 2 10 4 10 6 10 Figure 5.4: Rescaled free energy per site vs γ/µ for T = 2 × 10−6 . At low γ/µ and large L, we can see that it follows a L2/3 scaling. 73 ground state is the OGR state. From Eq. 5.1 we can see that the constraint energy is zero and we have fa = fOGR = −µnOGR /L. At low temperatures we can ignore the contribution due to entropy and using nOGR ∼ L1/2 , we can get an estimate of the zero temperature phase boundary, γ 81 1 = µ c 256 L1/2 (5.21) For L = 493 we get (γ/µ)c = 0.014 which is close to the intercept plotted in our phase diagram on a log log scale. If we make a plot of (γ/µ)c vs 1/L (Fig. 5.7) and calculate the y- intercept for T = 2 × 10−6 we get (γ/µ)c = 0.005 which is close order of magnitude to what we have seen earlier. 5.3.2.2 High temperature We start with Eq. 5.18 for the free energy per site in the symmetric phase. Eq. 5.19 gives us the optimal density, −µ + 4aγL2 ρs 3 + T ln ρs 1 − ρs = 0. 1/2 ρs ≈ ρs and in the symmetric phase ρs ∼ 2/3 , we At large T as ρs is small, we can use 1−ρ s L get −µ + 4aγL2 ρs 3 − T ln(2L2/3 ) = 0 As µ = 1, the first term is small compared to the other terms and we can ignore it to get ρs 3 = T ln(2L2/3 ) γ/µ 4aL2 74 (5.22) Now expanding the expression for the free energy per site Eq. 5.18 we get fs ≈ −µρs + aγL2 ρs 4 + T [ρs ln(ρs ) + ln(1 − ρs ) − ρs ln(1 − ρs )] fs ≈ −µρs + aγL2 ρs 4 + T ρs ln ρs 1 − ρs + ln(1 − ρs ) 1/2 ρs Using 1−ρ ≈ ρs and ρs ∼ 2/3 as we did earlier, s L fs ≈ −µρs + aγL2 ρs 1 + T −ρs ln(2L2/3 ) + ln(1 − ρs ) 8L2 (5.23) When we start at high γ/µ and initialize the ruler with the OGR state then the number of possible states W for the ruler is given by 2nOGR , where nOGR is the number of marks in the optimal Golomb ruler for length L. At high temperature the individual density for occupied sites will be 0.5 and the entropy per site will be given by s= 1 n 1 ln(W ) = ln(2nOGR ) = OGR ln(2) = 2ρa ln(2) N N N (5.24) Now the free energy per site in the asymmetric phase at large T when starting with the OGR state is given by fa = −µρa − 2T ρa ln(2) (5.25) Now equating fs and fa from Eq.5.23 and Eq.5.25 we get the phase boundary γ = 8 aρs × µ(ρs − ρa ) + T ((ρs ln(2L2/3 ) − ln(1 − ρs ) − 2ρa ln(2)) 75 (5.26) 1 L=35 L=56 L=107 L=200 L=493 Temperature 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 γ/µ Figure 5.5: The equilibrium phase diagram determined from the crossing points of the free energy curves, such as those shown in the lower half of Fig. 5.1. As ρ and ρOGR are small and µ = 1, we can ignore the first term in the above equation. Using a = 13 , we get γ = 24T µ c ln(2L2/3 ) − ln(1 − ρs ) ρa − 2 ln(2) ρs ρs (5.27) This result is qualitatively correct as we see from the mean field runs in Fig. 5.6 that (γ/µ)c increases with L and (γ/µ)c also increases linearly with T . From the figure we see that (γ/µ)c ≈ 5T at large temperatures. For L = 493 and T = 200 our analytic expression gives (γ/µ)c ≈ 44T . 76 2×106 2×104 Temperature 2×102 2×100 2×10-2 L=35 L=56 L=107 L=200 L=493 2×10-4 2×10-6 -3 10 10-2 10-1 100 101 102 103 104 105 106 γ/µ Figure 5.6: In this log-log plot for the equilibrium phase diagram we see that it has a finite non-zero value for the intercept. 106 105 104 103 γ/µ 2 10 5 T= 2×104 T= 2×103 T= 2×10 T= 2×102 T= 2×1010 T= 2×10-1 T= 2×10-2 T= 2×10 T= 2×10-3 T= 2×10-4 -5 T= 2×10-6 T= 2×10 1 10 0 10 -1 10 10-2 10-3 0.001 0.01 1/L 0.1 Figure 5.7: Plot showing the dependence of the critical γ/µ on T. At low T, the Y-intercept is γ/µ = 0.005 which is the phase boundary for rulers in the large L limit. 77 1 0.8 Density 0.6 0.4 0.2 0 Exact-F Exact-B MF-F MF-B Diff-F Diff-B -0.2 -0.4 10-4 10-3 10-2 γ/µ 10-1 100 0.2 0 Free energy -0.2 -0.4 -0.6 Exact-F Exact-B MF-F MF-B Diff-F Diff-B -0.8 -1 -4 10 -3 -2 10 10 γ/µ -1 10 0 10 Figure 5.8: Density and Free energy calculations done exactly and using mean field theory for L=26 at T=0.2. 78 5.4 Exact calculations We did exact and mean field calculation for L = 26 at T=0.2, to get the density and free energy. Fig. 5.8 shows the density and free energy as well as the difference in the mean field and the exact calculations. The end points of the ruler are always set to 1. So for L = 26 there are 224 possible states. For the exact calculation we iterate over all these possible states and evaluate the density and free energy. We see that there is a lag in the lines obtained by exact and mean calculations and it is because of the hysteresis. 5.5 Search for OGR Homotopy methods [76] have been used in statistical physics to obtain the global minimum. We tried to use a similar approach to find the optimal Golomb ruler. We start with a value of γ/µ very close to the phase boundary such that the ruler in the symmetric phase and then we gradually increase γ/µ so that there is a phase transition and it goes into the asymmetric phase. As the optimal Golomb ruler state is the global minimum, we were expecting that the asymmetric phase would be this global minima, but because of metastability this approach was only successful only for small rulers. 5.6 Symmetric theory The lattice gas formulation of OGR starts with defining variables ni = 0, 1 on a one dimensional chain where i = 0, 1, 2...L. Setting ni = 1 for values of i are in an OGR marker set, with the other values of i having ni = 0 maps an OGR solution to a lattice gas configuraiton. To construct the OGR lattice gas Hamiltonian, we need to ensure that the repeated 79 distances between the lattice gas particles are unfavorable. We define l to label a distance, so that 1 ≤ l ≤ L and we define the degeneracy of the distance l, to be Dl . A valid Golomb ruler must have degeneracy Dl = 0, 1. If a higher degeneracy were to occur, the distances would not be unique and the marker set would not be a maximally irregular set. A little thought reveals that the degeneracies Dl are related to the lattice gas variables ni through the relation, L−1 Dl = ni ni+l . (5.28) i=0 When the degeneracies Dl are summed over all l, we must have the total number of distances, so that for any configuration {ni }, we have the constraint, L l=1 1 Dl = m(m − 1). 2 (5.29) Now we define the lattice gas Hamiltonian in terms of the variables ni and Dl , by noting that for an OGR system of length L, we want to maximize the lattice gas density which is given by i ni , minimizing l Dl (Dl − 1), where the latter sum is zero when the degeneracies Dl are zero or one as required for a Golomb ruler state. The OGR lattice-gas Hamiltonian is then, L HOGR = −µ L ni + γ i=0 l=1 Dl (Dl − 1) (5.30) This Hamiltonian may be written in terms of the variables ni by using Eq. 5.28, which leads to a frustrated lattice Hamiltonian with long-range four-particle interactions. Provided the parameters µ and γ are positive, the first term in HOGR maximizes the density of the lattice gas, while the second term minimizes the number of times an interparticle distance is repeated. 80 To find the scaling behavior of the optimal solutions to HOGR , L(m), we consider the partition function of OGR, Z= 2L −1 e−βHk (5.31) k=0 Z= 2L −1 −β −µ e L n +γ i=0 i L D (D −1) l=1 l l (5.32) k=0 Z= 2L −1 L D 2 −γ l=1 l −β −µm+γ e L D l=1 l (5.33) k=0 Z= 2L −1 βµ e L n −βγ( i=0 i L D2 −m(m−1)/2) l=1 l (5.34) k=0 where we used the identity Eq. 5.29. To reduce the Dl2 term to linear form we introduce Gaussian integrals, 2 eD = A 2 e(−X +2XD)dx (5.35) so that, Z = AG 2L −1 eβµm+βγm(m−1)/2) ... k=0 − e dxl √ 2 l xl +2i βγ l xl Dl (5.36) l where AG normalizes the Gaussian integrals. This remains intractable, but becomes tractable when we make the symmetric assumption xl = x, ... dxl − e √ 2 l xl +2i βγ l xl Dl = l 81 √ −x2 +2i βγ(x/L) dxe l Dl L (5.37) Then we use the identity Eq. 5.29 again to get, ... √ 2 l xl +2i βγ − e dxl l xl Dl = L √ −x2 +2i βγ(x/L)(m/2)(m−1) dxe (5.38) l We convert the integral to an exponent using the Gaussian integral used earlier to get, ... dxl − e √ 2 l xl +2i βγ l xl Dl 2 = e−(βγ/L)[(m/2)(m−1)] L . (5.39) l Thus, Z = BG 2L −1 βγ βγ βµm+ 2 m(m−1)− 4L [m(m−1)]2 e (5.40) L βµm+ βγ m(m−1)− βγ [m(m−1)]2 2 4L e m (5.41) k=0 Z = BG × m where BG is a constant. To find the scaling behavior of Optimal Golomb rulers we consider the strong interaction limit γ → ∞ where the OGR constraints dominate, and solving yields the scaling law, L(m) = m2 − m (5.42) This is consistent with the known lower bound L(m) > m2 − 2m3/2 − m and with the Erd¨os conjecture L(m) < m2 [47], as well as with large scale simulations (see Fig. 5.9). 5.7 Summary In this work, a new connection between statistical physics and the combinatorial optimization problem of the optimal Golomb ruler is made. The statistical physics of the Golomb ruler problem is studied using the mean field approach and the phase diagram is obtained. It is 82 1 Density slope = -2/3 γ=0.001 γ=0.002 γ=0.003 γ=0.004 γ=0.005 slope = -1/2 OGR slope = -1/2 0 0.1 Density 10 -1 10 0 10 10 10 1 2 Length of OGR 10 100 Length of ruler (L) Figure 5.9: Comparison of numerical results (++++) for the length of optimal Golomb ruler with the best lower bound (solid line), and with the statistical physics scaling law (dotted line) that provides a useful upper bound on all best OGRs. The main figure is for exact OGR states, while the inset is for approximate OGR states of large size. 83 seen that even at a very low temperature it shows a first order phase transition at a finite non-zero value of the constraint parameter γ/µ. Analytic and exact calculations were done for the scaling of the density and free energy of the ruler and they were compared with those from the mean field. A new scaling law for the length of the OGR is derived, which is consistent with Erd¨os conjecture. 84 Chapter 6 Conclusion In this work, efficient methods of reconstructing complex Euclidean networks in two and three dimensions, given only their unassigned Euclidean distance lists were presented. The unassigned problem is complicated due to the combinatorial explosion of ways that atoms may be assigned to the endpoints of each distance in the distance list, leading to interesting theoretical and algorithmic problems as elucidated. It was found that there is enough information to uniquely reconstruct co-ordinates from distance lists that have no vector information in them and also found that this reconstruction is unique. Using the Tribond algorithm random point sets in 2 dimensions of size about one thousand were successfully reconstructed. In 3 dimensions, the core finding step has a high computational cost and the algorithm was successfully able to do core finding for random point sets of size about ten in a small amount of time. If the core is assumed to be given, it was able to successfully do the buildup for random point sets for size about 100. The algorithm was also used to solve for the structures of various organic molecules and the results were compared with those from using the Liga algorithm. While the Liga algorithm is successful in reconstructing structures which have a high symmetry (and a highly degenerate distance list), the Tribond algorithm is successful with random point sets which have a non-degenerate distance list. A hybrid algorithm should solve structures which fall between those having a high symmetry and a low symmetry. 85 Practical applications of the distance list method must overcome errors in the data including missing distances, shifted distances and errors in the multiplicity of peaks in degenerate cases. These issues are discussed in recent studies that attempt to reconstruct nanostructure from experimental PDF data [28, 29, 30]. The focus of this work was the broader theoretical and algorithmic issues related to the underlying inverse problem of finding structure from precise Euclidean distances. A statistical physics approach to the combinatorial optimization problem of optimal Golomb ruler is also presented. The phase diagram is studied and scaling calculations for the density, free energy and phase boundary were done. An analytic calculation using a continuum field theory gives the scaling for the length of the OGR that is consistent with Erd¨os conjecture and also with the proposed optimal rulers of large lengths. 86 APPENDIX 87 1 2 3 4 // Tribond . h : The h e a d e r f i l e f o r a l l t h e Tribond 2D & 3D ֒→ ←֓ f u n c t i o n s . // #i f n d e f Tribond h #define Tribond h 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 #include #include #include #include #include #include #include #include using s t d : : v e c t o r ; using s t d : : s t r i n g ; using s t d : : co ut ; using s t d : : e n d l ; using s t d : : s e t p r e c i s i o n ; using s t d : : i o s ; using s t d : : setw ; 21 22 23 24 25 26 27 28 29 30 31 32 33 // // Po i n t C l a s s //// c l a s s Po i nt { public : long double x , y , z ; // C o o r d i n a t e s long double c o s t ; Po i nt ( ) { x = y = z = 0.0; c o s t = −1; } friend bool compareCost ( const Po i nt& pt1 , const Po i nt& pt2 ) ; 34 35 36 }; 37 38 39 // O v e r l o a d i n g ” l e s s than ” o p e r a t o r so t h a t p o i n t s can be s o r t e d . bool operator <( const Po i nt& pt1 , const Po i nt& pt2 ) ; 40 41 42 43 // O v e r l o a d i n g o u t p u t o p e r a t o r f o r Po i n t c l a s s . s t d : : ostream& operator <<( s t d : : ostream& os , const Po i nt& pt ) ; 44 88 45 46 47 48 int p r i n t 2 s t r u c t u r e s ( v e c t o r & s t r u 1 , v e c t o r & ֒→ ←֓ s t r u 2 ) ; long double g et Ang l e ( Po i nt pt1 , Po i nt pt2 ) ; Po i nt g e t A x i s ( Po i nt pt1 , Po i nt pt2 ) ; 49 50 51 52 53 54 55 56 57 58 59 60 61 // C a l c u l a t e t h e E u c l i d i a n d i s t a n c e b e tw e e n 2 p o i n t s . i n l i n e long double g e t D i s t a n c e ( const Po i nt& pt1 , const Po i nt& ֒→ ←֓ pt2 ) { // r e t u r n s q r t ( ( p t 1 . x − p t 2 . x ) ∗ ( p t 1 . x − p t 2 . x ) + // ( pt1 . y − pt2 . y ) ∗ ( pt1 . y − pt2 . y ) + // ( pt1 . z − pt2 . z ) ∗ ( pt1 . z − pt2 . z ) ) ; return s q r t ( pow ( pt1 . x − pt2 . x , 2 . 0 L) + pow ( pt1 . y − pt2 . y , 2 . 0 L) + pow ( pt1 . z − pt2 . z , 2 . 0 L) ) ; } 62 63 64 65 66 67 68 69 70 71 72 73 74 75 // // S t r u c t u r e C l a s s //// class Structure { public : // R e l a t i v e t o l e r a n c e t o c h e c k i f 2 d i s t a n c e s a re t h e c l o s e ֒→ ←֓ enough . s t a t i c long double t o l e r ; s t a t i c const int maxPoolSize ; v e c t o r atoms ; v e c t o r p o o l ; v e c t o r targetDL , currDL , freeDL ; v e c t o r u s e d D i s t ; long double c o s t ; int dim , t a r g e t S i z e , c u r r S i z e ; 76 77 78 79 80 81 S t r u c t u r e ( int DIM, int N, s t r i n g d l i s t F i l e ) { t a r g e t S i z e = N; dim = DIM ; atoms . r e s i z e ( N ) ; 82 83 84 85 int sizeDL = N∗ ( N − 1 ) / 2 ; u s e d D i s t . r e s i z e ( sizeDL , f a l s e ) ; cost = 0; 86 89 g et D L f r o mF i l e ( d l i s t F i l e ) ; 87 88 89 90 91 92 } S t r u c t u r e ( int& N, s t r i n g x y z F i l e ) { getStruFromFile ( xyzFile ) ; 93 i f ( atoms . s i z e ( ) != N ) { N = atoms . s i z e ( ) ; co ut << ” Updating s t r u c t u r e s i z e t o ” << atoms . s i z e ( ) ֒→ ←֓ << e n d l ; } t a r g e t S i z e = N; dim = 3 ; // atoms . r e s i z e ( N ) ; 94 95 96 97 98 99 100 101 102 int sizeDL = N∗ ( N − 1 ) / 2 ; u s e d D i s t . r e s i z e ( sizeDL , f a l s e ) ; cost = 0; 103 104 105 106 107 108 109 110 111 112 113 114 115 } S t r u c t u r e ( int DIM, int N, v e c t o r inputDL ) { currSize = 0; t a r g e t S i z e = N; dim = DIM ; int sizeDL = N∗ ( N − 1 ) / 2 ; u s e d D i s t . r e s i z e ( sizeDL , f a l s e ) ; 116 targetDL = inputDL ; 117 118 119 120 121 122 123 124 125 126 127 128 129 } int updateCurrDL ( ) { currDL . c l e a r ( ) ; fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { fo r ( int j = i + 1 ; j < atoms . s i z e ( ) ; ++j ) { currDL . push back ( g e t D i s t a n c e ( atoms [ i ] , atoms [ j ] ֒→ ←֓ ) ) ; } } 90 s o r t ( currDL . b e g i n ( ) , currDL . end ( ) ) ; // c o u t << ” S i z e o f c u r r e n t DL: ” << currDL . s i z e ( ) << e n d l ; return 0 ; 130 131 132 133 134 135 136 137 138 139 } int p r i n t ( ) { int P r e c i s i o n = 8 ; int Width = 1 2 ; int Width2 = 6 ; 140 co ut . p r e c i s i o n ( P r e c i s i o n ) ; co ut . s e t f ( i o s : : f i x e d , i o s : : f l o a t f i e l d ) ; 141 142 143 fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { co ut << setw ( Width ) << atoms [ i ] . x << ’ \ t ’ << setw ( Width ) << atoms [ i ] . y << ’ \ t ’ << setw ( Width2 ) << atoms [ i ] . z << ’ \ t ’ << atoms [ i ] . c o s t << e n d l ; } // f o r ( i n t i = 0 ; i < atoms . s i z e ( ) ; ++i ) // { // c o u t << i << ’\ t ’ << s e t p r e c i s i o n ( 10 ) << atoms [ i ֒→ ←֓ ] . x << ’\ t ’ // << s e t p r e c i s i o n ( 10 ) << atoms [ i ] . y << ’\ t ’ ֒→ ←֓ << s e t p r e c i s i o n ( 10 ) << atoms [ i ] . z << e n d l ; // } // co ut << ” ∗∗∗∗ ” << e n d l ; return 0 ; 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 } bool f i n d C o r e ( ) ; bool findCore3D ( ) ; bool findCore3D ( int b a s e I d x ) ; bool doBuildup ( ) ; bool doBuildup3D ( v e c t o r i dxAr r ) ; // b o o l doBuildup3D ( i n t newBase ) ; bool doBuildup3Dv2 ( int basePt1 , int basePt2 , int basePt3 ) ; bool doBuildup2 ( int newBase ) ; bool doBuildup3 ( int newBase ) ; bool r e c o n s t r u c t ( ) ; bool r e c o n s t r u c t 2 ( ) ; bool r e c o n s t r u c t 3 ( int b a s e I d x ) ; 91 long double f e a s i b l e T e t r a ( int b a s e I d x ) ; 173 174 long double d i s t L i s t E r r o r ( ) ; long double d i s t L i s t E r r o r ( v e c t o r d l i s t ) ; long double d i s t L i s t E r r o r ( v e c t o r d l i s t 1 , v e c t o r d l i s t 2 ) ; long double checkMo r eBr i dg es ( int numChecks , Po i nt t e s t P t ) ; 175 176 177 178 179 180 bool bool bool bool bool 181 182 183 184 185 186 home ( ) ; home3D ( int d i s t I d x ) ; r e f l ec t ( string axis ) ; r o t a t e ( long double a n g l e ) ; r o t a t e ( long double a ng l e , long double axisX , long double axisY , long double a x i s Z ) ; 187 bool t r a n s l a t e ( long double distX , long double distY , long ֒→ ←֓ double d i s t Z ) ; 188 189 bool t e s t C o r e ( v e c t o r idxArr , int idxM , int idxN ) ; int g et D L f r o mF i l e ( s t r i n g f i l eNa me ) ; int g e t S t r u F r o m F i l e ( s t r i n g f i l eNa me ) ; int p r i n t D L t o F i l e ( s t r i n g f i l eNa me=”” ) ; 190 191 192 193 194 int r e d u c e D L p r e c i s i o n ( int p r e c i s i o n ) ; v e c t o r g et Co r e ( int c o r e S i z e ) ; int upda t eUsedD i st s ( ) ; int updateFreeDL ( ) ; long double g et Pt Co st ( Po i nt pt ) ; long double g e t P t s C o s t ( Po i nt pt1 , Po i nt pt2 ) ; 195 196 197 198 199 200 201 bool updatePool ( Po i nt pt ) ; int i n s e r t P o i n t ( Po i nt pt ) ; bool growStru ( ) ; int p r i n t P o o l ( ) ; int g e t P o o l s ( ) ; int g e t P o o l s 2 ( ) ; bool findCoreMPI ( int windowStart ) ; 202 203 204 205 206 207 208 209 210 }; 211 212 213 214 215 i n l i n e long double compareStru ( S t r u c t u r e t e s t S t r u , S t r u c t u r e ֒→ ←֓ t a r g e t S t r u ) { t e s t S t r u . updateCurrDL ( ) ; long double o v e r l a p E r r o r = 0 ; 92 s o r t ( t e s t S t r u . atoms . b e g i n ( ) , t e s t S t r u . atoms . end ( ) ) ; 216 217 fo r ( int i = 0 ; i < t e s t S t r u . t a r g e t S i z e ; ++i ) { o v e r l a p E r r o r += pow( g e t D i s t a n c e ( t e s t S t r u . atoms [ i ] , t a r g e t S t r u . atoms [ i ] ) , ֒→ ←֓ 2 . 0 ) ; } 218 219 220 221 222 223 co ut << ” Overlap E r r o r : ” << o v e r l a p E r r o r << e n d l ; co ut << ” D i s t a n c e E r r o r : ” << t e s t S t r u . d i s t L i s t E r r o r ( ) << e n d l ; return 0 ; 224 225 226 227 228 } 229 230 231 1 2 3 4 5 6 7 8 9 10 #endif // Tribond . cpp : The i m p l e m e n t a t i o n f i l e f o r o f a l l t h e 2D & 3D ֒→ ←֓ f u n c t i o n s . // #include ” Tribond . h” #include #include #include #include #include #include using namespace s t d ; 11 12 13 long double S t r u c t u r e : : t o l e r = 1 e −12L ; const int S t r u c t u r e : : maxPoolSize = 2 0 ; 14 15 16 17 18 19 20 bool operator <( const Po i nt& pt1 , const Po i nt& pt2 ) { // O v e r l o a d i n g t h e ” l e s s than ” o p e r a t o r f o r t h e Po i n t c l a s s . ֒→ ←֓ U s e f u l when // s o r t i n g p o i n t s i n t h e s t r u c t u r e so t h a t t h e i r o r d e r i n g i s ֒→ ←֓ un i q ue . Po i nt z e r o ; return g e t D i s t a n c e ( pt1 , z e r o ) < g e t D i s t a n c e ( pt2 , z e r o ) ; 21 22 23 24 } 93 25 26 27 28 s t d : : ostream& operator <<( s t d : : ostream& os , const Po i nt& pt ) { // O v e r l o a d i n g o u t p u t o p e r a t o r f o r Po i n t c l a s s . 29 int o u t p u t P r e c i s i o n = 8 ; int colWidth = 1 2 ; 30 31 32 os . p r e c i s i o n ( outputPrecision ) ; os . s e t f ( i o s : : fixed , i o s : : f l o a t f i e l d ) ; 33 34 35 o s << setw ( colWidth ) << pt . x << setw ( colWidth ) << pt . y << setw ( colWidth ) << pt . z ; return o s ; 36 37 38 39 40 41 } 42 43 44 45 46 47 48 void placeApex ( Po i nt& apexPt , int idxA , int idxB , int idxC , v e c t o r d l i s t ) { // Pl a c e t h e t o p p o i n t o f t h e t r i a n g l e by s o l v i n g t h e l o c i ֒→ ←֓ e q u a t i o n s . // For s k i n n y t r i a n g l e s , t h e y−c o o r d i n a t e may t u r n o ut t o be ֒→ ←֓ n e g a t i v e . // In t h o s e c a s e s , I am s e t t i n g i t t o z e r o . 49 apexPt . x = d l i s t [ idxA ] / 2 − ( ( d l i s t [ idxC ] − d l i s t [ idxB ֒→ ←֓ ] ) ∗ ( d l i s t [ idxC ] + d l i s t [ idxB ] ) / ( 2∗ d l i s t [ idxA ] ) ) ; 50 51 52 53 56 apexPt . y = s q r t ( ( d l i s t [ idxC ] + apexPt . x − d l i s t [ idxA ] ) ∗ ( d l i s t [ idxC ] − apexPt . x + d l i s t [ idxA ] ) ) ; 57 apexPt . z = 0 ; 54 55 58 63 i f ( apexPt . y != apexPt . y ) { apexPt . y = 0 ; } 64 return ; 59 60 61 62 65 66 } 94 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 void placeTop ( Po i nt basePt1 , Po i nt basePt2 , Po i nt basePt3 , ֒→ ←֓ Po i nt& apexPt , v e c t o r idxArr , v e c t o r d l i s t ) { /∗ Pl a c e Apex i n s p a c e . ∗ ∗ R e l a t i n g bonds t o p o i n t s : ∗ a <=> p1−p2 f <=> p3−p4 ∗ b <=> p1−p3 e <=> p2−p4 ∗ c <=> p2−p3 d <=> p1−p4 ∗ ∗ p1 i s p l a c e d a t t h e o r i g i n . ∗ p2 i s p l a c e d a t ( d l i s t [ a ] , 0 , 0) . ∗ p3 i s d e f i n e d t o have p o s i t i v e y−v a l u e . ∗ p4 i s d e f i n e d t o have p o s i t i v e z−v a l u e . ∗/ long double d12 , d13 , d14 , d23 , d12 = d l i s t [ i dxAr r [ 0 ] ] ; d34 d13 = d l i s t [ i dxAr r [ 1 ] ] ; d24 d23 = d l i s t [ i dxAr r [ 2 ] ] ; d14 d24 , d34 ; = d l i s t [ i dxAr r [ 5 ] ] ; = d l i s t [ i dxAr r [ 4 ] ] ; = d l i s t [ i dxAr r [ 3 ] ] ; 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 // p l a c e m e n t o f apex by s o l v i n g s i m p l e l o c i e q u a t i o n s // f a s t e r than e v a l u a t i n g t r i g e x p r e s s i o n s apexPt . x = ( d14 ∗ d14 − d24 ∗ d24 + d12 ∗ d12 ) / ( 2 . 0 L∗ d12 ) ; apexPt . y = ( ( d14 ∗ d14 − d34 ∗ d34 + basePt3 . x∗ basePt3 . x + basePt3 . y∗ basePt3 . y ) / ( 2 . 0 L∗ basePt3 . y ) ֒→ ←֓ ) − ( basePt3 . x/ basePt3 . y ) ∗ apexPt . x ; i f ( ( d14 ∗ d14 − apexPt . x∗ apexPt . x − apexPt . y∗ apexPt . y ←֓ < 0 . 0 L ) { // c o u t << ” img d i s t ! ” << e n d l ; // apexPt . z = 0 ; // c o u t << ” p a r s : ” << i d x Arr [ 0 ] << ’ , ’ << i d x Arr [ 1 ←֓ << ’ , ’ // << i d x Arr [ 2 ] << ’ , ’ << i d x Arr [ 3 ←֓ << ’ , ’ // << i d x Arr [ 4 ] << ’ , ’ << i d x Arr [ 5 ←֓ << e n d l ; // c o u t << ” p t s : ” << e n d l ; // c o u t << b a s e Pt1 << e n d l ; // c o u t << b a s e Pt2 << e n d l ; 95 ) ֒→ ] ֒→ ] ֒→ ] ֒→ // c o u t << b a s e Pt3 << e n d l ; // c o u t << apexPt << e n d l ; // g e t c h a r ( ) ; apexPt . z = 0 ; // s q r t ( d14 ∗ d14 − apexPt . x ∗ apexPt . x − ֒→ ←֓ apexPt . y∗ apexPt . y ) ; 106 107 108 109 110 111 112 113 114 115 116 } } else { apexPt . z = s q r t ( d14 ∗ d14 − apexPt . x∗ apexPt . x − ֒→ ←֓ apexPt . y∗ apexPt . y ) ; } 117 118 119 120 121 int c l o s e s t D i s t ( const long double& va l ue , const v e c t o r & d l i s t ) { // F un c ti o n t o f i n d t h e i n d e x o f t h e d i s t a n c e i n t h e g i v e n l i s t // which i s c l o s e s t t o t h e g i v e n v a l u e . 122 v e c t o r : : c o n s t i t e r a t o r const i t = l o wer bo und ( d l i s t . b e g i n ( ) , d l i s t . end ( ) , v a l u e ) ; 123 124 125 int b e s t I d x = d i s t a n c e ( d l i s t . b e g i n ( ) , i t ) ; 126 127 i f ( b e s t I d x == d l i s t . s i z e ( ) ) { b e s t I d x −= 1 ; } e l s e i f ( b e s t I d x > 0 and ( f a b s ( d l i s t [ b e s t I d x − 1 ] − v a l u e ) < f a b s ( d l i s t [ b e s t I d x ֒→ ←֓ ] − v a l u e ) ) ) { b e s t I d x −= 1 ; } 128 129 130 131 132 133 134 135 136 137 // c o u t << ” v a l u e , b e s t I d x , b e s t D i s t : ” ←֓ << v a l u e << ” , ” << b e s t I d x << ” , // << s e t p r e c i s i o n ( 11 ) << d l i s t [ // c o u t << ”( ” << d l i s t [ b e s t I d x − 1 ] ←֓ b e s t I d x + 1 ] << ” )”<< e n d l ; return b e s t I d x ; 138 139 140 141 143 } 144 bool S t r u c t u r e : : f i n d C o r e ( ) 142 96 << s e t p r e c i s i o n ( 11 ) ֒→ ” b e s t I d x ] << ’\ t ’ ; << ” , ” << d l i s t [ ֒→ 145 146 147 148 { // Find a c o r e made o f 4 p o i n t s by i t e r a t i n g o v e r a l l t r i a n g l e // c o m b i n a t i o n s . Also , t h e f u n c t i o n t o do t h e b u i l d u p i s ֒→ ←֓ c a l l e d a f t e r // we f i n d a c o r e b e c a u s e i t i s more c o n v e n i e n t t h i s way . 149 150 151 152 153 154 155 int idxA = 0 , idxB , idxC , idxD , idxE , idxF ; // i n d i c e s f o r ֒→ ←֓ t h e bonds int idxM , idxN ; // i n d i c e s f o r t h e o r i e n t a t i o n o f t h e t r i a n g l e v e c t o r i dxAr r ( 6 , −1 ) ; int i n c = 6 ; // w i d t h o f t h e bond window int w i n S t a r t = 0 , winStop = i n c ; // i n d i c e s f o r t h e window v e c t o r d l i s t = targetDL ; 156 157 158 159 160 long double b r i d g e D i s t = 0 . 0 ; int b r i d g e I d x = 0 ; int bridgeCount = 0 ; // c o un t t h e number o f b r i d g e bond c h e c k s long double f r a c E r r o r = 1 e6 ; 161 162 163 164 165 166 167 168 Po i nt basePt1 , basePt2 , apexPt1 , apexPt2 ; basePt1 . x = 0 ; basePt1 . y = 0 ; basePt2 . x = d l i s t [ idxA ] ; basePt2 . y = 0 ; co ut << ” basePt1 : ” << basePt1 . x << ” ” << basePt1 . y << e n d l ; co ut << ” basePt2 : ” << basePt2 . x << ” ” << basePt2 . y << e n d l ; co ut << ”Bond window : ” ; 169 170 171 172 173 174 175 176 177 178 179 180 181 182 while ( true ) { c e r r << ” −> ” << winStop ; fo r ( idxB = idxA + 1 ; idxB < winStop ; ++idxB ) { fo r ( idxC = idxB + 1 ; idxC < winStop ; ++idxC ) { i f ( d l i s t [ idxA ] + d l i s t [ idxB ] + t o l e r < d l i s t [ idxC ֒→ ←֓ ] ) { break ; } placeApex ( apexPt1 , idxA , idxB , idxC , d l i s t ) ; 183 184 185 186 fo r ( idxD = idxA + 1 ; idxD < winStop ; ++idxD ) { i f ( idxD == idxB or idxD == idxC ) 97 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 { continue ; } fo r ( idxE = idxD + 1 ; idxE < winStop ; ++idxE ) { i f ( idxB < w i n S t a r t and idxC < w i n S t a r t and idxD < w i n S t a r t and idxE < w i n S t a r t ) { continue ; } i f ( ( idxD < idxB and idxE < idxC ) or ( idxE > idxC and idxD < idxB ) ) { continue ; } i f ( idxE == idxB or idxE == idxC ) { continue ; } 213 i f ( d l i s t [ idxA ] + d l i s t [ idxD ] + t o l e r < d l i s t [ ֒→ ←֓ idxE ] ) { break ; } 214 placeApex ( apexPt2 , idxA , idxD , idxE , d l i s t ) ; 209 210 211 212 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 fo r ( idxM = 0 ; idxM < 2 ; ++idxM ) { // c o u t << ”idxM : ” << idxM << e n d l ; i f ( idxM == 1 ) { apexPt2 . x = d l i s t [ idxA ] − apexPt2 . x ; } fo r ( idxN = 0 ; idxN < 2 ; ++idxN ) { // c o u t << ” idxN : ” << idxN << e n d l ; i f ( idxN == 1 ) { apexPt2 . y = − apexPt2 . y ; } 98 231 232 233 b r i d g e D i s t = g e t D i s t a n c e ( apexPt1 , apexPt2 ) ; bridgeCount += 1 ; // c o un t number o f b r i d g e c h e c k s 234 235 236 237 238 239 240 241 242 243 244 245 246 bridgeIdx = c l o s e s t D i s t ( bridgeDist , d l i s t i f ( b r i d g e I d x == idxA or b r i d g e I d x == idxB b r i d g e I d x == idxC or b r i d g e I d x == idxD b r i d g e I d x == idxE ) { // Make s u r e t h a t t h e b r i d g e bond i s n o t ←֓ same as any o f // t h e d i s t a n c e s i n us e . continue ; } ); or or t h e ֒→ f r a c E r r o r = f a b s ( d l i s t [ b r i d g e I d x ] − ֒→ ←֓ b r i d g e D i s t ) / b r i d g e D i s t ; // c o u t << ” f r a c E r r o r : ” << f r a c E r r o r << e n d l ; 247 248 249 250 251 252 253 254 255 256 257 258 259 260 i f ( f a b s ( apexPt2 . y ) < 0 . 5 ) { // Skinny t r i a n g l e s have been found t o have a ֒→ ←֓ l a r g e e r r o r , // hence r e d u c i n g t h e i r e r r o r ” by hand” so ֒→ ←֓ t h a t we don ’ t // miss o ut on them . f r a c E r r o r /= 1 0 0 0 ; } if ( fracError { i dxAr r [ 0 ] i dxAr r [ 2 ] i dxAr r [ 4 ] < toler ) = idxA ; i dxAr r [ 1 ] = idxB ; = idxC ; i dxAr r [ 3 ] = idxD ; = idxE ; i dxAr r [ 5 ] = b r i d g e I d x ; 261 262 263 264 265 266 267 268 co ut << e n d l ; i f ( t e s t C o r e ( idxArr , idxM , idxN ) ) { atoms . push back ( basePt1 ) ; atoms . push back ( basePt2 ) ; atoms . push back ( apexPt1 ) ; atoms . push back ( apexPt2 ) ; 269 270 271 fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { 99 272 } 273 274 co ut << ” Po i nt : ” << i + 1 << ’ \ t ’ << ֒→ ←֓ atoms [ i ] << e n d l ; u s e d D i s t [ idxA ] = u s e d D i s t [ idxB ] = true ; u s e d D i s t [ idxC ] = u s e d D i s t [ idxD ] = true ; u s e d D i s t [ idxE ] = u s e d D i s t [ b r i d g e I d x ] = ֒→ ←֓ true ; 275 276 277 278 updateCurrDL ( ) ; // r e t u r n t r u e ; 279 280 281 // Attempt b u i l d u p t o g e t t h e re m a i n i n g ֒→ ←֓ p o i n t s . doBuildup ( ) ; 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 } } i f ( atoms . s i z e ( ) >= min ( 8 , t a r g e t S i z e ) ) { // I f b u i l d u p was a b l e t o add 4 more ֒→ ←֓ p o i n t s th e n w i t h // h i g h p r o b a b i l i t y , we have t h e r i g h t ֒→ ←֓ s t r u c t u r e . return true ; } else { // I f b u i l d u p c o u l d n o t even 4 p o i n t s ֒→ ←֓ th e n w i t h a v e r y // h i g h p r o b a b i l i t y , we have t h e wrong ֒→ ←֓ s t r u c t u r e . S t a r t // o v e r and f i n d t h e n e x t c o r e . co ut << ”No buildup , bad c o r e . ” ” F i ndi ng t he next c o r e . . . ” << ֒→ ←֓ e n d l ; atoms . c l e a r ( ) ; updateCurrDL ( ) ; co ut << ”Bond window : ” ; } } // n l o o p } // m l o o p } // idxE l o o p } // idxD l o o p 100 } // idxC l o o p } // idxB l o o p 309 310 311 // When idxB i f ( idxB == { winStart = winStop += 312 313 314 315 316 h i t s t h e window e d g e i n c re m e n t i t . winStop ) winStop ; inc ; 317 i f ( w i n S t a r t == d l i s t . s i z e ( ) ) { co ut << e n d l ; break ; } 318 319 320 321 322 323 324 325 326 327 } 328 329 } // w h i l e ( t r u e ) l o o p 330 331 return f a l s e ; 332 333 334 i f ( winStop > d l i s t . s i z e ( ) ) { winStop = d l i s t . s i z e ( ) ; } } 335 336 337 338 339 340 341 342 343 bool S t r u c t u r e : : doBuildup ( ) { // S t a r t i n g w i t h a c o r e o f s i z e 4 , f i n d t h e re m a i n i n g p o i n t s . // P a r t i a l up d a te o f t h e new d i s t a n c e s c r e a t e d used w h i l e ֒→ ←֓ a d d i n g a p o i n t . bool s u c c e s s F l a g = f a l s e ; int idxA = 0 , idxB , idxC , idxD , idxE , idxF ; // i n d i c e s f o r ֒→ ←֓ t h e bonds int idxM , idxN ; 344 345 346 347 348 349 long double b r i d g e D i s t = 0 . 0 ; int b r i d g e I d x = 0 ; int bridgeCount = 0 ; // c o un t t h e number o f t r i a n g l e s v e c t o r d l i s t = targetDL ; long double f r a c E r r o r = 1 e6 , f r a c E r r o r 2 = 1 e6 ; 350 351 a s s e r t ( ( ” In buildup , c o r e p r e s e n t ? ” , atoms . s i z e ( ) > 0 ) ) ; 101 352 353 Po i nt basePt1 = atoms [ 0 ] , basePt2 = atoms [ 1 ] , apexPt = atoms [ 2 ] , t e s t P t ; 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 fo r ( idxD = 1 ; idxD < d l i s t . s i z e ( ) ; ++idxD ) { i f ( u s e d D i s t [ idxD ] ) { continue ; } fo r ( idxE = idxD + 1 ; idxE < d l i s t . s i z e ( ) ; ++idxE ) { i f ( u s e d D i s t [ idxE ] ) { continue ; } 373 i f ( d l i s t [ idxA ] + d l i s t [ idxD ] + t o l e r < d l i s t [ idxE ] ) { break ; } 374 placeApex ( t e s t P t , idxA , idxD , idxE , d l i s t ) ; 369 370 371 372 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 fo r ( idxM = 0 ; idxM < 2 ; ++idxM ) { i f ( idxM == 1 ) { t e s t P t . x = d l i s t [ idxA ] − t e s t P t . x ; } fo r ( idxN = 0 ; idxN < 2 ; ++idxN ) { i f ( idxN == 1 ) { testPt . y = − testPt . y ; } bridgeCount += 1 ; // c o un t number o f b r i d g e c h e c k s b r i d g e D i s t = g e t D i s t a n c e ( apexPt , t e s t P t ) ; i f ( bridgeDist < d l i s t [ 0 ] ) { // Make s u r e t h e t e s t p o i n t i s n o t t o o c l o s e t o any ֒→ ←֓ p o i n t . continue ; 102 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 } bridgeIdx = c l o s e s t D i s t ( bridgeDist , d l i s t ) ; i f ( b r i d g e I d x == idxD or b r i d g e I d x == idxE or usedDist [ bridgeIdx ] ) { // Make s u r e t h e b r i d g e bond i s n o t t h e same as a ֒→ ←֓ used d i s t a n c e . continue ; } f r a c E r r o r = f a b s ( d l i s t [ b r i d g e I d x ] − b r i d g e D i s t ) / ֒→ ←֓ b r i d g e D i s t ; i f ( fabs ( testPt . y ) < 0.5 ) { // Found s k i n n y t r i a n g l e s t o have a h i g h e r r o r . ֒→ ←֓ Hence , r e d u c i n g // t h e e r r o r ” by hand ” so t h a t we don ’ t miss o ut on ֒→ ←֓ them . f r a c E r r o r /= 1 0 0 0 ; } if ( fracError > toler ) { continue ; } else { f r a c E r r o r 2 = checkMo r eBr i dg es ( 1 0 , t e s t P t ) ; i f ( fracError2 > sqrt ( toler ) ) { continue ; } else { u s e d D i s t [ idxD ] = true ; u s e d D i s t [ idxE ] = true ; u s e d D i s t [ b r i d g e I d x ] = true ; 431 atoms . push back ( t e s t P t ) ; co ut << ” Po i nt : ” << atoms . s i z e ( ) << ’ \ t ’ << t e s t P t << e n d l ; 432 433 434 435 436 } } 103 437 } // n l o o p } // m l o o p } // idxE l o o p } // idxD l o o p 438 439 440 441 442 c u r r S i z e = atoms . s i z e ( ) ; updateCurrDL ( ) ; 443 444 445 450 i f ( atoms . s i z e ( ) == t a r g e t S i z e ) { s u c c e s s F l a g = true ; } 451 return s u c c e s s F l a g ; 446 447 448 449 452 453 } 454 455 456 457 458 long double S t r u c t u r e : : d i s t L i s t E r r o r ( ) { // C a l c u l a t e t h e e r r o r b a s e d on t h e c l o s e n e s s o f t h e c u r r e n t ֒→ ←֓ and t h e t a r g e t // d i s t a n c e l i s t s . 459 a s s e r t ( currDL . s i z e ( ) == targetDL . s i z e ( ) ) ; long double e r r o r = 0 ; 460 461 462 467 fo r ( int i = 0 ; i < currDL . s i z e ( ) ; ++i ) { e r r o r += pow ( currDL [ i ] − targetDL [ i ] , 2 . 0 ) ; } 468 return s q r t ( e r r o r / currDL . s i z e ( ) ) ; 463 464 465 466 469 470 } 471 472 473 474 475 476 477 478 long double S t r u c t u r e : : d i s t L i s t E r r o r ( v e c t o r d l i s t 1 , v e c t o r ֒→ ←֓ d l i s t 2 ) { // C a l c u l a t e t h e e r r o r b a s e d on t h e c l o s e n e s s o f 2 d i s t a n c e ֒→ ←֓ l i s t s . i f ( d l i s t 1 . s i z e ( ) != d l i s t 2 . s i z e ( ) ) { 104 return 1 e6 ; } s o r t ( d l i s t 1 . b e g i n ( ) , d l i s t 1 . end ( ) ) ; s o r t ( d l i s t 2 . b e g i n ( ) , d l i s t 2 . end ( ) ) ; long double e r r o r = 0 ; 479 480 481 482 483 484 489 fo r ( int i = 0 ; i < d l i s t 1 . s i z e ( ) ; ++i ) { e r r o r += pow ( d l i s t 1 [ i ] − d l i s t 2 [ i ] , 2 . 0 ) ; } 490 return s q r t ( e r r o r / d l i s t 1 . s i z e ( ) ) ; 485 486 487 488 491 492 } 493 494 495 496 497 498 499 500 long double S t r u c t u r e : : d i s t L i s t E r r o r ( v e c t o r d l i s t ) { // F un c ti o n t o c a l c u l a t e t h e e r r o r b a s e d on how c l o s e t h e 2 ֒→ ←֓ d l i s t s a re . long double dEr r o r = 0 ; long double t o t a l E r r o r = 0 ; s i z e t bIdx = 0 ; 501 502 503 504 505 v e c t o r countB ( targetDL . s i z e ( ) , 0 ) ; int r e p e a t = 0 ; upda t eUsedD i st s ( ) ; v e c t o r freeDL ; 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 fo r ( int i = 0 ; i < targetDL . s i z e ( ) ; ++i ) { i f ( u s e d D i s t [ i ] == f a l s e ) { freeDL . push back ( targetDL [ i ] ) ; } } fo r ( int i = 0 ; i < d l i s t . s i z e ( ) ; i++ ) { // b I d x = c l o s e s t D i s t ( d l i s t [ i ] , ta rg e tD L ) ; // dError = d l i s t [ i ] − ta rg e tD L [ b I d x ] ; bIdx = c l o s e s t D i s t ( d l i s t [ i ] , freeDL ) ; dEr r o r = d l i s t [ i ] − freeDL [ bIdx ] ; t o t a l E r r o r += dEr r o r ∗ dEr r o r ; 105 523 524 525 526 527 528 529 530 531 532 } 533 534 dEr r o r += 0 . 0 1 ∗ r e p e a t ; // c o u t << ” p t c o s t : ” << s e t p r e c i s i o n ( 8 ) << s q r t ( ֒→ ←֓ t o t a l E r r o r / d l i s t . s i z e ( ) ) << e n d l ; // g e t c h a r ( ) ; return s q r t ( t o t a l E r r o r / d l i s t . s i z e ( ) ) ; 535 536 537 538 539 540 countB [ bIdx ] += 1 ; i f ( countB [ bIdx ] >= 2 ) { ++r e p e a t ; } // FIXME: Commenting o ut c a s e when t h e r e i s a l a r g e e r r o r . i f ( dEr r o r > 0 . 5 ) { // r e t u r n 1 . 0 ; } } 541 542 543 544 545 long double S t r u c t u r e : : checkMo r eBr i dg es ( int numChecks , Po i nt ֒→ ←֓ t e s t P t ) { // Check more b r i d g e bonds f o r t h e t e s t P t t o make s u r e t h a t ֒→ ←֓ i t i s // i n d e e d p a r t o f t h e a c t u a l s t r u c t u r e . 546 547 548 549 550 long double f r a c E r r o r = 0 ; long double b r i d g e D i s t = 0 ; int b r i d g e I d x = 0 ; v e c t o r d l i s t = targetDL ; 551 552 553 554 555 556 557 558 559 560 561 562 563 fo r ( int i = 0 ; i < numChecks ; ++i ) { i f ( i >= atoms . s i z e ( ) ) break ; b r i d g e D i s t = g e t D i s t a n c e ( t e s t P t , atoms [ i ] ) ; i f ( bridgeDist < d l i s t [ 0 ] ) { f r a c E r r o r += 0 . 1 ; continue ; } bridgeIdx = c l o s e s t D i s t ( bridgeDist , d l i s t ) ; f r a c E r r o r += f a b s ( d l i s t [ b r i d g e I d x ] − b r i d g e D i s t ) / ֒→ ←֓ b r i d g e D i s t ; 106 } 564 565 return f r a c E r r o r /numChecks ; 566 567 568 } 569 570 571 572 573 bool S t r u c t u r e : : r e c o n s t r u c t ( ) { // R e c o n s t r u c t t h e s t r u c t u r e by f i r s t f i n d i n g t h e c o r e and ֒→ ←֓ th e n d o i n g // b u i l d u p ( i f needed ) . 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 // Need t o f i n d t h e c o r e f i r s t . i f ( atoms . s i z e ( ) == 0 ) { i f ( dim == 2 ) { bool f i n d C o r e F l a g = f i n d C o r e ( ) ; co ut << ” Core found ? ” << b o o l a l p h a << f i n d C o r e F l a g << e n d l ; } e l s e i f ( dim == 3 ) { // c o u t << ” findCore3D ( 2 ) ” << e n d l ; findCore3D ( ) ; } } // Core i s a l r e a d y t h e r e , j u s t do t h e b u i l d u p . i f ( dim == 3 and atoms . s i z e ( ) >= 4 ) { v e c t o r i dxAr r ( 6 , 0 ) ; i dxAr r [ 0 ] = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ ←֓ 1 ] ) , targetDL ) ; i dxAr r [ 1 ] = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ ←֓ 2 ] ) , targetDL ) ; i dxAr r [ 2 ] = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ ←֓ 2 ] ) , targetDL ) ; i dxAr r [ 3 ] = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ ←֓ 3 ] ) , targetDL ) ; i dxAr r [ 4 ] = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ ←֓ 3 ] ) , 107 0 ] , atoms [ ֒→ 0 ] , atoms [ ֒→ 1 ] , atoms [ ֒→ 0 ] , atoms [ ֒→ 1 ] , atoms [ ֒→ targetDL ) ; i dxAr r [ 5 ] = c l o s e s t D i s t ( g e t D i s t a n c e ( ←֓ 3 ] ) , targetDL ) ; co ut << ” i dxAr r : ” << i dxAr r [ 0 ] << ’ , << i dxAr r [ 2 ] << ’ , << i dxAr r [ 4 ] << ’ , ←֓ e n d l ; print () ; 603 604 605 606 607 608 609 atoms [ 2 ] , atoms [ ֒→ ’ << i dxAr r [ 1 ] << ’ , ’ ’ << i dxAr r [ 3 ] << ’ , ’ ’ << i dxAr r [ 5 ] << ֒→ 610 int idxG , idxH , i d x I , i dxJ ; 611 612 idxG = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ 0 ] , atoms [ 4 ←֓ targetDL ) ; idxH = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ 1 ] , atoms [ 4 ←֓ targetDL ) ; i d x I = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ 2 ] , atoms [ 4 ←֓ targetDL ) ; i dxJ = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ 3 ] , atoms [ 4 ←֓ targetDL ) ; co ut << ” i d x : ” << idxG << ’ , ’ << idxH << ’ , ’ << i d x I << ’ , ’ << i dxJ << e n d l ; u s e d D i s t [ idxG ] = u s e d D i s t [ idxH ] = u s e d D i s t [ i d x I ] = u s e d D i s t [ i dxJ ] = true ; 613 614 615 616 617 618 619 620 621 doBuildup3D ( i dxAr r ) ; 622 623 // a t t e m p t b u i l d u p a g a i n i f s h o r t o f a few p o i n t s int a t t empt s = 0 ; long double o l d T o l e r = t o l e r ; while ( a t t empt s < 5 and atoms . s i z e ( ) < t a r g e t S i z e ) { t o l e r ∗= 1 0 ; doBuildup3D ( i dxAr r ) ; ++a t t empt s ; co ut << ” attempt : ” << a t t empt s << e n d l ; } t o l e r = oldToler ; 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 } i f ( atoms . s i z e ( ) == t a r g e t S i z e ) { return true ; } 108 ] ) , ֒→ ] ) , ֒→ ] ) , ֒→ ] ) , ֒→ bool s u c c e s s F l a g = doBuildup ( ) ; i f ( successFlag ) { co ut << ” F i n i s h e d r e c o n s t r u c t i o n ” << e n d l ; } return s u c c e s s F l a g ; 642 643 644 645 646 647 648 649 } 650 651 652 653 654 bool S t r u c t u r e : : r e f l e c t ( s t r i n g a x i s ) { // R e f l e c t t h e s t r u c t u r e a b o u t t h e X or t h e Y a x i s , used by ֒→ ←֓ t h e o v e r l a p // f u n c t i o n . 655 677 i f ( a x i s == ”X” ) { fo r ( int i = 0 ; i < atoms . { atoms [ i ] . y = − atoms [ } } e l s e i f ( a x i s == ”Y” ) { fo r ( int i = 0 ; i < atoms . { atoms [ i ] . x = − atoms [ } } e l s e i f ( a x i s == ”Z” ) { fo r ( int i = 0 ; i < atoms . { atoms [ i ] . z = − atoms [ } } 678 return true ; 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 679 680 s i z e ( ) ; ++i ) i ]. y; s i z e ( ) ; ++i ) i ]. x; s i z e ( ) ; ++i ) i ]. z; } 681 682 683 684 bool S t r u c t u r e : : home ( ) { // O ri e n t t h e s t r u c t u r e i n a un i q ue manner so t h a t i t becomes ֒→ ←֓ e a s y t o c h e c k 109 685 // i f two or more s t r u c t u r e a re i d e n t i c a l t o one a n o t h e r or n o t . 686 687 688 689 long double minDist = 1 e6 ; int i d x 1 = −1, i d x 2 = −1; long double d i s t = 1 e6 ; 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 // Find t h e s m a l l e s t bond i n t h e s t r u c t u r e fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { fo r ( int j = i + 1 ; j < atoms . s i z e ( ) ; ++j ) { d i s t = g e t D i s t a n c e ( atoms [ i ] , atoms [ j ] ) ; i f ( d i s t < minDist ) { minDist = d i s t ; idx1 = i ; idx2 = j ; } } } // c o u t << ” i d x 1 , i d x 2 : ” << i d x 1 << ’\ t ’ << i d x 2 << e n d l ; 706 707 708 709 710 // Lo c a te t h e apex p o i n t o f t h e b a s e t r i a n g l e long double minDist1 = 1 e6 , minDist2 = 1 e6 ; long double d i s t 1 , d i s t 2 ; int minIdx1 , minIdx2 ; 711 712 713 714 fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { i f ( i == i d x 1 or i == i d x 2 ) continue ; 715 d i s t 1 = g e t D i s t a n c e ( atoms [ i ] , atoms [ i d x 1 ] ) ; i f ( d i s t 1 < minDist1 ) { minDist1 = d i s t 1 ; minIdx1 = i ; } 716 717 718 719 720 721 722 723 724 725 726 727 728 729 } d i s t 2 = g e t D i s t a n c e ( atoms [ i ] , atoms [ i d x 2 ] ) ; i f ( d i s t 2 < minDist2 ) { minDist2 = d i s t 2 ; minIdx2 = i ; } 110 730 int i d x 3 = minIdx1 ; i f ( minDist1 > minDist2 ) { i d x 3 = minIdx2 ; swap ( idx1 , i d x 2 ) ; } 731 732 733 734 735 736 737 // C o r r e c t l y p l a c e t h e b a s e t r i a n g l e t r a n s l a t e ( −atoms [ i d x 1 ] . x , −atoms [ i d x 1 ] . y , −atoms [ i d x 1 ֒→ ←֓ ] . z ) ; // f i n d a n g l e and r o t a t e long double a n g l e = atan2 ( atoms [ i d x 2 ] . y , atoms [ i d x 2 ] . x ) ; rotate ( angle ) ; 738 739 740 741 742 743 i f ( atoms [ i d x 3 ] . y < 0 ) { r e f l e c t ( ”X” ) ; } 744 745 746 747 748 // S o r t t h e p o i n t s so t h a t t h e r e i s some un i q ue o r d e r s o r t ( atoms . b e g i n ( ) , atoms . end ( ) ) ; 749 750 751 return true ; 752 753 754 } 755 756 757 758 759 760 bool S t r u c t u r e : : t e s t C o r e ( v e c t o r idxArr , int idxM , int ֒→ ←֓ idxN ) { // Check t o make s u r e t h a t t h e c o r e i s c o r r e c t by u s i n g a l l ֒→ ←֓ p o s s i b l e bonds // as t h e b a s e and c h e c k i n g i f t h e r e s u l t i n g b r i d g e bonds a re ֒→ ←֓ p a r t o f t h e // t a r g e t d i s t a n c e l i s t . 761 762 763 764 Po i nt basePt1 , basePt2 , apexPt1 , apexPt2 ; basePt1 . x = 0 ; basePt1 . y = 0 ; 765 766 767 768 v e c t o r d l i s t = targetDL ; long double b r i d g e D i s t , f r a c E r r o r ; int idxA , idxB , idxC , idxD , idxE , g i v e n B r i d g e I d x ; 769 770 int bondA = i dxAr r [ 0 ] , bondB = i dxAr r [ 1 ] , bondC = i dxAr r [ ֒→ 111 771 772 773 774 775 776 777 778 779 780 781 ←֓ 2 ] , bondBP = i dxAr r [ 3 ] , bondCP = i dxAr r [ 4 ] , bondBridge = ֒→ ←֓ i dxAr r [ 5 ] ; i f ( idxM == 1 ) { swap ( bondB , bondC ) ; } v e c t o r > idxCombin ( 6 , i dxAr r ) ; idxCombin [ 0 ] [ 0 ] = bondA ; idxCombin [ 0 ] [ 1 ] = bondB ; idxCombin [ 0 ] [ 2 ] = bondC ; idxCombin [ 0 ] [ 3 ] = bondBP ; idxCombin [ 0 ] [ 4 ] = bondCP ; idxCombin [ 0 ] [ 5 ] = bondBridge ; 782 783 784 785 idxCombin [ 1 ] [ 0 ] = bondB ; idxCombin [ 1 ] [ 1 ] = bondC ; idxCombin [ 1 ] [ 2 ] = bondA ; idxCombin [ 1 ] [ 3 ] = bondBridge ; idxCombin [ 1 ] [ 4 ] = bondBP ; idxCombin [ 1 ] [ 5 ] = bondCP ; 786 787 788 789 idxCombin [ 2 ] [ 0 ] = bondC ; idxCombin [ 2 ] [ 1 ] = bondB ; idxCombin [ 2 ] [ 2 ] = bondA ; idxCombin [ 2 ] [ 3 ] = bondBridge ; idxCombin [ 2 ] [ 4 ] = bondCP ; idxCombin [ 2 ] [ 5 ] = bondBP ; 790 791 792 793 idxCombin [ 3 ] [ 0 ] = bondBP ; idxCombin [ 3 ] [ 1 ] = bondB ; idxCombin [ 3 ] [ 2 ] = bondBridge ; idxCombin [ 3 ] [ 3 ] = bondA ; idxCombin [ 3 ] [ 4 ] = bondCP ; idxCombin [ 3 ] [ 5 ] = bondC ; 794 795 796 797 idxCombin [ 4 ] [ 0 ] = bondCP ; idxCombin [ 4 ] [ 1 ] = bondBP ; idxCombin [ 4 ] [ 2 ] = bondA ; idxCombin [ 4 ] [ 3 ] = bondBridge ; idxCombin [ 4 ] [ 4 ] = bondC ; idxCombin [ 4 ] [ 5 ] = bondB ; 798 799 800 801 idxCombin [ 5 ] [ 0 ] = bondBridge ; idxCombin [ 5 ] [ 1 ] = bondBP ; idxCombin [ 5 ] [ 2 ] = bondB ; idxCombin [ 5 ] [ 3 ] = bondCP ; idxCombin [ 5 ] [ 4 ] = bondC ; idxCombin [ 5 ] [ 5 ] = bondA ; 802 803 804 805 806 807 808 809 810 bool s u c c e s s = true , l o c a l S u c c e s s = f a l s e ; fo r ( int i = 0 ; i < idxCombin . s i z e ( ) ; ++i ) { idxA = idxCombin [ i ] [ 0 ] ; idxB = idxCombin [ i ] [ 1 ] ; idxC = idxCombin [ i ] [ 2 ] ; idxD = idxCombin [ i ] [ 3 ] ; idxE = idxCombin [ i ] [ 4 ] ; g i v e n B r i d g e I d x = idxCombin [ i ֒→ ←֓ ] [ 5 ] ; basePt2 . x = d l i s t [ idxA ] ; basePt2 . y = 0 ; 811 812 placeApex ( apexPt1 , idxA , idxB , idxC , d l i s t ) ; 112 placeApex ( apexPt2 , idxA , idxD , idxE , d l i s t ) ; 813 814 fo r ( int idxY = 0 ; idxY < 2 ; ++idxY ) { i f ( idxY == 1 ) { apexPt2 . y = − apexPt2 . y ; } 815 816 817 818 819 820 821 b r i d g e D i s t = g e t D i s t a n c e ( apexPt1 , apexPt2 ) ; fracError = fabs ( d l i s t [ givenBridgeIdx ] − bridgeDist ) / d l i s t [ givenBridgeIdx ] ; 822 823 824 825 i f ( f a b s ( apexPt2 . y ) < 0 . 5 ) { f r a c E r r o r /= 1 0 0 0 ; } 826 827 828 829 830 831 832 833 834 835 836 } 837 838 839 840 841 842 } 843 844 846 848 i f ( l o c a l S u c c e s s == f a l s e ) { success = false ; } co ut << ” Test c o r e . Good? ” << b o o l a l p h a << s u c c e s s << e n d l ; return s u c c e s s ; 845 847 i f ( fracError < toler ) { // c o u t << ” f r a c E r r o r : ” << f r a c E r r o r << e n d l ; l o c a l S u c c e s s = true ; break ; } } 849 850 851 852 853 int S t r u c t u r e : : g et D L f r o mF i l e ( s t r i n g d l i s t F i l e ) { // Read f i l e t o g e t t h e l i s t o f d i s t a n c e s f o r t h e ֒→ ←֓ r e c o n s t r u c t i o n . // I f f i l e n a m e i s ” rand2 ” , th e n g e n e r a t e a random p o i n t s e t . 854 855 856 long double i n p u t D i s t ; a s s e r t ( ( ” Target d i s t a n c e l i s t must be empty ” , ֒→ 113 ←֓ targetDL . s i z e ( ) == 0 ) ) ; ifstream inputFile ; 857 858 i f ( d l i s t F i l e == ” rand2 ” or d l i s t F i l e == ” rand3 ” ) { int N = t a r g e t S i z e ; fo r ( int i = 0 ; i < N; ++i ) { atoms [ i ] . x = N∗ f l o a t ( rand ( ) ) / RAND MAX; atoms [ i ] . y = N∗ f l o a t ( rand ( ) ) / RAND MAX; 859 860 861 862 863 864 865 866 i f ( d l i s t F i l e == ” rand3 ” ) { atoms [ i ] . z = N∗ f l o a t ( rand ( ) ) / RAND MAX; } atoms [ i ] . c o s t = 0 ; 867 868 869 870 871 } 872 873 c u r r S i z e = N; updateCurrDL ( ) ; targetDL = currDL ; 874 875 876 } else { i n p u t F i l e . open ( d l i s t F i l e . data ( ) ) ; a s s e r t ( ( ” Trouble o peni ng d i s t a n c e l i s t 877 878 879 880 881 f i l e ” , inputFile ) ) ; 882 while ( i n p u t F i l e >> i n p u t D i s t ) { targetDL . push back ( i n p u t D i s t ) ; } inputFile . close () ; 883 884 885 886 887 888 s o r t ( targetDL . b e g i n ( ) , targetDL . end ( ) ) ; } return 0 ; 889 890 891 892 893 } 894 895 896 897 int S t r u c t u r e : : p r i n t D L t o F i l e ( s t r i n g f i l eNa me ) { // Save t h e l i s t o f d i s t a n c e s t o t h e g i v e n f i l e . 898 899 900 o f s t r e a m o u t F i l e ( f i l eNa me . data ( ) ) ; o u t F i l e . p r e c i s i o n ( 20 ) ; 114 901 co ut << ” currDL s i z e : ” << currDL . s i z e ( ) << e n d l ; i f ( outFile . is open () ) { fo r ( int i = 0 ; i < currDL . s i z e ( ) ; ++i ) { o u t F i l e << currDL [ i ] << e n d l ; } outFile . close () ; } else { co ut << ” una bl e t o open f i l e ” << e n d l ; } 902 903 904 905 906 907 908 909 910 911 912 913 914 915 return 0 ; 916 917 918 } 919 920 921 922 923 924 925 926 927 int S t r u c t u r e : : g e t S t r u F r o m F i l e ( s t r i n g f i l eNa me ) { // Read t h e s t r u c t u r e from t h e g i v e n f i l e so t h a t i t can used ֒→ ←֓ as a c o r e . a s s e r t ( ( ” L i s t o f atoms s h o u l d be empty” , atoms . s i z e ( ) == 0 ֒→ ←֓ ) ) ; ifstream inputFile ; Po i nt i nput Pt ; i nput Pt . c o s t = 0 ; 928 929 i n p u t F i l e . open ( f i l eNa me . data ( ) ) ; 930 938 while ( i n p u t F i l e >> i nput Pt . x >> i nput Pt . y >> i nput Pt . z ) { atoms . push back ( i nput Pt ) ; co ut << ”Pt : ” << i nput Pt << e n d l ; } 939 inputFile . close () ; 931 932 933 934 935 936 937 940 941 942 943 c u r r S i z e = atoms . s i z e ( ) ; updateCurrDL ( ) ; targetDL = currDL ; 115 s o r t ( atoms . b e g i n ( ) , atoms . end ( ) ) ; 944 945 return 0 ; 946 947 948 } 949 950 951 952 953 954 bool S t r u c t u r e : : t r a n s l a t e ( long double distX , long double distY , long double d i s t Z ) { // S h i f t a l l t h e p o i n t s i n t h e s t r u c t u r e by t h e g i v e n amounts ֒→ ←֓ i n X, Y and Z // d i r e c t i o n s . 955 fo r ( int { atoms [ atoms [ atoms [ } 956 957 958 959 960 961 962 965 i ] . x += di st X ; i ] . y += di st Y ; i ] . z += d i s t Z ; return true ; 963 964 i = 0 ; i < atoms . s i z e ( ) ; ++i ) } 966 967 968 969 970 971 972 973 974 long double g et Ang l e ( Po i nt pt1 , Po i nt pt2 ) { // Assume t h a t p t 2 i s a l o n g t h e x−a x i s long double norm = s q r t ( pt1 . x∗ pt1 . x + pt1 . y∗ pt1 . y + ֒→ ←֓ pt1 . z ∗ pt1 . z ) ; return a c o s ( pt1 . x/ norm ) ; // r e t u r n atan2 ( s q r t ( p t 1 . y∗ p t 1 . y + p t 1 . z ∗ p t 1 . z ) , p t 1 . x ) ; } 975 976 977 978 979 980 981 982 983 Po i nt g e t A x i s ( Po i nt pt1 , Po i nt pt2 ) { // Assume p t 2 i s a l o n g t h e x−a x i s // l o n g d o u b l e norm = s q r t ( p t 1 . x ∗ p t 1 . x + p t 1 . y∗ p t 1 . y + ֒→ ←֓ p t 1 . z ∗ p t 1 . z ) ; long double norm = s q r t ( pt1 . y∗ pt1 . y + pt1 . z ∗ pt1 . z ) ; // c o u t << ” p t 1 i n g e t A x i s : ” << p t 1 << e n d l ; // c o u t << ”norm i n g e t A x i s : ” << norm << e n d l ; Po i nt a x i s ; 984 985 axis . x = 0; 116 988 a x i s . y = pt1 . z / norm ; a x i s . z = −pt1 . y/ norm ; 989 return a x i s ; 986 987 990 991 } 992 993 994 995 996 bool S t r u c t u r e : : r o t a t e ( long double a n g l e ) { // Turn t h e s t r u c t u r e a b o u t t h e Z a x i s by t h e g i v e n a n g l e ( i n ֒→ ←֓ r a d i a n s ) . long double cosT = c o s ( a n g l e ) ; long double sinT = s i n ( a n g l e ) ; long double oldX , oldY ; 997 998 999 1000 1008 fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { oldX = atoms [ i ] . x ; oldY = atoms [ i ] . y ; atoms [ i ] . x = oldX ∗ cosT + oldY ∗ sinT ; atoms [ i ] . y = −oldX ∗ sinT + oldY ∗ cosT ; } 1009 return true ; 1001 1002 1003 1004 1005 1006 1007 1010 1011 } 1012 1013 1014 1015 1016 bool S t r u c t u r e : : r o t a t e ( long double a ng l e , long double axisX , long double axisY , long double a x i s Z ) { // Make s u r e t h e a x i s components a re n o r m a l i z e d 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 // I f a n g l e i s r e a l l y s m a l l , don ’ t b o t h e r w i t h a n y t h i n g i f ( f a b s ( a n g l e ) < 1e−6 ) { // c o u t << ” s m a l l a n g l e : ” << a n g l e << e n d l ; return true ; } // h t t p : / / i n s i d e . mines . edu /˜ gmurray/ A r b i t r a r y A x i s R o t a t i o n / long double cosT = c o s ( a n g l e ) ; long double sinT = s i n ( a n g l e ) ; long double oldX , oldY , oldZ ; 1029 117 1042 fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { oldX = atoms [ i ] . x ; oldY = atoms [ i ] . y ; oldZ = atoms [ i ] . z ; atoms [ i ] . x = axisX ∗ ( axisX ∗ oldX + axisY ∗ oldY + a x i s Z ∗ oldZ ֒→ ←֓ ) ∗ ( 1 − cosT ) + oldX ∗ cosT + ( −a x i s Z ∗ oldY + axisY ∗ oldZ ) ∗ sinT ; atoms [ i ] . y = axisY ∗ ( axisX ∗ oldX + axisY ∗ oldY + a x i s Z ∗ oldZ ֒→ ←֓ ) ∗ ( 1 − cosT ) + oldY ∗ cosT + ( a x i s Z ∗ oldX − axisX ∗ oldZ ) ∗ sinT ; atoms [ i ] . z = a x i s Z ∗ ( axisX ∗ oldX + axisY ∗ oldY + a x i s Z ∗ oldZ ֒→ ←֓ ) ∗ ( 1 − cosT ) + oldZ ∗ cosT + ( −axisY ∗ oldX + axisX ∗ oldY ) ∗ sinT ; } 1043 return true ; 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1044 1045 } 1046 1047 1048 1049 1050 int S t r u c t u r e : : r e d u c e D L p r e c i s i o n ( int n e w P r e c i s i o n ) { v e c t o r oldAtoms = atoms ; 1051 1052 1053 1054 1055 1056 1057 1058 fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { // FIXME d e c l a r i n g v a r i a b l e i n s i d e l o o p as a q u i c k f i x t o ֒→ ←֓ make i t work stringstream lessPreciseX ; l essPr eci seX . p r e c i s i o n ( newPrecision ) ; l e s s P r e c i s e X << atoms [ i ] . x ; l e s s P r e c i s e X >> atoms [ i ] . x ; 1059 1060 1061 1062 1063 stringstream lessPreciseY lessPreciseY lessPreciseY lessPreciseY ; . p r e c i s i o n ( newPrecision ) ; << atoms [ i ] . y ; >> atoms [ i ] . y ; stringstream lessPreciseZ lessPreciseZ lessPreciseZ lessPreciseZ ; . p r e c i s i o n ( newPrecision ) ; << atoms [ i ] . z ; >> atoms [ i ] . z ; 1064 1065 1066 1067 1068 1069 1070 } 118 updateCurrDL ( ) ; targetDL = currDL ; atoms = oldAtoms ; 1071 1072 1073 1074 fo r ( int i = 0 ; i < targetDL . s i z e ( ) ; ++i ) { stringstream lessPrecis e ; l e s s P r e c i s e . p r e c i s i o n ( newPrecision ) ; 1075 1076 1077 1078 1079 l e s s P r e c i s e << targetDL [ i ] ; l e s s P r e c i s e >> targetDL [ i ] ; 1080 1081 1083 } 1084 return 0 ; 1082 1085 1086 } 1087 1088 1089 1090 1091 1092 1093 1094 v e c t o r S t r u c t u r e : : g et Co r e ( int c o r e S i z e ) { co ut << ” c o r e S i z e , atoms . s i z e ( ) : ” << c o r e S i z e << ’ , ’ << ֒→ ←֓ atoms . s i z e ( ) << e n d l ; a s s e r t ( ( ” Core s i z e <= S i z e o f s t r u c t u r e ” , c o r e S i z e <= ֒→ ←֓ atoms . s i z e ( ) ) ) ; v e c t o r c o r e P o i n t s ; c o r e P o i n t s . i n s e r t ( c o r e P o i n t s . end ( ) , atoms . b e g i n ( ) , ֒→ ←֓ atoms . b e g i n ( ) + c o r e S i z e ) ; return c o r e P o i n t s ; 1095 1096 1097 } 1098 1099 1100 1101 1102 1103 1104 1105 1106 int S t r u c t u r e : : upda t eUsedD i st s ( ) { int numUsed = 0 ; int usedNum = 0 ; // r e t u r n 0 ; // FIX ME debug mode int i d x ; long double e r r 1 , e r r 2 ; updateCurrDL ( ) ; 1107 1108 1109 1110 1111 fo r ( int i = 0 ; i < currDL . s i z e ( ) ; ++i ) { i d x = c l o s e s t D i s t ( currDL [ i ] , targetDL ) ; // c o u t << ” i d x , currD , tarD , f l a g : ” << i d x << ’ , ’ << ֒→ ←֓ currDL [ i ] << ’ , ’ 119 // 1112 << ta rg e tD L [ i d x ] << ’ , ’ << u s e d D i s t [ i d x ] << e n d l ; 1113 1114 // I f i d x i s a l r e a d y e x c l u d e d ; th e n e x c l u d e t h e n e i g h b o u r // FIX ME i f ( usedDist [ idx ] ) { // ++numUsed ; } 1115 1116 1117 1118 1119 1120 1121 // i f ( u s e d D i s t [ i d x ] and i d x −1 >= 0 and i d x + 1 < ֒→ ←֓ u s e d D i s t . s i z e ( ) −1 ) i f ( false ) { e r r 1 = f a b s ( targetDL [ i d x + 1 ] − currDL [ i ] ) ; e r r 2 = f a b s ( targetDL [ i d x − 1 ] − currDL [ i ] ) ; 1122 1123 1124 1125 1126 1127 i f ( err1 < err2 ) { i f ( err1 < 0.1 ) { u s e d D i s t [ i d x + 1 ] = true ; ++numUsed ; } } else { i f ( err2 < 0.1 ) { u s e d D i s t [ i d x − 1 ] = true ; ++numUsed ; } } 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 } } else { u s e d D i s t [ i d x ] = true ; ++numUsed ; } // c o u t << ” e x c l u d i n g ” << d l i s t [ i d x ] << ” f o r ” << ֒→ ←֓ c o r e D l i s t [ i ] << e n d l ; // c o u t << ”Number o f used d i s t a n c e s : ” << numUsed << e n d l ; return 0 ; 120 1155 1156 } 1157 1158 1159 1160 1161 1162 int S t r u c t u r e : : updateFreeDL ( ) { int i d x ; long double e r r 1 , e r r 2 ; v e c t o r e x c l u d e ( targetDL . s i z e ( ) , f a l s e ) ; 1163 1164 1165 1166 1167 // updateCurrDL ( ) ; fo r ( int i = 0 ; i < currDL . s i z e ( ) ; ++i ) { i d x = c l o s e s t D i s t ( currDL [ i ] , targetDL ) ; 1168 // I f i d x i s a l r e a d y e x c l u d e d ; th e n e x c l u d e t h e n e i g h b o u r // FIX ME i f ( e x c l u d e [ i d x ] and i d x −1 >= 0 and i d x + 1 < ֒→ ←֓ e x c l u d e . s i z e ( ) −1 ) // i f ( f a l s e ) { e r r 1 = f a b s ( targetDL [ i d x + 1 ] − currDL [ i ] ) ; e r r 2 = f a b s ( targetDL [ i d x − 1 ] − currDL [ i ] ) ; 1169 1170 1171 1172 1173 1174 1175 1176 i f ( err1 < err2 ) { i f ( err1 < 0.1 ) { e x c l u d e [ i d x + 1 ] = true ; } } else { i f ( err2 < 0.1 ) { e x c l u d e [ i d x − 1 ] = true ; } } 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 } } else { e x c l u d e [ i d x ] = true ; } // c o u t << ” e x c l u d i n g ” << ta rg e tD L [ i d x ] << ” f o r ” << ֒→ ←֓ currDL [ i ] << e n d l ; 121 1198 freeDL . c l e a r ( ) ; v e c t o r u s e d D l i s t ; fo r ( int i = 0 ; i < e x c l u d e . s i z e ( ) ; ++i ) { i f ( e x c l u d e [ i ] == f a l s e ) { freeDL . push back ( targetDL [ i ] ) ; } else { u s e d D l i s t . push back ( targetDL [ i ] ) ; } } 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 co ut << ” targetDLErr ( usedDl , c o r e D l ) : ” << d i s t L i s t E r r o r ( u s e d D l i s t , currDL ) << e n d l ; // a s s e r t ( f r e e D l i s t . s i z e ( ) + currDL . s i z e ( ) >= d l i s t . s i z e ( ) ) ; co ut << ”Number o f f r e e d i s t a n c e s : ” << freeDL . s i z e ( ) << e n d l ; co ut << ”Number o f used d i s t a n c e s : ” << u s e d D l i s t . s i z e ( ) << ֒→ ←֓ e n d l ; return 0 ; 1213 1214 1215 1216 1217 1218 1219 1220 } 1221 1222 1223 1224 1225 long double S t r u c t u r e : : g et Pt Co st ( Po i nt pt ) { // F un c ti o n t o c a l c u l a t e t h e c o s t o f an i n d i v i d u a l p o i n t wrt ֒→ ←֓ t o t h e s t r u c t u r e // and wrt t o t h e t a r g e t d l i s t 1226 v e c t o r p t D l i s t ( atoms . s i z e ( ) , 0 ) ; // c o u t << ” c u r r S i z e : ” << c u r r S i z e << e n d l ; 1227 1228 1229 fo r ( int i = 0 ; i < p t D l i s t . s i z e ( ) ; ++i ) { p t D l i s t [ i ] = g e t D i s t a n c e ( pt , atoms [ i ] ) ; // c o u t << i << ’\ t ’ << p t D l i s t [ i ] << e n d l ; } 1230 1231 1232 1233 1234 1235 // c o u t << e n d l ; // g e t c h a r ( ) ; // c o u t << ” p t c o s t f o r ” << p t << e n d l ; return d i s t L i s t E r r o r ( p t D l i s t ) ; 1236 1237 1238 1239 1240 } 122 1241 1242 1243 1244 1245 1246 1247 bool compareCost ( const Po i nt& pt1 , const Po i nt& pt2 ) { return ( pt1 . c o s t < pt2 . c o s t ) ; } 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 bool S t r u c t u r e : : updatePool ( Po i nt pt ) { i f ( p o o l . s i z e ( ) == maxPoolSize and pt . c o s t > p o o l [ ֒→ ←֓ p o o l . s i z e ( )−1 ] . c o s t ) { return f a l s e ; } // Make s u r e t h a t t h e same p o i n t i s n o t i n t h e p o o l bool r e p e a t = f a l s e ; int o r g I d x = −1; fo r ( int i = 0 ; i < p o o l . s i z e ( ) ; { i f ( g e t D i s t a n c e ( pt , p o o l [ i ] ) { i f ( pt . c o s t < p o o l [ i ] . c o s t { pool . erase ( pool . begin ( ) + } else { r e p e a t = true ; orgIdx = i ; } } } ++i ) < 0.2 ) ) i ); i f ( r e p e a t == f a l s e ) { v e c t o r : : c o n s t i t e r a t o r const i t = l o wer bo und ( p o o l . b e g i n ( ) , p o o l . end ( ) , pt , compareCost ) ; int i d x = i t − p o o l . b e g i n ( ) ; p o o l . i n s e r t ( p o o l . b e g i n ( ) + idx , pt ) ; co ut << ” adding pt : ” << pt << ’ , ’ << pt . c o s t << e n d l ; } 123 while ( p o o l . s i z e ( ) > maxPoolSize ) { p o o l . pop back ( ) ; } 1285 1286 1287 1288 1289 return true ; 1290 1291 1292 } 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 bool S t r u c t u r e : : doBuildup2 ( int newBase ) { Po i nt smPt ; smPt . x = 2 . 4 2 6 7 1 2 3 2 ; smPt . y = 0 . 2 4 0 2 1 1 4 8 ; long double t r i T o l e r = 0 . 1 ; Po i nt z e r o ; zero . x = zero . y = zero . z = 0; i f ( newBase == 0 ) { newBase = 1 ; } // FIXME q u i c k hack as p t 1 o f b a s e i s a l r e a d y a t o r i g i n long double di st A = g e t D i s t a n c e ( zer o , atoms [ newBase ] ) ; long double d i s t B = 0 , d i s t C = 0 ; long double e r r o r = 0 . 0 L ; Po i nt t e s t P t ; int numTestPts = 0 ; co ut << ” di st A : ” << di st A << e n d l ; 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 fo r ( int bIdx = 0 ; bIdx < targetDL . s i z e ( ) ; bIdx++) { i f ( u s e d D i s t [ bIdx ] ) { continue ; } d i s t B = targetDL [ bIdx ] ; fo r ( int cIdx = bIdx + 1 ; cIdx < targetDL . s i z e ( ) ; cIdx++) { i f ( u s e d D i s t [ cIdx ] ) { continue ; } d i s t C = targetDL [ cIdx ] ; 124 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 i f ( di st A > d i s t C ) { i f ( ( d i s t B + d i s t C + t r i T o l e r ) < di st A ) { continue ; } } else { i f ( ( di st A + d i s t B + t r i T o l e r ) < d i s t C ) { break ; } } // Pl a c e t e s t P t // c o u t << ” p l a c i n g t e s t P t ” << e n d l ; t e s t P t . x = di st A /2 − ( d i s t C − d i s t B ) ∗ ( d i s t C + d i s t B ֒→ ←֓ ) / ( 2∗ di st A ) ; t e s t P t . y = s q r t ( ( d i s t C + t e s t P t . x − di st A ) ∗ ( d i s t C − t e s t P t . x + di st A ) ) ; i f ( g e t D i s t a n c e ( t e s t P t , smPt ) < 0 . 2 ) { co ut << ” M i s s i n g p o i n t ! ” << e n d l ; // g e t c h a r ( ) ; } // In t h e c a s e o f n e a r l y c o l l i n e a r p o i n t s , t o c a c l u l a t e ֒→ ←֓ y , we maybe // t a k i n g t h e s q u a r e r o o t o f a n e g a t i v e number . In such ֒→ ←֓ c a s e s , damp // i t t o z e r o . i f ( t e s t P t . y != t e s t P t . y ) { testPt . y = 0; } fo r ( int m = 0 ; m < 2 ; m++ ) { i f ( m == 1 ) { t e s t P t . x = di st A − t e s t P t . x ; } 125 fo r ( int n = 0 ; n < 2 ; n++) { i f ( n == 1 ) { testPt . y = − testPt . y ; } 1372 1373 1374 1375 1376 1377 1378 t e s t P t . c o s t = e r r o r = g et Pt Co st ( t e s t P t ) ; i f ( ( testPt . cost − 0.00478091 ) < 0.001 ) { co ut << ” t e s t P t , e r r o r : ” << t e s t P t << ” , ” << ֒→ ←֓ e r r o r << e n d l ; co ut << ” bIdx , cIdx , m, n : ” << bIdx << ’ , ’ << cIdx ֒→ ←֓ << ’ , ’ << m << ’ , ’ << n << e n d l ; // g e t c h a r ( ) ; } i f ( error < 0.2 ) { updatePool ( t e s t P t ) ; ++numTestPts ; } } // n l o o p } // m l o o p } // c l o o p } // b l o o p 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 co ut << ”Number o f p o i n t s t e s t e d : ” << numTestPts << e n d l ; return 0 ; 1397 1398 1399 1400 } 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 bool S t r u c t u r e : : doBuildup3 ( int newBase ) { Po i nt smPt ; smPt . x = 2 . 4 2 6 7 1 2 3 2 ; smPt . y = 0 . 2 4 0 2 1 1 4 8 ; long double t r i T o l e r = 0 . 1 ; Po i nt z e r o ; zero . x = zero . y = zero . z = 0; i f ( newBase == 0 ) { newBase = 1 ; } 126 1415 1416 1417 1418 1419 1420 1421 // FIXME q u i c k hack as p t 1 o f b a s e i s a l r e a d y a t o r i g i n long double di st A = g e t D i s t a n c e ( zer o , atoms [ newBase ] ) ; long double d i s t B = 0 , d i s t C = 0 ; long double e r r o r = 0 . 0 L ; Po i nt t e s t P t ; int numTestPts = 0 ; co ut << ” di st A : ” << di st A << e n d l ; 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 // updateFreeDL ( ) ; // f o r ( i n t b I d x = 0 ; b I d x < ta rg e tD L . s i z e ( ) ; b I d x++) fo r ( int bIdx = 0 ; bIdx < freeDL . s i z e ( ) ; bIdx++) { // i f ( u s e d D i s t [ b I d x ] ) // { // continue ; // } d i s t B = freeDL [ bIdx ] ; fo r ( int cIdx = bIdx + 1 ; cIdx < freeDL . s i z e ( ) ; cIdx++) { // i f ( u s e d D i s t [ c I d x ] ) // { // continue ; // } d i s t C = freeDL [ cIdx ] ; 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 i f ( di st A > d i s t C ) { i f ( ( d i s t B + d i s t C + t r i T o l e r ) < di st A ) { continue ; } } else { i f ( ( di st A + d i s t B + t r i T o l e r ) < d i s t C ) { break ; } } // c o u t << ” b I d x , c I d x : ” << b I d x << ’\ t ’ << c I d x << e n d l ; // Pl a c e t e s t P t // c o u t << ” p l a c i n g t e s t P t ” << e n d l ; t e s t P t . x = di st A /2 − ( d i s t C − d i s t B ) ∗ ( d i s t C + d i s t B ֒→ 127 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 ←֓ ) / ( 2∗ di st A ) ; t e s t P t . y = s q r t ( ( d i s t C + t e s t P t . x − di st A ) ∗ ( d i s t C − t e s t P t . x + di st A ) ) ; i f ( g e t D i s t a n c e ( t e s t P t , smPt ) < 0 . 2 ) { co ut << ” M i s s i n g p o i n t ! ” << e n d l ; // g e t c h a r ( ) ; } // In t h e c a s e o f n e a r l y c o l l i n e a r p o i n t s , t o c a c l u l a t e ֒→ ←֓ y , we maybe // t a k i n g t h e s q u a r e r o o t o f a n e g a t i v e number . In such ֒→ ←֓ c a s e s , damp // i t t o z e r o . i f ( t e s t P t . y != t e s t P t . y ) { testPt . y = 0; } fo r ( int m = 0 ; m < 2 ; m++ ) { i f ( m == 1 ) { t e s t P t . x = di st A − t e s t P t . x ; } fo r ( int n = 0 ; n < 2 ; n++) { i f ( n == 1 ) { testPt . y = − testPt . y ; } t e s t P t . c o s t = e r r o r = g et Pt Co st ( t e s t P t ) ; // i f ( ( t e s t P t . c o s t − 0 . 0 0 4 7 8 0 9 1 ) < 0 . 0 0 1 ) // { // c o u t << ” t e s t P t , e r r o r : ” << t e s t P t << ” , ” << ֒→ ←֓ e r r o r << e n d l ; // c o u t << ” b I d x , c I d x , m, n : ” << b I d x << ’ , ’ << ֒→ ←֓ c I d x << ’ , ’ // << m << ’ , ’ << n << e n d l ; // // g e t c h a r ( ) ; // } // c o u t << ” t e s t P t , c o s t : ” << t e s t P t << ’ , ’ << e r r o r ֒→ ←֓ << e n d l ; 128 1499 i f ( error < 0.2 ) { updatePool ( t e s t P t ) ; ++numTestPts ; } } // n l o o p } // m l o o p } // c l o o p } // b l o o p 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 co ut << ”Number o f p o i n t s t e s t e d : ” << numTestPts << e n d l ; return 0 ; 1510 1511 1512 1513 } 1514 1515 1516 1517 1518 long double S t r u c t u r e : : g e t P t s C o s t ( Po i nt pt1 , Po i nt pt2 ) { // R e p l a c i n g b r u t e f o r c e i m p l e m e n t a t i o n w i t h t h e one b a s e d on // r e u s i n g t h e c o s t o f p1 , p2 c a l c u l a t e d e a r l i e r . 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 v e c t o r p t s D l i s t ; fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { p t s D l i s t . push back ( g e t D i s t a n c e ( pt1 , atoms [ i ] ) ) ; p t s D l i s t . push back ( g e t D i s t a n c e ( pt2 , atoms [ i ] ) ) ; } p t s D l i s t . push back ( g e t D i s t a n c e ( pt1 , pt2 ) ) ; s o r t ( p t s D l i s t . b e g i n ( ) , p t s D l i s t . end ( ) ) ; 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 // // // // // // // c o u t << ” pt1 , p t 2 ” << e n d l ; c o u t << p t 1 << e n d l ; c o u t << p t 2 << e n d l ; f o r ( i n t i = 0 ; i < p t s D l i s t . s i z e ( ) ; ++i ) { c o u t << ” D l i s t : ” << i << ’\ t ’ << p t s D l i s t [ i ] << e n d l ; } i f ( p t s D l i s t [ 0 ] < 0 . 3 3 ∗ targetDL [ 0 ] ) { // c o u t << ” o v e r l a p ” << e n d l ; // g e t c h a r ( ) ; return 1 . 0 ; } 129 1544 return d i s t L i s t E r r o r ( p t s D l i s t ) ; 1545 1546 1547 } 1548 1549 1550 1551 1552 int S t r u c t u r e : : i n s e r t P o i n t ( Po i nt pt ) { // FIXME f a s t e r t o us e b i n a r y s e a r c h Po i nt z e r o ; 1553 fo r ( int i d x = 0 ; i d x < atoms . s i z e ( ) ; ++i d x ) { i f ( g e t D i s t a n c e ( pt , z e r o ) < g e t D i s t a n c e ( atoms [ i d x ] , ֒→ ←֓ z e r o ) ) { atoms . i n s e r t ( atoms . b e g i n ( ) + idx , pt ) ; return 0 ; } } 1554 1555 1556 1557 1558 1559 1560 1561 1562 atoms . push back ( pt ) ; return 0 ; 1563 1564 1565 1566 } 1567 1568 1569 1570 bool S t r u c t u r e : : growStru ( ) { printPool () ; 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 bool growthFlag = f a l s e ; i f ( pool . s i z e () < 2 ) { co ut << ” Small p o o l ” << e n d l ; return f a l s e ; } long double c o s t 2 p t s = 0 ; long double minCost = 1 e6 ; s i z e t idx1best = 0 , idx2best = 0; 1582 1583 1584 1585 1586 1587 fo r ( int i d x 1 = 0 ; i d x 1 < p o o l . s i z e ( ) ; ++i d x 1 ) { fo r ( int i d x 2 = i d x 1 + 1 ; i d x 2 < p o o l . s i z e ( ) ; ++i d x 2 ) { cost2pts = getPtsCost ( pool [ idx1 ] , pool [ idx2 ] ) ; 130 // c o u t << ” i d x 1 , i d x 2 , c o s t : ” << i d x 1 << ” , ” << i d x 2 ֒→ ←֓ << ” , ” // << s e t p r e c i s i o n ( 8 ) << c o s t 2 p t s << e n d l ; i f ( c o s t 2 p t s < minCost ) { idx1best = idx1 ; idx2best = idx2 ; minCost = c o s t 2 p t s ; } } // i d x 2 l o o p } // i d x 1 l o o p 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 i f ( minCost < 1 ) { // c o u t << ” b e s t i d x : ” << i d x 1 b e s t << ’ , ’ << i d x 2 b e s t << ’ ; ’ // << ” p o t e n t i a l c o s t : ” << s e t p r e c i s i o n ( 8 ) << ֒→ ←֓ minCost << e n d l ; 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 } 1611 1612 // g e t c h a r ( ) ; return growthFlag ; 1613 1614 1615 1616 insertPo int ( pool [ idx1best ] ) ; insertPo int ( pool [ idx2best ] ) ; growthFlag = true ; updateCurrDL ( ) ; co ut << ” adding p t s with i d x : ” << i d x 1 b e s t << ” , ” << ֒→ ←֓ i d x 2 b e s t << e n d l ; co ut << ” ∗∗∗ ” << p o o l [ i d x 1 b e s t ] << e n d l ; co ut << ” ∗∗∗ ” << p o o l [ i d x 2 b e s t ] << e n d l ; } 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 int S t r u c t u r e : : p r i n t P o o l ( ) { co ut << ” ∗∗∗∗ ” << ” P o i n t s i n t he p o o l and t h e i r c o s t ” << ” ֒→ ←֓ ∗∗∗∗ ” << e n d l ; fo r ( int i = 0 ; i < p o o l . s i z e ( ) ; ++i ) { // c o u t << i << ’\ t ’ << p o o l [ i ] << ’\ t ’ << s e t p r e c i s i o n ( 8 ֒→ ←֓ ) << p o o l [ i ] . c o s t // << e n d l ; long double ptCost = g et Pt Co st ( p o o l [ i ] ) ; co ut << i << ’ \ t ’ << p o o l [ i ] << ’ \ t ’ << s e t p r e c i s i o n ( 8 ) ֒→ 131 ←֓ << ptCost << e n d l ; 1628 } co ut << e n d l ; // g e t c h a r ( ) ; 1629 1630 1631 1632 return 0 ; 1633 1634 1635 } 1636 1637 1638 1639 1640 int S t r u c t u r e : : g e t P o o l s ( ) { // f un c t h a t g e t s and combines t h e p o o l s from t h e 3 bonds i n ֒→ ←֓ t h e b a s e t r i a n g l e // and th e n i s re a d y f o r d o i n g b u i l d u p 1641 1642 1643 // FIXME q u i c k hack t o f i x some f a i l e d r e c o n s t r u c t i o n s // s ra n d ( time ( NULL ) ) ; 1644 1645 1646 1647 1648 1649 int numPools = 1 ; int basePt1 = 0 , basePt2 = 1 ; int oldBasePt1 = basePt1 , oldBasePt2 = basePt2 ; v e c t o r oldAtoms = atoms , newPool ; long double newOx , newOy , newOz , a n g l e ; 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 fo r ( int i = 1 ; i <= numPools ; ++i ) { do { basePt1 = rand ( ) % atoms . s i z e ( ) ; basePt2 = rand ( ) % atoms . s i z e ( ) ; i f ( basePt1 > basePt2 ) { swap ( basePt1 , basePt2 ) ; } } while ( basePt2 == basePt1 or basePt1 == oldBasePt1 or basePt2 == oldBasePt2 or basePt1 == oldBasePt2 or basePt2 == oldBasePt1 ) ; 1666 1667 1668 1669 1670 oldBasePt1 = basePt1 ; oldBasePt2 = basePt2 ; co ut << ” newBase : ” << basePt1 << ’ \ t ’ << basePt2 << e n d l ; 132 newOx = atoms [ basePt1 ] . x ; newOy = atoms [ basePt1 ] . y ; newOz = atoms [ basePt1 ] . z ; 1671 1672 1673 1674 // a n g l e = atan2 ( atoms [ b a s e Pt2 ] . y − atoms [ b a s e Pt1 ] . y , // atoms [ b a s e Pt2 ] . x − atoms [ b a s e Pt1 ] . x ) ; // t r a n s l a t e ( −newOx , −newOy , −newOz ) ; // r o t a t e ( a n g l e ) ; 1675 1676 1677 1678 1679 // c o u t << ” b e f o r e b u i l d u p ” << e n d l ; // p r i n t ( ) ; // d o B ui l d up 2 ( b a s e Pt2 ) ; 1680 1681 1682 1683 // d o B ui l d up 3 ( b a s e Pt2 ) ; 1684 1685 // // // // // // 1686 1687 1688 1689 1690 1691 t r a n s l a t e ( −newOx , −newOy , −newOz ) ; Po i n t p t 2 ; pt2 . x = 1; pt2 . y = 0; pt2 . z = 0; l o n g d o u b l e a n g l e = g e t A n g l e ( atoms [ b a s e Pt2 ] , p t 2 ) ; Po i n t a x i s = g e t A x i s ( atoms [ b a s e Pt2 ] , p t 2 ) ; r o t a t e ( angle , a x i s . x , a x i s . y , a x i s . z ) ; 1692 doBuildup3Dv2 ( 0 , 1 , 2 ) ; atoms . i n s e r t ( atoms . end ( ) , p o o l . b e g i n ( ) , p o o l . end ( ) ) ; 1693 1694 1695 // r o t a t e ( −a n g l e , a x i s . x , a x i s . y , a x i s . z ) ; // t r a n s l a t e ( newOx , newOy , newOz ) ; 1696 1697 1698 // r o t a t e ( −a n g l e ) ; // t r a n s l a t e ( newOx , newOy , newOz ) ; 1699 1700 1701 // c o u t << ” a f t e r b u i l d u p ” << e n d l ; // p r i n t ( ) ; 1702 1703 1704 newPool . i n s e r t ( newPool . end ( ) , atoms . b e g i n ( ) + ֒→ ←֓ oldAtoms . s i z e ( ) , atoms . end ( ) ) ; pool . c l e a r () ; atoms = oldAtoms ; 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 } fo r ( int j = 0 ; j < newPool . s i z e ( ) ; ++j ) { updatePool ( newPool [ j ] ) ; } 133 1715 return 0 ; 1716 1717 1718 } 1719 1720 1721 1722 1723 int S t r u c t u r e : : g e t P o o l s 2 ( ) { // f un c t h a t g e t s and combines t h e p o o l s from t h e 3 bonds i n ֒→ ←֓ t h e b a s e t r i a n g l e // and th e n i s re a d y f o r d o i n g b u i l d u p 1724 1725 1726 // FIXME q u i c k hack t o f i x some f a i l e d r e c o n s t r u c t i o n s // s ra n d ( time ( NULL ) ) ; 1727 1728 1729 1730 1731 1732 int numPools = 2 ; int basePt1 = 0 , basePt2 = 1 , basePt3 = 2 ; int oldBasePt1 = basePt1 , oldBasePt2 = basePt2 , oldBasePt3 = ֒→ ←֓ basePt3 ; v e c t o r oldAtoms = atoms , newPool ; long double newOx , newOy , newOz , a n g l e ; 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 fo r ( int i = 1 ; i <= numPools ; ++i ) { do { basePt1 = rand ( ) % atoms . s i z e ( ) ; basePt2 = rand ( ) % atoms . s i z e ( ) ; basePt3 = rand ( ) % atoms . s i z e ( ) ; // } w h i l e ( b a s e Pt3 == o l d B a s e Pt3 // or b a s e Pt3 == b a s e Pt1 // or b a s e Pt3 == b a s e Pt2 ) ; } while ( basePt2 == basePt1 or basePt3 == basePt2 or basePt3 == basePt1 or basePt1 == oldBasePt1 or basePt2 == oldBasePt2 or basePt1 == oldBasePt3 or basePt1 == oldBasePt2 or basePt2 == oldBasePt1 or basePt3 == oldBasePt3 or basePt3 == oldBasePt1 or basePt3 == oldBasePt2 ) ; 1755 1756 1757 co ut << ” newBase : ” << basePt1 << ’ \ t ’ << basePt2 << ’ \ t ’ << basePt3 << e n d l ; 134 1758 1759 1760 1761 oldBasePt1 = basePt1 ; oldBasePt2 = basePt2 ; oldBasePt3 = basePt3 ; 1762 1763 1764 1765 1766 int sum basePt1 basePt3 basePt2 1767 1768 1769 1770 1771 1772 = = = = basePt1 + basePt2 + basePt3 ; min ( min ( oldBasePt1 , oldBasePt2 ) , oldBasePt3 ) ; max( max( oldBasePt1 , oldBasePt2 ) , oldBasePt3 ) ; sum − basePt1 − basePt3 ; oldBasePt1 = basePt1 ; oldBasePt2 = basePt2 ; oldBasePt3 = basePt3 ; co ut << ” newBase : ” << basePt1 << ’ \ t ’ << basePt2 << ’ \ t ’ << basePt3 << e n d l ; 1773 1774 1775 1776 1777 1778 // b a s e Pt1 = 0 ; b a s e Pt2 // b a s e Pt1 = 1 ; b a s e Pt2 newOx = atoms [ basePt1 newOy = atoms [ basePt1 newOz = atoms [ basePt1 = 1 ; b a s e Pt3 = atoms . s i z e ( ) − 1 ; = 2 ; b a s e Pt3 = 3 ; ].x; ].y; ]. z; 1779 1780 1781 1782 1783 t r a n s l a t e ( −newOx , −newOy , −newOz ) ; // c o u t << ” T r a n s l a t e t o new o r i g i n ” << e n d l ; // p r i n t ( ) ; // g e t c h a r ( ) ; 1784 1785 1786 1787 1788 Po i nt pt2 ; pt2 . x = 1 ; pt2 . y = 0 ; pt2 . z = 0 ; long double a n g l e = g et Ang l e ( atoms [ basePt2 ] , pt2 ) ; Po i nt a x i s = g e t A x i s ( atoms [ basePt2 ] , pt2 ) ; 1789 1790 1791 1792 1793 // c o u t << atoms [ b a s e Pt2 ] << e n d l ; // c o u t << ” a n g l e , a x i s : ” << a n g l e << ’ , ’ << a x i s . x << ’ , ’ // << a x i s . y << ’ , ’ << a x i s . z << e n d l ; r o t a t e ( a ng l e , a x i s . x , a x i s . y , a x i s . z ) ; 1794 1795 1796 1797 1798 1799 1800 1801 // // // // // // l o n g d o u b l e a n g l e 2 = g e t A n g l e ( atoms [ b a s e Pt2 ] , p t 2 ) ; Po i n t a x i s 2 = g e t A x i s ( atoms [ b a s e Pt2 ] , p t 2 ) ; c o u t << atoms [ b a s e Pt2 ] << e n d l ; c o u t << ” a n g l e , a x i s : ” << a n g l e 2 << ’ , ’ << a x i s 2 . x << ’ , ’ << a x i s 2 . y << ’ , ’ << a x i s 2 . z << ֒→ ←֓ e n d l ; // r o t a t e ( a n g l e 2 , a x i s 2 . x , a x i s 2 . y , a x i s 2 . z ) ; 135 1802 // c o u t << ” R o ta te t o make new b a s e bond a l o n g t h e X a x i s ” ֒→ ←֓ << e n d l ; // p r i n t ( ) ; // g e t c h a r ( ) ; long double a n g l e 3 = atan2 ( atoms [ basePt3 ] . z , atoms [ ֒→ ←֓ basePt3 ] . y ) ; co ut << ”−a ng l e3 , a x i s : ” << −a n g l e 3 << ’ , ’ << ”1 ” << ’ , ’ << ” 0” << ’ , ’ << ”0 ” << e n d l ; r o t a t e ( −a ng l e3 , 1 , 0 , 0 ) ; // p r i n t ( ) ; // c o u t << ” a n g l e 3 , a x i s : ” << a n g l e 3 << ’ , ’ << ”1” << ’ , ’ // << ”0” << ’ , ’ << ”0” << e n d l ; // r o t a t e ( a n g l e 3 , 1 , 0 , 0 ) ; // p r i n t ( ) ; // g e t c h a r ( ) ; // r e t u r n 1 ; 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 doBuildup3Dv2 ( basePt1 , basePt2 , basePt3 ) ; atoms . i n s e r t ( atoms . end ( ) , p o o l . b e g i n ( ) , p o o l . end ( ) ) ; r o t a t e ( a ng l e3 , 1 , 0 , 0 ) ; // p r i n t ( ) ; r o t a t e ( −a ng l e , a x i s . x , a x i s . y , a x i s . z ) ; // p r i n t ( ) ; t r a n s l a t e ( newOx , newOy , newOz ) ; // p r i n t ( ) ; // g e t c h a r ( ) ; 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 co ut << ” S i z e o f newPool : ” << newPool . s i z e ( ) << e n d l ; newPool . i n s e r t ( newPool . end ( ) , atoms . b e g i n ( ) + ֒→ ←֓ oldAtoms . s i z e ( ) , atoms . end ( ) ) ; co ut << ” S i z e o f newPool : ” << newPool . s i z e ( ) << e n d l ; pool . c l e a r () ; 1828 1829 1830 1831 1832 1833 atoms . c l e a r ( ) ; atoms = oldAtoms ; 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 } fo r ( int j = 0 ; j < newPool . s i z e ( ) ; ++j ) { co ut << ” update , ” ; updatePool ( newPool [ j ] ) ; // g e t c h a r ( ) ; } 136 1844 return 0 ; 1845 1846 1847 } 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 int p r i n t 2 s t r u c t u r e s ( v e c t o r & s t r u 1 , v e c t o r & ֒→ ←֓ s t r u 2 ) { // For c o n v e n i e n c e p r i n t s t r u 1 , s t r u 2 f o rm a t a s s e r t ( s t r u 1 . s i z e ( ) >= s t r u 2 . s i z e ( ) ) ; co ut << ” ∗∗∗∗ ” << ” P r i n t i n g 2 s t r u c t u r e s ” << ” ∗∗∗∗ ” << e n d l ; Po i nt z e r o ; long double r1 , r2 , d i f f ; int i = 0 , j = 0 ; long double d i s t T o l e r = 0 . 5 ; 1858 1859 1860 1861 1862 1863 while ( { r1 = r2 = diff i < s t r u 1 . s i z e ( ) and j < s t r u 2 . s i z e ( ) ) getDistance ( stru1 [ i ] , zero ) ; getDistance ( stru2 [ j ] , zero ) ; = getDistance ( stru1 [ i ] , stru2 [ j ] ) ; 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 i f ( f a b s ( r 1 −r 2 ) < d i s t T o l e r and d i f f < d i s t T o l e r ) { co ut << i << ’ \ t ’ << ” ( ” << s t r u 1 [ i ] << ” ) ” << ’ \ t ’ << ” ( ” << s t r u 2 [ j ] << ” ) ” << ’ \ t ’ << g e t D i s t a n c e ( s t r u 1 [ i ] , s t r u 2 [ j ] ) ; ++i ; ++j ; } else i f ( r1 < r2 ) { co ut << i << ’ \ t ’ << ” ( ” << s t r u 1 [ i ] << ” ) ” << ’ \ t ’ << ” ( ” << z e r o << ” ) ” << ’ \ t ’ << g e t D i s t a n c e ( s t r u 1 [ i ] , z e r o ) ; ++i ; } else { co ut << ”−1” << ’ \ t ’ << ” ( ” << z e r o << ” ) ” << ’ \ t ’ << ” ( ” << s t r u 2 [ j ] << ” ) ” << ’ \ t ’ << g e t D i s t a n c e ( zer o , s t r u 2 [ j ] ) ; ++j ; } 137 co ut << e n d l ; 1888 } 1889 1890 while ( i != s t r u 1 . s i z e ( ) ) { co ut << i << ’ \ t ’ << ” ( ” << s t r u 1 [ i ] << ” ) ” << e n d l ; ++i ; } 1891 1892 1893 1894 1895 1896 while ( j != s t r u 2 . s i z e ( ) ) { co ut << ”−1” << ’ \ t ’ << ” ( ” << z e r o << ” ) ” << ’ \ t ’ << ” ( ” << s t r u 2 [ j ] << ” ) ” << ’ \ t ’ << g e t D i s t a n c e ( zer o , s t r u 2 [ j ] ) << e n d l ; ++j ; } co ut << e n d l ; 1897 1898 1899 1900 1901 1902 1903 1904 1905 // c o u t << ” i , j : ” << i << ’ , ’ << j << e n d l ; return 0 ; 1906 1907 1908 1909 } 1910 1911 1912 1913 1914 1915 1916 1917 bool S t r u c t u r e : : r e c o n s t r u c t 2 ( ) { bool growth = f a l s e ; int resetNum = 0 ; v e c t o r g i venCo r e = atoms ; updateCurrDL ( ) ; co ut << ” s i z e o f g i v e n c o r e : ” << g i venCo r e . s i z e ( ) << e n d l ; 1918 1919 1920 1921 1922 1923 1924 while ( atoms . s i z e ( ) < t a r g e t S i z e ) { // u p d a t e U s e d D i s t s ( ) ; // updateCurrDL ( ) ; updateFreeDL ( ) ; co ut << ” s i z e o f freeDL : ” << freeDL . s i z e ( ) << e n d l ; 1925 1926 1927 1928 1929 1930 1931 1932 i f ( f a l s e and growth and atoms . s i z e ( ) >= 20 and t a r g e t S i z e > 20 and ( atoms . s i z e ( ) / 2 ) % 2 == 0 ) { } else { pool . c l e a r () ; 138 // g e t P o o l s ( ) ; getPools2 () ; 1933 1934 } co ut << ” p o o l . s i z e : ” << p o o l . s i z e ( ) << e n d l ; co ut << ” Pool pt0 , c o s t : ” << p o o l [ 0 ] << ’ , ’ << g et Pt Co st ( ֒→ ←֓ p o o l [ 0 ] ) << e n d l ; 1935 1936 1937 1938 growth = growStru ( ) ; s o r t ( atoms . b e g i n ( ) , atoms . end ( ) ) ; // c o u t << ” s o l u t i o n s i z e : ” << atoms . s i z e ( ) << e n d l ; i f ( resetNum >= 2 ) { break ; } i f ( growth == f a l s e ) { print () ; atoms = g i venCo r e ; updateCurrDL ( ) ; ++resetNum ; } // g e t c h a r ( ) ; 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 } 1954 1955 return ( atoms . s i z e ( ) == t a r g e t S i z e ) ; // r e t u r n g ro w th ; 1956 1957 1958 1959 } 1960 1961 1962 1963 bool S t r u c t u r e : : findCore3D ( ) { bool s u c c e s s F l a g = f a l s e ; 1964 1965 1966 1967 1968 1969 1970 // 10 bonds i n t h e t e t r a h e d r o n int idxA = 0 , idxB , idxC , idxD , idxE , idxF , idxG , idxH , i d x I , ֒→ ←֓ idxJ , idxM ; int b r i d g e I d x ; long double c o u n t e r = 0 ; // number o f t e t r a h e d r a long double b r i d g e D i s t = 0 ; long double fmin , fmax , imin , imax , f r a c E r r o r ; 1971 1972 1973 1974 int invC = 0 ; // # i n v a l i d c o r e s // bond window int w i n S t a r t = 0 , i n c = 1 0 , winStop = w i n S t a r t + i n c ; // ֒→ ←֓ windowing on 139 1975 1976 v e c t o r d l i s t = targetDL ; co ut << ” targetDL s i z e : ” << targetDL . s i z e ( ) << e n d l ; 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 Po i nt basePt1 , basePt2 , basePt3 , apexPt1 , apexPt2 ; basePt1 . x = 0 ; basePt1 . y = 0 ; basePt1 . z = 0 ; basePt2 . x = d l i s t [ idxA ] ; basePt2 . y = 0 ; basePt2 . z = 0 ; co ut << ” basePt1 : ” << basePt1 . x << ” ” << basePt1 . y << ” ” ֒→ ←֓ << basePt1 . z << e n d l ; co ut << ” basePt2 : ” << basePt2 . x << ” ” << basePt2 . y << ” ” ֒→ ←֓ << basePt2 . z << e n d l ; co ut << ”Bond window : ” ; 1987 1988 1989 1990 1991 1992 int countC = 0 ; // c o un t number o f c o r e s v e c t o r i dxAr r ( 6 , 0 ) ; while ( true ) // window l o o p { c e r r << ”−>” << winStop ; 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 // b a s e t r i a n g l e fo r ( idxB = 1 ; idxB < winStop ; ++idxB ) { // c o u t << e n d l << ” idxB : ” << idxB << ” , ” ; // e n d l ; // c o u t << ” idxC : ” ; fo r ( idxC = idxB + 1 ; idxC < winStop ; ++idxC ) { // c o u t << ” ” << idxC << ” ” ; // << e n d l ; // NOTE: c>b so we have t o c h e c k o n l y 1 t r i a n g l e ֒→ ←֓ i n e q u a l i t y i f ( d l i s t [ idxA ] + d l i s t [ idxB ] + t o l e r < d l i s t [ idxC ֒→ ←֓ ] ) break ; placeApex ( basePt3 , idxA , idxB , idxC , d l i s t ) ; 2006 2007 2008 2009 2010 2011 2012 2013 2014 fo r ( idxD = 1 ; idxD < winStop ; ++idxD ) { i f ( idxD == idxB or idxD == idxC ) continue ; i f ( idxD < idxB ) continue ; fo r ( idxE = 1 ; idxE < winStop ; ++idxE ) { i f ( idxE < idxB ) continue ; i f ( idxE == idxB or idxE == idxC or idxE == idxD ) ֒→ ←֓ continue ; 140 2018 // t r i a n g l e i n e q u a l i t i e s i f ( d l i s t [ idxA ] + d l i s t [ idxE ] + t o l e r < d l i s t [ ֒→ ←֓ idxD ] ) continue ; i f ( d l i s t [ idxA ] + d l i s t [ idxD ] + t o l e r < d l i s t [ ֒→ ←֓ idxE ] ) break ; 2019 placeApex ( apexPt1 , idxA , idxD , idxE , d l i s t ) ; 2015 2016 2017 2020 2021 2022 2023 fmin = g e t D i s t a n c e ( basePt3 , apexPt1 ) ; apexPt1 . y = − apexPt1 . y ; fmax = g e t D i s t a n c e ( basePt3 , apexPt1 ) ; 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 fo r ( idxF = 1 ; idxF < d l i s t . s i z e ( ) ; ++idxF ) { i f ( d l i s t [ idxF ] < ( fmin − t o l e r ) ) continue ; i f ( d l i s t [ idxF ] > ( fmax + t o l e r ) ) break ; i f ( idxF == idxB or idxF == idxC or idxF == idxD ֒→ ←֓ or idxF == idxE ) continue ; i dxAr r [ 0 ] = idxA ; i dxAr r [ i dxAr r [ 1 ] = idxB ; i dxAr r [ i dxAr r [ 2 ] = idxC ; i dxAr r [ placeTop ( basePt1 , basePt2 , ←֓ idxArr , d l i s t ) ; 5 ] = idxF ; 4 ] = idxE ; 3 ] = idxD ; basePt3 , apexPt1 , ֒→ fo r ( idxG = 1 ; idxG < winStop ; ++idxG ) { i f ( idxG < idxB ) continue ; i f ( idxG == idxB or idxG == idxC or idxG == ֒→ ←֓ idxD or idxG == idxE or idxG == idxF ) continue ; i f ( idxG < idxD ) continue ; fo r ( idxH = 1 ; idxH < winStop ; ++idxH ) { i f ( idxH < idxB ) continue ; i f ( idxH ==idxB or idxH == idxC or idxH == ֒→ ←֓ idxD or idxH == idxE or idxH == idxF or idxH == ֒→ ←֓ idxG ) continue ; i f ( d l i s t [ idxA ] + d l i s t [ idxH ] + t o l e r < ֒→ ←֓ d l i s t [ idxG ] ) continue ; // t r i a n g l e ֒→ ←֓ i n e q u a l i t y i f ( d l i s t [ idxA ] + d l i s t [ idxG ] + t o l e r < ֒→ ←֓ d l i s t [ idxH ] ) break ; // t r i a n g l e ֒→ ←֓ i n e q u a l i t y 141 2049 2050 placeApex ( apexPt2 , idxA , idxG , idxH , d l i s t ) ; 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 imin = g e t D i s t a n c e ( basePt3 , apexPt2 ) ; apexPt2 . y = − apexPt2 . y ; imax = g e t D i s t a n c e ( basePt3 , apexPt2 ) ; // c o u t << ” bp3 , ap2 : ” << b a s e Pt3 << ’ , ’ << ֒→ ←֓ apexPt2 << e n d l ; fo r ( i d x I = 1 ; i d x I < d l i s t . s i z e ( ) ; ++i d x I ) { i f ( i d x I == idxB or i d x I == idxC or i d x I == ֒→ ←֓ idxD or i d x I == idxE or i d x I == idxF or i d x I == ֒→ ←֓ idxG or i d x I == idxH ) continue ; // c o r r e c t windowing i f ( idxH < w i n S t a r t and idxG < w i n S t a r t and idxE < w i n S t a r t and idxD < w i n S t a r t and idxC < w i n S t a r t and idxB < w i n S t a r t ) ֒→ ←֓ continue ; i f ( d l i s t [ i d x I ] < ( imin − t o l e r ) ) ֒→ ←֓ continue ; i f ( d l i s t [ i d x I ] > ( imax + t o l e r ) ) break ; 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 // c o u t << ” a b c d e f g h i : ” << idxA << ’ , ’ << ֒→ ←֓ idxB << ’ , ’ // << idxC << ’ , ’ << ֒→ ←֓ idxD << ’ , ’ // << idxE << ’ , ’ << ֒→ ←֓ idxF << ’ , ’ // << idxG << ’ , ’ << ֒→ ←֓ idxH << ’ , ’ // << i d x I << e n d l ; // c o u t << ” d l i s t −I , imin , imax : ” << ֒→ ←֓ d l i s t [ i d x I ] << ’ , ’ << imin << ’ , ’ << ֒→ ←֓ imax << e n d l << e n d l ; i dxAr r [ 0 ] = idxA ; i dxAr r [ i dxAr r [ 1 ] = idxB ; i dxAr r [ i dxAr r [ 2 ] = idxC ; i dxAr r [ placeTop ( basePt1 , basePt2 , ←֓ apexPt2 , idxArr , d l i s t 142 5 ] = idxI ; 4 ] = idxH ; 3 ] = idxG ; basePt3 , ֒→ ); 2082 2083 2084 2085 2086 2087 fo r ( idxM = 0 ; idxM < 2 ; ++idxM ) { apexPt2 . z = pow( −1.0L , idxM ) ∗ ֒→ ←֓ apexPt2 . z ; // r e f l e c t a b o u t XY p l a n e b r i d g e D i s t = g e t D i s t a n c e ( apexPt1 , ֒→ ←֓ apexPt2 ) ; c o u n t e r += 1 ; 2088 2089 2090 2091 2092 i f ( b r i d g e D i s t != b r i d g e D i s t ) g e t c h a r ( ) ; ֒→ ←֓ // c o u t << p4 << ’\ t ’ << p5 << e n d l ; i f ( ( b r i d g e D i s t < ( d l i s t [ 0 ] − ֒→ ←֓ b r i d g e D i s t ∗ t o l e r ) ) or ( b r i d g e D i s t > ( d l i s t [ d l i s t . s i z e ( ) ֒→ ←֓ − 1 ] + b r i d g e D i s t ∗ t o l e r ) ) ) ֒→ ←֓ continue ; i dxJ = c l o s e s t D i s t ( b r i d g e D i s t , d l i s t ) ; 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 i f ( i dxJ == idxA and i dxJ == idxB and ֒→ ←֓ i dxJ == idxC and i dxJ == idxD and i dxJ == idxE and ֒→ ←֓ i dxJ == idxF and i dxJ == idxG and i dxJ == idxH and ֒→ ←֓ i dxJ == i d x I ) { continue ; } f r a c E r r o r = f a b s ( d l i s t [ i dxJ ] − ֒→ ←֓ b r i d g e D i s t ) / b r i d g e D i s t ; if ( fracError < toler ) { atoms . push back ( basePt1 ) ; atoms . push back ( basePt2 ) ; atoms . push back ( basePt3 ) ; 2107 2108 2109 atoms . push back ( apexPt1 ) ; atoms . push back ( apexPt2 ) ; 2110 2111 2112 2113 co ut << e n d l << ” 3D c o r e found ! ! ! ” << ֒→ ←֓ e n d l ; co ut << b r i d g e D i s t << ’ \ t ’ << d l i s t [ ֒→ ←֓ i dxJ ] << ’ \ t ’ << f a b s ( b r i d g e D i s t − d l i s t [ i dxJ ֒→ ←֓ ] ) << e n d l ; 143 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 co ut << ” b r i d g e D i s t : ” << b r i d g e D i s t << ֒→ ←֓ e n d l ; co ut << ” c o r e f i n d e r c o u n t e r : ” << ֒→ ←֓ c o u n t e r << e n d l ; fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { co ut << ” Po i nt : ” << i + 1 << ’ \ t ’ << ֒→ ←֓ atoms [ i ] << e n d l ; } // Adding t h e b u i l d u p i n s i d e c o r e ֒→ ←֓ f i n d e r t o d e a l // w i t h bad c o r e s doBuildup3D ( i dxAr r ) ; i f ( atoms . s i z e ( ) >= min ( 8 , t a r g e t S i z e ֒→ ←֓ ) ) { // I f b u i l d u p was a b l e t o add 4 more ֒→ ←֓ p o i n t s th e n w i t h // h i g h p r o b a b i l i t y , we have t h e ֒→ ←֓ r i g h t s t r u c t u r e . co ut << ” atoms s i z e : ” << ֒→ ←֓ atoms . s i z e ( ) << e n d l ; return true ; } else { // I f b u i l d u p c o u l d n o t even 4 p o i n t s ֒→ ←֓ th e n w i t h a v e r y // h i g h p r o b a b i l i t y , we have t h e ֒→ ←֓ wrong s t r u c t u r e . S t a r t // o v e r and f i n d t h e n e x t c o r e . co ut << ”No buildup , bad c o r e . ” ” F i ndi ng t he next c o r e . . . ” ֒→ ←֓ << e n d l ; atoms . c l e a r ( ) ; updateCurrDL ( ) ; // c o u t << ”Bond window : ” ; } // r e t u r n t r u e ; } } // m l o o p } // i l o o p } // h l o o p 144 } // g l o o p } // f l o o p } // e l o o p } // d l o o p } // c l o o p } // b l o o p 2148 2149 2150 2151 2152 2153 2154 co ut << ” c o u n t e r : ”<< c o u n t e r << e n d l ; w i n S t a r t = winStop ; winStop += i n c ; i f ( w i n S t a r t == d l i s t . s i z e ( ) ) break ; 2155 2156 2157 2158 i f ( winStop > d l i s t . s i z e ( ) ) { winStop = d l i s t . s i z e ( ) ; } 2159 2160 2161 2162 2163 2166 co ut << ” s e a r c h i n g i n a ”<< winStop << ” bond window” << e n d l ; } // window w h i l e l o o p 2167 return s u c c e s s F l a g ; 2164 2165 2168 2169 } 2170 2171 2172 2173 bool S t r u c t u r e : : doBuildup3D ( v e c t o r i dxAr r ) { bool s u c c e s s F l a g = f a l s e ; 2174 2175 2176 2177 2178 // 10 bonds i n t h e t e t r a h e d r o n int idxA = i dxAr r [ 0 ] , idxB = i dxAr r [ 1 ] , idxC = i dxAr r [ 2 ] , idxD = i dxAr r [ 3 ] , idxE = i dxAr r [ 4 ] , idxF = i dxAr r [ 5 ] , idxG , idxH , i d x I , idxJ , idxM ; 2179 2180 2181 2182 2183 int b r i d g e I d x ; long double c o u n t e r = 0 ; // number o f t e t r a h e d r a long double b r i d g e D i s t = 0 ; long double fmin , fmax , imin , imax , f r a c E r r o r ; 2184 2185 2186 2187 2188 int invC = 0 ; // # i n v a l i d c o r e s // bond window v e c t o r d l i s t = targetDL ; int w i n S t a r t = 0 , winStop = d l i s t . s i z e ( ) ; // windowing o f f 2189 2190 2191 Po i nt basePt1 = atoms [ 0 ] , basePt2 = atoms [ 1 ] , basePt3 = ֒→ ←֓ atoms [ 2 ] , apexPt1 = atoms [ 3 ] , apexPt2 ; 145 2192 2193 int countC = 0 ; // c o un t number o f c o r e s 2194 2195 2196 2197 2198 2199 fo r ( idxG = 0 ; idxG < winStop ; ++idxG ) { i f ( idxG == idxB or idxG == idxC or idxG == idxD or idxG == idxE or idxG == idxF ) continue ; i f ( u s e d D i s t [ idxG ] ) continue ; 2200 2201 2202 2203 2204 2205 fo r ( idxH = 0 ; idxH < winStop ; ++idxH ) { i f ( idxH ==idxB or idxH == idxC or idxH == idxD or idxH == idxE or idxH == idxF or idxH == idxG ) continue ; i f ( u s e d D i s t [ idxH ] ) continue ; 2206 2210 i f ( d l i s t [ idxA ] ←֓ ) continue ; i f ( d l i s t [ idxA ] ←֓ ) break ; // i f ( d l i s t [ idxH ] ←֓ ) continue ; 2211 placeApex ( apexPt2 , idxA , idxG , idxH , d l i s t ) ; 2207 2208 2209 + d l i s t [ idxH ] + t o l e r < d l i s t [ idxG ] ֒→ // t r i a n g l e i n e q u a l i t y + d l i s t [ idxG ] + t o l e r < d l i s t [ idxH ] ֒→ triangle inequality + d l i s t [ idxG ] + t o l e r < d l i s t [ idxA ] ֒→ // t r i a n g l e i n e q u a l i t y 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 imin = g e t D i s t a n c e ( basePt3 , apexPt2 ) ; apexPt2 . y = − apexPt2 . y ; imax = g e t D i s t a n c e ( basePt3 , apexPt2 ) ; // c o u t << ” bp3 , ap2 : ” << b a s e Pt3 << ’ , ’ << apexPt2 << ֒→ ←֓ e n d l ; fo r ( i d x I = 0 ; i d x I < d l i s t . s i z e ( ) ; ++i d x I ) { i f ( i d x I == idxB or i d x I == idxC or i d x I == idxD or i d x I == idxE or i d x I == idxF or i d x I == idxG or i d x I == idxH ) continue ; i f ( u s e d D i s t [ i d x I ] ) continue ; 2224 2225 2226 i f ( d l i s t [ i d x I ] < ( imin − t o l e r ) ) continue ; i f ( d l i s t [ i d x I ] > ( imax + t o l e r ) ) break ; 2227 2228 2229 2230 2231 i dxAr r [ 0 ] = idxA ; i dxAr r [ i dxAr r [ 1 ] = idxB ; i dxAr r [ i dxAr r [ 2 ] = idxC ; i dxAr r [ placeTop ( basePt1 , basePt2 , ←֓ d l i s t ) ; 146 5 ] = idxI ; 4 ] = idxH ; 3 ] = idxG ; basePt3 , apexPt2 , idxArr , ֒→ 2232 2233 2234 2235 2236 fo r ( idxM = 0 ; idxM < 2 ; ++idxM ) { apexPt2 . z = pow ( −1.0L , idxM ) ∗ apexPt2 . z ; // r e f l e c t ֒→ ←֓ a b o u t XY p l a n e c o u n t e r += 1 ; 2237 2238 2239 2240 2241 2242 b r i d g e D i s t = g e t D i s t a n c e ( apexPt1 , apexPt2 ) ; i f ( b r i d g e D i s t != b r i d g e D i s t ) g e t c h a r ( ) ; // c o u t << ֒→ ←֓ p4 << ’\ t ’ << p5 << e n d l ; i f ( ( b r i d g e D i s t < ( d l i s t [ 0 ] − b r i d g e D i s t ∗ t o l e r ) ֒→ ←֓ ) or ( b r i d g e D i s t > ( d l i s t [ d l i s t . s i z e ( ) − 1 ] + ֒→ ←֓ b r i d g e D i s t ∗ t o l e r ) ) ) continue ; i dxJ = c l o s e s t D i s t ( b r i d g e D i s t , d l i s t ) ; 2243 2244 2245 2246 2247 2248 2249 2250 i f ( i dxJ == idxA and i dxJ == idxD and i dxJ == idxG and { continue ; } i f ( u s e d D i s t [ i dxJ ] i dxJ == idxB and i dxJ == idxC and i dxJ == idxE and i dxJ == idxF and i dxJ == idxH and i dxJ == i d x I ) ) continue ; 2251 2252 2253 2254 2255 f r a c E r r o r = f a b s ( d l i s t [ i dxJ ] − b r i d g e D i s t ) / ֒→ ←֓ b r i d g e D i s t ; if ( fracError < toler ) { atoms . push back ( apexPt2 ) ; 2256 2257 2258 2259 2260 2261 co ut << e n d l << ” Po i nt found ! ! ! ” << e n d l ; co ut << b r i d g e D i s t << ’ \ t ’ << d l i s t [ i dxJ ] << ’ \ t ’ << f a b s ( b r i d g e D i s t − d l i s t [ i dxJ ] ) << e n d l ; co ut << ” b r i d g e D i s t : ” << b r i d g e D i s t << e n d l ; co ut << ” c o u n t e r : ” << c o u n t e r << e n d l ; 2262 2263 2264 2265 2266 2267 2268 2269 2270 co ut << ” Po i nt : ” << atoms . s i z e ( ) << ’ \ t ’ << atoms [ atoms . s i z e ( ) − 1 ] << e n d l ; u s e d D i s t [ idxG ] = u s e d D i s t [ idxH ] = u s e d D i s t [ ֒→ ←֓ i d x I ] = u s e d D i s t [ i dxJ ] = true ; i f ( atoms . s i z e ( ) == t a r g e t S i z e ) { co ut << ” b u i l d u p c o u n t e r : ”<< c o u n t e r << e n d l ; 147 return true ; 2271 } 2272 2273 } } // m l o o p } // i l o o p } // h l o o p } // g l o o p 2274 2275 2276 2277 2278 2279 co ut << ” c o u n t e r : ”<< c o u n t e r << e n d l ; return f a l s e ; 2280 2281 2282 2283 } 2284 2285 2286 2287 2288 bool S t r u c t u r e : : home3D ( int d i s t I d x ) { // O ri e n t t h e s t r u c t u r e i n a un i q ue manner so t h a t i t becomes ֒→ ←֓ e a s y t o c h e c k // i f two or more s t r u c t u r e a re i d e n t i c a l t o one a n o t h e r or n o t . 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 long double minDist = 1 e6 ; int i d x 1 = 0 , i d x 2 = 1 ; long double d i s t = 1 e6 , e r r , minErr ; minErr = f a b s ( targetDL [ d i s t I d x ] − g e t D i s t a n c e ( atoms [ i d x 1 ֒→ ←֓ ] , atoms [ i d x 2 ] ) ) ; // Find t h e c o r r e c t bond i n t h e s t r u c t u r e fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { fo r ( int j = i + 1 ; j < atoms . s i z e ( ) ; ++j ) { d i s t = g e t D i s t a n c e ( atoms [ i ] , atoms [ j ] ) ; e r r = f a b s ( targetDL [ d i s t I d x ] − d i s t ) ; i f ( e r r < minErr ) { minErr = e r r ; idx1 = i ; idx2 = j ; } } } // c o u t << ” i d x 1 , i d x 2 : ” << i d x 1 << ’\ t ’ << i d x 2 << e n d l ; 2311 2312 2313 // Lo c a te t h e apex p o i n t o f t h e b a s e t r i a n g l e long double minDist1 = 1 e6 , minDist2 = 1 e6 ; 148 2314 2315 long double d i s t 1 , d i s t 2 ; int minIdx1 , minIdx2 ; 2316 2317 2318 2319 fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { i f ( i == i d x 1 or i == i d x 2 ) continue ; 2320 d i s t 1 = g e t D i s t a n c e ( atoms [ i ] , atoms [ i d x 1 ] ) ; i f ( d i s t 1 < minDist1 ) { minDist1 = d i s t 1 ; minIdx1 = i ; } 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 } d i s t 2 = g e t D i s t a n c e ( atoms [ i ] , atoms [ i d x 2 ] ) ; i f ( d i s t 2 < minDist2 ) { minDist2 = d i s t 2 ; minIdx2 = i ; } int i d x 3 = minIdx1 ; i f ( minDist1 > minDist2 ) { i d x 3 = minIdx2 ; swap ( idx1 , i d x 2 ) ; } // C o r r e c t l y p l a c e t h e b a s e t r i a n g l e t r a n s l a t e ( −atoms [ i d x 1 ] . x , −atoms [ i d x 1 ] . y , −atoms [ i d x 1 ֒→ ←֓ ] . z ) ; // f i n d a n g l e and r o t a t e Po i nt pt2 ; pt2 . x = 1 ; pt2 . y = 0 ; pt2 . z = 0 ; long double a n g l e = g et Ang l e ( atoms [ i d x 2 ] , pt2 ) ; Po i nt a x i s = g e t A x i s ( atoms [ i d x 2 ] , pt2 ) ; // c o u t << ” a n g l e , a x i s : ” << a n g l e << ’ , ’ << a x i s . x << ’ , ’ ֒→ ←֓ << a x i s . y << ’ , ’ // << a x i s . z << e n d l ; r o t a t e ( a ng l e , a x i s . x , a x i s . y , a x i s . z ) ; 2353 2354 2355 2356 i f ( atoms [ i d x 3 ] . y < 0 ) { r e f l e c t ( ”X” ) ; 149 } 2357 2358 // S o r t t h e p o i n t s so t h a t t h e r e i s some un i q ue o r d e r s o r t ( atoms . b e g i n ( ) , atoms . end ( ) ) ; 2359 2360 2361 int idxApex = 2 ; long double a n g l e 3 = atan2 ( atoms [ idxApex ] . z , atoms [ ֒→ ←֓ idxApex ] . y ) ; // c o u t << ”−a n g l e 3 , a x i s : ” << −a n g l e 3 << ’ , ’ << ”1” << ’ , ’ // << ”0” << ’ , ’ << ”0” << e n d l ; r o t a t e ( −a ng l e3 , 1 , 0 , 0 ) ; 2362 2363 2364 2365 2366 2367 2372 i f ( atoms [ 3 ] . z < 0 ) { r e f l e c t ( ”Z” ) ; } 2373 return true ; 2368 2369 2370 2371 2374 2375 } 2376 2377 2378 2379 2380 2381 2382 bool S t r u c t u r e : : doBuildup3Dv2 ( int pt1 , int pt2 , int pt3 ) { bool s u c c e s s F l a g = f a l s e ; freeDL = targetDL ; // u p d a t e U s e d D i s t s ( ) ; // v e c t o r freeDL ; 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 // f o r ( i n t i = 0 ; i < ta rg e tD L . s i z e ( ) ; ++i ) // { // i f ( u s e d D i s t [ i ] == f a l s e ) // { // freeDL . p u s h b a c k ( ta rg e tD L [ i ] ) ; // } // } v e c t o r i dxAr r ( 6 , 0 ) ; i dxAr r [ 0 ] = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ pt1 ] , atoms [ ֒→ ←֓ pt2 ] ) , targetDL ) ; i dxAr r [ 1 ] = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ pt1 ] , atoms [ ֒→ ←֓ pt3 ] ) , targetDL ) ; i dxAr r [ 2 ] = c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ pt2 ] , atoms [ ֒→ ←֓ pt3 ] ) , 150 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 // i d x Arr [ 0 ] = ←֓ 1 ] ) , // // i d x Arr [ 1 ] = ←֓ 2 ] ) , // // i d x Arr [ 2 ] = ←֓ 2 ] ) , // // i d x Arr [ 3 ] = ←֓ 3 ] ) , // // i d x Arr [ 4 ] = ←֓ 3 ] ) , // // i d x Arr [ 5 ] = ←֓ 3 ] ) , // co ut << ” i dxAr r : targetDL ) ; c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ 0 ] , atoms [ ֒→ freeDL ) ; c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ 0 ] , atoms [ ֒→ freeDL ) ; c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ 1 ] , atoms [ ֒→ freeDL ) ; c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ 0 ] , atoms [ ֒→ freeDL ) ; c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ 1 ] , atoms [ ֒→ freeDL ) ; c l o s e s t D i s t ( g e t D i s t a n c e ( atoms [ 2 ] , atoms [ ֒→ freeDL ” << i dxAr r [ 0 ] << << i dxAr r [ 2 ] << << i dxAr r [ 4 ] << ); ’ , ’ << i dxAr r [ 1 ] << ’ , ’ ’ , ’ << i dxAr r [ 3 ] << ’ , ’ ’ , ’ << i dxAr r [ 5 ] << e n d l ; 2414 2415 2416 2417 2418 2419 // 10 bonds i n t h e t e t r a h e d r o n int idxA = i dxAr r [ 0 ] , idxB = i dxAr r [ 1 ] , idxC = i dxAr r [ 2 ] , idxD = i dxAr r [ 3 ] , idxE = i dxAr r [ 4 ] , idxF = i dxAr r [ 5 ] , idxG , idxH , i d x I , idxJ , idxM ; int numTestPts = 0 ; // number o f t e t r a h e d r a 2420 2421 2422 2423 2424 // Po i n t b a s e Pt1 = atoms [ 0 ] , b a s e Pt2 = atoms [ 1 ] , b a s e Pt3 ֒→ ←֓ = atoms [ 2 ] , // apexPt1 = atoms [ 3 ] , apexPt2 ; Po i nt basePt1 = atoms [ pt1 ] , basePt2 = atoms [ pt2 ] , basePt3 ֒→ ←֓ = atoms [ pt3 ] , apexPt1 , apexPt2 ; 2425 2426 2427 2428 2429 long double imin , imax ; long double t r i T o l e r = 0 . 1 ; long double di st A = g e t D i s t a n c e ( basePt1 , basePt2 ) , distG , distH , d i s t I , e r r o r ; 2430 2431 2432 2433 fo r ( idxG = 0 ; idxG < freeDL . s i z e ( ) ; ++idxG ) { di st G = freeDL [ idxG ] ; 2434 151 2435 2436 2437 2438 2439 2440 2441 fo r ( idxH = 0 ; idxH < freeDL . s i z e ( ) ; ++idxH ) { di st H = freeDL [ idxH ] ; i f ( di st A + di st H + t r i T o l e r < di st G ) continue ; // ֒→ ←֓ t r i a n g l e i n e q u a l i t y i f ( di st A + di st G + t r i T o l e r < di st H ) break ; // t r i a n g l e ֒→ ←֓ i n e q u a l i t y placeApex ( apexPt2 , idxA , idxG , idxH , freeDL ) ; 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 imin = g e t D i s t a n c e ( basePt3 , apexPt2 ) ; apexPt2 . y = − apexPt2 . y ; imax = g e t D i s t a n c e ( basePt3 , apexPt2 ) ; // c o u t << ” bp3 , ap2 : ” << b a s e Pt3 << ’ , ’ << apexPt2 << ֒→ ←֓ e n d l ; fo r ( i d x I = 0 ; i d x I < freeDL . s i z e ( ) ; ++i d x I ) { d i s t I = freeDL [ i d x I ] ; i f ( d i s t I < ( imin − t r i T o l e r ) ) continue ; i f ( d i s t I > ( imax + t r i T o l e r ) ) break ; 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 i dxAr r [ 0 ] = idxA ; i dxAr r [ i dxAr r [ 1 ] = idxB ; i dxAr r [ i dxAr r [ 2 ] = idxC ; i dxAr r [ placeTop ( basePt1 , basePt2 , ←֓ freeDL ) ; 5 ] = idxI ; 4 ] = idxH ; 3 ] = idxG ; basePt3 , apexPt2 , idxArr , ֒→ fo r ( idxM = 0 ; idxM < 2 ; ++idxM ) { apexPt2 . z = pow ( −1.0L , idxM ) ∗ apexPt2 . z ; // r e f l e c t ֒→ ←֓ a b o u t XY p l a n e apexPt2 . c o s t = e r r o r = g et Pt Co st ( apexPt2 ) ; 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 i f ( error < 0.2 ) { // c o u t << ” a d d i n g pt , c o s t : ” << apexPt2 << ’ , ’ << ֒→ ←֓ e r r o r << e n d l ; updatePool ( apexPt2 ) ; ++numTestPts ; // FIX ME // r e t u r n f a l s e ; } } // m l o o p 152 } // i l o o p } // h l o o p } // g l o o p 2474 2475 2476 2477 co ut << ” p o o l . s i z e : ” << p o o l . s i z e ( ) << e n d l ; co ut << ”Number o f p o i n t s t e s t e d : ” << numTestPts << e n d l ; co ut << ”End o f doBuildup3Dv2” << e n d l ; return f a l s e ; 2478 2479 2480 2481 2482 2483 } 2484 2485 2486 2487 2488 2489 bool S t r u c t u r e : : findCoreMPI ( int windowStart ) { // Find a c o r e made o f 4 p o i n t s by i t e r a t i n g o v e r a l l t r i a n g l e // c o m b i n a t i o n s . Also , t h e f u n c t i o n t o do t h e b u i l d u p i s ֒→ ←֓ c a l l e d a f t e r // we f i n d a c o r e b e c a u s e i t i s more c o n v e n i e n t t h i s way . 2490 2491 2492 2493 2494 2495 2496 int idxA = 0 , idxB , idxC , idxD , idxE , idxF ; // i n d i c e s f o r ֒→ ←֓ t h e bonds int idxM , idxN ; // i n d i c e s f o r t h e o r i e n t a t i o n o f t h e t r i a n g l e v e c t o r i dxAr r ( 6 , −1 ) ; int i n c = 6 ; // w i d t h o f t h e bond window int w i n S t a r t = windowStart , winStop = windowStart + i n c ; // ֒→ ←֓ i n d i c e s f o r t h e window v e c t o r d l i s t = targetDL ; 2497 2498 2499 2500 2501 long double b r i d g e D i s t = 0 . 0 ; int b r i d g e I d x = 0 ; int bridgeCount = 0 ; // c o un t t h e number o f b r i d g e bond c h e c k s long double f r a c E r r o r = 1 e6 ; 2502 2503 2504 2505 2506 2507 2508 2509 Po i nt basePt1 , basePt2 , apexPt1 , apexPt2 ; basePt1 . x = 0 ; basePt1 . y = 0 ; basePt2 . x = d l i s t [ idxA ] ; basePt2 . y = 0 ; co ut << ” basePt1 : ” << basePt1 . x << ” ” << basePt1 . y << e n d l ; co ut << ” basePt2 : ” << basePt2 . x << ” ” << basePt2 . y << e n d l ; co ut << ”Bond window : ” ; 2510 2511 2512 2513 2514 2515 while ( true ) { c e r r << ” −> ” << winStop ; fo r ( idxB = idxA + 1 ; idxB < winStop ; ++idxB ) 153 2516 2517 2518 2519 2520 2521 2522 2523 { fo r ( idxC = idxB + 1 ; idxC < winStop ; ++idxC ) { i f ( d l i s t [ idxA ] + d l i s t [ idxB ] + t o l e r < d l i s t [ idxC ֒→ ←֓ ] ) { break ; } placeApex ( apexPt1 , idxA , idxB , idxC , d l i s t ) ; 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 fo r ( idxD = idxA + 1 ; idxD < winStop ; ++idxD ) { i f ( idxD == idxB or idxD == idxC ) { continue ; } fo r ( idxE = idxD + 1 ; idxE < winStop ; ++idxE ) { i f ( idxB < w i n S t a r t and idxC < w i n S t a r t and idxD < w i n S t a r t and idxE < w i n S t a r t ) { continue ; } i f ( ( idxD < idxB and idxE < idxC ) or ( idxE > idxC and idxD < idxB ) ) { continue ; } i f ( idxE == idxB or idxE == idxC ) { continue ; } 2554 i f ( d l i s t [ idxA ] + d l i s t [ idxD ] + t o l e r < d l i s t [ ֒→ ←֓ idxE ] ) { break ; } 2555 placeApex ( apexPt2 , idxA , idxD , idxE , d l i s t ) ; 2550 2551 2552 2553 2556 2557 2558 fo r ( idxM = 0 ; idxM < 2 ; ++idxM ) { 154 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 // c o u t << ”idxM : ” << idxM << e n d l ; i f ( idxM == 1 ) { apexPt2 . x = d l i s t [ idxA ] − apexPt2 . x ; } fo r ( idxN = 0 ; idxN < 2 ; ++idxN ) { // c o u t << ” idxN : ” << idxN << e n d l ; i f ( idxN == 1 ) { apexPt2 . y = − apexPt2 . y ; } b r i d g e D i s t = g e t D i s t a n c e ( apexPt1 , apexPt2 ) ; bridgeCount += 1 ; // c o un t number o f b r i d g e c h e c k s 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 bridgeIdx = c l o s e s t D i s t ( bridgeDist , d l i s t i f ( b r i d g e I d x == idxA or b r i d g e I d x == idxB b r i d g e I d x == idxC or b r i d g e I d x == idxD b r i d g e I d x == idxE ) { // Make s u r e t h a t t h e b r i d g e bond i s n o t ←֓ same as any o f // t h e d i s t a n c e s i n us e . continue ; } ); or or t h e ֒→ f r a c E r r o r = f a b s ( d l i s t [ b r i d g e I d x ] − ֒→ ←֓ b r i d g e D i s t ) / b r i d g e D i s t ; // c o u t << ” f r a c E r r o r : ” << f r a c E r r o r << e n d l ; 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 i f ( f a b s ( apexPt2 . y ) < 0 . 5 ) { // Skinny t r i a n g l e s have been found t o have a ֒→ ←֓ l a r g e e r r o r , // hence r e d u c i n g t h e i r e r r o r ” by hand” so ֒→ ←֓ t h a t we don ’ t // miss o ut on them . f r a c E r r o r /= 1 0 0 0 ; } if ( fracError < toler ) { i dxAr r [ 0 ] = idxA ; i dxAr r [ 1 ] = idxB ; 155 2600 2601 i dxAr r [ 2 ] = idxC ; i dxAr r [ 3 ] = idxD ; i dxAr r [ 4 ] = idxE ; i dxAr r [ 5 ] = b r i d g e I d x ; 2602 2603 2604 2605 2606 2607 2608 2609 co ut << e n d l ; i f ( t e s t C o r e ( idxArr , { // atoms . p u s h b a c k ( // atoms . p u s h b a c k ( // atoms . p u s h b a c k ( // atoms . p u s h b a c k ( idxM , idxN ) ) b a s e Pt1 b a s e Pt2 apexPt1 apexPt2 ); ); ); ); 2610 2611 2612 2613 2614 co ut co ut co ut co ut << << << << basePt1 basePt2 apexPt1 apexPt2 << << << << endl ; endl ; endl ; endl ; 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 // f o r ( i n t i = 0 ; i < atoms . s i z e ( ) ; ++i ) // { // c o u t << ” Po i n t : ” << i + 1 << ’\ t ’ << ֒→ ←֓ atoms [ i ] << e n d l ; // } // u s e d D i s t [ idxA ] = u s e d D i s t [ idxB ] = t r u e ; // u s e d D i s t [ idxC ] = u s e d D i s t [ idxD ] = t r u e ; // u s e d D i s t [ idxE ] = u s e d D i s t [ b r i d g e I d x ] ֒→ ←֓ = t r u e ; // updateCurrDL ( ) ; // r e t u r n t r u e ; 2627 2628 2629 // Attempt b u i l d u p t o g e t t h e re m a i n i n g ֒→ ←֓ p o i n t s . // d o B ui l d up ( ) ; 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 // i f ( atoms . s i z e ( ) >= min ( 8 , t a r g e t S i z e ) ) // { // // I f b u i l d u p was a b l e t o add 4 more ֒→ ←֓ p o i n t s th e n w i t h // // h i g h p r o b a b i l i t y , we have t h e r i g h t ֒→ ←֓ s t r u c t u r e . // return true ; // } // e l s e // { // // I f b u i l d u p c o u l d n o t even 4 p o i n t s ֒→ 156 2640 // 2641 // // // 2642 2643 // // // // 2644 2645 2646 2647 2648 } 2649 2650 } // n l o o p } // m l o o p } // idxE l o o p } // idxD l o o p } // idxC l o o p } // idxB l o o p 2651 2652 2653 2654 2655 2656 2657 // // // // // // // // // // // // // // // // // 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2677 i f ( w i n S t a r t == d l i s t . s i z e ( ) ) { c o u t << e n d l ; break ; } } i f ( winStop > d l i s t . s i z e ( ) ) { winStop = d l i s t . s i z e ( ) ; } return f a l s e ; 2678 2680 When idxB h i t s t h e window e d g e i n c re m e n t i t . i f ( idxB == winStop ) { w i n S t a r t = winStop ; winStop += i n c ; } // w h i l e ( t r u e ) l o o p 2676 2679 } ←֓ th e n w i t h a v e r y // h i g h p r o b a b i l i t y , we have t h e wrong ֒→ ←֓ s t r u c t u r e . S t a r t // o v e r and f i n d t h e n e x t c o r e . c o u t << ”No b u i l d u p , bad c o r e . ” ” F i n d i n g t h e n e x t c o r e . . . ” ֒→ ←֓ << e n d l ; atoms . c l e a r ( ) ; updateCurrDL ( ) ; c o u t << ”Bond window : ” ; } } 2681 157 2682 2683 2684 2685 2686 bool S t r u c t u r e : : findCore3D ( int b a s e I d x ) { // F un c ti o n t o g e t t i m i n g s runs f o r d i f f e r e n t c h o i c e s o f b a s e ֒→ ←֓ bond bool s u c c e s s F l a g = f a l s e ; 2687 2688 2689 2690 2691 2692 2693 // 10 bonds i n t h e t e t r a h e d r o n int idxA = baseIdx , idxB , idxC , idxD , idxE , idxF , idxG , idxH , ֒→ ←֓ i d x I , idxJ , idxM ; int invC = 0 ; // # i n v a l i d c o r e s v e c t o r d l i s t = targetDL ; co ut << ” targetDL s i z e : ” << targetDL . s i z e ( ) << e n d l ; 2694 2695 2696 2697 2698 2699 2700 2701 2702 int i n c = 1 0 , w i n S t a r t = max( b a s e I d x − i n c / 2 , 0 ) , winStop = min ( w i n S t a r t + i n c / 2 , int ( d l i s t . s i z e ( ) ) − 1 ֒→ ←֓ ) ; // windowing on int b r i d g e I d x ; long double c o u n t e r = 0 ; // number o f t e t r a h e d r a long double b r i d g e D i s t = 0 ; long double fmin , fmax , imin , imax , f r a c E r r o r ; 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 Po i nt basePt1 , basePt2 , basePt3 , apexPt1 , apexPt2 ; basePt1 . x = 0 ; basePt1 . y = 0 ; basePt1 . z = 0 ; basePt2 . x = d l i s t [ idxA ] ; basePt2 . y = 0 ; basePt2 . z = 0 ; co ut << ” basePt1 : ” << basePt1 . x << ” ” << basePt1 . y << ” ” ֒→ ←֓ << basePt1 . z << e n d l ; co ut << ” basePt2 : ” << basePt2 . x << ” ” << basePt2 . y << ” ” ֒→ ←֓ << basePt2 . z << e n d l ; co ut << ”Bond window : ” ; 2713 2714 2715 2716 2717 2718 2719 2720 2721 int countC = 0 ; // c o un t number o f c o r e s v e c t o r i dxAr r ( 6 , 0 ) ; while ( true ) // window l o o p { // Break a f t e r t h e e n t i r e d i s t a n c e l i s t i s e x h a u s t e d i f ( w i n S t a r t <= 0 and winStop >= d l i s t . s i z e ( ) − 1 ) { break ; 158 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 } // Make s u r e i f ( winStart { winStop += winStart = } t o c h e c k i f window e d g e s a re l e g i t < 0 ) −w i n S t a r t ; 0; i f ( winStop >= d l i s t . s i z e ( ) ) { w i n S t a r t −= winStop − d l i s t . s i z e ( ) + 1 ; winStop = d l i s t . s i z e ( ) −1; 2736 2737 2738 2739 2740 i f ( winStart < 0 ) { winStop += −w i n S t a r t ; winStart = 0 ; } 2742 } 2743 c e r r << ”−>” << winStop ; 2741 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 // b a s e t r i a n g l e fo r ( idxB = 0 ; idxB < winStop ; ++idxB ) { // c o u t << e n d l << ” idxB : ” << idxB << ” , ” ; // e n d l ; // c o u t << ” idxC : ” ; fo r ( idxC = idxB + 1 ; idxC < winStop ; ++idxC ) { // c o u t << ” ” << idxC << ” ” ; // << e n d l ; // NOTE: c>b so we have t o c h e c k o n l y 1 t r i a n g l e ֒→ ←֓ i n e q u a l i t y i f ( d l i s t [ idxA ] + d l i s t [ idxB ] + t o l e r < d l i s t [ idxC ֒→ ←֓ ] ) break ; i f ( d l i s t [ idxC ] + d l i s t [ idxB ] + t o l e r < d l i s t [ idxA ֒→ ←֓ ] ) continue ; placeApex ( basePt3 , idxA , idxB , idxC , d l i s t ) ; 2758 2759 2760 2761 2762 2763 fo r ( idxD = 0 ; idxD < winStop ; ++idxD ) { i f ( idxD == idxB or idxD == idxC ) continue ; i f ( idxD < idxB ) continue ; fo r ( idxE = 0 ; idxE < winStop ; ++idxE ) 159 2764 { 2771 i f ( idxE < idxB ) continue ; i f ( idxE == idxB or idxE == idxC or ←֓ continue ; // t r i a n g l e i n e q u a l i t i e s i f ( d l i s t [ idxA ] + d l i s t [ idxE ] + ←֓ idxD ] ) continue ; i f ( d l i s t [ idxA ] + d l i s t [ idxD ] + ←֓ idxE ] ) break ; i f ( d l i s t [ idxE ] + d l i s t [ idxD ] + ←֓ idxA ] ) continue ; 2772 placeApex ( apexPt1 , idxA , idxD , idxE , d l i s t ) ; 2765 2766 2767 2768 2769 2770 idxE == idxD ) ֒→ t o l e r < d l i s t [ ֒→ t o l e r < d l i s t [ ֒→ t o l e r < d l i s t [ ֒→ 2773 2774 2775 2776 fmin = g e t D i s t a n c e ( basePt3 , apexPt1 ) ; apexPt1 . y = − apexPt1 . y ; fmax = g e t D i s t a n c e ( basePt3 , apexPt1 ) ; 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 fo r ( idxF = 0 ; idxF < d l i s t . s i z e ( ) ; ++idxF ) { i f ( d l i s t [ idxF ] < ( fmin − t o l e r ) ) continue ; i f ( d l i s t [ idxF ] > ( fmax + t o l e r ) ) break ; i f ( idxF == idxB or idxF == idxC or idxF == idxD ֒→ ←֓ or idxF == idxE ) continue ; i dxAr r [ 0 ] = idxA ; i dxAr r [ i dxAr r [ 1 ] = idxB ; i dxAr r [ i dxAr r [ 2 ] = idxC ; i dxAr r [ placeTop ( basePt1 , basePt2 , ←֓ idxArr , d l i s t ) ; 5 ] = idxF ; 4 ] = idxE ; 3 ] = idxD ; basePt3 , apexPt1 , ֒→ fo r ( idxG = 0 ; idxG < winStop ; ++idxG ) { i f ( idxG < idxB ) continue ; i f ( idxG == idxB or idxG == idxC or idxG == ֒→ ←֓ idxD or idxG == idxE or idxG == idxF ) continue ; i f ( idxG < idxD ) continue ; fo r ( idxH = 0 ; idxH < winStop ; ++idxH ) { i f ( idxH < idxB ) continue ; i f ( idxH ==idxB or idxH == idxC or idxH == ֒→ ←֓ idxD or idxH == idxE or idxH == idxF or idxH == ֒→ ←֓ idxG ) continue ; 160 2803 i f ( d l i s t [ idxA ] ←֓ d l i s t [ idxG ←֓ i n e q u a l i t y i f ( d l i s t [ idxA ] ←֓ d l i s t [ idxH ←֓ i n e q u a l i t y i f ( d l i s t [ idxH ] ←֓ d l i s t [ idxA ←֓ i n e q u a l i t y 2804 placeApex ( apexPt2 , idxA , idxG , idxH , d l i s t ) ; 2800 2801 2802 + d l i s t [ idxH ] + t o l e r < ֒→ ] ) continue ; // t r i a n g l e ֒→ + d l i s t [ idxG ] + t o l e r < ֒→ ] ) break ; // t r i a n g l e ֒→ + d l i s t [ idxG ] + t o l e r < ֒→ ] ) continue ; // t r i a n g l e ֒→ 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 imin = g e t D i s t a n c e ( basePt3 , apexPt2 ) ; apexPt2 . y = − apexPt2 . y ; imax = g e t D i s t a n c e ( basePt3 , apexPt2 ) ; // c o u t << ” bp3 , ap2 : ” << b a s e Pt3 << ’ , ’ << ֒→ ←֓ apexPt2 << e n d l ; fo r ( i d x I = 0 ; i d x I < d l i s t . s i z e ( ) ; ++i d x I ) { i f ( i d x I == idxB or i d x I == idxC or i d x I == ֒→ ←֓ idxD or i d x I == idxE or i d x I == idxF or i d x I == ֒→ ←֓ idxG or i d x I == idxH ) continue ; // c o r r e c t windowing i f ( idxH < w i n S t a r t and idxG < w i n S t a r t and idxE < w i n S t a r t and idxD < w i n S t a r t and idxC < w i n S t a r t and idxB < w i n S t a r t ) ֒→ ←֓ continue ; i f ( d l i s t [ i d x I ] < ( imin − t o l e r ) ) ֒→ ←֓ continue ; i f ( d l i s t [ i d x I ] > ( imax + t o l e r ) ) break ; 2823 2824 2825 2826 2827 2828 2829 // c o u t << ” a b c d e f g h i : ” << idxA ←֓ idxB << ’ , ’ // << idxC ←֓ idxD << ’ , ’ // << idxE ←֓ idxF << ’ , ’ // << idxG ←֓ idxH << ’ , ’ // << i d x I // c o u t << ” d l i s t −I , imin , imax : 161 << ’ , ’ << ֒→ << ’ , ’ << ֒→ << ’ , ’ << ֒→ << ’ , ’ << ֒→ << e n d l ; ” << ֒→ 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 ←֓ d l i s t [ i d x I ] << ’ , ’ << imin << ’ , ’ << ֒→ ←֓ imax << e n d l << e n d l ; i dxAr r [ 0 ] = idxA ; i dxAr r [ i dxAr r [ 1 ] = idxB ; i dxAr r [ i dxAr r [ 2 ] = idxC ; i dxAr r [ placeTop ( basePt1 , basePt2 , ←֓ apexPt2 , idxArr , d l i s t 5 ] = idxI ; 4 ] = idxH ; 3 ] = idxG ; basePt3 , ֒→ ); fo r ( idxM = 0 ; idxM < 2 ; ++idxM ) { apexPt2 . z = pow( −1.0L , idxM ) ∗ ֒→ ←֓ apexPt2 . z ; // r e f l e c t a b o u t XY p l a n e b r i d g e D i s t = g e t D i s t a n c e ( apexPt1 , ֒→ ←֓ apexPt2 ) ; c o u n t e r += 1 ; 2842 2843 2844 2845 2846 i f ( b r i d g e D i s t != b r i d g e D i s t ) g e t c h a r ( ) ; ֒→ ←֓ // c o u t << p4 << ’\ t ’ << p5 << e n d l ; i f ( ( b r i d g e D i s t < ( d l i s t [ 0 ] − ֒→ ←֓ b r i d g e D i s t ∗ t o l e r ) ) or ( b r i d g e D i s t > ( d l i s t [ d l i s t . s i z e ( ) ֒→ ←֓ − 1 ] + b r i d g e D i s t ∗ t o l e r ) ) ) ֒→ ←֓ continue ; i dxJ = c l o s e s t D i s t ( b r i d g e D i s t , d l i s t ) ; 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 i f ( i dxJ == idxA and i dxJ == idxB and ֒→ ←֓ i dxJ == idxC and i dxJ == idxD and i dxJ == idxE and ֒→ ←֓ i dxJ == idxF and i dxJ == idxG and i dxJ == idxH and ֒→ ←֓ i dxJ == i d x I ) { continue ; } f r a c E r r o r = f a b s ( d l i s t [ i dxJ ] − ֒→ ←֓ b r i d g e D i s t ) / b r i d g e D i s t ; if ( fracError < toler ) { atoms . push back ( basePt1 ) ; atoms . push back ( basePt2 ) ; atoms . push back ( basePt3 ) ; 2861 162 2862 2863 atoms . push back ( apexPt1 ) ; atoms . push back ( apexPt2 ) ; 2864 2865 2866 2867 2868 2869 2870 co ut << e n d l << ” 3D c o r e found ! ! ! ” << ֒→ ←֓ e n d l ; co ut << b r i d g e D i s t << ’ \ t ’ << d l i s t [ ֒→ ←֓ i dxJ ] << ’ \ t ’ << f a b s ( b r i d g e D i s t − d l i s t [ i dxJ ֒→ ←֓ ] ) << e n d l ; co ut << ” b r i d g e D i s t : ” << b r i d g e D i s t << ֒→ ←֓ e n d l ; co ut << ” c o r e f i n d e r c o u n t e r : ” << ֒→ ←֓ c o u n t e r << e n d l ; 2875 fo r ( int i = 0 ; i < atoms . s i z e ( ) ; ++i ) { co ut << ” Po i nt : ” << i + 1 << ’ \ t ’ << ֒→ ←֓ atoms [ i ] << e n d l ; } 2876 doBuildup3D ( i dxAr r ) ; 2871 2872 2873 2874 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 i f ( atoms . s i z e ( ) >= min ( 8 , t a r g e t S i z e ֒→ ←֓ ) ) { // I f b u i l d u p was a b l e t o add 4 more ֒→ ←֓ p o i n t s th e n w i t h // h i g h p r o b a b i l i t y , we have t h e ֒→ ←֓ r i g h t s t r u c t u r e . co ut << ” atoms s i z e : ” << ֒→ ←֓ atoms . s i z e ( ) << e n d l ; return true ; } else { // I f b u i l d u p c o u l d n o t even 4 p o i n t s ֒→ ←֓ th e n w i t h a v e r y // h i g h p r o b a b i l i t y , we have t h e ֒→ ←֓ wrong s t r u c t u r e . S t a r t // o v e r and f i n d t h e n e x t c o r e . co ut << ”No buildup , bad c o r e . ” ” F i ndi ng t he next c o r e . . . ” ֒→ ←֓ << e n d l ; atoms . c l e a r ( ) ; updateCurrDL ( ) ; 163 // c o u t << ”Bond window : ” ; 2894 2896 } 2897 // r e t u r n t r u e ; 2895 } } // m l o o p } // i l o o p } // h l o o p } // g l o o p } // f l o o p } // e l o o p } // d l o o p } // c l o o p } // b l o o p 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 co ut << ” c o u n t e r : ”<< c o u n t e r << e n d l ; w i n S t a r t = winStop ; winStop += i n c ; i f ( w i n S t a r t == d l i s t . s i z e ( ) ) break ; 2909 2910 2911 2912 i f ( winStop > d l i s t . s i z e ( ) ) { winStop = d l i s t . s i z e ( ) ; } 2913 2914 2915 2916 2917 2920 co ut << ” s e a r c h i n g i n a ”<< winStop << ” bond window” << e n d l ; } // window w h i l e l o o p 2921 return s u c c e s s F l a g ; 2918 2919 2922 2923 } 2924 2925 2926 2927 2928 2929 2930 long double S t r u c t u r e : : f e a s i b l e T e t r a ( int b a s e I d x ) { // F un c ti o n t o g e t number o f f e a s i b l e t e t r a h e d r a ’ s f o r a ֒→ ←֓ g i v e n b a s e bond long double numTet = 0 ; long double numTri = 0 ; bool s u c c e s s F l a g = f a l s e ; 2931 2932 2933 2934 2935 2936 // 10 bonds i n t h e t e t r a h e d r o n int idxA = baseIdx , idxB , idxC , idxD , idxE , idxF , idxG , idxH , ֒→ ←֓ i d x I , idxJ , idxM ; int invC = 0 ; // # i n v a l i d c o r e s v e c t o r d l i s t = targetDL ; 164 2937 co ut << ” targetDL s i z e : ” << targetDL . s i z e ( ) << e n d l ; 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 int i n c = 1 0 , // w i n S t a r t = max ( b a s e I d x − i n c /2 , 0) , // winStop = min ( w i n S t a r t + inc , i n t ( d l i s t . s i z e ( ) ) − 1 ֒→ ←֓ ) ; // windowing on winStart = 0 , winStop = d l i s t . s i z e ( ) − 1 ; // windowing on int b r i d g e I d x ; int c o u n t e r = 0 ; // number o f t e t r a h e d r a long double b r i d g e D i s t = 0 ; long double fmin , fmax , imin , imax , f r a c E r r o r ; 2949 2950 2951 2952 2953 2954 2955 2956 2957 Po i nt basePt1 , basePt2 , basePt3 , apexPt1 , apexPt2 ; basePt1 . x = 0 ; basePt1 . y = 0 ; basePt1 . z = 0 ; basePt2 . x = d l i s t [ idxA ] ; basePt2 . y = 0 ; basePt2 . z = 0 ; co ut << ” basePt1 : ” << basePt1 . x << ” ” << basePt1 . y << ” ” ֒→ ←֓ << basePt1 . z << e n d l ; co ut << ” basePt2 : ” << basePt2 . x << ” ” << basePt2 . y << ” ” ֒→ ←֓ << basePt2 . z << e n d l ; 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 int countC = 0 ; // c o un t number o f c o r e s v e c t o r i dxAr r ( 6 , 0 ) ; while ( true ) // window l o o p { fo r ( idxB = 0 ; idxB < winStop ; ++idxB ) { // c o u t << e n d l << ” idxB : ” << idxB << ” , ” ; // e n d l ; // c o u t << ” idxC : ” ; fo r ( idxC = idxB + 1 ; idxC < winStop ; ++idxC ) { // c o u t << ” ” << idxC << ” ” ; // << e n d l ; // NOTE: c>b so we have t o c h e c k o n l y 1 t r i a n g l e ֒→ ←֓ i n e q u a l i t y i f ( d l i s t [ idxA ] + d l i s t [ idxB ] + t o l e r < d l i s t [ idxC ֒→ ←֓ ] ) break ; placeApex ( basePt3 , idxA , idxB , idxC , d l i s t ) ; numTri += 1 ; 2975 2976 fo r ( idxD = 1 ; idxD < winStop ; ++idxD ) 165 { 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 i f ( idxD == idxB or idxD == idxC ) continue ; i f ( idxD < idxB ) continue ; fo r ( idxE = 1 ; idxE < winStop ; ++idxE ) { i f ( idxE < idxB ) continue ; i f ( idxE == idxB or idxE == idxC or idxE == idxD ) ֒→ ←֓ continue ; // t r i a n g l e i n e q u a l i t i e s i f ( d l i s t [ idxA ] + d l i s t [ idxE ] + t o l e r < d l i s t [ ֒→ ←֓ idxD ] ) continue ; i f ( d l i s t [ idxA ] + d l i s t [ idxD ] + t o l e r < d l i s t [ ֒→ ←֓ idxE ] ) break ; placeApex ( apexPt1 , idxA , idxD , idxE , d l i s t ) ; numTri += 1 ; 2988 2989 2990 fmin = g e t D i s t a n c e ( basePt3 , apexPt1 ) ; apexPt1 . y = − apexPt1 . y ; fmax = g e t D i s t a n c e ( basePt3 , apexPt1 ) ; 2991 2992 2993 2994 fo r ( idxF = 1 ; idxF < d l i s t . s i z e ( ) ; ++idxF ) { i f ( d l i s t [ idxF ] < ( fmin − t o l e r ) ) continue ; i f ( d l i s t [ idxF ] > ( fmax + t o l e r ) ) break ; i f ( idxF == idxB or idxF == idxC or idxF == idxD ֒→ ←֓ or idxF == idxE ) continue ; 2995 2996 2997 2998 2999 3000 // placeTop ( basePt1 , basePt2 , basePt3 , apexPt1 , ֒→ ←֓ idxArr , d l i s t ) ; 3001 3002 numTet += 1 ; continue ; } // f l o o p } // e l o o p } // d l o o p } // c l o o p } // b l o o p 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 } co ut << ”numTri : ” << numTri << e n d l ; return numTet ; } // window w h i l e l o o p 3016 166 3017 3018 3019 3020 bool S t r u c t u r e : : r e c o n s t r u c t 3 ( int b a s e I d x ) { // R e c o n s t r u c t t h e s t r u c t u r e by f i r s t f i n d i n g t h e c o r e and ֒→ ←֓ th e n d o i n g // b u i l d u p ( i f needed ) . 3021 i f ( atoms . s i z e ( ) == 0 ) { i f ( dim == 2 ) { bool f i n d C o r e F l a g = f i n d C o r e ( ) ; co ut << ” Core found ? ” << b o o l a l p h a << f i n d C o r e F l a g << e n d l ; } e l s e i f ( dim == 3 ) { co ut << ” findCore3D , ” << b a s e I d x << e n d l ; findCore3D ( b a s e I d x ) ; } } e l s e i f ( dim == 3 ) { 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 } 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 1 2 3 4 5 6 } i f ( atoms . s i z e ( ) == t a r g e t S i z e ) { return true ; } else { return f a l s e ; } // Test2D . cpp : F i l e w i t h a l l t h e t e s t f u n c t i o n s f o r Tribond 2D. // #include #include #include ” Tribond . h” using namespace s t d ; 7 8 9 10 int p r o c e s s A r g s ( int argc , char ∗∗ argv , int& N, s t r i n g& f i l e , ֒→ ←֓ int& rngSeed ) { 167 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 } // Pro c e s s i n p u t from u s e r t o g e t problem p a ra m e te rs . i f ( a r g c == 4 ) { N = a t o i ( argv [ 1 ] ) ; f i l e = argv [ 2 ] ; rngSeed = a t o i ( argv [ 3 ] ) ; } else { co ut << ” Usage : ” << ” . / Test2d N( >4) rand2 rngSeed ” << e n d l ; co ut << ” N: number o f s i t e s i n t he random p o i n t s e t ” << ֒→ ←֓ e n d l ; co ut << ” rand2 : command t o use a random 2D p o i n t s e t ” << ֒→ ←֓ e n d l ; co ut << ” rngSeed : s e e d f o r t he random number g e n e r a t o r ” ֒→ ←֓ << e n d l ; exit (0) ; } 28 29 30 31 32 33 34 35 36 int t e s t 1 ( int DIM, int N, s t r i n g f i l e , int rngSeed ) { // I n i t i a l i z e t h e random number g e n e r a t o r . i f ( rngSeed <= −1) { rngSeed = time ( NULL ) ; } sr a nd ( rngSeed ) ; 37 // S e tup t a r g e t s t r u c t u r e and a t t e m p t r e c o n s t r u c t i o n . S t r u c t u r e t a r g e t S t r u ( DIM, N, f i l e ) ; S t r u c t u r e t e s t S t r u ( DIM, N, t a r g e t S t r u . currDL ) ; 38 39 40 41 testStru . reconstruct () ; 42 43 // Compare r e c o n s t r u c t e d s t r u c t u r e w i t h t h e t a r g e t . t a r g e t S t r u . home ( ) ; t e s t S t r u . home ( ) ; compareStru ( t e s t S t r u , t a r g e t S t r u ) ; // t e s t S t r u . p r i n t D L t o F i l e ( ” d i s t a n c e L i s t . t x t ” ) ; 44 45 46 47 48 49 return 0 ; 50 51 52 } 168 53 54 55 56 57 58 int main ( int argc , char ∗∗ argv ) { int DIM = 2 , N, rngSeed ; string f i l e ; p r o c e s s A r g s ( argc , argv , N, f i l e , rngSeed ) ; 59 t e s t 1 ( DIM, N, f i l e , rngSeed ) ; return 0 ; 60 61 62 1 2 3 4 5 6 } // Test3D . cpp : F i l e w i t h a l l t h e t e s t f u n c t i o n s f o r Tribond 3D. // #include #include #include ” Tribond . h” using namespace s t d ; 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 int p r o c e s s A r g s ( int argc , char ∗∗ argv , int& N, s t r i n g& f i l e , ֒→ ←֓ int& rngSeed ) { // Pro c e s s i n p u t from u s e r t o g e t problem p a ra m e te rs . i f ( a r g c == 4 ) { N = a t o i ( argv [ 1 ] ) ; f i l e = argv [ 2 ] ; rngSeed = a t o i ( argv [ 3 ] ) ; } else { co ut << ” Usage : ” << ” . / Test2d N( >4) rand2 rngSeed ” << e n d l ; co ut << ” N: number o f s i t e s i n t he random p o i n t s e t ” << ֒→ ←֓ e n d l ; co ut << ” rand2 : command t o use a random 2D p o i n t s e t ” << ֒→ ←֓ e n d l ; co ut << ” rngSeed : s e e d f o r t he random number g e n e r a t o r ” ֒→ ←֓ << e n d l ; exit (0) ; } } 28 29 30 int t e s t 1 ( int DIM, int N, s t r i n g f i l e , int rngSeed ) { 169 // I n i t i a l i z e t h e random number g e n e r a t o r . 31 32 // FIXME hard c o d i n g t h e rnd s e e d int i d x = rngSeed ; // rngSeed = 1 ; 33 34 35 36 i f ( rngSeed <= −1) { rngSeed = time ( NULL ) ; } sr a nd ( rngSeed ) ; 37 38 39 40 41 42 fo r ( int i = 0 ; i <= 0 ; ++i ) { // S e tup t a r g e t s t r u c t u r e and a t t e m p t r e c o n s t r u c t i o n . // S t r u c t u r e t a r g e t S t r u ( DIM, N, f i l e ) ; S t r u c t u r e t a r g e t S t r u ( N, f i l e ) ; t a r g e t S t r u . home3D ( 0 ) ; targetStru . print () ; 43 44 45 46 47 48 49 50 S t r u c t u r e t e s t S t r u ( DIM, N, t a r g e t S t r u . currDL ) ; t e s t S t r u . atoms = t a r g e t S t r u . g et Co r e ( 5 ) ; 51 52 53 int b a s e I d x = i d x ∗ ( t e s t S t r u . targetDL . s i z e ( ) − 1 ) / 1 0 ; co ut << ” b a s e I d x : ” << b a s e I d x << e n d l ; // t e s t S t r u . r e c o n s t r u c t 3 ( b a s e I d x ) ; testStru . reconstruct () ; 54 55 56 57 58 // Compare r e c o n s t r u c t e d s t r u c t u r e w i t h t h e t a r g e t . t e s t S t r u . home3D ( 0 ) ; compareStru ( t e s t S t r u , t a r g e t S t r u ) ; p r i n t 2 s t r u c t u r e s ( t e s t S t r u . atoms , t a r g e t S t r u . atoms ) ; compareStru ( t e s t S t r u , t a r g e t S t r u ) ; 59 60 61 62 63 } 64 65 // t e s t S t r u . p r i n t D L t o F i l e ( ” d i s t a n c e L i s t . t x t ” ) ; return 0 ; 66 67 68 69 } 70 71 72 73 74 75 int t e s t 2 ( int DIM, int N, s t r i n g f i l e , int rngSeed ) { // I n i t i a l i z e t h e random number g e n e r a t o r . i f ( rngSeed <= −1) { 170 rngSeed = time ( NULL ) ; } sr a nd ( rngSeed ) ; 76 77 78 79 // S e tup t a r g e t s t r u c t u r e and a t t e m p t r e c o n s t r u c t i o n . S t r u c t u r e t a r g e t S t r u ( DIM, N, f i l e ) ; // S t r u c t u r e t a r g e t S t r u ( N, f i l e ) ; // t a r g e t S t r u . home3D( ) ; 80 81 82 83 84 S t r u c t u r e t e s t S t r u ( DIM, N, t a r g e t S t r u . currDL ) ; // t e s t S t r u . atoms = t a r g e t S t r u . g e tC o re ( 5 ) ; long double numTetra = 0 ; fo r ( int i = 0 ; i <= 1 0 ; ++i ) { numTetra = t e s t S t r u . f e a s i b l e T e t r a ( i ∗ ֒→ ←֓ ( t e s t S t r u . targetDL . s i z e ( ) − 1 ) /10 ) ; co ut << ” i : ” << i << ” , ” << ”numTetra : ” << numTetra << ֒→ ←֓ e n d l ; 85 86 87 88 89 90 91 92 94 } 95 return 0 ; 93 96 97 } 98 99 100 101 102 103 int main ( int argc , char ∗∗ argv ) { int DIM = 3 , N, rngSeed ; string f i l e ; p r o c e s s A r g s ( argc , argv , N, f i l e , rngSeed ) ; 104 t e s t 1 ( DIM, N, f i l e , rngSeed ) ; return 0 ; 105 106 107 1 2 3 } // Test2D−v2 . cpp : F i l e w i t h a l l t h e f u n c t i o n s t o t e s t t h e // i m p r e c i s e v e r s i o n o f t h e Tribond 2D a l g o r i t h m . // 4 5 6 7 8 #include #include #include ” Tribond . h” using namespace s t d ; 9 10 171 11 12 13 14 15 16 17 18 int iniRandGen ( int rngSeed ) { // I n i t i a l i z e t h e random number g e n e r a t o r . i f ( rngSeed <= −1) { rngSeed = time ( NULL ) ; } sr a nd ( rngSeed ) ; 19 return 0 ; 20 21 22 } 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 int p r o c e s s A r g s ( int argc , char ∗∗ argv , int& N, s t r i n g& f i l e , ֒→ ←֓ int& rngSeed , int& p r e c i s i o n , int& c o r e S i z e ) { // Pro c e s s i n p u t from u s e r t o g e t problem p a ra m e te rs . i f ( a r g c == 6 ) { N = a t o i ( argv [ 1 ] ) ; f i l e = argv [ 2 ] ; rngSeed = a t o i ( argv [ 3 ] ) ; p r e c i s i o n = a t o i ( argv [ 4 ] ) ; c o r e S i z e = a t o i ( argv [ 5 ] ) ; co ut << ”N, f i l e , rngSeed , p r e c i s i o n , c o r e S i z e : ” << N << ֒→ ←֓ ” , ” << f i l e << ” , ” << rngSeed << ” , ” << p r e c i s i o n << ” , ” << ֒→ ←֓ c o r e S i z e << e n d l ; } else { co ut << ” Usage : ” << ” . / t e s t 2 d N( >4) \” rand2 \” rngSeed ֒→ ←֓ p r e c i s i o n c o r e S i z e ” << e n d l ; exit (0) ; } return 0 ; 44 45 46 } 47 48 49 50 51 int t e s t 1 ( int DIM, int N, s t r i n g f i l e , int rngSeed ) { // // I n i t i a l i z e t h e random number g e n e r a t o r . // i f ( rngSeed <= −1) 172 // { // rngSeed = time ( NULL ) ; // } // s ra n d ( rngSeed ) ; iniRandGen ( rngSeed ) ; 52 53 54 55 56 57 // S e tup t a r g e t s t r u c t u r e and a t t e m p t r e c o n s t r u c t i o n . S t r u c t u r e t a r g e t S t r u ( DIM, N, f i l e ) ; S t r u c t u r e t e s t S t r u ( DIM, N, t a r g e t S t r u . currDL ) ; testStru . reconstruct () ; 58 59 60 61 62 // Compare r e c o n s t r u c t e d s t r u c t u r e w i t h t h e t a r g e t . t a r g e t S t r u . home ( ) ; t e s t S t r u . home ( ) ; compareStru ( t e s t S t r u , t a r g e t S t r u ) ; // t e s t S t r u . p r i n t D L t o F i l e ( ” d i s t a n c e L i s t . t x t ” ) ; 63 64 65 66 67 68 return 0 ; 69 70 71 } 72 73 74 75 76 77 78 int t e s t 2 ( int DIM, int N, s t r i n g f i l e , int rngSeed , int ֒→ ←֓ p r e c i s i o n , int c o r e S i z e ) { // c o u t << ” c h e c k p a r s : ” << DIM << ” , ” << N << ” , ” << f i l e ֒→ ←֓ << ” , ” // << rngSeed << ” , ” << p r e c i s i o n << ” , ” << c o r e S i z e ֒→ ←֓ << e n d l ; iniRandGen ( rngSeed ) ; 79 80 81 82 // S e tup t a r g e t s t r u c t u r e and a t t e m p t r e c o n s t r u c t i o n . S t r u c t u r e t a r g e t S t r u ( DIM, N, f i l e ) ; t a r g e t S t r u . home ( ) ; 83 84 85 86 87 v e c t o r c o r e = t a r g e t S t r u . g et Co r e ( c o r e S i z e ) ; targetStru . reduceDLprecision ( pr ecisio n ) ; S t r u c t u r e t e s t S t r u ( DIM, N, t a r g e t S t r u . targetDL ) ; t e s t S t r u . atoms = c o r e ; 88 89 90 91 92 // t e s t S t r u . atoms = t a r g e t S t r u . g e tC o re ( c o r e S i z e ) ; t e s t S t r u . c u r r S i z e = t e s t S t r u . atoms . s i z e ( ) ; co ut << ” c o r e s i z e : ” << t e s t S t r u . atoms . s i z e ( ) << e n d l ; testStru . print () ; 93 173 // Po i n t smPt ; // smPt . x = −2.04305609; smPt . y = 1 . 3 6 5 0 4 9 9 3 ; // c o u t << ” Point , c o s t : ” << smPt << ’\ t ’ << ←֓ t e s t S t r u . g e t P t C o s t ( smPt ) << e n d l ; // r e t u r n 1 ; // smPt . x = 1 . 1 7 4 4 7 2 0 1 ; smPt . y = −2.80072967; // c o u t << ” Point , c o s t : ” << smPt << ’\ t ’ << ←֓ t e s t S t r u . g e t P t C o s t ( smPt ) << e n d l ; // r e t u r n 1 ; 94 95 96 97 98 99 100 smPt . z = 0 ; ֒→ smPt . z = 0 ; ֒→ 101 bool s u c c e s s F l a g = t e s t S t r u . r e c o n s t r u c t 2 ( ) ; co ut << ” t e s t S t r u s i z e : ” << t e s t S t r u . atoms . s i z e ( ) << e n d l ; testStru . print () ; 102 103 104 105 118 p r i n t 2 s t r u c t u r e s ( t e s t S t r u . atoms , t a r g e t S t r u . atoms ) ; // Compare r e c o n s t r u c t e d s t r u c t u r e w i t h t h e t a r g e t . i f ( successFlag ) { t a r g e t S t r u . home ( ) ; t e s t S t r u . home ( ) ; compareStru ( t e s t S t r u , t a r g e t S t r u ) ; } else { co ut << ” R e c o n s t r u c t i o n f a i l e d ! ” << e n d l ; } 119 return 0 ; 106 107 108 109 110 111 112 113 114 115 116 117 120 121 } 122 123 124 125 126 127 128 int main ( int argc , char ∗∗ argv ) { int DIM = 2 , N, rngSeed , p r e c i s i o n , c o r e S i z e ; string f i l e ; p r o c e s s A r g s ( argc , argv , N, f i l e , rngSeed , p r e c i s i o n , ֒→ ←֓ c o r e S i z e ) ; // t e s t 1 ( DIM, N, f i l e , rngSeed ) ; t e s t 2 ( DIM, N, f i l e , rngSeed , p r e c i s i o n , c o r e S i z e ) ; return 0 ; 129 130 131 132 1 2 } // Test3D−v2 . cpp : F i l e w i t h a l l t h e f u n c t i o n s t o t e s t t h e // i m p r e c i s e v e r s i o n o f t h e Tribond 3D a l g o r i t h m . 174 3 // 4 5 6 7 8 #include #include #include ” Tribond . h” using namespace s t d ; 9 10 11 12 13 14 15 16 17 18 int iniRandGen ( int rngSeed ) { // I n i t i a l i z e t h e random number g e n e r a t o r . i f ( rngSeed <= −1) { rngSeed = time ( NULL ) ; } sr a nd ( rngSeed ) ; 19 return 0 ; 20 21 22 } 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 int p r o c e s s A r g s ( int argc , char ∗∗ argv , int& N, s t r i n g& f i l e , ֒→ ←֓ int& rngSeed , int& p r e c i s i o n , int& c o r e S i z e ) { // Pro c e s s i n p u t from u s e r t o g e t problem p a ra m e te rs . i f ( a r g c == 6 ) { N = a t o i ( argv [ 1 ] ) ; f i l e = argv [ 2 ] ; rngSeed = a t o i ( argv [ 3 ] ) ; p r e c i s i o n = a t o i ( argv [ 4 ] ) ; c o r e S i z e = a t o i ( argv [ 5 ] ) ; co ut << ”N, f i l e , rngSeed , p r e c i s i o n , c o r e S i z e : ” << N << ֒→ ←֓ ” , ” << f i l e << ” , ” << rngSeed << ” , ” << p r e c i s i o n << ” , ” << ֒→ ←֓ c o r e S i z e << e n d l ; } else { co ut << ” Usage : ” << ” . / t e s t 2 d N( >4) \” rand2 \” rngSeed ֒→ ←֓ p r e c i s i o n c o r e S i z e ” << e n d l ; exit (0) ; } 175 return 0 ; 44 45 46 } 47 48 49 50 51 52 53 54 55 56 int t e s t 1 ( int DIM, int N, s t r i n g f i l e , int rngSeed ) { // // I n i t i a l i z e t h e random number g e n e r a t o r . // i f ( rngSeed <= −1) // { // rngSeed = time ( NULL ) ; // } // s ra n d ( rngSeed ) ; iniRandGen ( rngSeed ) ; 57 // S e tup t a r g e t s t r u c t u r e and a t t e m p t r e c o n s t r u c t i o n . S t r u c t u r e t a r g e t S t r u ( DIM, N, f i l e ) ; S t r u c t u r e t e s t S t r u ( DIM, N, t a r g e t S t r u . currDL ) ; testStru . reconstruct () ; 58 59 60 61 62 // Compare r e c o n s t r u c t e d s t r u c t u r e w i t h t h e t a r g e t . t a r g e t S t r u . home ( ) ; t e s t S t r u . home ( ) ; compareStru ( t e s t S t r u , t a r g e t S t r u ) ; // t e s t S t r u . p r i n t D L t o F i l e ( ” d i s t a n c e L i s t . t x t ” ) ; 63 64 65 66 67 68 return 0 ; 69 70 71 } 72 73 74 75 76 77 78 int t e s t 2 ( int DIM, int N, s t r i n g f i l e , int rngSeed , int ֒→ ←֓ p r e c i s i o n , int c o r e S i z e ) { co ut << ” check p a r s : ” << DIM << ” , ” << N << ” , ” << f i l e << ֒→ ←֓ ” , ” << rngSeed << ” , ” << p r e c i s i o n << ” , ” << c o r e S i z e << ֒→ ←֓ e n d l ; iniRandGen ( rngSeed ) ; 79 80 81 82 // S e tup t a r g e t s t r u c t u r e and a t t e m p t r e c o n s t r u c t i o n . S t r u c t u r e t a r g e t S t r u ( N, f i l e ) ; // t a r g e t S t r u . home ( ) ; 83 84 85 v e c t o r c o r e = t a r g e t S t r u . g et Co r e ( c o r e S i z e ) ; targetStru . reduceDLprecision ( pr ecisio n ) ; 176 S t r u c t u r e t e s t S t r u ( DIM, N, t a r g e t S t r u . targetDL ) ; t e s t S t r u . atoms = c o r e ; 86 87 88 // t e s t S t r u . atoms = t a r g e t S t r u . g e tC o re ( c o r e S i z e ) ; t e s t S t r u . c u r r S i z e = t e s t S t r u . atoms . s i z e ( ) ; co ut << ” c o r e s i z e : ” << t e s t S t r u . atoms . s i z e ( ) << e n d l ; testStru . print () ; 89 90 91 92 93 // Po i n t smPt ; // smPt . x = −2.04305609; smPt . y = 1 . 3 6 5 0 4 9 9 3 ; // c o u t << ” Point , c o s t : ” << smPt << ’\ t ’ << ←֓ t e s t S t r u . g e t P t C o s t ( smPt ) << e n d l ; // r e t u r n 1 ; // smPt . x = 1 . 1 7 4 4 7 2 0 1 ; smPt . y = −2.80072967; // c o u t << ” Point , c o s t : ” << smPt << ’\ t ’ << ←֓ t e s t S t r u . g e t P t C o s t ( smPt ) << e n d l ; // r e t u r n 1 ; 94 95 96 97 98 99 100 smPt . z = 0 ; ֒→ smPt . z = 0 ; ֒→ 101 bool s u c c e s s F l a g = t e s t S t r u . r e c o n s t r u c t 2 ( ) ; co ut << ” t e s t S t r u s i z e : ” << t e s t S t r u . atoms . s i z e ( ) << e n d l ; testStru . print () ; 102 103 104 105 p r i n t 2 s t r u c t u r e s ( t e s t S t r u . atoms , t a r g e t S t r u . atoms ) ; // Compare r e c o n s t r u c t e d s t r u c t u r e w i t h t h e t a r g e t . i f ( successFlag ) { t a r g e t S t r u . home ( ) ; t e s t S t r u . home ( ) ; compareStru ( t e s t S t r u , t a r g e t S t r u ) ; } else { co ut << ” R e c o n s t r u c t i o n f a i l e d ! ” << e n d l ; } 106 107 108 109 110 111 112 113 114 115 116 117 118 return 0 ; 119 120 121 } 122 123 124 125 126 127 int main ( int argc , char ∗∗ argv ) { int DIM = 3 , N, rngSeed , p r e c i s i o n , c o r e S i z e ; string f i l e ; p r o c e s s A r g s ( argc , argv , N, f i l e , rngSeed , p r e c i s i o n , ֒→ ←֓ c o r e S i z e ) ; 177 128 // t e s t 1 ( DIM, N, f i l e , rngSeed ) ; t e s t 2 ( DIM, N, f i l e , rngSeed , p r e c i s i o n , c o r e S i z e ) ; return 0 ; 129 130 131 132 } 178 BIBLIOGRAPHY 179 BIBLIOGRAPHY [1] G. M. Crippen and T. F. Havel, Distance Geometry and Molecular Conformation. Wiley and Sons, New York, 1988. [2] G. Crippen, “Chemical distance geometry: current realization and future projection,” Journal of mathematical chemistry, vol. 6, no. 1, pp. 307–324, 1991. [3] K. Wuthrich, “The development of nuclear magnetic resonance spectroscopy as a technique for protein structure determination,” Accounts of Chemical Research, vol. 22, pp. 36–44, Jan. 1989. [4] K. Wuthrich, “Protein structure determination in solution by nuclear magnetic resonance spectroscopy,” Science, 1989. [5] M. Li, Y. Otachi, and T. Tokuyama, “Efficient algorithms for network localization using cores of underlying graphs,” Algorithms for Sensor Systems, pp. 101–114, 2012. [6] R. L. McGreevy and L. Pusztai, “Reverse Monte Carlo Simulation: A New Technique for the Determination of Disordered Structures,” Molecular Simulation, vol. 1, pp. 359–367, Dec. 1988. [7] A. L. Patterson, “Ambiguities in the X-Ray Analysis of Crystal Structures,” Phys. Rev., vol. 65, pp. 195–201, Mar. 1944. [8] J. Yoon, Y. Gad, and Z. Wu, “Mathematical modeling of protein structure using distance geometry,” tech. rep., 2000. [9] J. C. Kendrew, Dickerson R. E., B. E. Strandberg, R. G. Hart, D. R. Davies, D. C. Phillips, and V. C. Shore, “Structure of Myoglobin,” Nature, vol. 185, pp. 422–427, 1960. [10] M. F. Perutz, M. Rossmann, A. Cullis, H. Muirhead, G. Will, and A. C. T. North, “Structure of Haemoglobin,” Nature, vol. 185, pp. 416–422, 1960. [11] J. Miao, H. N. Chapman, J. Kirz, D. Sayre, and K. O. Hodgson, “Taking X-ray diffraction to the limit: macromolecular structures from femtosecond X-ray pulses and diffrac- 180 tion microscopy of cells with synchrotron radiation.,” Annual review of biophysics and biomolecular structure, vol. 33, pp. 157–76, Jan. 2004. [12] J. Miao, J. Kirz, and D. Sayre, “The oversampling phasing method research papers,” Acta Crystallographica Section D, pp. 1312–1315, 2000. [13] J. Wu, K. Leinenweber, J. C. H. Spence, and M. O’Keeffe, “Ab initio phasing of X-ray powder diffraction patterns by charge flipping.,” Nature materials, vol. 5, pp. 647–52, Aug. 2006. [14] V. L. Shneerson, A. Ourmazd, and D. K. Saldin, “Crystallography without crystals. I. The common-line method for assembling a three-dimensional diffraction volume from single-particle scattering.,” Acta Crystallographica Section A, vol. 64, pp. 303–15, Mar. 2008. [15] Y. Jiao and S. Torquato, “Geometrical ambiguity of pair statistics: Point configurations,” Physical Review E, vol. 81, pp. 1–11, Jan. 2010. [16] Y. Jiao, F. Stillinger, and S. Torquato, “Geometrical ambiguity of pair statistics. II. Heterogeneous media,” Physical Review E, vol. 82, pp. 1–11, July 2010. [17] D. Cule and S. Torquato, “Generating random media from limited microstructural information via stochastic optimization,” Journal of applied physics, vol. 86, no. 6, p. 3428, 1999. [18] T. Egami and S. J. L. Billinge, Underneath the Bragg Peaks: Structural Analysis of Complex Materials. Oxford: Pergamon Press, Elsevier, 2003. [19] M. Nilges and S. I. O’Donoghue, “Ambiguous NOEs and automated NOE assignment,” Progress in Nuclear Magnetic Resonance Spectroscopy, vol. 32, pp. 107–139, Apr. 1998. [20] B. Hendrickson, “The molecule problem: Exploiting structure in global optimization,” SIAM Journal on Optimization, vol. 5, no. 4, pp. 835–857, 1995. [21] B. Berger, J. Kleinberg, and T. Leighton, “Reconstructing a three-dimensional model with arbitrary errors,” Journal of the ACM (JACM), pp. 1–16, 1999. [22] H. Lin, E. Boˇzin, S. Billinge, E. Quarez, and M. Kanatzidis, “Nanoscale clusters in the high performance thermoelectric AgPbmSbTem+2,” Physical Review B, vol. 72, pp. 1–7, Nov. 2005. 181 [23] L. Malavasi, G. a. Artioli, H. Kim, B. Maroni, B. Joseph, Y. Ren, T. Proffen, and S. J. L. Billinge, “Local structural investigation of SmFeAsOxF(x) high temperature superconductors.,” Journal of physics. Condensed matter, vol. 23, p. 272201, July 2011. [24] T. Proffen and S. Billinge, “Probing the local structure of doped manganites using the atomic pair distribution function,” Applied Physics A, vol. 74, pp. 1770–1772, 2002. [25] S. J. Billinge, “Nanoscale structural order from the atomic pair distribution function (PDF): There’s plenty of room in the middle,” Journal of Solid State Chemistry, vol. 181, pp. 1695–1700, July 2008. [26] S. J. L. Billinge and M. G. Kanatzidis, “Beyond crystallography: the study of disorder, nanocrystallinity and crystallographically challenged materials with pair distribution functions.,” Chemical communications (Cambridge, England), pp. 749–60, Apr. 2004. [27] S. J. L. Billinge and I. Levin, “The problem with determining atomic structure at the nanoscale.,” Science (New York, N.Y.), vol. 316, pp. 561–5, Apr. 2007. [28] P. Juh´as, D. M. Cherba, P. M. Duxbury, W. F. Punch, and S. J. L. Billinge, “Ab initio determination of solid-state nanostructure.,” Nature, vol. 440, pp. 655–8, Mar. 2006. [29] P. Juh´as, L. Granlund, P. M. Duxbury, W. F. Punch, and S. J. L. Billinge, “The Liga algorithm for ab initio determination of nanostructure.,” Acta crystallographica. Section A, Foundations of crystallography, vol. 64, pp. 631–40, Nov. 2008. [30] P. Juhas, L. Granlund, S. R. Gujarathi, P. M. Duxbury, and S. J. L. Billinge, “Crystal structure solution from experimentally determined atomic pair distribution functions,” Journal of Applied Crystallography, vol. 43, pp. 623–629, 2010. [31] J. Mor´e and Z. Wu, “ε-Optimal Solutions To Distance Geometry Problems Via Global Continuation,” Tech. Rep. May, 1995. [32] J. Saxe, “Embeddability of weighted graphs in k-space is strongly NP-hard,” Proc. 17th Allerton Conference in Communications, Control and Computing, vol. 480-489, 1979. [33] D. Freeman, “Maximizing irregularity and the Golomb ruler problem,” Available through internet at http://citeseer. nj. nec. com/6709. html, 1997. [34] G. Bloom and S. Golomb, “Applications of numbered undirected graphs,” Proceedings of the IEEE, vol. 65, no. 4, pp. 562–570, 1977. 182 [35] J. N. Franklin, “Ambiguities in the X-ray analysis of crystal structures,” Acta Crystallographica Section A, vol. 30, pp. 698–702, Nov. 1974. [36] G. Bloom, “A counterexample to a theorem of S. Piccard,” Journal of Combinatorial Theory, Series A, vol. 22, no. 3, pp. 378–379, 1977. [37] W. Babcock, “Intermodulation interference in radio systems,” Bell Systems Technical Journal, 1953. [38] E. Blum, J. Ribes, and F. Biraud, “Some new possibilities of optimum synthetic linear arrays for radioastronomy,” Astronomy and Astrophysics, vol. 41, no. 3-4, pp. 409–411, 1975. [39] F. Biraud, E. Blum, and J. Ribes, “On optimum synthetic linear arrays with application to radioastronomy,” IEEE Transactions on Antennas and Propagation, pp. 108–109, 1974. [40] A. Moffet, “Minimum-redundancy linear arrays,” IEEE Transactions on Antennas and Propagation, vol. 16, pp. 172–175, Mar. 1968. [41] D. Robertson, “Geophysical applications of very-long-baseline interferometry,” Reviews of modern physics, vol. 63, no. 4, pp. 899–918, 1991. [42] A. K. Dewdney, “Computer recreations,” Scientific American, pp. 16–26, Dec. 1985. [43] A. K. Dewdney, “Computer recreations,” Scientific American, pp. 14–21, Mar. 1986. [44] M. Gardner, “Mathematical games,” Scientific American, vol. 226, pp. 108–112, 1972. [45] M. Gardner, “Mathematical games,” Scientific American, vol. 226, pp. 114–118, June 1972. [46] A. Eckler, “The construction of missile guidance codes resistant to random interference,” Bell Syst Technical J, vol. 39, no. 3, pp. 973–994, 1960. [47] A. Dimitromanolakis, Analysis of the Golomb Ruler and the Sidon Set Problems, and Determination of Large, Near-Optimal Golomb Rulers. PhD thesis, 2002. [48] P. Erd¨os and P. Tur´an, “On a problem of Sidon in additive number theory, and on some related problems,” Journal of the London Mathematical Society, vol. 16, pp. 212–215, 1941. 183 [49] S. Sidon, “”Ein Satz u ¨ ber trigonometrische Polynome und seine Anwendungen in der Theorie der Fourier-Reihen”,” Mathematische Annalen, vol. 106, pp. 536–539, 1932. [50] M. Ajtai, J. Kolmos, and E. Szemeredi, “A dense infinite Sidon sequence,” European Journal of Combinatorics, vol. 2, pp. 1–11, 1981. [51] R. Bose, “An affine analogue of Singers theorem,” J. Indian Math. Soc, vol. 6, pp. 1–15, 1942. [52] I. Z. Ruzsa, “Solving a linear equation in a set of integers I,” Acta Arithmetica, vol. 3, no. LXV, pp. 259–282, 1993. [53] B. Lindstrom, “Finding finite B2-sequences faster,” Mathematics of Computation, vol. 67, no. 223, pp. 1173–1178, 1998. [54] C. Meyer and P. a. Papakonstantinou, “On the complexity of constructing Golomb Rulers,” Discrete Applied Mathematics, vol. 157, pp. 738–748, Feb. 2009. [55] J. Robinson and A. Bernstein, “A class of binary recurrent codes with limited error propagation,” IEEE Transactions on Information Theory, vol. 13, no. 1, pp. 106–113, 1967. [56] J. Shearer, “Some new optimum Golomb rulers,” IEEE Transactions on Information Theory, vol. 36, no. 1, pp. 183–184, 1990. [57] W. Rankin, Optimal golomb rulers: An exhaustive parallel search implementation. PhD thesis, 1993. [58] A. Dollas, W. Rankin, and D. McCracken, “A new algorithm for Golomb ruler derivation and proof of the 19 mark ruler,” IEEE Transactions on Information Theory, vol. 44, no. 1, pp. 379–382, 1998. [59] “http://distributed.net/ogr.” [60] A. K. Hartmann and M. Weigt, Phase Transitions in Combinatorial Optimization Problems: Basics, Algorithms and Statistical Mechanics. Wiley-VCH, Berlin, 2005. [61] D. Achlioptas, A. Naor, and Y. Peres, “Rigorous location of phase transitions in hard optimization problems.,” Nature, vol. 435, pp. 759–64, June 2005. 184 [62] R. Monasson, R. Zecchina, S. Kirkpatrick, B. Selman, and L. Troyansky, “Determining computational complexity from characteristic’phase transitions’,” Nature, vol. 400, no. 6740, pp. 133–137, 1999. [63] D. Mitchell, B. Selman, and H. Levesque, “Hard and easy distributions of SAT problems,” Proceedings of the 10th National Conference on Artificial Intelligence (AAAI-92), vol. AAAI Press, p. 440, 1992. [64] R. Monasson and R. Zecchina, “Statistical mechanics of the random K-satisfiability model,” Physical Review E, vol. 56, no. 2, p. 1357, 1997. [65] M. Weigt and A. K. Hartmann, “Number of guards needed by a museum: a phase transition in vertex covering of random graphs.,” Physical review letters, vol. 84, pp. 6118–21, June 2000. [66] C. Fay, J. Liu, and P. Duxbury, “Maximum independent set on diluted triangular lattices,” Physical Review E, vol. 73, pp. 1–14, May 2006. [67] D. Johnson and M. Garey, “Computers and Intractability: A Guide to the Theory of NP-completeness,” Freeman & Co, San Francisco, 1979. [68] R. Moessner and A. P. Ramirez, “Geometrical Frustration,” Physics Today, vol. 59, no. 2, p. 24, 2006. [69] A. P. Ramirez, “Strongly Geometrically Frustrated Magnets,” Annual Review of Materials Science, vol. 24, pp. 453–480, Aug. 1994. [70] B. Roth, “Rigid and flexible frameworks,” The American Mathematical Monthly, vol. 88, no. 1, pp. 6–21, 1981. [71] H. Crapo, “Structural rigidity,” Structural Topology, vol. 1, pp. 26–45, 1979. [72] L. Asimow and B. Roth, “The rigidity of graphs, II,” Journal of Mathematical Analysis and Applications, vol. 68, pp. 171–190, 1979. [73] L. Asimow and B. Roth, “The rigidity of graphs,” Transactions of the American Mathematical Society, vol. 245, no. November 1978, pp. 279–289, 1978. [74] M. Thorpe and P. Duxbury, Rigidity Theory and Applications. Springer, 1999. 185 [75] G. Laman, “On graphs and rigidity of plane skeletal structures,” Journal of Engineering Mathematics, vol. 4, pp. 331–340, Oct. 1970. [76] R. Kenna, “Homotopy in statistical physics,” Condensed Matter Physics, vol. 9, pp. 283– 304, 2006. 186