g5. w. . um «mamzégma 5?: .J ...;.....n« Emu. S‘MM’JW..IPHHRHI as ‘FI 1.354.". vs .. Ratqmwfi... a. 4i i :(l: ‘ Yaw... hag ., an: \ .34.“... 51. .45 qui i] s 5 . , .3. 9.1.1.... .. 356$ 2... Eran»? . #35... f. x . .. s: 23:. 233.. . .f . .9 i.l.1\:.il...s .8 . I r .I‘. o. . . . Waginhfinru 2...: , . .{lblia . .311 4 - . z 33.... . 2... .1- 3. 1.3.1.... . p , . . :1 I‘m.‘ Ila’..’ 725.0! g , :..J‘1&h..vlflai. .2, . z. E... . n p 30.1.!!! hum! 00.363?! .dI’ASFQ‘I. .3} 3:5}. n|\~? 5 311.3!!! )2 t! is: 2. invuyil. :lu :Hh... .12.! .11...- «sunrohn; hugs... 13.34.; .. 9.41.: ‘01:. 232.01.3va 11 .l! 0"? , yttftrflrc... E: : :1. r ‘l «vols . . t. THES'J‘ 20“} This is to certify that the thesis entitled COMPLEX NETWORK PROBLEMS IN PHYSICS COMPUTER SCIENCE AND BIOLOGY presented by RADU IONUT COJOCARU has been accepted towards fulfillment of the requirements for the Ph.D. degree in Physics and Astronomy flu QM Llel“L Major Professor's gignature AWL/2 006 Date MSU is an Affirmative Action/Equal Opportunity Institution ”LT—”EEARY * -_- Micl'.l=,-.. State University -.-.-.-.-.-.-.-.-.-.-.-.-.-.- PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 2/05 p:/ClRC/DaIeDue.indd-p.1 COMPLEX NETWORK PROBLEMS IN PHYSICS, COMPUTER SCIENCE AND BIOLOGY By Radu Ionut Cojocaru A DISSERTATION Submitted tO Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Physics and Astronomy 2006 ABSTRACT COMPLEX NETWORK PROBLEMS IN PHYSICS, COMPUTER SCIENCE AND BIOLOGY By Radu Ionut Cojocaru There is a close relation between physics and mathematics and the exchange Of ideas between these two sciences are well established. However until few years ago there was no such a close relation between physics and computer science. Even more, only recently biologists started to use methods and tOOls from sta- tistical physics in order to study the behavior Of complex system. In this thesis we concentrate on applying and analyzing several methods borrowed from computer science to biology and also we use methods from statistical physics in solving hard problems from computer science. In recent years physicists have been interested in studying the behavior of com- plex networks. Physics is an experimental science in which theoretical predictions are compared to experiments. In this definition, the term prediction plays a very important role: although the system is complex, it is still possible to get predictions for its behavior, but these predictions are of a probabilistic nature. Spin glasses, lat- tice gases or the Potts model are a few examples of complex systems in physics. Spin glasses and many frustrated antiferromagnets map exactly tO computer science problems in the NP-hard class defined in Chapter 1. In Chapter 1 we dis- cuss a common result from artificial intelligence (AI) which shows that there are some problems which are NP-complete, with the implication that these problems are difficult to solve. We introduce a few well known hard problems from com- puter science (Satisfiability, Coloring, Vertex Cover together with Maximum Inde- pendent Set and Number Partitioning) and then discuss their mapping to prob- lems from physics. In Chapter 2 we provide a short review Of combinatorial Optimization algo- rithms‘and their applications to ground state problems in disordered systems.We discuss the cavity method initially developed for studying the Sherrington- Kirkpatrick model Of spin glasses. We extend this model to the study of a specific case Of spin glass on the Bethe lattice at zero temperature and then we apply this formalism to the K-SAT problem defined in Chapter 1. The phase transition which physicists study often corresponds to a change in the computational complexity of the corresponding computer science problem. Chapter 3 presents phase transitions which are Specific to the problems discussed in Chapter 1 and also known results for the K-SAT problem. We discuss the replica method and experimental evidences of replica symmetry breaking. The physics approach to hard problems is based on replica methods which are difficult to understand. In Chapter 4 we develop novel methods for studying hard problems using methods similar to the message passing techniques that were dis- cussed in Chapter 2. Although we concentrated on the symmetric case, cavity methods show promise for generalizing our methods to the un-symmetric case. As has been highlighted by John Hopfield, several key features of biological systems are not shared by physical systems. Although living entities follow the laws Of physics and chemistry, the fact that organisms adapt and reproduce in- troduces an essential ingredient that is missing in the physical sciences. In order to extract information from networks many algorithm have been developed. In Chapter 5 we apply polynomial algorithms like minimum spanning tree in order to study and construct gene regulatory networks from experimental data. As fu- ture work we propose the use of algorithms like min-cut/max-flow and Dijkstra for understanding key properties of these networks. ACKNOWLEDGMENTS A Ph.D. dissertation is far from being a one person work, although finally it comes to a single name on the cover. There are so many people I am thankful for getting this far. First, and foremost, I would like tO thank my adviser and probably the most influential person on my future career, Phillip Duxbury, who introduced me to combinatorial Optimization problems, taught me almost everything I know about them, guided my questions and helped me to develop some ideas I run into. His generosity and his character to discuss physics in particular and anything else in general made our discussions some of the best in my life and it is very hard to find someone else tO thank for something like that. I would like tO thank Wolfgang Bauer, Jonathan Hall, Carlo Piermarocchi and Stuart Tessmer who, as members Of my advisory committee, Offered precious suggestions and criticism. The Department of Physics provided generous financial support in my first two years, which made my life in East Lansing comfortable. To that end, I thank our graduate secretary Debbie Simmons, CMP secretary Cathy Cords and secretary to the Chair of our department, Lisa Ruess. I would like tO thank my Office mates Chris Farrow, Chip Fay, Jiwu Liu, Dan Olds and one Of my former Office mate, now Dr. Erin McGarrity, for many interesting discussions from the physics realm and not only. I surely couldn’t be here without my dear physics professors from Romania: Andrei Medar, Vasile Constantin or Raluca Rebedea (the last two passed away but not in my heart). Definitely the most important person that I want to thank is my wife Nadia. Her support and incredible understanding during the graduate years and espe- iv cially in the last one year made this work possible. I would like to thank my son Alex who, although he is only three years and half, he understood that sometimes instead of Spending evenings together I had to come at school and work. Also many thanks to my one month and a half son Paul who teaches me something new every day. Being successful on both plans, family and school, is not an easy job so this is why I also thank totea Lida who traveled thousands Of miles to help us, my wife, kids and I. I would be remiss if I did not mention the people with whom I enjoyed my free time since I was almost a kid. These include my friends Cristian Cocheci, Marian Maruta and Claudiu Soare. Last but not least, I would also like to thank my mother and my sister Geor- giana for keeping me sane and for encouragements especially when I didn’t have my best days. This work is dedicated to my father. Vi TABLE OF CONTENTS LIST OF FIGURES xi 1 Introduction 1 1.1 Computational Complexity for Physicists .................. 1 1.2 Examples of key problems from the NP-complete class .......... 3 1.2.1 The SAT problem ............................... 3 1.2.2 The Coloring Problem ............................ 4 1.2.3 Vertex Cover and Maximum Independent Set .............. 5 1.2.4 Maximum Cut ................................. 6 1.2.5 Number Partitioning ............................. 6 1.3 Mappings .................................... 7 1.3.1 Mapping 3-SAT to VC ............................ 7 1.3.2 Mapping 3-Coloring to VC ......................... 9 1.4 Mapping of NP-complete problems to statistical physics ......... 10 1.4.1 The SAT problem ............................... 10 1.4.2 Coloring .................................... 11 1.4.3 Spin Glasses .................................. 12 1.4.4 Number partitioning ............................. 15 1.4.5 Maximum Independent Set and Vertex Cover .............. 15 1.5 Novel Techniques for Attacking NP-C Problems: The Satisfiability Problem on a DNA computer ........... I ............ 16 1.5.1 The Satisfiability Problem on a DNA computer .............. 17 2 Algorithms 25 2.1 Minimum Spanning Tree ............................ 26 vii 2.1.1 Applications .................................. 26 2.1.2 Kruskal’s Algorithm and Prim’s Algorithm ................ 27 2.2 Shortest Path ................................... 28 2.3 Flow Algorithms ................................ 30 2.3.1 Flow Networks ................................ 30 2.4 Message Passage Techniques ......................... 32 2.4.1 Brief Survey of Inference Problems ..................... 32 2.4.2 An Example from Computer Vision: Pairwise Random Markov Fields (PRMF) ................................ 35 2.4.3 Mapping to Statistical Physics ........................ 35 2.4.4 Tanner Graphs and Factor Graphs ..................... 37 2.4.5 Standard Belief Propagation ......................... 39 2.4.6 The Free Energy ................................ 42 2.4.7 The Mean-Field Free Energy ......................... 43 2.4.8 The Bethe Free Energy ............................ 44 2.5 Novel Techniques: Message Passing Algorithms and the Cavity Method 46 2.5.1 The Message Passing Solution of SAT on a Tree .............. 47 2.5.2 The Cavity Method .............................. 50 3 Phase Transitions in Random Combinatorial Problems 55 3.1 Phase Transitions ................................ 55 3.1.1 Phase transition for the SAT problem ................... 56 3.1.2 Phase transition for the Coloring problem ................. 59 3.1.3 Phase transition for the Vertex Cover problem .............. 62 3.1.4 Phase transition for the Number Partition Problem ........... 63 3.2 More details on K-SAT problem ........................ 67 3.2.1 Known results for the K-SAT problem ................... 67 3.2.2 Statistical mechanics of the K-SAT problem ................ 68 viii 3.2.3 The simplest case, K = 1 ........................... 69 3.2.4 Replica symmetric solutions for all K ................... 70 3.3 Physical meaning of breaking the symmetry ................ 73 3.3.1 Replica symmetric solution for the Sherington-Kirkpatrick model . . . 73 3.3.2 Experimental evidence of replica symmetry breaking .......... 77 4 Geometric Approach 80 4.1 Introduction ................................... 80 4.2 Connectivity and Rigidity percolation .................... 83 4.3 Viana-Bray model ................................ 85 4.4 K-SAT ....................................... 87 4.5 Energy per variable ............................... 93 4.6 Coloring ..................................... 98 4.7 The model and Limiting results ........................ 99 4.8 Coloring on Bethe Lattice ........................... 102 4.9 Results ...................................... 106 5 Applications to System Biology 111 5.1 Combinatorial Optimization Methods for Dissecting Gene Regulatory Networks During Neuronal Differentiation ............... 112 5.1.1 Computational Background ......................... 113 5.1.2 Biological Bakground ............................ 113 5.1.3 Construction of gene regulatory networks (GRN) from experimental data .................................. 118 5.1.4 Discovery of functions / pathways from constructed retinal GRN . . . 119 5.2 Preliminary results ............................... 121 6 Conclusion 1 136 ix BIBLIOGRAPHY 139 1.1 1.2 1.3 1.4 1.5 LIST OF FIGURES Vertex Cover instance resulting from the 3 — SAT instance ........ 7 Vertex Cover instance resulting from the 3-Coloring instance ....... 9 Frustration in a square lattice: on the left hand side an unfrustrated pla- quette is shown while on the right hand side a frustrated plaquette is shown .................................... 13 The DNA computer [101] ........................... 23 Analysis of the full library. Purified full library was PCR-amplified under standard conditions for 15 cycles. PCR products were ana- lyzed on 4% agarose gels. Lanes 1 and 2 correspond to primer set < X1T,, XkT > , lanes 3 and 4 correspond to primer pair < XITH XkT > , lanes 5 and 6 correspond to primer pair < X1T,, le > , lanes 7 and 8 correspond to primer pair < X15, X ,f > , lanes 9 and 10 correspond to primer pair < X1F,,X}CC > , where: (A) k = 11; (B) k =14; (C) k = 17; (D) k = 20. Molecular weight markers are on the leftmost lane of each gel ................................... 24 xi 1.6 Readout of the answer: 1 — yl aliquots of a 50-fold dilution of the an- swer stock were PCR-amplified under standard conditions for 25 cycles. PCR products were analyzed on 4% agarose gels. Lanes 1 and 2 correspond to primer set < XIT” X; > , lanes 3 and 4 cor- respond to primer pair < X17” XE > , lanes 5 and 6 correspond to primer pair < X15, X}: > , lanes 7 and 8 correspond to primer pair < X15,le > where where: (A) k = 2; (B) k = 5; (C) k = 8; (D) k = 11 (E) k = 14 (F) k = 17 (G) k = 20. Molecular weight markers are on the leftmost lane of each gel ................ 24 2.1 The fictional Asia Bayesian network [72] ................... 33 2.2 A square lattice Pairwise Random Markov Field .............. 36 2.3 A factor graph representing the joint probability distribution given by Eq. (2.10) ................................... 38 2.4 A. An illustration of the messages passed in Belief Propagation; B. A diagramatic representation of (2.13); C. A diagramatic representa- tion of the BP message update rules 2.14. The summation symbol indicates that the summation is over all the states of node i; D. A pairwise MRF with four hidden nodes ................... 40 2.5 A function node a and its neighborhood. The survey of cavity bias cab be computed from the knowledge of the joint probability distribu- tion for all the cavity-biases in the set U, so those coming onto all variables node j which are neighbors of a, except j = i .......... 49 xii 2.6 3.1 3.2 3.3 3.4 3.5 3.6 An example, for the case k = 2, of a GN,6 cavity graph where q = 6 randomly chosen cavity spins have two neighbors only. Fig. A: All the other N — 6 spins outside the cavity are connected through a random graph such that every spin has k + 1 = 3 neighbors. Fig. B: Starting from G M6 cavity graph we can create a G N+2,o graph by adding two sites. Fig. C: Starting from GN,6 cavity graph we can create a GN,0 graph by adding three links. [82]. ............. 53 The 4.3 point for the 3-SAT problem ..................... 58 The ratio of frozen pairs to (g) plotted against the ratio M / N, where M is the number of frozen pairs [24] ...................... 60 The ratio of free pairs to ('2’) plotted against the ratio M / N, where M is the number of frozen pairs [24] ....................... 61 Probability Pcoz,(x) that a cover exists for a random graph (c = 2) as a function of the fraction x of covered vertices. The result is shown for three different system sizes N = 25, 50, 100 (averaged for 103 — 104 samples). Lines are guides for the eyes only [57] ............. 64 Typical fraction of violated clauses (bold line) and entropy (thin line) vs. a for K = 1 in the limit N -—> oo [89] .................. 71 FC- and ZFC-magnetization (higher and lower curve respectively) vs. temperature of Cu(Mn13.5°/o), H = 1 Ce. For such a low field the magnetization is proportional to susceptibility [28]. .......... 78 xiii 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 The factor graph used to construct the recurrence relations. The circles denote variable nodes, while the square nodes are the clause nodes. V is the probability that a variable node is frozen, while F is the probability that a clause node is frozen (see the text). We assume that a variable at level 1 is frozen and find the probability that a variable at level 3 is frozen. The clause nodes have co-ordination K, while the variable nodes have co-ordination M ............. The probability that a clause is frozen, F, as a function of a, for 2-SAT. . . The probability that a clause is frozen, F, as a function of a, for 3-SAT. . . The probability that a clause is frozen, P, as a function of a, for 4-SAT. . . Ground state energy for K = 3 using Eq.4.30 (upper curve) and using the cavity approach (lower curve line) as a function of a ......... Ground state energy for K 2 4 using Eq.4.30 (upper curve) and using the cavity approach (lower curve) as a function of a :- M / N ....... The coloring order parameters for q = 2. The lower two curves are the probability that a site is frozen and colorable, G (the s = 0 term in eq.(4.40)), and the probability that a site is frozen and frustrated, H (the s 2 1 term in eq.(4.40)). The top curve is the probability that a site has a frozen color F = G + H, which is found by solving 89 91 92 93 96 97 eq.(4.40) with q = 2. ............................ 107 The coloring order parameters for q = 3. The lower two curves are the probability that a site is frozen and colorable, G (the s = 0 term in eq.(4.40)), and the probability that a site is frozen and frustrated, H (the s _>_ 1 term in Eq.(4.40)). The top curve is the probability that a site has a frozen color F = G + H, which is found by solving Eq.(4.40) with q = 3. ............................ 108 4.9 Energy density for the q = 3 case ....................... 109 xiv 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 Time-line of rod photoreceptor birth, major developmental events, and the kinetics of N RL and rhodopsin (Rho) gene expression. Rod birth peaks at P1 — 2. At P6, expression of rhodopsin and several other rod-specific genes is observed. Outer segments begin to form at P10 and by P28 mature rods are formed. Adapted from Akimoto et al. [361115 A proposed model of photoreceptor differentiation, integrating the transcriptional regulatory functions of N RL and N R2153 [22] ...... 115 The phenotype matrix. E". (t) represents the expression level of a set of genes, with each gene labeled by 1', corresponding to the I-th experi- ment (zle. knockout phenotype) at time t. ................ 119 Clustering of genes based on correlation level. Small clusters of high correlation appear first. When clusters join, they are linked by an edge (thick edges in the figure) of lower correlation which gives a measure of the confidence associated with the larger cluster. The two different degree of grayness are for the two types of co- expressions: up-, respectively down-regulated genes. ......... 120 The FDRCI output of wild type compared to N RL — / — at 5 different time points run with a minimum fold change of 2. Only the most down- and -up 30 regulated genes are shown. ............. 122 Minimum Spanning Tree corresponding to the most highly anti— correlated 30 genes at embryonic 16, post-natal 6 and 2 months (adult) for erKO-thfp. ......................... 123 Minimum Spanning Tree corresponding to the highly anti-correlated genes at embryonic 16 for erKO—thfp. ................ 124 Minimum Spanning Tree corresponding to the highly anti-correlated genes at post-natal 2 for erKO-thfp. ................. 125 XV 5.9 Minimum Spanning Tree corresponding to the highly anti-correlated genes at post-natal 6 for erKO-thfp. ................. 126 5.10 Minimum Spanning Tree corresponding to the highly anti-correlated genes at post-natal 10 for erKO-thfp .................. 127 5.11 Minimum Spanning Tree corresponding to the highly anti-correlated genes at 2 months (adult) for N rlKO-thfp ................ 128 5.12 k vs. N(k) for the anti-correlated case corresponding to the 5 networks corresponding to the anti-correlated case. ................ 130 5.13 Minimum Spanning Tree corresponding to the highly correlated genes at embryonic 16 for erKO—thfp. .................... 131 5.14 Minimum Spanning Tree corresponding to the highly correlated genes at post-natal 2 for erKO-thfp. ..................... 132 5.15 Minimum Spanning Tree corresponding to the highly correlated genes at post-natal 6 for erKO—thfp. ..................... 133 5.16 k vs. N(k) for the anti-correlated case corresponding to the 5 networks corresponding to the correlated case .................... 134 xvi Chapter 1 Introduction You are a chief of protocol for the embassy ball. The crown prince instructs you either to invite Peru or to exclude Qatar. The queen asks you to invite either Qatar or Romania or both. The king, in a spiteful mood, wants to snub either Romania or Peru or both. Is there a guest list that will satisfy the whims of the entire royal family? 1.1 Computational Complexity for Physicists This contrived little puzzle is an instance of a problem that lies near the root of theoretical computer science. It is called the satisfiability problem (or SAT) and it was the first member of the well know class called the nondeterministic polynomial or NP class (in the next section we define a few problems that are in NP, the SAT problem being one of them). Computational complexity is a part of computer science which deals with clas- sifying problems according to the computational resources required to solve them. There are two basic classes: the polynomial class (or the P class) and the nondeter- ministic polynomial (or NP) class. The NP term was introduced in the early 1970’s and describes the abyss of in- herent intractability that programmers face as they try to solve larger and more 1 complex problems. Problems that belong to the nondeterministic-polynomial NP class seem intrinsically hard, but after 35 years of attempts nobody proved that they are necessarily difficult. Characteristic for the NP class is that if we can guess the answer, than we can check its correctness in polynomial time (i.e. efficiently). For the above example, the checking procedure is straight forward: given a pro- posed labeling, we just have to substitute the specified invite (or true) and do not invite (or false) for the three variables (P from Peru, Q from Qatar and R from Ro- mania) and make sure that the formula is true. There are many examples of prob- lems with competing objectives (like the one above) that appear in both physics (e.g. the spin glasses (SG problem)) and in computer science (e.g. scheduling or planning). These kinds of problems with many degrees of freedom have been an- alyzed for more than half a century by the computer science community and more recently by physicists. The SAT problem is a member of an even more exclusive class called NP-complete. Completeness is a key property to the entire set of NP- complete problems: if a polynomial time algorithm could be found for any NP - complete problem, then the algorithm could be adapted to all problems in NP. The SAT problem was the first problem shown by Stephen Cook [46] to belong to the NP-complete class. For the other class, the polynomial class, we say that a problem belongs to the P class if it can be solved by a polynomial-time algorithm. Unfortunately, the converse assumption is not true: just because no one found a polynomial-time algorithm to solve a problem doesn’t mean that the problem is not in class P. As Hayes [58] states: It remains possible (though unlikely) that we are simply attacking them by clumsy methods, and if we could dream up a clever algorithm they would all turn out to be easy. There are thousands of other N P-complete problems that are suspended in this computational limbo, and in the next section we introduce some of them. FT 2:: ca an eit mt he he Np file 1.2 Examples of key problems from the N P-complete class 1.2.1 The SAT problem An instance of the satisfiability problem is defined in terms of N Boolean variables, and a set of M constraints between them, where each constraint takes the special form of a clause. A clause is the logical OR of some variables or their negations. The notation that we use is the following: a variable x,, with i 6 {1,2, - - - , N}, takes value in {0, 1} with 1 corresponding to true and 0 corresponding to false; the negation of x,- is f,- E 1 — x,-. A variable or its negation is called a literal, denoted with 1;, (i.e 1;, denotes either x,- or ii). A clause a, with a 6 {1,2,- - - , M}, involv- ing Ka variables is a constraint which forbids exactly one among the 2Ka possible assignments to these Kn variables. The clause a is written as C, = I}, V 1,2,, - - - , V15. An instance of the satisfiability problem can be written as: F=C1/\C2/\-°°/\CM 0-1) called a conjunctive normal form (CNF). Given the formula P, the question is whether there exists an assignment of the variables xi, such that the formula F is true. An algorithm solving the satisfiability problem must be able, given the formula F, to either answer YES (the formula is then said to be SAT and provide such an assign- ment), or to answer NO in which case the formula is called UNSAT. The restriction of the satisfiability problem by requiring that all the clauses in F have the same length K“ = K, is called the K—satisfiability problem (or K-SAT). As we already mentioned the satisfiability problem was the first problem shown to be N P-complete. For K, S 2 the problem is solved in polynomial time but for K, 2 3 the problem belongs to the N P-complete class. 3 An optimization problem is associated to the decision version of satisfiability: given a formula F, one is asked to find an assignment which violates the smallest number of clauses. This is called the MA X-SAT problem. 1.2.2 The Coloring Problem The Graph Coloring problem is a well known problem in combinatorics and in statistical physics. The problem is simply stated but is very difficult to solve either analytically or numerically. Given a graph, or a lattice, and given a number q of available colors, the problem consists in finding a coloring of vertices such that no edge has the two ending vertices of the same color. The possibility of finding such a solution, depends on the way the graph is constructed and also on the number of colors. The minimally needed number of colors is the chromatic number of the graph. In modern computer science, graph coloring is taken as one of the most widely used benchmarks for the evaluation of algorithm performance. The interest in col- oring stems from the fact that many real—world combinatorial optimization prob- lems have component sub-problems which can be easily represented as coloring problems (for example, a classical application is the scheduling of registers in the central processing unit of computers). The q-coloring problem of random graphs is an active field of research in discrete mathematics and represents the natural evolution of the percolation theory initiated by Erdos and Rényi in the 50’s [33]. One point of contact between computer science and random graph theory arises from the observation that for large random graphs, there exists a critical average connectivity beyond which the graphs become uncolorable with probabil- ity going to one as the graph size goes to infinity. The precise value of the critical connectivity depends on the number of allowed colors and on the ensemble of random graplm under consideration. Graphs generated close to, but below their 4 critical connectivity are hard to color. 1.2.3 Vertex Cover and Maximum Independent Set In order to have a more intuitive understanding for defining the vertex cover (VC) problem we will use an example in the same way we did for defining the satisfia- bility problem. Imagine that a director of a museum situated in a large park with numerous paths wants to put guards on crossroads to observe any path, but in or- der to economize costs the director has to use as few guards as possible. Let N be the number of crossroads and let X S N be the number of guards. Then there are C )1}! possibilities of placing the guards, but most configurations will lead to unob- served paths. Deciding whether there exists any perfect solution, or finding one, can take a time that grows exponentially with N. The mathematical formulation of the VC problem can be stated as follows: Con- sider an undirected graph G = (V, E) with N vertices i E V = 1,2, - - - , N (the crossroads in our example) and edges (i, j) E E C V x V (the paths in the consid- ered example). A vertex cover is a subset ch C V of vertices such that for all edges (i, j) E E there is at least one of its endpoints i or 1' in WC (the path is observed). We call the vertices that are in WC covered, whereas the vertices in its complement V \ ch are called uncovered. The decision version is whether a VC of fixed cardi- nality exists or not. Maximum independent set is trivially related to the minimum VC. First we de- fine the independent set as a subset of vertices which are pairwise disconnected in the graph. Any set V \ ch thus forms an independent set, and maximal indepen- dent sets are complementary to minimal vertex cover. 5 1.2.4 Maximum Cut In order to define the Max Cut we first need to define the notion of a cut. In graph theory, a cut is a partition of the vertices of a graph into two sets. More formally, let G(V, E) denote a graph. A cut is a partition of the vertices V into two sets S and T. Any edge (i,j) E E withi E S andj E T (ori E T andj E S, in case ofa directed graph) is said to be crossing the cut and it is called a cut edge. The size of a cut is the total number of edges crossing the cut. In weighted graphs, the size of the cut is defined to be the sum of weights of the edges crossing the cut. If we consider non- negative weights w; 1- = wj, on the edges (i, j) E E, the maximum cut problem Max Cut is that of finding the set of vertices S that maximizes the weight of the edges in the cut (5, T) that is, the weight of the edges with one endpoint in S and the other in T E S. For simplicity, we usually set wij = 0 for (i, j) ¢ E and denote the weight of a cut by w(S, T) = Eies'jeTwij. The Max Cut problem is one of the Karp’s original NP-complete problems [64] and has long been known to be NP-complete even if the problem is unweighted; that is, if wij = 1 for all (i, j) E E [46]. The Max Cut problem is solvable in polynomial time for some special classes of graphs (e.g. if the graph is planar [54]). Besides its theoretical importance, the MAX CUT problem has applications in circuit layout design and statistical physics [8]. 1.2.5 Number Partitioning The number partition problem (N PP) is another problem of Garey and Johnson’s [46] six basic NP-complete problems that lie at the heart of N P-completeness. It is defined as follows: given a sequence of positive real numbers a1, a2, ..., a N the NPP consists of partitioning them into two disjoint sets A1,A2 such that the difference I Eaje A1 a i -— ZajeAz a jl is minimized. Despite its simplicity, the N PP was shown to belong to the N P-complete class. Figure 1.1: Vertex Cover instance resulting from the 3 — SAT instance The decision version of NPP is: given a fixed k determine if there is a partition of A = A1 U A2 such that IE a;- E a1|£k (1.2) aj-EAI ajE/iz The four problems that we defined in this section (SAT, Coloring, VC and NPP) are among the six celebrated and most used problems in proving NP-completeness (the other two are the traveling salesman problem TSP and Hamiltonian circuit HC). 1.3 Mappings In this section we use the reduction methodology to provide NP-completeness proofs for some of the problems discuses above. 1.3.1 Mapping 3-SAT to VC The proof of the N P-completeness of the vertex cover problem works by reducing 3 —— SAT to VC in polynomial time. First, we show VC 6 NP: It is very easy to 7 decide for a given subset V’ of vertices, whether all edges are covered, i.e. whether V’ is a vertex cover, by just iterating over all edges. Hence, it remains to Show that 3 - SAT is polynomially reducible to VC. Let F = C1/\, - - - , Cm be a 3 — SAT formula with variables X = x1,- - - ,x,, and Cp 2 1,1, V [’73 V 1?, for all p where each literal is a variable (1;, = x1) or a negated variable (I;7 = ii). We have to create a graph G and a threshold K, such that G has a VC of size lower than or equal to K, iff F is satisfiable. For this purpose, we set: (i) V1 E {v1,51,--- ,v,,,5,,} with (I V1 |: 2n) and E1 E {(221,61), (222,52), -- , (v,,,v,,)}, i.e. for each variable occurring in P we create a pair of vertices and an edge between them. To cover the edges in E1, we have to include at least one vertex per pair in the covering set. In this part of the graph, each cover corresponds to an assignment of the variables with the following idea behind it: If variable x,- = 1, then v,- should be covered, while if x,- = 0 then v—,- is to be covered. (ii) For each clause in P we introduce three vertices connected in a triangle: V2 E {ahaiaiaaaéagpn,a},,,a?,,,a?n} and E2 E {(a1,a1),(a1,a?,) (ainthIai), (“EIQEI (aimin- 1(anlan )I (anta3)I (“manll For each clause, we have to include at least two vertices in a VC. In a cover of minimum size, the uncovered vertex corresponds to a literal which is satisfied. (iii) Finally, for each position i in a clause p, vertex a; is connected with the vertex representing the literal 1;, appearing at that position of the clause: E3 E {a;,,v,- | p = 1,--- ,m; i: 1,2,3 if 1;, 2 xi} LJ {aiflvl- | p = 1,-~- ,m; i: 1,2,3 if 1;, = 71-} Hence, E3 contains edges each connecting one vertex from V1 with one vertex from V2. (iv) The graph G is the combination of the above introduced vertices and edges: G = (V,E), V ‘2 V1 U V2, E 2‘ E1 UEzUE3. (v) For a SAT formula, the size of the minimum vertex cover con- [j Figure 1.2: Vertex Cover instance resulting from the 3-Coloring instance structed has cardinality n + 2m. As a small example we consider F = (x1 V 363 V if) /\ (71 V x2 V 34). The resulting graph G(V, E) is displayed in Fig. 1.1. 1.3.2 Mapping 3-Coloring to VC In [Fig.(1.2) the mapping of the VC problem to 3-Coloring problem is illustrated. For each vertex of the square we introduce a sub-lattice with size 3 (i.e. a triangle); if we would like to color the square with q colors, then at each vertex of the square we would introduce a clique of size q. Each of these triangles (or cliques) correspond to one color. In order to ensure that two neighbors are never colored the same, we connect these two nodes with an edge. We continue this process for each of the four cliques/ triangles and if we can find a MIS with size equal to the number of nodes in the lattice, then our original square is colorable. If not, the cost function (energy) is equal to the number of uncolorable sites. 9 I l 1.4 till pro 0‘10 EVA 1.4 In . Val qm the 1.4 Mapping of NP-complete problems to statistical physics One of the main goals for an optimization problem is to find the configuration of variables to minimize (maximize) a multi-variable function. A combinatorial opti- mization problem corresponds to the case when the variables take only discrete val- ues under some combinatorial constraints. The function that we want to minimize (maximize) f (x1, x2, ..., x,,) is called the cost function. The principal goal of statisti- cal mechanics is to understand the macroscopic properties of many-body systems starting from the knowledge of interactions between microscopic elements. As an example we can consider the case of water which can exist in three different states: vapor (gas), water (liquid) or ice (solid) which look very different from each other although the microscopic elements are the same molecules of H20. Macroscopic properties of these three phases are quite different from each other because inter— molecular interactions drastically change the macroscopic behavior according to external conditions like for example temperature or pressure. 1.4.1 The SAT problem In order to map the K — SAT problem onto random diluted systems we introduce spin variables S,- = 1 if the Boolean variable x,- is true and S,- = —1 if the Boolean variable x,- is false. The structure of the clauses is taken into account by an M x M quenched matrix C1,. = —1(1) if x;(f,~) belongs to clause I and 0, otherwise. Then, the energy cost function (the number of violated clauses) is given by: M N E[C, 5] = E 5Q: Q's,- - K) (1.3) [=1 i=1 10 m] In] Wh 1.4 FIG stat For subject to the constraints DAL C118,- : K and 21:1 Ci 2 K, I = 1 . . . M. The symbol (5 denotes the delta function. The energy cost turns out to be equal to the number of violated clauses in that the quantity £1111 C118,: equals K if and only if all the Boolean variables in the clause 1 take the values opposite to the desired ones, i.e if the clause itself is false. The constraints ensure that the number of Boolean variables in any clause is exactly K. From a physical point of view, the K—SAT en- ergy function is similar to the Hamiltonian of spin glasses. These systems, char- acterized by a frozen-in structural disorder, have been intensively studied in the last twenty five years. Spin glasses are materials weakly diluted with magnetic ions. The random positions of the ions induce random (in sign and strength) mag- netic interactions. The lack of homogeneity results in an extremely complex en- ergy landscape. In particular, the enormous number of meta-stable states makes the low-temperature behavior very unusual and interesting from a fundamental point of view. In the K = 3 case, the disorder is induced by the random clauses which make the problem more and more frustrated as the ration M / N increases. The ground state properties of the cost function given by Eq.(3.8) will reflect those of K-SAT (E55 = 0) and MAX-K-SAT (E65 > 0). In Eq.(3.8), K may be in- terpreted as the number of neighbors to which each spin is coupled inside a clause. In Chapter 3 we study the ground state properties of the cost function (3.8) using the replica approach and in Chapter 4 we develop a novel geometrical technique which turns out to be much simpler than the replica technique. 1.4.2 Coloring From the physics point of view, the q-coloring problem corresponds to the ground state configuration of a Potts anti-ferromagnetic with q-state variables [114], [12]. For most lattices such a system is frustrated and displays all the equilibrium and 11 out-of-equilibrium features of spin glasses. The Hamiltonian is, E : Enjoy,” (1.4) i,j where bi]- = 1 (0) if an edge is present (absent) between sites i and j. The Potts anti-ferromagnetic is an example of a physical system with geometrical frustration [114]. Optimizing the color configuration with q colors is equivalent to finding the ground state of the q-state Potts anti-ferromagnetic on the same graph. 1.4.3 Spin Glasses Spin was first discovered in the context of the emission spectrum of alkali met- als. In 1924 Pauli introduced what he called a two-valued quantum degree of freedom associated with the electron in the outermost shell. This allowed him to formu- late the Pauli exclusion principle, stating that no two electrons can share the same quantum numbers. Spin glasses (also called amorphous magnets) are magnetic substances in which the interaction among the spins is sometimes ferromagnetic, sometimes anti-ferromagnetic. The rigorous formulation of the Ising spin glass problem is the following: we have N variables 5; where s; can take only two values: +1 or —1 (Ising spin); for a given set of Iii we are interesting in minimizing the function: H/{S} = ‘211751'3,’ (1.5) i> j The function H is called the cost function or the Hamiltonian (or energy). From the point of view of complexity theory this problem is NP-complete [7], which means that very likely there is no algorithm that can find the minimum (called the ground state) in polynomial time. 12 y -- | ? - | Figure 1.3: Frustration in a square lattice: on the left hand side an unfrustrated plaquette is shown while on the right hand side a frustrated plaquette is shown. A spin glass has two properties: frustration and quenched disorder. The term frustration refers to this inability to satisfy all the bonds To illustrate the physics of frustration, consider a two-dimensional Ising model on a square lat- tice with nearest-neighbor couplings [,7 which can take on only the values :t I. We examine its energetics at a level of a single square of 4 spins and their 4 mutual couplings. Such an elementary unit of lattice is called a plaquette. If the number of negative bonds is even, then it is always possible to find a pair of spin configura- tions which satisfy all the bonds. One simply chooses one of the spins to be, for example up and then move, say, clockwise around the loop determining the value of the next spin to be that of the previous one multiplied by the sign of the bond connecting them. Then there will be no conflicting instructions coming from the originally-fixed spin when we get all the way around the loop to the starting point again. If the number of negative bonds is odd like in Fig.(1.3), there will be a conflict when one gets all the way arround the loop. The bond connecting the last spin and the originally fixed spin will not be satisfied. If one tries to satisfy it by flipping either of these two spins, one will break another bond instead. 13 In frustrated systems the individual entities that construct the model (spins, for the spin glass system) feel some sort of frustration in the literal sense. Toulouse [107] introduced the concept of frustration for this plaquette that is specific to spin gasses. In mathematical terms, quenched disorder is harder to analyze than its annealed counterpart since the thermal and the disorder averaging play very dif- ferent roles. In fact the problem is so hard that few techniques to approach it are known, most of them relying on approximations. The most used is replica theory (which we describe in Chapter 3), a technique based on a mathematical analytical continuation (known as the replica trick) which although it gives results in accord with experiments in a large range of problems, is not a rigorous mathematical pro- cedure and is still the subject of research. If we perform for example a Monte Carlo simulation on quenched disordered systems we have to overcome large energy barriers between the various minima. As a consequence, the relaxation times become very large and it is well known in the community of computational physicists that investigating disordered sys- tems at equilibrium is almost impossible for large systems. So an important aspect is that the system size has to be relatively small. Another important observation known as ”large sample-to-sample fluctuations” is that different samples, i.e. dif- ferent disorder realizations can have completely different properties. This obser- vation originates in the lack of self-averaging in some physical observable so we can have rare events (i.e. disorder configurations with small probability) that have strong impact on averaged quantities like susceptibilities or autocorrelations. So when disordered systems are investigated we have to sample a huge number of disordered configurations and to be modest in system size. Ising spin glasses map to one of the optimization problems defined in the pre- vious section, namely Max-Cut [8]. Of course, due to the completeness property, a spin glass can be mapped to all other NP-complete problems so finding a solution 14 to one of them would solve the Ising spin glass problem. 1.4.4 Number partitioning Apartition can be encoded by Ising spins s,- : :i:1 : s,- : 1 if a,- 6 A1 and s,- = —1, otherwise. So, we search for the Ising spin configurations s = (51,52, ...,sN) that minimizes the energy or cost function: N E(s) : | ZaI-sjl (1.6) i=1 Despite its simplicity, the N PP was shown to belong to the N P-complete class no deterministic polynomial algorithm that solves all the instances of this problem in polynomial time, is currently available. The fact that the N PP problem is frustrated can easily be understood by squaring Eq.(1.6), so that the problem of minimizing the energy E becomes equivalent to finding the ground state of Ising Hamiltonian: 1 H = E2 "—" EZZaMI-sisj (1.7) i j>i In statistical mechanics, this is an infinite range Ising spin glass with Mattis-like, anti-ferromagnetic couplings [i]- = —-a,-a]~ [44]. 1.4.5 Maximum Independent Set and Vertex Cover Maximum independent set (MIS) is a problem of broad interest in both the statisti- cal physics and computer science communities. Finding a maximum independent set MIS, or the minimum vertex cover MV C to which it is trivially related, is one of the six fundamental NP-complete problems [46], and is NP-complete even on planar graphs. As we already defined in the previous section, an independent set (IS) in a 15 graph is a set of vertices such that no two independent sites share an edge. The MIS is an IS that contains the maximum number of sites [46] . In statistical physics, an MIS corresponds to the maximum packing state of a hard core lattice gas [56] . The hard core lattice gas Hamiltonian for the MIS is H =1£e,,-Iz,nj—y):;n, (1.8) ij i where n,- = 1 if a site is part of the MIS and n,- = 0 if a site is part of the minimum vertex cover MVC . In order to find the MIS, we take the limit I —+ co to ensure that no bond has an MIS site at both of it’s ends and 6,7 = 1 if a bond exists between sites i and j. The chemical potential u weights the cardinality of an independent set. In order to find the MIS, we take the limit Bu —> 00, with B = 1 /kBT k3 being the Boltzmann’s constant and I / u —I co to ensure that the independent set (hard core) condition is preserved. Here 8 = 1 /kBT ‘where k3 is Boltzmann’s constatnt and T is the temperature. 1.5 Novel Techniques for Attacking NP-C Problems: The Satisfiability Problem on a DNA computer Remarkably, the new computer science predicts that quantum computers will be able to perform certain computational tasks in phenomenally fewer steps than any conventional classical computer. Moreover, quantum effects allow unprecedented tasks to be performed, such as teleportation of information, breaking supposedly unbreakable codes, generating true random numbers, and communicating with messages that expose the presence of eavesdropping. Although attacking NP-complete problems using quantum computing is not our focus of research in this thesis, it is worth mentioning that Lidar [73] devel- 16 oped an algorithm which allows one to construct a superposition of qubit states, such that each state uniquely codes for a single configuration of Ising spins. A central feature of the algorithm is that the quantum probability of each state in the superposition is exactly equal to the thermodynamic weight of the corresponding configuration. When a measurement is performed, it causes the superposition to collapse into a single state. The probabilities of measuring states are ordered by the energies of the corresponding spin configurations, with the ground state having the highest probability. Therefore, statistical averages needed for calculations of thermodynamic quantities obtained from the partition function are approximated in the fastest converging order in the number of measurements. Unlike Monte Carlo simulations on a classical computer, consecutive measurements on a quan- tum computer are totally uncorrelated. Research at the intersection of the biological and computational sciences holds the potential to enable a number of important advances for both communities. Computational models of cell regulatory networks and biochemical signaling cas- cades are being constructed to elucidate the inner working of living cells. Biol- ogists are utilizing these models to explore the logical implication of alternative competing hypothesis, to design drugs that are highly selective for specific targets, and to control the behavior of cells in response to external inputs. Similarly, ad- vances in the biological sciences are being used to drive innovation in the design of new computing architectures based on biomolecules. The inherent ability of DNA and RNA nucleotides to perform very big computations is being exploited to solve hard problems, as outlined in the next section. 1.5.1 The Satisfiability Problem on a DNA computer Recently, the vast parallelism in molecular computation proved the capability of attacking N P-complete problems that have resisted conventional methods. For 17 molecular computations have been proposed different methods, [3], [74], [11], [102], [52], [100], [38] few of them being applied experimentally with promising re- sults [3], [100], [52], [37], [53], [39]. The 3 — SAT problem, became the benchmark for testing the performance of DNA computers, after Lipton [74] demonstrated that it was well suited to take advantage of the parallelism specific to molecular computation. A group led by Smith [37] used surface-based chemistry to solve a four-variable (16 possible truth assignments) instance of the problem. Yoshida and Suyama [53] also solved a four-variable instance using a DNA program im- plementing a breadth first search. Sakamoto et al. [35] solved a six-variable (64 possible truth assignments) problem using hairpin DNA. A group led by Landwe- ber [39] used RNA to solve an instance of a nine-variable (512 possible truth as- signments) satisfiability problem related to the ”Knights Problem” in chess. In the following we describe an experiment done by Adleman [3]. A 20-variable (1, 048, 576 possible truth assignments) instance of the 3 — SAT problem is solved using a simple DNA computer. This computational problem is the largest solved using non-electronic means. The architecture used is related to the Sticker Model [103] which uses two basic operations for computation: separation based on subse- quence and application of stickers. We use only separations which are carried out using oligonucleotide probes immobilized in polyacrylamide gel-filled glass mod- ules where information-carrying DNA strands are moved through the modules by electrophoresis. In the module are kept strands with subsequences complemen- tary to those of the immobilized hybridized probes. By running electrophoresis at temperature higher than the melting temperature, the capture strands are released from the probes and then transported using electrophoresis to new modules for further separations. We focus on the computational part and we do not emphasize technical de- tails about the experiment. The input was a a formula with 20-variables 24-clause 18 3-conjunctive normal form F =(Y3Vf16VX18)/\(X5VX12 Vfg) /\ ° ' ' (X8VY7VY15) /\ (YBVXMVYlo) In order to make the computation more challenging, this formula was chosen to have a unique truth assignment (i.e. a unique solution): x1213,x2=T,x3:F,x4:F,x5:F,x6=F,x7=T, x8=TI x9=FI x1027,x11=TIX12=TIx13=FI IMZFI 115 = T, x16 = TI x17 = TI x18 = F, Jr19 = PI 120 = F (1-9) To represent all possible truth assignments, a Lipton encoding [74] was used. For each of the 20 variables xk, k = 1,- - ~ ,20, two distinct 15 base value sequences were designed: one representing true (T), X]? and one representing false (F), X If . Below are shown few examples of sequences written 5’ to 3’: xT = TTA CAC CAA TCT CTT, xF = CTC CTA CAA TTC CTA, x; = ATT TCC AAC ATA CTC, x5 = AAA CCT AAT ACT CCT, X]; = ACC CAT TAC TAC CAT, xi, = ACC CAT TAC TAC CAT, xgo = ACA CAA ATA CAC ATC, x50 2 CAA CCA AAC ATA AAC. (1.10) Each of the 220 truth assignments was represented by a library sequence of 300 bases consisting of the ordered concatenation of one value sequence for each vari- able. Single-stranded DNA molecules with library sequences were termed library strands and a collection of all library strands duplexed with complements was termed afull library. For each of the 40 sequences X2, k = 1,- - - ,20, Z = T or F, 19 5’-end Acrydite-modified oligonucelotides were obtained and used as probes during separation operations. To test the full library, PC R amplifications were run with primer sets: < X17", X-kT >, < XIT, if >, < Xf, X—kT >, < Xf, if >, < Xi’”, X_1F>, < XE, if > for various k. Gel analysis of the resulting products showed bands of the expected lengths as it is illustrated in Fig.(1.5). In Fig.(1.4) it is shown the computer consisted of an electrophoresis box with a hot chamber and a cold chamber, a glass library module filled with polyacrylamide gel containing cova- lently bound full library, and for each of the 24 clauses of formula P a glass clause module filled with polyacrylamide gel containing covalently bound probes and designed to capture only library strands encoding truth assignments satisfying that clause [101]. The computational protocol that was used is as follows: Step 1: Insert the library module into the hot chamber of the electrophoresis box and the first clause module into the cold chamber of the box. Begin elec- trophoresis. The library strands melt off their Acrydite-modified complements in the library module and migrate to the first clause module. Library strands encoding truth assignments satisfying the first clause are captured in the capture layer, while library strands encoding non-satisfying assignments run through the capture layer and continue into the buffer reservoir. Step 2: Remove both modules from the box. Discard the module from the hot chamber. Wash the box and add new buffer. Insert the module from the cold chamber into the hot chamber and the module for the next clause into the cold chamber.Begin electrophoresis. During Step 2, library strands melt off their Acrydite-modified probes in the clause module located in the hot chamber and migrate to the clause module in the cold chamber. Library strands encoding truth assignments satisfying the clause associated with the module in the cold chamber 20 will be captured, while library strands encoding non—satisfying assignments will run through the capture layer and continue into the buffer reservoir. Step 3: Repeat Step 2 for each of the remaining 22 clauses. At the end of Step 3, the final (24th) clause module will contain only those library strands which have been captured in all 24 clause modules and hence encode truth assignments satisfying each clause of formula F and therefore formula F itself. Step 4: Extract the answer strands from the final clause module, PCR-amplify, and ”read” the answer. For assigning truth values to variables x1 and x20, 1 ul aliquots of 10—, 20—, 30—, 40-, 50—, 60—, and 100- fold dilutions of the answer stock were PC R—amplified with primer sets: < xix-{O >, < XT,X-.fO >, < Xf,X-§0 >, < xix—50 >. Gel analysis of the PCR products for 10—, 20—, 30—, 40—, 50-fold dilutions showed no bands except for primer set: < X1F , X—ZFO > . These primer sets gave only a band corresponding to 300 bp. Based on this, x1 and x20 were assigned to be false. Analysis of the PCR products for the 60- and 100-fold dilutions showed no bands for any primer set. For assigning truth values to the variables x2, x3 - - - x19, and as a redundant test for the truth value of x20, a 1 ul aliquot of the 50-fold dilution of the answer stock was PCR-amplified with primer sets: < xtx’; >, < x1212: >, < xix“,T >, < xf,x',f > , where k = 2,3,- .. ,20. According to Fig.(1.6) gel analysis showed that in each case only one combination of primers gave a band and this band was of the expected length, compared with the one from Fig.(1.5). On this basis, truth-values were assigned to each variable. These experimentally derived truth values corresponded to the unique satisfying truth assignment for formula F given by Eq. (1.9). 21 To summarize, in this chapter we gave the computer science definition and their counterpart in statistical physics version of four optimization problems. Many ideas from statistical physics methods (like replica technique and the cav- ity method) have been applied to these problems and will be discussed more de- tailed in the next chapter. On the other hand, methods from computer science have proven useful in statistical physics, recent examples being belief propagation or exact methods like branch and cut in the study of spin glasses. We also de- scribed a novel procedures to attack the NP-complete problems based on the vast parallelism computation feature: an experiment using a DNA computer was in- troduced which illustrates that biological molecules can be used also for distinctly non-biological purposes. A minimalistic approach was taken in this experiment: a 20-variable instance of a 3-SAT problem was solved using (except during input and output) DNA Watson-Crick pairing and melting as the sole operation. Though computational theory would predict it, nonetheless it is remarkable that this basic molecular interaction could sustain such a complex computation. 22 hating coll (copw who) Hot Water Cold Water Hot Chamber Cold Chamber 9 >69 Figure 1.4: The DNA computer [101] 23 cooling coll (copper tube) 1 234557891081 2345678910 0123458789100 12345878910 Figure 1.5: Analysis of the full library. Purified full library was PC R-amplified un- der standard conditions for 15 cycles. PCR products were analyzed on 4% agarose gels. Lanes 1 and 2 correspond to primer set < X5, XkT > , lanes 3 and 4 correspond to primer pair < X5, XkT >, lanes 5 and 6 correspond to primer pair < X5, X; > , lanes 7 and 8 correspond to primer pair < X1,, X f > , lanes 9 and 10 correspond to primer pair < X5,le >, where. (A) k— — 11; (B) k = 14; (C) k = 17; (D) k = 20. Molecular weight markers are on the leftmost lane of each gel. A 12345670 B12345678 D12345670 E12345678 F12345678 812345870 - ‘ I (l‘. .I Figure 1.6: Readout of the answer: 1 — pl aliquots of a 50-fold dilution of the answer stock were PC R amplified under standard conditions for 25 cycles. PCR products were analyzed on 4% agarose gels. Lanes 1 and 2 correTpond to primer set < XlT,, X; >, lanes 3 and 4 correspond to primer pair < XlT,, Xk ,lanes 5 and 6 correspond to primer pair < X1" Xk >, lanes 7 and 8 correspond to primer pair < X5 ,Xk >wherewhere: (A) k: 2, (B) k: 5, (C) k=8; (D) k: 11 (E) k: 14 (F) k— — 17 (G) k = 20. Molecular weight markers are on the leftmost lane of each gel. c123‘5678 24 Chapter 2 Algorithms There are many useful algorithms in both, the P and NP classes. In this chapter we introduce some of these algorithms and some of their connections with statistical physics. We also introduce new algorithms for solving (at least approximatively) hard problems as they are related to the methods described in Chapter 3 and 4.In Chapter 5 we use these algorithms on applications to gene networks. Many difficult ground state problems like the Random Field Ising Model (RFIM), Spin Glasses in two dimensions, domain walls in random bond magnets, arrays of directed polymers or rigidity percolation are only a few problem from physics which are exactly solvable in polynomial time. We present three basic classes of polynomial network algorithms: (i) flow, (ii) minimal path and (iii) minimum spanning tree. These algorithms are greedy in the sense that the algorithms make the choice which is the best at that time. Greedy algorithms solve some problems exactly and provide approximations to hard com- putational problems. 25 2.1 Minimum Spanning Tree A spanning tree T of graph G with N vertices is a connected acyclic subgraph that spans all the nodes so the spanning tree has N — 1 edges. Given an undirected graph G = (N, E) with N nodes and E edges, with each each edge having a weight (or cost) c),- with (i, j) E E, we want to find a spanning tree called the minimum spanning tree which has the smallest total weight, measured as the sum of weights (costs) of the edges in the spanning tree. 2.1.1 Applications The minimum spanning tree problem arises in a number of applications: (i) de- signing physical systems, (ii) optimal message passing, (iii) reducing data storage and (iv) cluster analysis. Two applications of minimum spanning tree, namely de- signing physical systems and cluster analysis are now presented. Designing Physical Systems. We need to design a network that will connect com- ponents of a system that are geometrically dispersed and we are interested in the cheapest possible connection, a minimum spanning tree. This problem arises in the following settings: 0 Connect terminals and cable amongst the panels of electrical equipment in such a way to use the least possible length of wire. 0 Construct a pipeline network to connect a number of towns using the small- est possible total length of pipes. 0 Construct a digital computer system, composed of high-frequency circuitry, when it is important to minimize the length of wires between different com- ponents to reduce both capacitance and delay line effects. 26 Cluster Analysis Data points within a particular group of data, or cluster are more closely related to each other than data points not in a cluster. Suppose that we are interested in finding a partition of a set of n points in two-dimensional Euclidian space into clusters. A popular method for solving this problem, which we describe in the next section is Kruskal’s algorithm. At each intermediate iteration, Kruskal’s algorithm maintains a forest and adds edges in non-decreasing order of their weight. The nodes spanned by the trees at intermediate steps can be seen as different clusters and these clusters are often good solutions for the clustering problem. Kruskal’s algorithm can be thought of as providing 11 partitions: the first partition contains n clusters, each cluster containing a single point, and the last partition contains just one cluster with all the points connected in the minimum spanning tree. 2.1.2 Kruskal’s Algorithm and Prim’s Algorithm In the following we will describe two algorithms for solving the minimum span- ning tree problem: Kruskal’s algorithm and Prim’s algorithm. The two algorithms are greedy in the sense that at each step the best choice is made. The greedy strat- egy generally does not find globally optimal solutions, but it does find the globally optimal solution for the minimum spanning tree problem. Growing a minimum spanning tree. Assume that we have a connected, undirected graph G = (V, E) with a weight function w —> R and we want to find a minimum spanning tree for G. The algo- rithm manages a set A that is always a subset of some minimum spanning tree. At each step, an edge (u, v) is determined that can be added to A without violating this invariant, in the sense that A(u, v) is also a subset of a minimum spanning tree. We call such an edge a safe edge for A, since it can be safely added to A with- out destroying the invariant. Two notions that we need to introduce in order to 27 understand the algorithm are the definition of cut and of light edge. Definition: A cut (S, V — S) of an undirected graph G = (V, E) is a partition of V. An edge is a light edge crossing a cut if its weight is the minimum of any edge crossing the cut. Kruskal’s algorithm. Kruskal’s algorithm finds a safe edge to add to the growing forest by finding of all edges that connect any two trees in the forest, an edge (u, v) of least weight. Let C1 and C2 denote the two trees that are connected by (u, v). Since (u, v) must be a light edge connecting C1 to some other tree, then (u, v) must be a safe edge connecting C1 to some other tree. The total running time of Kruskal’s algorithm is O(E lg E) [23], page 505 . Prim 's algorithm Prim’s algorithm operates more like Dijkstra’s algorithm for finding shortest paths in a graph. In Prim’s algorithm, the edges in the set A always form a single tree. At each step, a light edge connecting a vertex in A to a vertex in V — A is added to the tree so the algorithm adds only edges that are safe for A. Then, when the algorithm terminates, the edges in A form a minimum spanning tree. 2.2 Shortest Path In a shortest path problem, we are given a weighted, directed graph G = (V, E), with weight function w —> R mapping edges to real-valued weights. The weight of a path p =< v0, v1,. . . , vk > is the sum of the weights of its constituent edges k w(Pl = Zw(Ui—1Ivi) (2-1) 121 We define the shortest path weight from u to v by 28 min{w(p) : u i—I v} if there is a path from u to v (5(u,v) = 00 otherwise A shortest path from vertex u to vertex v is then defined as any path p with weight w(p) = 6(u,v). Shortest-path algorithms typically exploit the property that a shortest path between two vertices contains other shortest paths within it. Dijkstra’s algorithm, named after its discoverer, dutch computer scientist Eds- ger Dijkstra, is an algorithm that solves the single-source shortest path problem for a directed graph with nonnegative edge weights. For example, if the vertices of the graph represent cities and edge weights represent driving distances between pairs of cities connected by a direct road, Dijkstra’s algorithm can be used to find the shortest route between two cities. Dijkstra algorithm is used for finding the minimum path, which works by growing outward from the staring node 3 in a similar way to breadth-first-search algorithm [96], [23]. At each step Dijkstra’s algorithm chooses to advance its growth front to the next unlabeled site which has the smallest distance from the starting node. The minimal path problem is closely related to a polymer in a ran- dom medium and hence to growth problems. In the minimal-path problem, one chooses the site at the growth front which has the minimum-cost path to the source, while in the minimus spanning tree problem one chooses the minimum-cost edge. The Bellman-Ford algorithm solves the single-source shortest-path problem in a more general case when edge weights can be negative. The algorithm returns a boolean value indicating whether or not there is a negative weight cycle that is reachable from the source. If there is such a cycle, the algorithm indicates that no solution exists and if there is no such a cycle, the algorithm produces the short- est paths and their weights. Shortest path provides a first approximation to key regulatory paths in gene networks, as discussed in Chapter 5. 29 2.3 Flow Algorithms In the same way that we can model a road map as a directed graph in order to find the shortest path from one point to another, we can also interpret a directed graph as a flow netwok. In graph theory, a network flow is an assignment of flow to the edges of a directed graph (called a flow network in this case) where each edge has a capacity such that the amount of flow along an edge does not exceed its capacity. In addition we have the restriction that the amount of flow into a node equals the amount of flow out of it, except if it is a source which only has outgoing flow, or a sink which has only incoming flow. A flow network can be used to simulate traffic in a road system, fluids in pipes, currents in an electrical circuit, flow in a porous media or anything similar in which something travels through a network of nodes. An important feature of flow algo- rithms is the discreteness imposed by requiring integer flows. In particular, a flow of one unit along a path can be used to model a non-intersecting polymer. We introduce the maximum flow problem as the simplest problem concerning flow networks and it asks what is the greatest rate at which material can be shipped from the source to the sink without violating any capacity constraints. 2.3.1 Flow Networks Consider a graph G(V, E) with nodes V and edges E, and special nodes source ‘ s (in-degree 0) and sink t (out-degree 0), let f (u,v) be the flow from node 11 to node v, and C(u, v) the capacity (maximum flow possible). A network flow is a real function f : V x V —> R with the following three properties for all nodes u and v: (i) Capacity constraints: f (u, v) s C(u, v). The flow along an edge cannot ex- ceed its capacity. (ii) Skew symmetry: f (u, v) = — f (v, u). The net flow from u to v must be the 30 opposite of the net flow from v to ii. (iii) Flow conservation: ZUEV f ( u, v) : 0, unless u = s or u = t. The net flow to a node is zero, except for the source, which produces flow, and the sink, which consumes flow. To define the maximum flow we need to introduce two concepts: the residual capacity and the augmenting path. The residual capacity of an edge is cf(u,v) : c(u,v) — f (u,v). This defines a residual network denoted G f(V, E f), giving the amount of available capacity. Note that there can be an edge from u to v in the residual network, even though there is no edge from u to v in the original net- work. Since flows in opposite directions cancel out, decreasing the flow from v to u is the same as increasing the flow from u to v. An augmenting path is a path (ul, 112,. . ., uk), where 111 = 5,11)( = t, and cf(u,-, ui+1) > 0, which means it is possi- ble to send more flow along this path. The max-flow min-cut theorem is a statement in optimization theory about maximal flows in flow networks and it states that the maximal amount of a flow is equal to the capacity of a minimal cut. Suppose G(V,E) is a finite directed graph and every edge (u, v) has a capacity c(u, v) (a non-negative real number). Further assume two vertices, the source s and the sink t, have been distinguished. As we already defined, a cut is a split of the nodes into two sets 5 and T, such that s is in S and t is in T, so there are 2V‘2 possible cuts in a graph. The capacity of a cut (S, T) is the sum of the capacity of all the edges crossing the cut, from the region S to the region T: c(S, T) = ZuES,v€Tl(u,v)€E c(u,v) 1 Theorem: Min-cut/Max-flow The following three conditions are equivalent: (i) f is a maximum flow in G (ii) The residual network G f contains no augmenting paths. (iii) f = c(S, T) for some cut (S, T). At first glance, finding the minimum cut looks like a hard problem because it seems to require searching over all possible 31 cuts, which is worse than exponential in the number of nods in the graph. How- ever simple polynomial algorithms exist. In many of the physics applications, the minimum cut is important because it corresponds to the interface structure in interface problems, and the domain struc- ture of random magnets. The residual graph contains the minimum cut structure which means that it contains information about complex ground states morpholo- gies. The residual graph and its minimum cut structure are used in two additional ways: testing the sensitivity of cuts to small perturbations [4] and finding all the cuts in degenerate systems [10]. The algorithms that we introduced in Sections 2.1 — 2.3 are useful for solving easy (i.e. polynomial) problems. In the next section we introduce algorithms for solving approximately hard (i.e. NP-complete) problems. 2.4 Message Passage Techniques We first explain the principles that describe belief propagation, which is an efficient way to approximate inference problems based on passing local messages. Infer- ence problems arise in computer vision, error-correcting codes and last but not least in statistical mechanics. We make a survey of inference problems from artifi- cial intelligence, computer vision, digital communications and statistical physics. Algorithms of these type are used in Chapter 4. 2.4.1 Brief Survey of Inference Problems An example from Artificial Intelligence (AI): Bayesian Networks In the artificial intelligence literature Bayesian Networks are the most popular type of graphical model. They are used in expert systems involving different domains like medical diagnoses, heuristic search and so on. 32 ./ \\ , . ...5 LL) is) \ \ / PK. v Figure 2.1: The fictional Asia Bayesian network [72] We will take an example from medicine. Let us assume that we want to con- struct a machine that automatically gives the diagnosis for patients. For each pa- tient we have information about symptoms and test results and we want to infer the probability that a given disease or set of diseases is causing the symptoms. We also assume that we know the statistical dependencies between different symp- toms, test results and disease. So let us consider the following example: (a) A recent trip to Asia (A) increases the chance of tuberculosis (T); (b) Smoking (S) is a risk for both lung cancer (L) and bronchitis (B); (c) The pres- ence of either (E) tuberculosis and lung cancer is indicated by an X-ray (X) result; (d) Dyspnoea (D) can be caused by bronchitis (B) or either (E) tuberculosis (T) or lung cancer (L). In Fig. (2.1) each node represents a variable that can be in a discrete number of states and call x; the variable representing different possible states of node i. Associated with each arrow is a conditional probability. For exam- ple, p(xL | x5) is the conditional probability for a patient having lung cancer given that he does or he does not smoke; for this link we say that the S-node is the parent of the L-node because xL is conditionally dependent on its according to figure. In this example, the overall joint probability that the patient has some combination 33 of symptoms, test-results and disease is the product of all the probabilities of the parent nodes and all the conditional probabilities: PM = P(xA)P(xS)P(xT l xA)p(xL | Xslpbcs | xs) PUB l XTIXL)P(XD l XBIXEWUX l XE) (2-2) More generally, a Bayesian network is a directed acyclic graph of N random vari— ables x; that defines a joint probability function: N p(x1,x2,~~, —1—Ip(x,- | Parent( (x,)) (2.3) i=1 The goal is to compute certain marginal probabilities (in our case we want to cal- culate the probability that a patient has a certain disease). From a mathematical point of view, marginal probabilities are defined in terms of sums over all the pos- sible states of all the other nodes in the system. We refer to marginal probabilities that we compute approximately as beliefs and denote the belief at node i by b(x,~). If we have some information about the nodes (for example we know that the pa- tient does not smoke) then we are able to fix the corresponding variable and we don’t have to sum over the unknown states of that node. Such a node we call an observable node otherwise we call it an hidden node. For small Bayesian networks, we can do marginalization sums directly, but unfortunately, the number of terms will grow exponentially with the number of hidden nodes in the network. Using Belief Propagation algorithms we can compute marginal probabilities, at least ap- proximately, in a time that grows only linearly with the number of nodes in the system. 34 2.4.2 An Example from Computer Vision: Pairwise Random Markov Fields (PRMF) In computer vision problems [98] we want to infer the details of whatever is out there based on the data that we are given. We assume that we observe some quan- tities about the image y;, and we want to infer some other quantities about the underlying scene xi. We also assume that there exists a statistical dependency between x,- and y,- at each position i and we write this as a joint compatibility func- tion <1>,~(x,-, y,-) (often called the ”evidence” for x,). Nodes i are arranged in a two- dimensional grid, and scene variables x,- should be compatibile with nearby scene variables xj, as represented by a compatibility function Til-(xi, x j), where ‘1’,- j, con- nects only neighbor positions. The overall joint probability of a scene x,- and an image y,- reads: 1 p(x,y) = Z [1%inin Xj) H¢i(inyi) (2-4) 1} 1 with Z a normalization constant and the firstproduct is over nearest neighbors (i j ) on the square lattice. Fig. (2.2) shows a graphical description where the filled-in circles represents the ”observed” image nodes y,- and the empty circles represents the ”hidden scene nodes”, xi. Our goal is to calculate the belief b(x,-) for all posi— tions i and in this way to be able to infer something about the underlying scene. A direct computation of marginal probabilities would take exponential time so we need a faster heuristic algorithm like the belief propagation algorithm. 2.4.3 Mapping to Statistical Physics The PRMF decribed earlier can de transformed into a form which physicists recog- nise. Define the interaction [,7 between the variables at neighboring nodes by 1;,- = ln‘Yil-(xij) and the field at each node by h;(x,-) : 1n (x,-,y,-). If we define 35 /. 20 /0 W3) Figure 2.2: A square lattice Pairwise Random Markov Field /0 /0 E—C ,J the energy as: E(x) Z — Zlij(xixj) — EMA") (25) (it) i and we use Boltzmann’s law P(x) = % Exr(:§T(x—i)) (2.6) where Z is the normalization constant called the partition function, we see that our pairwise MRF corresponds to statistical field theory at T = 1. If the number of states at each node is exactly two, the model reduce to the Ising model. For this case we change variables from Boolean variables x,- = 0 (1) to spin variables S; = —1 (1) and if we restrict to [,7 interactions which have a symmetric form than we have a spin glass energy function [84]: 5(5) = — Zlij 5i S) '— 211151 (2.7) (if) i 36 In the context of the Ising model, the inference problem of computing beliefs b(x,-) can be mapped onto the physics problem of computing local magnetizations: m,- = b(5i:1)— b(Si = —l) ' (2.8) The magnetization is equal to the marginal probability and the marginal probabil- ity is approximated by the beliefs. 2.4.4 Tanner Graphs and Factor Graphs In the following we introduce a special type of graph namely factor graph to repre- sent more than 2-body interactions. A factor graph is a bipartite graph that shows how a global function of many variables factors into a product of local functions and so goes beyong pairwise interactions [71]. Consider a set of N discrete-valued random variables X1, X2, - - - , X N, and con- sider x; the possible realizations of random variables X,. The joint probability func- tion factors into a product of functions which has the general form: P(X) = %nfa(xa) (2.9) with x = x1,x2, - -- ,xN and a is an index which labels M functions fA,fB,. . . ,fM; the function fa(xa) has arguments xa that are some subsets of x1,x2, . ..,xN. A factor graph is a bipartite graph that expresses the factorization structure in Eq. (2.9). It has a variable node (which we draw as a circle) for each variable x,, a factor node (which we draw as a square) for each function fa, with an edge connecting variable node i to factor node a if and only if x,- is an argument of fa. From now on we index variable nodes with letters starting with i, j, . . . and factor nodes starting with a, b, , . . .. Fig. (2.3) presents an example of the factor graph corresponding to 37 \. /’“\\ /rfl~\\ G) (3,) C 3 J ( 4 i Figure 2.3: A factor graph representing the joint probability distribution given by Eq. (2.10) p(x1,x2,x3,x4) = %fa(xlrx2lfbkx2rx3rx4lf6(x4) (2-10) The goal is to compute the marginal probability distributions. x,- are the states of variable node i; if S is a set of variable nodes then x5 denotes the states of the corre- sponding variable nodes. The marginal probability function obtained by marginal- izing p( x) onto the set of variable nodes S reads: Ps(Xs) = Z P(X) (2.11) X\X5 where the summation over 1: \ xs indicates that we sum over the states of all vari- able nodes which are not in set S. The belief propagation algorithm is a method for computing marginal probability functions [115], [113]. The computed marginal probability functions will be exact if the factor graph has no cycles, but the BP algo- rithm is still well-defined and empirically often gives good approximation results even when the factor graph does have cycles. 38 2.4.5 Standard Belief Propagation In the next section we explain the standard belief propagation procedure on pair- wise random markov fields. We assume the observed nodes y; are fixed, write <1>,-(x,-) as a short-hand for ,-(x,-, y,~) and focus on the joint probability distribution for the unknown variables x,: %H¢1,~(xu x1>lT(xI-) (2.12) Z(i1') In the BP algorithm, we introduce variables mii(xf) which can be intuitively inter- preted as a message from a hidden node i to a hidden node j about what state node j should be in (Fig. 2.4A). The belief at node i is proportional to the productof the local evidence at that node (<1),- (x,-) ), and all the messages coming into node i: 171051) = ((9)1061) H "111061) (2-13) 1'6N(i) where K is the normalization constant (the beliefs must sum to 1) and N (i) is the set of nodes neighboring node i(Fig. 2.4B). The message update rule is: -- +— 2(1),- (x,-)‘i’,-,-( x,-,x]) H mki( x,) (2.14) kENU )\1' and it is shown diagramatically in (Fig. 2.4C). On the right-hand-side of Eq. (2.14) we take the product over all messages going into node i except for the one coming from node j. Let us consider an example of a network with four hidden nodes like the one in Fig. (2.4). 39 4 A ”131*.” .. m12(x2)> m23(x3)> ” ,"m43(x3) 1 * 2 ‘ 3 m21(x1) m32(x2) ‘ "~m35(x5) m53(x3\)\i 5 B V > 4 A m D O . 3 1 2 . 4 Figure 2.4: A. An illustration of the messages passed in Belief Propagation; B. A diagramatic representation of (2.13); C. A diagramatic representation of the BP message update rules 2.14. The summation symbol indicates that the summation is over all the states of node i; D. A pairwise MRF with four hidden nodes. 40 The belief at node 1 using the belief propagation rule (2.14) is: l71(3‘1) = k2(x2)m32(x2)m42(x2) X2 = k1(x1)E‘I’12(x1, X2)¢2(x2) Z¢3(13W23(x21 x3) E(PMXU‘I’zszI x4) (215) 1’2 I3 I4 Reorganizing the sums, it is easy to see that the belief at node 1 (Fig. 2.4D) is the same as the exact marginal probabilities at node 1: b1(361) = k E p({x}) = P101) (2-16) x2,x3,x4 Belief propagation gives the exact marginal probabilities for all nodes in any singly connected graph. In this case each message needs to be computed only once which means that the whole computation takes a time proportional to the number of links in the graph (so the whole computaion takes much less time than the exponentially large time required to compute marginal probabilities naively). The two-node marginal probabilities pi), for two neighboring sites i and j, is obtained by marginalizing the joint probability function over every node, except the two nodes i and j: Pij(x1Ixj) = E P({Z}) (2.17) Ziizij=(x1IIj) The belief equation for the two-nodes belief reads: bi)=k¢1j(X1Ile¢i(x1) H mm» = )Zbaxmnm (2.19) Distance D (known as Kulback-Liebler distance) does not have all the properties that we normally associate with distances: it is not symmetric and does not satisfy the triangle inequality. It is always non-negative and it is equal to zero if and only if the two probability functions b(x) and p(x) are equal. In statistical physics we assume a Boltzmann distribution law: P(X) = Ze’T (2.20) where T is an order parameter that defines a scale of units for the energy, and for simplicity we assume T = 1. From Eqs. (2.19) and (2.20) we find: D(b({x}) ll P({X})) = §b({x})E({x}) + {217001111 b({x}) +1112 (221) 42 As we mentioned earlier, distance D = 0 (b( {x}) = p({x}) when the quantity = ngxllEH-fl) + ;b({.¥})lnb({x}) = W ({x})) -S(b( (-)){r} x {A 922 achives its minimal value F = — an (2.23) called the Helmohltz free energy. In statistical physics language terms G, U, S and F defined in eqs. (2.22) and (2.23) are called Gibbs energy, average energy, entropy and Helmholtz free energy. 2.4.7 The Mean-Field Free Energy Let us consider the case when joint probabilities have a particularly simple form: they are factorized over sites b({x}) = 11171051) (2.24) with constraint Z,- b,-(x,-) :2 1. With this mean-field approximation, the one-node beliefs are b,-(x,-) and the two node beliefs are b,,j(xi, xi) = b;(x,~)bj(xj). The energy configuration of a pairwise M RF is: E(){x}— —z;ln1p,-]( x;,x j) —Z;ln,-(x,-) (2.25) (1!) The mean-field average energy and the mean-field entropy are: LIMF( {b})= —(X:Zb,-( x;)b l)ln1/1,,( x,,xl-) —2};b;( xi) ),~(ln (x,) (2.26) (lj))x1xj ' 43 respectively SMF(b1) = — 221“le 11119100) (2-27) 1 X1 The mean-field Gibbs free energy GMF ——— UMP — S M p is a function of the full joint probability distribution while the mean-field free energy is only a function of one node belief. Since we know that G lies below GM; it is reasonable to search for that configuration of b,’s that minimizes G M p which in fact represents the standard variational justification for the mean-field theory. 2.4.8 The Bethe Free Energy In the following we derive Gibbs free energy as a function of both the one-node belief b,- (x,-) and the two-nodes belief bi,j(in x,) which obey the normalization con- ditions 2x1, b,~(x,-) = Zx;,x,- bi, 1(in xi) = 1. In this case the average energy reads: U = - 2171 1(in 1)) 1“ lib/(xv 36)) — 2171061) 111 (131061) (228) 1 j 1 The average energy when computed with the exact marginal probabilities p;(x,-) and Pij(in x,) will also have this form so if the one-node and two-node beliefs are exact, the average energy given by Eq. (2.28) will be exact. We could compute the entropy exactly if we could explicitly express the joint probability distribution b(x) in terms of the one node and two-node beliefs. For a singly connected graph we can do this, case in which the joint probability distribution reads: “(111171101de 1T: b1(x1)"—1 44 b(x) = (2.29) where q,- represents the number of nodes neighboring node i. From eqs. (2.29) and (2.27) we have: SBethe = — (E) r; 1911(inle 1“ l71,‘(3C1'Ix)') + 2011 — 1) gbikxil 1“ MM) (230) , 1 . ,~, . 1 1 . , So for a singly connected graph, the Bethe approximation of both the energy and the entropy will have the correct functional dependency of beliefs and the values of these beliefs that minimize the Bethe free energy GEM, = U — S BM“. will corre- spond to the exact marginal probabilities. For graphs with loops, the Bethe entropy and free energy will be only approximations [42]. In contrast to the mean-field free energy, the Bethe free energy in general is not an upper bound on the true free energy. If we introduce the local energies: 51071) = — 1n¢1(x1) (2-31) E,j(x,-, Xj) = — ln‘l’il-(xi, x1) — ln,(x,-) — lnCDI-(xj) (2.32) and using the marginalization constraints we obtain: U = — Z Z b;j(x,-,xj)E,-j (x;,x]-) + E(q; — 1) Zb,(x,)E,-(x;) (2.33) (if) xi'xi 1' x,- which has exactly the same form as the entropy for Bethe approximation where we replaced the In terms with the local energy E terms. The Bethe free energy is equal to the exact Gibbs free energy for pairwise M RF ’5 when the graph has no loops so the Bethe free energy is minimal for the correct marginals. So when there are no loops the BP beliefs are the global minima of the Bethe free energy. It turns out that a set of beliefs gives a BP fixed point in any graph if and only if they are locally stationary points of the Bethe free en- 45 ergy [113]. The BP algorithm for graphical models with loops is not guaranteed to converge but because BP fixed points correspond to Bethe free energy minima, one can simply choose to minimize the Bethe free energy directly. Such free energy minimizations are slower than the B P algorithm, but at least they are guaranteed to converge. The succes of BP algorithms is exciting because it means that many different types of problems that seem difficult to handle, involving graphs with many nodes and loops, can actually be handled using efficient and systematically correctable algorithms. These algorithms are much faster than Monte Carlo approaches, and the approximation to the free energy that they are effectively implementing are more accurate than mean-field approximations. These algorithms give a princi- pal framework for propagating, in parallel, information and uncertainty between nodes in a network. In the next section we will show that ideas from BP algorithms combined with ideas from statistical physics lead to a new class of algorithms, the survey propaga- tion algorithms which turn out to be very effective in solving NP-complete prob- lems. 2.5 Novel Techniques: Message Passing Algorithms and the Cavity Method For the K-SAT problem, well below the threshold, a generic problem has many solutions, which tend to form one giant cluster; the set of all satisfying assign- ments form a connected cluster in which it is possible to find a path between two solutions that requires short steps only. So, greedy algorithms and other simple heuristics can readily identify a solution by a local search process. Close to the critical thershold, however, the solution space breaks up into many 46 smaller clusters and solutions from separate clusters are usually far apart. Clusters that correspond to partial solutions (i.e. satisfy some but not all of the constraints) are exponentially more numerous than the clusters of solutions and act as traps for local search algorithms. 2.5.1 The Message Passing Solution of SAT on a Tree Survey propagation finds solutions to many problem instances in this hard region, including very large instance that cannot be solved using earlier methods. For ex- ample, for random 3 — SAT problem and close to the threshold, survey propaga- tion algorithms solve instances up to 107 variables, whereas DPLL algorithms can solve instance with usually less than 100 variables. The message passing procedure resembles in some aspect the BP algorithm described earlier, but with few differ— ences. The messages sent along the graph underlying the combinatorial problem are surveys of some elementary warning messages and are probability distribu- tions parametrized in a simple way: they describe how the Boolean variables are expected to fluctuate from cluster to cluster. Once the iterative equations for such probability distributions have reached convergence, it is possible to identify the Boolean variables which can be fixed and then simplify the problem. The whole process is repeated on the simplified problem until a solution is found. We describe the message passing procedure in the special case in which the factor graph is a tree. The algorithm uses elementary messages passed along the graph called cavity biases and cavity fields. We use the definition of these two type of messages introduced by Braunstein et. a1. [17]. The basic elementary message passed from one functionj node a to a variable node i is a Boolean number um- 6 0,1 called a cavity bias. The basic elemntary message passed from one variable node j to a function node a is an integer number h 1,, called a cavity field. 47 In order to compute the cavity field h 10' the variable j considers the incoming cavity biases which it receives from all the function nodes b to which is connected, except b = a (so from here the name cavity) and performs the sum: hja = ( Z ”(71) — < Z um) (2.34) bev. (1'1\a beV-~(1)\a If j has no other neighbor than a, then h 1,, 2: 0. In Eq. (2.34) we denoted by V(i) the set of function nodes a to which it is connected by an edge, by V+ (i ) the subset of V(i) consisting of function nodes a where the variable appears un-negated (the edge that connects a to i is a full line), and by V. (i) the subset of V(i) consisting of function nodes a where the variable appears negated (the edge that connects a to i is a dashed line). In order to compute the cavity bias uni, the function node a considers the in- coming cavity fields which it receives from all of the variable nodes j to which is connected, except j = i. If there is at least one j E V(a) \ i such that (1)“an S 0 then ua; = 0, otherwise an; = 1. If a has no other neighbor than i, then ua, = 1. Therefore: uai: N 601ml?) (2-35) 1€V(a)\i where 0 (x) is the step function. Using ideas from survey propagation algorithms it was possible to have a more profound picture of the solution space for the SAT problem. It was found [85] that there exists two critical regimes for a. For a < 01,, m 3.921 the set of satisfying assignment SM is connected, that is one can find a short path in SM to go from one solution to any other solution. For 01,, < a < ac z 4.267, SM becomes divided into subsets which are far apart in Hamming distance. If we denote with NC, E exp():(a)) the number of such clusters and with Ni", E exp(S,-,,,(a)) the number of solutions within each cluster, then in the language of statistical physics, the quantity 2 is called the complexity and Sin, the 48 Figure 2.5: A function node a and its neighborhood. The survey of cavity bias cab be computed from the knowledge of the joint probability distribution for all the cavity-biases in the set U, so those coming onto all variables node j which are neighbors of a, except j = i. 49 internal entropy. Although there exists in this phase an exponentially large number of clusters, each containing an exponential number of solutions, it is very difficult to find a solution because of the proliferation of metastable states. A metastable cluster is a cluster of assignments which all have the same fixed number say C of violated clauses, and such that one cannot find at a small Hamming distance of this cluster an assignment which violates strictly less than C clauses. This clustering phenomenon is particularly difficult to study in random sys- tems but recent progress in statistical physics made possible to develop new meth- ods like the cavity method [85] [86]. 2.5.2 The Cavity Method The cavity method, initially developed to study the Sherington Kirkpatrick model of spin glasses [68] is a powerful method to compute the properties of ground states in many condensed matter and optimization problems. The method is in principle equivalent to the replica method, but has more clearer interpretation that allows to determine solutions for problems which remain very difficult to under- stand in the replica formalism. Let us consider a system of N Ising spins, S,- = i1, i E 1,. . ., N, interacting with random couplings, with energy H = -— 2 1,5,5) (2.36) <1 )> The sum is over all links of a lattice. For each link (i j) the coupling [,1- is an in- dependent random variable chosen with the same probability distribution P( I) The aim is to compute in the infinite N limit, the value of the energy density of the global ground state, which is the configuration of Ising spins which minimizes the Hamiltonian given by Eq. (2.36). The ground state energy of a N spin system, 50 averaged over the distribution of samples is denoted by E N, so in other words we want to compute: = lim 3 (2.37) N—Im The cavity method is best exploited when the locally stucture of the underlying graph is a tree. There are various types of tree-like lattices, the most used by physi- cist are presented as follows: 0 The Cayley tree: starting from a central site i = 0, one builds a first shell of k + 1 neighbors. Then each of the first shell spins is connected to k new neighbors in the second shell etc. until one reaches the Lth shell which is the boundary. There is no overlap among the new neighbors, so that the graph is a tree. 0 The random graph with fluctuating connectivity: for each pair of indices (1 j ), a link is present with probability c/ N and absent with probability 1 — c / N . The number of links connected to a point is a random variable with a Poisson distribution, its mean being equal to c. o The random graph with fixed connectivity, equal to k + 1, called the Bethe lattice. The space of allowed graphs are all graphs such that the number of links connected to each point is equal to k + 1. The simplest choice is the case where every such graph has the same probability. There are several ways to calculate the energy density given by eq.(2.37) and in the following will adopt a formalism developed by Mazard and Parisi [82]. Let us introduce an intermediate object which is a spin glass model with N spins, on a slightly different random lattice, where q randomly chosen cavity spins have only k neighbors, while the other N — q spins all have k + 1 neighbors as in part A of Fig.(2.6). We call such a graph a GM, cavity graph. The cavity spins are 51 fixed, their values are 51,. . . , sq. The global ground state energy of the correspond- ing spin glass model depends on the values of the cavity spins. The intermediate construction of G N), turns out to be helpful. The basic opera- tions which one can perform on cavity graphs are the following (Figures A, B, C from Fig.(2.6)): o Iteration: By adding a new spin so of fixed value into the cavity, connect- ing it to k of the cavity spins say 51,. . .,sk, and optimizing the values of these k spins, we change the GM, into a GN+1,q_k+1 graph, so we have 6N=1, 6q=—k+1. a Link addition: By adding a new link between two randomly chosen cavity spins $1, 52, and optimizing the values of these two spins we change a G N), into a GN,q—2 graph, so (SN 2 0, (Sq = —2. 0 Site addition: By adding a new spin so into the cavity, connecting it to k + 1 of the cavity spins say 51, . . . , 5H], and optimizing the values of the k + 2 spins $1,...,sk+2, we change a GM, into a GN+1,q_k_1 graph, so (SN 2 1, (Sq = —k—1. It is straightforward to see that if we start from a GN,2(k+1) cavity graph and per- form k + 1 link additions, we get a G N,0 graph, i.e. the original spin glass problem with N spins. Starting from the same GN,2(k+1) cavity graph and performing two site additions we get a G N+2,0 graph, i.e. the original spin glass problem with N + 2 spins. Therefore the variation in the global ground state energy when going from N to N + 2 sites, (E N+2 — E N), is related to the average energy shifts AE“) for a site addition and AEm for a link addition, through: EN+2 — EN = 21113“) — (k +1)AE(2>. (2.38) 52 FIG. A S 51 S2 S3 6 S0 S4 55 FIG.B S1 S2 5.6 S3 55 S4 FIG.C s 3 6 S Figure 2.6: An example, for the case k :2 2, of a GN,6 cavity graph where q = 6 randomly chosen cavity spins have two neighbors only. Fig. A: All the other N — 6 spins outside the cavity are connected through a random graph such that every spin has k + 1 = 3 neighbors. Fig. B: Starting from G M6 cavity graph we can create a GN+2’0 graph by adding two sites. Fig. C: Starting from G M6 cavity graph we can create a G N,0 graph by adding three links. [82]. 53 Using the fact that the total energy is asymptotically linear in N, the energy density of the ground state is, E 13 — E, k 1 u = AlganN = JET—Y = AB“) — %AE(2). (2.39) An intuitive interpretation of this result given by 2.39 is that in order to go from N to N + 1 one should remove (k + 1) / 2 links (the energy for removing a link is minus the energy for adding a link) and then add a site. I When q / N << 1, generically, the distance on the lattice between two generic cavity spins is large (it diverges logarithmically in the thermodynamic limit). It is thus resonable to assume that various cavity spins become uncorrelated. This is the basic assumption of the replica symmetric solution and it is treated in a detailed way in the next chapter. 54 Chapter 3 Phase Transitions in Random Combinatorial Problems In the last decade tools from statistical physics of frustrated systems are being used to gain a better understanding of complexity, by mapping the optimization prob- lems onto the study of the ground state of disordered models [83], as we showed in Chapter 1. Concepts and methods from statistical physics have been applied to the analysis of the typical properties of optimization problems and also to different search algorithms in which temperature is an important control parameter [67]. In Section (3.1) we give an introduction to phase transitions in random combinatorial problems discussed in Chapter 1, Section (3.2) presents a more detailed discussion of K—SAT and Section (3.3) discusses replica symmetry breaking in spin glasses. 3.1 Phase Transitions The observation of threshold phenomena in random mathematical and computer science problems has shown that not only ground state properties but also the critical behavior at glassy phase transitions occurring in disordered systems could 55 be relevant for complexity theory, due to the so called intractability concentration phenomena [60]. For many NP-complete problems one or more order parameters can be defined, and hard instances occur around particular critical values of these order parame- ters. In addition, such critical values form a boundary that separates the space of problems into two regions. One region is under-constrained so the density of so- lutions is high, thus making it relatively easy to find a solution. The other region is over-constrained and very unlikely to contain a solution. If there are solutions in this over-constrained region, they have such a deep local minima that any rea- sonable algorithm is likely to find it. If there is no solution, then a backtrack search can usually establish this easily, since potential solution paths are usually cut off early in the search. Really hard problems occur on the boundary between these two regions, where the probability of finding a solution is low but non-negligible. At this point there are typically many local minima corresponding to almost so- lutions separated by high energy barriers. These almost solutions form deep local minima that may often trap search methods that rely on local information. Because it is possible to locate a region where hard problems occur, it is possible to predict whether a particular problem is likely to be easy to solve. 3.1.1 Phase transition for the SAT problem The most widely used complete algorithms (i.e., algorithms which are able to ei- ther find a satisfying assignment, or to prove that there is no such assignment) were developed by Davis and Putnam [27], improved later by Longemann and Loveland and are known in the literature as (DPLL) algorithms. DPLL operates by directed trial and error, using a search tree made of nodes connected through edges as follows: (1) A node corresponds to the choice of a variable. Depending on the value of the latter, DPLL takes one of the two possible edges; (2) Along an edge, 56 all logical implications of the last choice made are extracted; (3) DPLL goes back to step (1) unless a solution is found or a contradiction arises; in the latter case, DPLL backtracks to the closest incomplete node, inverts the attached variable and goes to step (2). If all the nodes carry two descendent edges, unsatisfiability is proven. The probability P (1x) that an instance is satisfiable as a function of a, for differ- ent sizes N is shown in the lower part of Fig.(3.1). In addition to being a decreasing function of a as expected from above, a striking phenomenon happens as N grows. An abrupt decrease of the probability takes place at a critical value ac z 4.3. In- stances with less than aCN clauses are almost surely satisfiable, whereas the ones with more than aCN clauses almost never have any solution. The upper part of Fig.(3.1) shows the median time necessary to solve a random 3 — SAT instance, that is to find a solution (below the threshold), or check that there is none (above the threshold). The general pattern of the complexity as a function of a, namely easy, hard and less hard resolutions is a generic feature valid for commonly used algorithms for hard computational problems. Mathematicians have been able to establish rigorous bounds on the threshold, 3.14 3 ac S 4.51, but the exact value seems out of reach with the available probabilistic techniques [1], [2]. Approxima- tion methods developed by physicists in the course of the study of phase transi- tions allow not only to estimate the value of the threshold but also to unveil the microscopic structure of the solutions and the mechanism leading to their disap- pearance. We already showed in Chapter 1 the mapping between the SAT prob- lem and statistical physics: we translated the SAT problem with its ingredients (Boolean variables, clauses) to statistical physics (Ising spins, interactions). The main idea is to introduce an energy function, which is equal to the number of un- satisfied clauses for each variables-spins configuration, and study the ground-state properties. The satisfiability of the 3 — SAT problem is therefore equivalent to the vanishing of ground-state energy. The physical scenario that was obtained for ran- 57 The 4.3 Point >. a: E a .0 o I— 0. Ratio of CIauses-to—Variables Mltchell, Selman, and Levesque 1991 Figure 3.1: The 4.3 point for the 3-SAT problem 58 dom 3-SAT is the following [15], [14]. For a > ac the ground-state energy becomes positive with probability one, whereas it vanishes for a < ac. Finding the exact solution of random 3 — SAT remains an open problem. However, combining some exact results with approximated techniques it was found [88] ac z 4.48, which is just 5% larger than the numerical value. For a slightly lower than ac the number of solutions remains enormous 2mm [88], and then vanishes abruptly at ac. Hence, increasing slightly the number of clauses is enough to make all solutions disappear simultaneously. Moreover, immediately beyond the transition, a finite fraction of over-constrained variables assuming the same value in each optimal configuration abruptly emerges. This backbone of variables plays the role of an order parameter: it vanishes for a > ac and jumps discontinuously to a value 15% at the transition. Hence, the 3 — SAT transition may be interpreted as a first order transition. 3.1.2 Phase transition for the Coloring problem By adding one edge at a time, we can determine exactly which edge causes a graph to become uncolorable. We call two nodes in a graph as frozen if they preserve the same color in all legal colorings. Culberson and Gent [25] studied for the 3- Coloring problem, the development of frozen pairs (so pairs that have the same color under all three colorings of the graph) and they found a jump at the thresh- old. For the K-SAT problem one measure for the order parameter is the backbone which is the number of variables that are frozen to a particular value under all satisfying assignments. For the Coloring problem it is more difficult to find and measure the order parameter because of the symmetry of solutions. If we add an edge to a colorable graph and then the graph becomes uncolorable, then in the graph without the edge the pair of vertices must have been colored with the same color in every coloring of the graph. This measurement process is called the frozen development process and it was 59 0.35 f - I T Y T I Y T 0.3 025 ” 0.2 0.15 0.1 0.05 9 Figure 3.2: The ratio of frozen pairs to (’2') plotted against the ratio M / N, where M is the number of frozen pairs [24]. shown that this set of pairs shows an explosive jump at the threshold. By identify- ing pairs of vertices that are frozen, a collapsed graph was created which has the same set of 3-colorings. If the measure were to be taken with respect to minimiza- tion of violated edges, then this sudden drop would mostly dissapear. This is due to the fact that for large sets of edges removing any of the violated edges from the threshold would result in a distinct set of 3-colorings. This could explain the dis- crepancy between the jump that is observed and the lack in certain statistical me- chanics analysis of the 3-coloring phase transition [106]. From Fig.(3.2) we see that the freezing process is not gradual; the addition of one edge can cause many pairs to become frozen simultaneously. Fig.(3.2) and Fig.(3.3) shows how the number of frozen and free pairs grows as the edge density increases. The full development process was run on 100 permutations at each value of N = 50, 75, . . . , 225 in step of 25. This is typical for a phase transition, and the sharpness suggests that there 60 9 Figure 3.3: The ratio of free pairs to (’2’) plotted against the ratio M / N, where M is the number of frozen pairs [24]. 61 will be a discontinuity at N —+ 00. 3.1.3 Phase transition for the Vertex Cover problem Let us consider a random graph with N vertices V = 1,2, . . ., N where any pair of vertices is connected randomly and independently by an edge with probability p. The expected number of edges becomes p(g’) = pN2 + O(N) , and the average connectivity of a vertex is p(N — 1). As we defined in Chapter 1 the mathemat- ical formulation of the VC problem is as follows: Consider an undirected graph G = (V,E) wittherticesi E V = 1,2,--- ,N and edges (i,j) E E C V x V. A vertex cover is a subset ch C V of vertices such that for all edges (1, j) E E there is at least one of its endpoints i or j in WC. We call the vertices that are in WC covered, whereas the vertices in its complement V \ ch are called uncovered. We map the random graph to a disordered spin systems and we seek to mini- mize the Hamiltonian: 1 N H({51}1{li/}) = 5 Z lij5s,,—155j,—1 (3-1) i,j=1 where 1;,- = 1 whenever there is an edge between vertex i and vertex j and zero otherwise. The covering state of the vertices is mapped to a configuration of N Ising spins: S; = 1 if i E ch so if vertex i is covered and S,- = —1 if i E V — ch so if vertex i is uncovered. The vertex-cover decision problem asks whether there are covers of fixed cardinality xN = [chl or in other words, we are interested if it is possible to cover all the edges of G by covering xN suitably chosen vertices. We have to minimize the Hamiltonian H under the constraint: 1 N N S,- = 2x — 1 (3.2) .:1 l 62 which fixes the cardinality of the cover set, or in physical terms, it fixes the magne- tization of the Ising spin systems. If this make the energy equal to zero, then there are no uncovered edges and the decision problem can be positively answered. If a minimal energy greater than zero is found, there is no vertex cover of cardinality xN, case in which the minimum energy represents the minimum number of un- covered edges. When the number xN of covering marks is lowered (the average connectivity c, which can be seen as edge concentration is kept constant), the model is expected to undergo a coverable-uncoverable transition. Using probabilistic tools, rigor- ous lower and upper bounds for this threshold [47] and the asymptotic behavior for large connectivities have been deduced [43]. In Fig.(3.4) the average ground- state energy density and the probability Pcov(x) that a graph is coverable with xN vertices are shown for different problem sizes N = 25, 50, 100. The average con- nectivity considered is c = 2, but qualitatively equivalent results are found for other values of c too. The results were obtained using the branch-and-bound al- gorithm presented with data averaged over 103 (N = 100) to 104 (N = 25,50) samples [56] [111]. As expected, the value of Pcov(x) increases with the fraction of covered vertices. Growing the size of the graphs , the curves become steeper. This indicates that in the thermodynamic limit N —> oo , a sharp threshold at xc z 0.39 appears. Above xc a graph is coverable with probability tending to one in the large N limit, while below xc it is almost surely uncoverable. In the language of a physi- cist, a phase transition from a coverable phase to an uncoverable phase occurs and it is usually denoted as the coveruncover transition. 3.1.4 Phase transition for the Number Partition Problem In Chapter 1 we defined the number partitioning problem as follows: given a se- quence of positive real numbers a1, a2, ..., a N the NPP consists of partitioning them 63 vaYYvi‘r j’Yva‘T‘VTY ffj 1. T If 08’ 0.6 0.4 o--—o N=25 fl-—--—D N=50 +——+ N=100 A 0.2 A A A 0.2 0.3 0.4 0. 5 Figure 3.4: Probability Pcov(x) that a cover exists for a random graph (c = 2) as a function of the fraction x of covered vertices. The result is shown for three different system sizes N = 25, 50, 100 (averaged for 103 — 104 samples). Lines are guides for the eyes only [57]. into two disjoint sets A1, A2 such that the difference I Edie/(1 a 1- — Ewe/(2 “1' is mini- mized. When the difference between the two sums is less than or equal to one then the partition is called perfect partition, otherwise is called imperfect partition. Fu [44] claims that in the random number partitioning problem no phase tran- sition of any kind is found. If Fu were right, NPP would be a notable exception to the observation that many NP-complete problems do have a phase transition, pa- rameterized by a control parameter that separates the easy from the hard to solve instances [21]. Mertens [79], [80], computed the expected number of perfect partition and for this they considered the binary representation of the n integers that are being par- titioned. They considered the case when in the bags they have even sums where each bag must add to the same target sum. In order to develop an annealed theory, the probabilities are averaged independently over the different digit positions. On average we expect half of the possible partitions to add up to a number with the same parity as the least significant bit to the target sum. The expected number of perfect partitions, reads: n < Sol >= 27 (3.3) where it was considered that the numbers were drawn unifome and at random from (0.1] . The expected number of solutions < Sol > and the problem size n defines a constrainedness parameter: _ log2 < Sol > n K = 1 (3.4) If K is small then the problem is under-constrained and there are a large number of solutions compared with the problem size. If K is large, the problem is over- constrained and only few solutions exist. The transition gets sharper with increas- ing 11. Finite-size scaling lead Gent and Walsh to find KC 2 0.96 for N —I 00. This 65 result is in contradiction with Fu’s claim but this type of phase transition can in- deed be found in the statistical mechanics of the number partitioning problem. Substituting the annealed value of solutions (Sol ) from Eq.(3.3) into Eq.(3.4) we get: K = _10:5221 (3.5) To estimate the critical value of K, Gent and Walsh [50] found the probability that a bag with an even sum has a perfect partition as a function of K for n varying from 6 to 30 and logzl from 0 to Zn. They generated 1000 problems at each value of A and 11. Similar results are seen using bags with an odd sum, and bags with both odd and even sums. As predicted, a phase transition occurs around K m 1 with the transition sharpening as N increases. Finite size scaling methods were applied to determine how the probability scales with problem size [9]. Around some critical point, it was predicted that problems of all sizes will be indistinguishable except for a change of scale. This suggests: K'—'KC Prob(perfect partition) 2 f (( )nl/V) (3.6) Kc where f is a fundamental function, KC is the critical point and n“ 1’ provides the change of scale. The fraction (K—Efi) plays the role of the reduced temperature (T — Tc)/TC in physical systems. Eq.(3.6) has a fixed point where K 2 KC and for all N, the probability is the constant value f (0). In order to estimate KC the fixed point was taken where the spread in probabilities is the smallest, which gives Kc = 0.96 :l: 0.02. To estimate 1/ the assumption that Eq. (3.6) holds at the point 50% was made, which gives u = 1 i 0.3. Then we can define a rescaled parameter: 1c -— 0.96 7 _ 0.96 " (3‘7) 66 Gent and Walh plotted the probability of a perfect partition as a function of '7. They found that the finite-size scaling provides both a simple and accurate model for the scaling of probability with problem size. A similar rescaling of the constrainedness parameter 1c describe the phase transition in satisfiability [48], constrained satisfac- tion [67] and traveling salesman problem [49]. 3.2 More details on K-SAT problem 3.2.1 Known results for the K-SAT problem As noted in Section 3.1.1, when the number of clauses becomes of the same order as the number of variables (M = aN) and in the large N limit (the thermodynamic limit), the K-SAT shows a striking threshold phenomena. Numerical experiments have shown that the probability of finding a solution (i.e. a correct Boolean as- signment) falls abruptly from one down to zero when it crosses a critical value ac( K) of the number of clauses per variable. Above ac (K) all clauses cannot be sat- isfied any longer and we are interested in minimizing the number of unsatisfied clauses, which is the optimization version of K-SAT referred as MAX-K-SAT which we already introduced in Chapter 1. Near aC(K), heuristic search algorithms get trapped in non-optimal solutions and a slow down effect is observed. We give a brief review of known results for the K-SAT problem that have been obtained in the framework of complexity theory. 0 For K = 2, 2 — SAT problem belongs to the class P of polynomial problems. For a > ac, MAX —- 2 — SAT is NP-complete [46]. Mapping of 2 — SAT on graph theory allows to derive rigorously the threshold value ac : 1 and an explicit 2 — SAT polynomial algorithm has been developed [5]. 67 o For K 2 3, both K-SAT and MAX-K-SAT belong to the NP-complete class. From a rigorous point of view only upper and lower bounds of aC(K) are known [29], [51]. Finite size scaling techniques, have been allowed one to determined precisely the numerical values of ac for K : 3, 4, 5, 6 [67]. o For K >> 1, clauses become decoupled and an asymptotic expression 01,; z 2" 1n 2 can be found. 3.2.2 Statistical mechanics of the K-SAT problem For a = M/ N > 0, K-SAT can be cast in the framework of statistical mechanics of random diluted systems by the identification of an energy-cost function E (K, or) equal to the number of violated clauses [66], [88], [1’], . The study of its ground state allows one to address the optimization version of the K-SAT problem as well as to characterize the space of solutions by its typical entropy, i.e., the degener- acy of the ground state. The vanishing condition on the ground state energy for a given K corresponds to the existence of a solution to the K-SAT problem and thus identifies a critical value aC(K) below which random formulas are satisfiable with probability one. For a > aC(K), the ground state energy becomes nonzero and gives information on the maximum number of satisfiable clauses, i.e.,‘on the MAX-K-SAT problem. As already defined in Eq.(3.8), the energy cost function (the number of violated clauses) is given by: M N E[C, S] = 2 5(2 Chs, — K) (3.8) 1:1 1:1 subject to the constraints 2111(3051 : K and 2111 C121, 2 K, I = 1...M. To com- pute the ground state energy, a fictitious temperature 1 / B is introduced to regular- ize all mathematical expressions and B —> 00 is taken at the end of the calculations. In this way we compute the free energy density at inverse temperature )3, averaged 68 over the clauses distribution: 1 Hp) : —fi—Nan[C] (3.9) with ZlCl = [am—1151081) (3.10) 51 The energy given by Eq.(3.8) is self averaging and can be obtained from the free en- ergy defined in Eq.(3.9). The overline denotes the average over the random clauses and is performed using the replica trick: __ l1 _ an = lim Z 1 11—90 11 (3.11) One prepares n replicas of the original system, evaluates the configurational aver- age of the product of their partition function Z", and then takes the limit 11 -—I 0. This technique, the replica method, is useful because it is easier to evaluate 7 than W. The typical properties of the ground state will then be recovered in the B —-> oo limit. 3.2.3 The simplest case, K = 1 The K = 1 case can be solved either by a direct combinatorial method or using the statistical physics approach. Moreover, the K = 1 case allows us to check the correctness of the statistical physics approach [89]. For this case, a sample of M clauses is completely described by giving the numbers t,- and f,- of clauses that a variable S,- must be true or false respectively. Then N Z[t, f] = Hoe—13“ + 1.2—(3ft) (3.12) i=1 69 so for the average disorder we have 1 —— M! Nan[t,f] = Emlnzmm = ln2- F?- +Iziooe”“lj(a)ln[cosh(%l)] (3.13) where I, denotes the 1”’ modified Bessel function. The ground state energy (so the average number of violated clauses) at zero temperature is: E55 2 gm — e—“10(a)— e‘“11(1x)] (3.14) Using a well known formula from thermodynamics E = U — T5, the ground state entropy (defined as % ln( number of degenerate ground states)) at zero temperature is then found to be: SGS = e—“Io(a) 1112 (3.15) These results are recovered using the RS ansatz for all a and B when K = 1. The finite value of the ground state entropy may be ascribed to the existence of un- frozen spins. The non-zero value of the ground state energy is due to the presence of completely unfrozen spins of magnetizations. Fig.(3.5) shows the plots of the above energy and entropy at zero temperature. 3.2.4 Replica symmetric solutions for all K The fraction of violated clauses is given by }l 1 e = —N—afian[C] (3.16) and the ground state energy depends only upon the magnetizations of order :1: (1 — O(e“z(3)) if any, and such contribution can be described by the introduction of the 70 '07: 0.69 0.5- 0.4; / 0.3: -,/ . 0.23 x 0. I l r’/ Figure 3.5: Typical fraction of violated clauses (bold line) and entropy (thin line) vs. a for K = 1 in the limit N ——> oo [89] new rescaled function R(z) = minding—é»; mug» (3.17) Also R(z) defined in Eq.(3.17) obeys the saddle-point equation: K—l °° u aK °° , R(z) = [mi—gcos(uz)exp[-F+aK/O gdz)R(z,)cos(u mm(1,zl,...zK_1))] (3.18) Then, the corresponding ground-state energy reads 00 K E6501) 2 11(1 — K)/O Hdz)R(z,)min(1,zl,...zK) + 1:1 01K °°K—1 °° —/ Hdz,R(z,)min(1,zl,...,zK_1) —/ dzR(z)z (3.19) 2 0 1:1 0 The saddle-point Eq.(3.18) is a self-consistent identity for R(z) in the range 2 E [0, 1] only. Outside this interval, it is just a definition of the functional order pa- 71 rameter R: R(z) = if [711111.160 — I) (3.201 lz—oo with 71 given by 1 —€—7‘ [0(71l]1<_1 11 = «K[ 2 (3.21) Inserting Eq.(3.18) in Eq.(3.17) we obtain 565(4) 2 1(1— [7110011) — Ke—l‘lihill (322) 2K So, in the RS context, the SAT to UNSAT transition corresponds to the emergence of peaks centered at x = :tl with finite weights, that is to a transition from '71 = 0 to 71 > 0. This result is in good agreement with our results using combinatorial methods presented in Chapter 4. In addition to Eq.(3.20), there exists other RS solutions to the saddle-point Eq.(3.16). For instance if we choose 20 = 1/ 2, the insertion process ends up after two iterations and generates Dirac peaks centered in all integer and half-integer numbers. More generally, for any integer p 2 1, we can define the solution of Eq.(3.18): R(z) = i r16(z — %) (3.23) I=—oo having exactly p peaks in the interval [0,1], whose centers are 21 = I / p, l = Qqu—L The self-consistency of Eq.(3.21) admits the solution 71 = 0 for any a. When K = 2, there is another solution 71(a) > 0 above a = 1 which maximizes E05 [83]. Therefore, the RS theory predicts that E05 : 0 for a S 1 and increases continu- ously when a > 1, giving back the rigorous result at ac(2) = 1. The transition taking place at ac is of second order with respect to the order parameter. As far as 2 — SAT is concerned the value of the threshold is correctly predicted and the RS 72 solution is exact for any value of a. For the same case K = 2 but for a 2 ac, the vanishing of the exponentially large number of solutions that were present below the threshold is surprisingly abrupt. It was shown [88] that this transition is due to the abrupt appearance of contradic- tory logical loops in all solutions at a 2 ac and not to the progressive decreasing of the number of solutions down to zero at the threshold. It was found [89], [83] ,that for K 2 3 and for sufficiently big values of a, the RS entropy is negative, whereas of course it has to be the logarithm of an integer number. This result is a consequence of replica symmetry breaking (RSB) effects. Despite the general complexity of such anapproach in diluted models and the technical difficulty of the K — SAT problem, recent attempts in this direction are successful [81]. In the next section we will describe the physical meaning of (RSB) for spin glasses without focusing on mathematical details. 3.3 Physical meaning of breaking the symmetry Initially was suspected that the negative entropy might have been caused by the inappropriate exchange of limits n —> 0 and N ——> oo in deriving the free energy. The correct order is N —> 00 after 11 —+ 0, but we take the limit N ——> 00 first so that the method of steepest descent is applicable. However it has been established that the assumption of replica symmetry q“); = q, V (a, B) : a 71$ B (the parameter q is introduced below) is the real source of the trouble. 3.3.1 Replica symmetric solution for the Sherington-Kirkpatrick model In order to understand the physical meaning of replica symmetry and replica sym- metry breaking we introduce a well studied infinite-range spin glass model, the 73 Sherington-Kirkpatrick (SK) model. For the SK model the Hamiltonian is given by: HAS] 2 — Z IiijSj — h ES,‘ (3.24) 1 1§i TC 2 1, and for T < TC there is another solution (the physical one) where q is different from zero. So, there is a phase transition at T 2 1, h = 0 while there is no transition at h ¢ 0. Although everything looks okay, detailed computation made by Sherington and Kirkpatrick [105] shows that the entropy becomes negative at small temper- ature. At zero temperature 5(0) = —1/27r z —1.7 which is unphysical so in this case the replica symmetric method leads to a disaster. On the other hand the 76 l'dlL met tem tem 1111 1111 Re; pa: pre As wt value of many thermodynamic functions computed within the above replica sym- metric solution do not disagree too much with the numerical data, even at low temperature where it is incorrect; for example the internal energy per spin at zero temperature (the ground state energy) gives 11(0) = —(2/7r)1/2 a. —0.798 (3.37) while numerical simulations [68], [59] give: U(0) = —0.76 i 0.01 (3.38) which are not too different from each other. 3.3.2 Experimental evidence of replica symmetry breaking Replica symmetry breaking affects the equilibrium properties of the system and in particular the magnetic susceptibility. For example let us consider a system in the presence of an external constant magnetic field, with the Hamiltonian given by: H[s] = Ho[s] + E125,- (3.39) 1 As soon as replica symmetry is broken we can define two magnetic susceptibilities which are different: 0 The magnetic susceptibility that we obtain when the system is constrained to remain in a valley. In the limit of zero magnetic field this susceptibility is given by X017 2 5(1 “ LIE/1)- . The total magnetic susceptibility (the system is allowed to change state as an effect of the magnetic field). In the limit of zero magnetic field this suscepti- 77 we ter let We Sa 911 th 20 so 40 50 60 7o 80 90 T00 Figure 3.6: FC- and ZFC-magnetization (higher and lower curve respectively) vs. temperature of C u(M n13.5%), H = 1 Oe. For such a low field the magnetization is proportional to susceptibility [28]. bility is given by 201 = 13 f qu(q)(1 - q) S 13(1 — 45A). Both susceptibilities are experimentally observable. The first susceptibility is when we measure if we cool in zero field and then add a very small magnetic field at low temperature. The field should be small enough in order to neglect non-linear ef- fects. ‘In this situation, when we change the magnetic field, the system remains inside a given state and it is not forced to jump from a state to another state and we measure the zero field cooled (ZFC) susceptibility, that corresponds to x04. The second susceptibility can be approximately measured doing a cooling in the pres- ence of a small field: in this case the system has the ability to choose the state which is most appropriate in presence of the applied field. This susceptibility, the so called field cooled (FC) susceptibility is nearly independent of the temperature and corresponds to xeq . To conclude, we can identify X01; and xgq with the ZFC susceptibility and with the FC susceptibility respectively. The experimental plot of the two susceptibilities 78 is shown in Fig.(3.6). They are clearly equal in the high temperature phase while they differ in the low temperature phase. The difference among the two suscepti- bilities is a signature of replica symmetry breaking and it can be explained in this framework. This phenomenon is due to the fact that a small change in the mag- netic field pushes the system into a slightly metastable state, which may decay only with a very long time scale. This happens only if there are many states which differ one from the other by a very small amount in free energy. 79 Chapter 4 Geometric Approach As is evident from the first three chapters, there is intense interest in the relations between statistical physics and computational complexity, from both the computer science and physics communities [30, 83, 85]. The physics approach to hard prob- lems is based on replica methods which is quite complex. In this chapter we de- velop methods to study hard problems analytically using methods similar to the message passing procedures that were discussed in Chapter 2. In the first part we introduce the network algorithms that are going to be used in applications from biology. In the second part we cover networks algorithms that have evolved from applications of statistical physics to hard computational problems. 4.1 Introduction The k—connectivity and k—core problems [99] have attracted interest, for example in designing redundant networks [69]. k—connectivity is the generalization of the conventional connectivity percolation problem to the requirement of k—fold con- nectivity. That is, a graph is k-connected if for each pair of vertices in the graph there exist at least k mutually independent paths connecting them. The k—core of a graph is the largest subgraph with minimum vertex degree k. The Bethe lattice 80 equations for the k-core were actually first derived in the context of k-bootstrap percolation [20]. k-bootstrap percolation is the percolation process found by re- cursively deleting all nodes, which have connectivity less than k. More recently the Bethe lattice k—core equations have been used to develop theories for rigidity percolation [31,91,94]. We give a brief introduction to the connectivity percolation and g— rigidity equations on Bethe lattices and then describe similar percolation processes which are important in the Viana-Bray spin-glass model, the coloring problem and the K-SAT problem. For these problems we develop equations for the probability that an infinite frozen cluster emerges. We then show that in the simplest approximation, this formalism reproduces the replica symmetric equa- tions in a surprisingly straightforward manner. Frozen order is a unifying concept in the analysis of glasses and geometrically frustrated systems in physics [13] and also in NP-complete problems in computer science, such as coloring [25] and K-SAT [30]. Frozen long-range order is most eas- ily understood at zero temperature. At zero temperature the paradigm geometry is to fix the variables on a surface of the system and then to test whether these frozen degrees of freedom cause the propagation of frozen order into the bulk of a sample. A spin is frozen only if the spin is fixed or constrained by the spin con- figurations of its neighbors, as we shall demonstrate explicitly below using the Viana-Bray model. Frozen order may occur even though the variables (e.g. the Spins in a spin glass) at each vertex of a graph look random. Furthermore, not all of the variables in the system need to be frozen. However for the system to be in the frozen ordered ground state, the frozen component must percolate. As discussed in Chapter 1 the vertex q-coloring problem is equivalent to find- ing the ground state of the q-state Potts anti-ferromagnet [114]. Each node of a complex graph may have any one of q colors. The objective is to find the color configuration which minimizes the number of edges which have the same color at 81 each end. The propagation of frozen color has many conceptual similarities with the propagation of rigidity in central force networks [61,92]. However there is a key difference, which makes the coloring problem NP-complete whereas the rigid- ity problem is polynomial. The key difference is that the constraints in coloring are distinguishable while the constraints in rigidity percolation are not. Spin glasses and many frustrated antiferromagnets map exactly to problems in the NP-complete class [83]. NP-complete problems are of central interest in computer science (CSE) [46] and have motivated many attempts to design quan- tum algorithms for their efficient solution. The phase transitions which physicists study often correspond to a change in the computational complexity of the corre- sponding CSE problem. Since these problems are of enormous interest in physics, CSE and also in practical applications it is not surprising that there is a burgeoning of efforts to better understand the phase transition which occurs in NP-complete problems. The physics community has applied the replica method to NP combinatorial problems with remarkable success [45, 56, 63, 89,90, 95, 108]. In addition new algo- rithms have been developed based on a combination of replica symmetry break- ing ideas from physics and belief propagation ideas from the artificial intelligence community [18, 85, 86]. Though the replica method is an excellent tool, it is quite difficult both technically and intuitively. In this chapter we show that a simple combinatorial procedure based on percolation ideas can reproduce many of the Successes of the replica method. The percolation process occurring at the phase transition can be thought of as either percolation of constraint or percolation of frozen order. We derive the replica symmetric theories for K-SAT, the Viana-Bray model and coloring using percolation concepts. Elsewhere we show how these procedures lead to new algorithms for hard problems. In the next section of we give a brief review of the analysis of connectivity 82 percolation on Bethe lattices and random graphs, and also describe its extension to k-connectivity percolation. Also we describe the analysis of the glass transition, at T = 0, in the Viana-Bray model. Then we focus on the coloring problem, and in the last section we present an analysis of K-SAT. 4.2 Connectivity and Rigidity percolation Percolation on diluted Bethe lattices was analyzed by Fisher and Essam [41], who defined the probability that a node is part of the infinite cluster, T. They found that the probability that a node is not on the infinite cluster, Q = 1 — T, only requires that all of its connected neighbors also not be part of the infinite cluster, so that Q = (1 - P(1 — (2))“ (4.1) where p is the probability that an edge is present in the Bethe lattice, and a = z — 1, where z is the co-ordination number of the Bethe lattice. Note that this expression may be written as, T2“ 0‘ T’1— T“ 4.2 E(JWH P) () which is more convenient for the generalization to rigidity percolation. From Eq.(4.2), it is easy to show that there is a phase transition at pc = Eli and that T ~ (p — pc) near the critical threshold. The phase transition is thus continu- ous with order parameter exponent one. Somewhat earlier, this transition was also studied in the graph theory community by Erdos and Rényi [33]. They con- centrated on random graphs, which consist of highly diluted complete graphs. A complete graph is a graph where every node is connected to every other node. In fact they defined p = c/ N, where c is finite and showed that a giant (extensive) connected cluster emerges at c = 1. They derived an equation for the probability 83 that a node is on the giant cluster, 7. Their equation is found from Eq.(4.2), by taking the limit p = c/N, N = z —I 00, to find 7 = 1 - e“C7. Near the critical point '7 ~ 2(c — 1) / c2 so, as expected based on the universality hypothesis, '7 also has an order parameter exponent of one. Rigidity percolation on Bethe lattices is described by a simple generalization of Eq.(4.2). In this generalization, each node has g degrees of freedom. For example if we wish to model rigidity percolation on central force networks, then g = d, where d is the lattice dimension. In order to make a giant g-rigid cluster, we need to constrain the g degrees of freedom at each node with at least g bonds, so we generalize Eq.(4.2) to: (I Tg : E (3‘) (pTg)’(1 — pTg)“_I (4.3) 12g which is the simple generalization of Eq.(4.2) to the requirement of at least g — 1 neighbor connections. Eq.(4.3) was first discovered in the context of a Bethe lattice theory for Bootstrap percolation [20] and has been used more recently to develop a Bethe lattice theory for rigidity percolation [31,91,94]. In the random graph limit, Eq.(4.3) reduces to, '13— e‘cls )2— C]—%— (4.4) When g = 1 this gives the Erdos and Rényi result [33] for the emergence of a gi- ant cluster in random graphs, while for g > 1, there is a discontinuous onset of a finite solution at a sharp threshold cg [94]. Numerical solution of Eq.(4.4) indi- cates that for g = 2, c2 2 3.3510(1). This value has also been found in a recent mathematical analysis [99] of the threshold for the emergence of the giant 3-core on random graphs. The k-core problem is equivalent to the k-bootstrap percola- tion problem. However the k + 1-core is in general different than the k-rigidity problem, and even on Bethe lattices and random graphs there are some important 84 differences.The most important difference is that for g-rigidity, the finite solution T3 is metastable for a range of c > cg [31,94]. The true rigidity transition actually sets in at c, > cg and is identified using constraint counting arguments [31,91]. Nevertheless the probability of being on the infinite rigid cluster is correctly found from Eq.(4.4), provided c > c,, where c, is the rigidity threshold [31,91]. As we shall see below the analogous theories for glassy combinatorial prob- lems, in particular the Viana-Bray model, K — SAT and q-coloring, provide solu- tions at the level of the replica symmetric theory. Moreover, the methodology we introduce here can be used to develop simple and accurate recursive algorithms for these glassy problems on general graphs [40]. In the case of first order transitions, as occurs in q-coloring (with q 2 3) and for K-SAT (K 2 3), the transition point we find below marks the onset of metastability. In order to find the true threshold we need the ground state energy. 4.3 Viana-Bray model We first analyze the onset of frozen order in the Viana-Bray(VB) spin-glass model [110], which provides a basic model for disordered and frustrated magnets, such as Eu JrSr1_ x5 [77]. The Hamiltonian for the VB model is, H 2 £1,755]- (4.5) .ij where S; 2 i1. The exchange constants 1;,- are randomly drawn from the distribu- tion, 1 1 - 131201)) = Pl§5(lij + I) + 5001) - 1)] + (1 — 105(k)) (4-6) We focus on the random graph limit p = c/ N and we introduce the following probabilities: 85 P = probability a site is frozen in the up state M = probability a site is frozen in the down state D = probability a site is degenerate In the absence of an applied field and within a symmetric assumption, P = M and D = 1 — 2M. We then need to consider only one of these probabilities. However for clarity and for ease of generalization, we continue to include M and P sepa- rately. In terms of these order parameters, the magnetization is given by m = P-M and the spin glass order parameter is q = P + M. The recurrence formula for P, using p = c/N is, i i “' P: ' k=01=k+1 k!l!(a — k — 1)! CP cM k cM cP (—+—(—+—)—5 2N 2N 2N 2N N(M + P))"‘“"" (4.7) [(1 This is understood as follows. If a bond connects a site at the lower level to a site at the upper level then the site at the upper level will be frozen up: if the connecting bond is ferromagnetic and the lower level spin is frozen up; or if the connecting bond is anti-ferromagnetic and the lower level spin is frozen down. This event has probability cP/2N + CM / 2N . Similarly, the probability that a spin at the upper level of the bond will be frozen down (negative) is given by, cM / 2N + cP/2N. The newly added spin at the upper level is frozen up if there are a larger number of connections from the upper to the lower level which prefer the frozen up state. The sum in Eq.(4.7) is thus restricted to events of this sort. The event (1 — c(P + M) / N) is the probability that a site at the lower level in the tree is either degenerate or disconnected from the newly added site. In the large N limit, Eq.(4.7) reduces to: oo oo (52‘! )k+I kll! = 1 — e—C‘710(cq) (4.8) where we have used the fact that we are considering a case where the magnetiza- tion 111 = 0. In that case, M = P = q/ 2, where q is the spin glass order parameter. 10 is the spherical Bessel function of zeroth order. The result given by Eq.(4.8) has been found before within the replica symmetric solution to the Viana-Bray (VB) model [63]. Thus symmetric constraint percolation (C P) in the VB model is equiva- lent to the ground state spin glass transition as found within the replica symmetric approach. The C P approach is attractive because is it is simple, it avoids the math- ematical difficulties of the replica method and it is physically transparent. The construction we have used makes it clear that simple connectivity is sufficient to ensure propagation of spin glass order in the VB model. Constraint percolation occurs at c = 1 and the order parameter approaches zero as q ~ fiflc — 1), so the CP transition in this case is continuous, with the same exponent as the Erdos-Rényi transition. 4.4 K-SAT As discussed in previous chapters, the satisfiability problems we consider ask the following question: given a set of binary variables, 2,- = 0, 1 or equivalently z,- = True orFalse, is it possible to satisfy a specified set of constraints on these variables? In the K — SAT case, a typical constraint is of the form, (Z; /\ 21° /\ Zk) (4.9) where /\ is the logical AND operation and the overline indicates a negated vari- able. This logical clause is satisfied (SAT) if any one of the variables in the clause is SAT. The variables 2,- and 2;, are SAT if they are true (T), which we take to be z; = 2,, = 1, while the variable 2]— is SAT when 2,- is false (P), which corre- sponds to z]- : 0. We shall also fix the number of variables in each clause to be 87 K, which is the K-SAT problem. In these SAT problems we shall randomly choose a set of M clauses and try to find the assignment of the binary variables which minimizes the number of violated clauses. Each variable appearing in a clause is negated with probability 1/ 2 and the number of variables is N. The key ratio is 01 = M / N. We would like to find the threshold for constraint percolation. That is, what is the threshold for the appearance of a giant cluster of clauses where each clause is completely specified or frozen. These completely specified clauses cannot be altered without increasing the total number of violated clauses, so that they are non-degenerate. There are three types of clauses in an optimal assignment of a formula: (i) Clauses that are SAT but are degenerate; (ii) Clauses that are SAT but are frozen; (iii) Clauses that are UNSAT but are degenerate. Only type (ii) clauses propagate constraint. We make a tree construction of the factor graph for the K — SATproblem as it is shown in Fig.(4.1). The probability that a variable is frozen and part of the giant frozen cluster is V and the probability that a clause is frozen and part of the giant frozen cluster is F. The branching of the variable nodes has maximum co- ordination M, but the probability that a link actually exists between a node and clause is p = K / N. We start by assuming that a variable is frozen at level 1 (as it is shown in Fig.4.1) and then determine the consequences of this assumption at levels 2 and 3. The probability P that level 2 clause is frozen, given the probability V, that a variable is frozen at level 1 is given by: F =(Z)K-1. (4.10) 2 ‘v ’7 \- / \ / \ J U Q Q) C) C) O b 1 V = probability a variable node is frozen F = probability a clause node is frozen Figure 4.1: The factor graph used to construct the recurrence relations. The circles denote variable nodes, while the square nodes are the clause nodes. V is the prob- ability that a variable node is frozen, while F is the probability that a clause node is frozen (see the text). We assume that a variable at level 1 is frozen and find the probability that a variable at level 3 is frozen. The clause nodes have co-ordination K, while the variable nodes have co-ordination M This equation is understood as follows. In order for a clause at level 2 to be frozen by the variables at levell, all of the level 1 variables to which it is connected must be frozen and in conflict with the assignment in the clause. This imposes a fixed assignment on the variable 3. This is the only configuration of variables at level 1 which propagates constraint through a clause to level 3. Now we must consider the cumulative effect of all of the clauses which are connected to the variable at level 3. There are M — 1 such clauses of which a fraction P propagate constraint (are frozen) according to the mechanism of the previous paragraph. Some of these frOzen clauses propagate the requirement x and others propagate the requirement f- The variable at level 3 then has three possible states: P = positive, N = negative and D=degenerate. The state of the level 3 variable is degenerate if the number 0f constrained connections which favor the positive state (x) is the same as the Ilumber of connections which favor the negative state (E). The probability this 89 variable is frozen (i.e. either negated or not) is V = P + N = 1 — D as we are considering the case where the probability that a variable is negated is 1 / 2. It is straightforward to generalize to the case of unequal probabilities. The probability that the node at level 3 is degenerate is then: _ M M! PF 2k M-2k D—g(k!)2(M—2k)!(2) (I‘m?) (4'11) Where we have used the fact that the probability that a connection occurs between a variable node and a clause node is p = K / N. Eq.(4.11) is understood as follows. The probability that a clause at level 2 is frozen and connected(i.e. it propagates constraint), and it requires the variable at level 3 to be x is pF / 2. The probability that a clause propagates constraint and it requires the variable at level 3 to be i is also pF / 2. The variable at level 3 is degenerate if these two events occur an equal number of times, hence the term ( pF/2)2k . The combinatorial factor gives all ways of arranging these events, taking into account that the x and E events are distinct. In the thermodynamic limit, using pM —— ocK, we find, °° 1 aKF aKF E 2k 0 2 Then we can write: v = 1 — D = 1 — e"“KFIO(aKF) (4.13) Where 10 is the spherical Bessel function of zero order. Note that 10(0) = 1. For Completeness, we note that the probability that the new variable is frozen in the Positive (not negated) state is: M! pF M M —— k I M—k—l P " gig. (0)2014 —k—I>!( 2 W1 ‘ pp) ‘4'”) The probability that the variable is frozen in the N state is the same as P for the case 90 o" A ' ._ 0.5 1‘ {.5 2 Figure 4.2: The probability that a clause is frozen, F, as a function of a, for 2-SAT. we are considering, where the variables have equal probability of being negated and not negated. Equations (4.14) and (4.10) provide the self consistent theory for the onset of a giant constrained cluster in K-SAT. We now analyze this theory for the two typical cases. The 2-SAT case (K = 2) In this case Eq.(4.10) is F = V / 2. Expanding Eq.(4.13) in powers of P, we then have, P = [1 — (1 — 2011-“ + 21121?2 +..)(1+ a2F2))] (4.15) NIH This has the trivial solution a = 1. It also has the non-trivial solution 2 . F ~ WM — 1) with (a — 1) <1 (4.16) Thus the random 2 — SAT giant cluster emerges smoothly at a = 1. Numerical Calculation of F from equations (4.10) and (4.13) is presented in Fig.4.2. The K 2 3-SAT case. In this case, Eq.(4.14) and Eq.(4.10) do not have a solution 91 01‘) ' f 0.05-- l ~ I l l 0 . a . . . - . .-7. , . . 4 4.5 5 5 5 6 Figure 4.3: The probability that a clause is frozen, F, as a function of oz, for 3-SAT. with a smooth behavior near the critical point. However they do have a non-trivial solution which has a discontinuous onset at a threshold value, «C(K). This solu- tion is found by iteration of Eq.(4.10) and Eq.(4.13) and the result is presented in Fig.(4.3). We find that although the emergence of the giant cluster is discon- tinuous, for any K > 2 the size of the first order jump decreases quite rapidly with increasing K. This indicates that the K-SAT transition is weakly first or- der and that an analytic analysis at large K is possible. The 3-SAT critical value which we find, ac(3) % 4.6673(3), is consistent with the replica symmetric solu- tion [89] for the metastability point, and significantly higher than the numerical Values for the K-SAT transition which lie around 4.3 [85]. The numerical results We have found (using Eq.(4.14)) for the metastable point and the jump in F at that point: 015(3) = 4.6673(3), 61-} 2 0.0680(1); 00(4) = 11.833(1), (SEC 2 0.0341(3); “c(S) = 2991(1), (SE; = 0.016(1) ; ac(6) 2 641(1), JFC = 0.0071(1). The lower branches (the unstable solutions for equations (4.10) and (4.13)) were found using the half-search interval method. 92 006: 005 00-4- 0.03- 002' 001- 0 . . , .1 . . . . _. , . ,. . II 115 12 125 13 HS 14 Figure 4.4: The probability that a clause is frozen, P, as a function of a, for 4-SAT. 4.5 Energy per variable For the K - SAT problem the energy gain of removing a variable node and all clauses connected to it reads: 8E 3E The energy gain of removing a clause is: e, : E(N,M) — E(N,M _1) z 38—; Also we can write: B_E_8(Ne)_ NEE— —aa—€ 3N 3N 81x 0N 8a and 3E __ 8(Ne) Be 301 36 _._. _— a—M— .9111 “5.151112% 93 (4.17) (4.18) In this way, Eq.(4.17) and Eq.(4.18) become: Be Be ev — e — aa—a + Kata (4.19) and 8E Be eC—E(N,M)—E(N,M—1) — 0W1.- 3a (4.20) Plugging Eq.(4.20) into Eq.(4.19) we obtain the true energy per variable for the K — SAT problem 6 : 5.11—(K —1)t‘t€c (4.21) For the 3 — SAT problem Eq.(4.21) is the same as the one obtained by Mezard [86] using the cavity approach. The energy density for removing a variable node reads: K 00 00 1 x [(+1 2 )2 )2 Km (2) (I = 31+ 252 (4.22) K=OI=K+1 ' ' with S1 and 52 given by: 00 _ K x 2K 81 : K208 x(K!)2 (a) (4.23) 00 00 K x 2K+n = E X) Win—1(5) (4.24) 94 In our notation x : aKF with F given by Eq.(4.10). Using the identities: 00 x 2K+Il Y KZOK !(KI<)!+n (E) : §1n+l(x) and we have for 51 and 52: S1 = ’ €_II](X) (4.25) and 82 = 2(1— e_x(10(x) + 211(x))) (4.26) Using Eq. (4.25) Eq.(4.26) and Eq.(4.22), the final expression for energy density when we remove a variable node reads: X e. = 5 (1 — e-xuom + 1100)) (4.27) The energy density when we remove a clause node is given by: K —x 6c: (K) = il—e 10(x) (4.28) 2 OK 2 where V = 1 - fa” 10(aKF ) as we already found (Eq.(4.13). In this way the true energy density (4.21) reads: 6 : ev — (K —1)aec = 2312 (1 — e—x(10(x) + Kl1(x))) (4.29) 95 0.05 r T . . . 0.03 l- 0.02 *- 0.01?- -0.0l *— l l -0.02 I l L A i I l 4 1’ l 6 Figure 4.5: Ground state energy for K = 3 using Eq.4.30 (upper curve) and using the cavity approach (lower curve line) as a function of a 4.5 5 5.5 For the K = 3 case, the energy given by Eq.(4.29) is represented in Fig.(4.5). In the same figure for the K = 3 case is represented the energy given by equation: a K e = / rm da (4.30) 0 From Fig.(4.5) we see that the metastable solution for 1355 = 0 is reached at ammsmble m 4.667 while the stable solution is reached at “stable m 5.181. Fig.(4.6) is for the K = 4 case, and we see that the ground state energy is reached at “metastable % 11.832 and “stable 3 14.369 96 0.2 f.“ "T 77"” i— I" 'T" L 'WV 77*?" “TV Tfi ‘ V ’T*'_»“ T i‘_' j T f 1 0.15 — 0.1 *— a F 1 0.05 *- .4 0 >— _a _005 I - . i l l A l I 10 12 l4 16 18 20 Figure 4.6: Ground state energy for K = 4 using Eq.4.30 (upper curve) and using the cavity approach (lower curve) as a function of a = M / N. 97 4.6 Coloring The Graph Coloring problem problem is simply stated but is very difficult to solve either analytically or numerically. Given a graph, or a lattice, and given a number a of available colors, the problem consists in finding a coloring of vertices such that no edge has its two end vertices of the same color. The possibility of finding such a solution depends on the way the graph is constructed and also on the number of colors. The minimally number of colors is the chromatic number of the graph. For large random graphs, there exists a critical average connectivity beyond which the graphs become uncolorable with probability going to one as the graph size (the number of vertices) goes to infinity. As discussed in Chapter 3 graphs generated close to their critical connectiv- ity are hard to color and therefore the study of critical instances is an algorithmic challenge for understanding the onset of computational complexity. For many NP problems, the complexity shows an easy-hard-easy pattern [25] [21] [70], [109]. Recently, methods from statistical physics have been applied to computational complexity problems one of them being the Coloring problem. The statistical physics approach is based on the introduction of a cost function (or energy), calcu- late the free energy in the large system limit (thermodynamic limit) and from here calculate the macroscopic quantities of interest. From the free energy we can cal- culate the ground state energy which allows us to make predictions of the graph colorability: a non-zero ground state energy indicates that random graphs are typ- ically not colorable. Apart from determining the colorability of the graph, from the ground state energy we can determine the typical minimal fraction of unsatisfied edges when the graph is not colorable. Also, the ground state entropy gives us information about the different coloring schemes that share the minimum number of unsatisfied edges (i.e. the degeneracy). 98 As shown in Chapter 1, the q-coloring problem can be mapped to a Potts model: E = 212.4... (4.31) i i where b),- = 1 (0) if an edge is present (absent) between sites i and j. This is the same as the energy of the Potts anti-ferromagnetic model which is an exam- ple of physical system with geometrical frustration [114]. Optimizing the color configuration of a graph is equivalent to finding the ground state of the Potts anti- ferromagnetic. A proper coloring of a graph corresponds to a zero energy ground state configuration of a Potts anti-ferromagnetic with q-state variables. In the next section we discuss the order parameter we use to describe frozen or- der, over-constrained variables and colorable variables. This is followed by a dis- cussion of the coloring problem on complete graphs where the problem is trivially solvable. The next two sections contain the analysis of the equations for percola- tion of frozen order on diluted Bethe lattices, with particular emphasis on the large coordination number limit. Then we present some results for the special case 4 = 2 which exhibits a threshold behavior similar to percolation, and also the more inter- esting case a = 3. The last section contains a brief summary and some concluding remarks. 4.7 The model and Limiting results As described above, each site has a variable which may have one of q colors, that is x; = 1,. . . , q. If a site has no neighbors, then the energy is the same for any of these a possible colors. However if the site has many neighbors we need to find the color configuration which minimizes the number of mono-color bonds, i.e. bonds which have the same color at each end. Of particular interest is the possibility that 99 asfie fixed whos re) of ex 0‘ a site has a unique color which minimizes the energy. That is, the color of a site is fixed by its neighbor configuration. We introduce two probabilities related to sites whose color is fixed, i.e. , frozen, namely: G = Probability that a site is frozen and colorable, H = Probability that a site is frozen and not colorable. A site is frozen and colorable if the site has a unique color, but the site can be col- ored without causing an increase in the energy cost of the system. That is, none of the neighbors of the site has the same color as the frozen site. In contrast, a site is frozen and uncolorable if its color is unique, but it has at least one connected neigh- bor which has the same color in the ground state. The sum of these probabilities, P z G + H, plays the most important role in the analysis. In order to be a globally frozen ground state, frozen order must percolate. If frozen order does not perco- late, then a surface exists which separate one region of frozen order from another region of frozen order. These two regions may be deformed with respect to each other and hence the ground state is unstable to a long wave length zero-energy excitation. We shall also find the probability, 0, that a site is not colorable and over-constrained (this probability includes the probability H of being frozen and colorable). Finally, we will find the probability U that a site is under-constrained. Under-constrained sites are colorable but are not frozen. The following conserva- tion law holds for these probabilities: O+G+U=1 (4.32) The energy is found from the order parameter by noticing that if we add one edge to the system, we have probability F2 / 2 of making a monocolor bond, so the en- ergy of the system in increased by one unit. That is, if both of the sites at the ends of the edge have frozen colors and their colors are the same, then the edge is frus- 100 frat wh wl df trated and costs unit energy. The probability that the two sites are frozen is F2, while the probability that the two sites have the same color is 1/ q. We thus have: E(B+1)—E(B) :ts—z — (4.33) where B is the number of bonds in the system before the edge is added. In a diluted complete graph, B = pN (N — 1)/ 2. In the random graph limit, p = c/ N and B = cN/2. Defining the energy density 6 : E/N, we find: 86_ FZC a—Cz Z—qc (4.34) Integrating this expression we have, CF2 6 c = — 4.35 ( ) 24 ( ) where c... is the coloring threshold. A simple example which can be solved explic- itly is a complete graph, where every site is connected to every other site (this is equivalent to c —> N / 2 in the above). We define n, to be the number of vertices which are assigned color I . Since every vertex is connected to every other vertex, the energy is simply: 1 q q l-3(r11,n;g,...,nq ):§;n ()n(—16(En1—N) (4.36) 1:1 121 If we define the fractions x1 = n, / N and take the limit N —> 00, we find, AP 4 q —— fo6(2x1 — 1) (4.37) 1:1 E(x1,x2,...,xq) = 2 1:1 This energy is a convex function of the fractions x1, so its minimum is either at a boundary of the domain, or it is a unique minimum in the interior of the domain. 101 A straightforward analysis demonstrates that the minimum is a symmetric solu- tion at x, = 1 / q and hence the energy is E = N2/2q. This state has degeneracy N! / ( (N /q)!)‘? which is all the ways of arranging q sets of n / q items each amongst the total of N items. In order to make contact with the energy that we defined in (4.35), we set c = N, and we find, €(C _. oo) = — (4.38) 4.8 Coloring on Bethe Lattice A Bethe lattice introduced by Hans Bethe in 1935 is a connected cycle-free graph where each node is connected to z neighbors with 2 called the coordination num- ber. Bethe lattice can be seen as a tree-like structure growing from a central node called root, with all the nodes arranged in shells around the central one. We solve the problem of coloring random lattices in the limit of large coordina- tion number. In this limit the Bethe lattice solution approaches that of the solution of random graphs. It is important to point out at the outset that trees are bipartite, so that coloring on simple trees is trivial. We only need two colors to color a bi- partite graph, so for any a > 2 there is no transition on trees with free boundaries. However there is a symmetric fixed point (i.e. all colors are equally probable) even on trees. This symmetric solution is produced either by choosing equal probability boundary conditions on the outer leaves on the trees or by explicitly equating the probabilities of all of the available colors. Consider a Bethe lattice of coordination number z. The limiting behavior of such lattices is calculated on one branch of the tree which has coordination num- ber a = N — 1. We consider randomly removing bonds on the Bethe lattice so that a bond is present with probability p. In the random graph limit N —> co and 102 p —+ c/ N where c is finite, there is typically a finite number of connections between neighboring sites. We solve the problem of finding the lowest cost coloring of di- luted Bethe lattices. As is usual on trees, this calculation can be done recursively due to the independence of the probabilities on a given level of the tree. In this problem we have 6] colors and we seek to assign colors to the vertices of the graph so that the minimum number of connected neighbor sites have the same color. Our analysis centers on the probability P, (I = 1, 2, ..q), which is the probability that a site is frozen in color I. We also introduce the probability G1, which is the probability that a site is frozen with color I and is colorable. The probability G1 is given by the recursion relation: 00 oo 00 00 a! —kzz=1k32=1 1.42:1). Eek "2'k3' '4’“qu q+1 (17152)“(pF3)"3--.(PFq)""(1 - PEFI)"4*‘5(S + Z, k: — a) (4.39) 122 This formula is understood as follows. In order for a site to be frozen in the color 1, all the other q — 1 colors must appear and be frozen on one of the connected neighbor sites. The probability that a neighbor site is connected and frozen in color I is pFl. This event may occur one or more times, so we must sum over k) = 1, ...oo We thus have a term (pl-‘1)"l for each of the required q — 1 frozen neighbor colors. However, we must also allow for the possibility of events which are not of the type pH, which leads to the term (1 — p )3 F1 )k4+1. This probability is summed from 0 to co as it does not have to exist in a configuration in order to ensure that G] be finite. Note however that (1 — p E F,) is by far the most likely event in the random graph limit, where p —+ c/ N. All of these probabilities are exclusive and independent. We must also allow for all ways of arranging this set of q + 1 exclusive events amongst the a possible connections between our newly added site and the sites at the lower level in the tree. This leads to the multinomial factor. Eq. (4.39) occurs 103 for each of the 11 colors that are allowed. Now we make the symmetric assumption in which each color is assumed to occur on frozen sites with equal probability. We thus can write G,- = G/q and F,- = F / q. Equation 4.39 reduces to: G : qk§11§1ukguzzokblk3l“’91!qu q q+1 (pp/q)z,...kz(1_ pokrws + 2 k1 — a) (4.40) 122 Using a convenient form for the 6 function we have: ._ .2] dz (3 —-612m. 2‘1+1 = °° _ q_1_, q—1 if dz _ pFrz qr); 1) ( r )2”, 2,11 eXPKl PF)2+ q l =qi(-1)"‘1"(q—1)(l—pF+fi:) (4.41) r=0 7' q Now we take the random graph limit, p = c/ N, a —> N to find: _ a! °° szk — °° 1 _ G_q27?/za+1(kzfi( (—)kq) Em“ [(1 pF)z] hf _1)4- 130 -pF)2 q-l q—r—l 51—1 —Cf+‘—2r—’ G=sg(—1) ( r )e = qe_cp[ecF/1 — 1]‘1-1 (4.42) The calculation of H, the probability that a site is frozen but not colorable (i.e. it has neighbors which have the same color) is similar though more complex. For a site to be uncolorable but frozen, it may have a color which occurs on a neighboring frozen site. However for color 1 to be frozen and colorable, the number of times a frozen neighbor has color 1 must be strictly smaller then the number of times any of the other q — 1 colors occur on frozen neighbors. There is an infinite series of terms of this sort, consisting of cases where the color 1 has 1,2, 3, neighbors of 104 the same color while all the other colors occur a strictly larger number of times for each case. The probability H 1 is given by, 00 CD a! _0 Slkzlk3l...kqlkq+1l (1+1— 1k2:s+lnk,,:s+1k q+1 (pF1)(pF2)k2...()k1qu (1—pZF,)1‘6(S+ 12k, —a) (4.43) Similar equations occur for H,. The case s = 0 correspond to the probability G, of Eq.(4.39). Making the symmetric assumption H 1 : H / q, F, = F / q, and using the integral form of the delta function we find, a! dz °° 1 sz S ”=1m/z«+1§;(7) ( E %<”—F"'>")q_1 €501 -— 1202) (4.44) k=s+l ' q 1:0 Using F 2: G + H together with Eq. (4.42) and Eq.(4.44) we have the key equations for the symmetric theory of constraint percolation. The solutions of these self- consistent equations lead to predictions for the behavior of F from which we may calculate other quantities of interest. For example, the probability that a site is not colorable or over-constrained is given by, 00 00 00 a! 0:2 E" 3;.” E okl'kz'k3'” kq'kq+1' k=l k2=1 q+1 (pmk‘ (pF2)"2...(qu)"1(1 — pZWr‘M )3 k, — a) (4.45) 1:] That is a site is over-constrained if all of the colors occur and are frozen on neigh- boring sites. Using the symmetric assumption and the integral form for the delta 105 function we have, a'/dz<°°1sz,q_1°°1 t =— —<—>) —<1-— 02 2m z“+1 ék! q gt![ P J = (1 — (GP/W (4.46) From this we find the simple the result: 0 = (1 — e‘i‘ifl (4.47) Finally, the probability that a site is under-constrained is given by the probability that there are q — 2 or fewer connected neighbors which have frozen colors. For the symmetric case, this leads to, u— "‘Fq—ZC" 00 CF” 5 448 — e Z 5 2(7) E ( - ) 5:0 k=1 ' which reduces to U = 1 — (1 — (CF/‘7)" — qe’CF(eCF/‘l —1)‘7_1 (4.49) Comparing Eqs. (4.42), (4.47) and (4.49), we have the expected normalization con- ditionG+O+U = 1. 4.9 Results For q = 2, we assume that F is continuous near the percolation threshold and expand this expression in powers of P which yields, r 2 CF — 2602 + 0((cr)3) (4.50) 106 l 0 . 8 ~ --.. O 6 ...oo.o°. .......o ..:::.M..... “...,...” O 4 .33. “1'3“:1; I ..O. ’00... 0 ...° 0000...... O 0 2 _ .0 .....o.0. .000... L. .......ooo --.QO.... 1 . 1 I A A . C 2 3 4 5 Figure 4.7: The coloring order parameters for q = 2. The lower two curves are the probability that a site is frozen and colorable, G (the s = 0 term in eq.(4.40)), and the probability that a site is frozen and frustrated, H (the s 2 1 term in eq.(4.40)). The top curve is the probability that a site has a frozen color F = G + H, which is found by solving eq.(4.40) with q = 2. This has the solution: F m 3éu — 1) c 2 1 (4.51) 80, other than a prefactor of 4/ 3c2 instead of 2 / c2, the critical behavior of the giant cluster probability is the same as is in random graphs [33,41]. For c well away from . the transition, we solve Eq.(4.50) numerically. The s and k sums converge rapidly and for the c range near critical threshold, only a few terms are required for high accuracy results. From the solution for P we obtain all of the results of interest and they are presented in Fig.(4.7). The continuous behavior of 2-coloring near threshold is ev- ident from these data. Using eq.(4.35), the energy density for q = 3 is represented in Fig.(4.9). For q = 3, an attempt to find a continuous transition by expanding in powers of F fails. 107 l O . 8 f O . 6 " ...oofi‘ouuooonu”””°' 00......”. o 4 33"” '"°’°° « :0 w ..o “My”... 0. O 2 .00 o 0...... a... + A g C 5 6 7 8 9 Figure 4.8: The coloring order parameters for q = 3. The lower two curves are the probability that a site is frozen and colorable, G (the s = 0 term in eq.(4.40)), and the probability that a site is frozen and frustrated, H (the s 2 1 term in Eq.(4.40)). The top curve is the probability that a site has a frozen color F = G + H, which is found by solving Eq.(4.40) with q = 3. Numerical solution of Eq.(4.40) is presented in Fig.(4.8) where it was seen that there is a jump discontinuity in the infinite frozen cluster probability at a sharp threshold. We find that c... = 5.14(1) and that the jump in the order parameter is AFC = 0.365(1). We thus find that the coloring transition for q = 3 is first order as has been found in numerical simulations on random graphs [25]. Our coloring threshold is consistent with a recent replica symmetric numeri- cal calculation, which yielded c, z 5.1 [108], but is significantly higher than that found in the simulation work of Culberson and Gent [25] where c... z 4.5 — 4.7 or in the numerical work on survey propagation [95], which yields c... z 4.42. Never- theless the nature of the transition is correctly captured by the simple CP theory. It is also important to note that the solution found here may also be metastable for a range of c, as was found in the rigidity case [31]. The onset of metastability is an important threshold from the point of algorithmic efficiency, as it marks the onset 108 (7). l I I I T I fi' I I 0.08 — l 0.06'_ r,” —‘ s 1 « '50 $— — ,9 — o 004 9’ c: _ ,9 1 LL] $6 0.02— — ()— p—— .- _- .- p— .— +—- r- -0.02 Figure 4.9: Energy density for the q = 3 case of glassy relaxation dynamics. The coloring theory developed above can be formulated in a very similar way to the formulation of the propagation of the k-core. However, there is a critical difference. The constraints in the coloring theory have to be treated as distinguish- able, while the constraints in the k-core calculation are indistinguishable. Although we concentrated on the symmetric theory, cavity methods [85] hold promise for generalizing this approach to the unsymmetric case, as will be pre- sented elsewhere. The coloring transition is continuous for q = 2 and discontin- uous for q 2 3, similarly K — SAT is continuous for K = 2 and discontinuous for K 2 3. In contrast the VB model of glasses has a continuous phase transition. As found in the rigidity percolation problem [94], processes which require more than 2-connectivity in order to propagate constraint have a tendency toward first order transitions. However, a counter example is rigidity percolation on triangular lat- 109 tices, where the rigidity transition is continuous [93]. It thus seems a difficult task to determine the conditions which produce continuous as opposed to discontinu- ous percolation transitions in complex combinatorial problems. 110 Chapter 5 Applications to System Biology Research at the intersection of the biological and computational sciences holds the potential to enable a number of important advances for both communities. Com- putational models of cell regulatory networks and biochemical signaling cascades are being constructed to elucidate the inner working of living cells. Biologists are utilizing these models to explore the logical implication of alternativecompeting hypothesis, to design drugs that are highly selective for specific targets, and to control the behaviors of cells in response to external inputs. Similarly, advances in the biological sciences are being used to drive innovation in the design of new computing architectures based on biomolecules. The inherent ability of DNA and RNA nucleotides to perform very big computations is being exploited to solve NP hard computational problems. 111 5.1 Combinatorial Optimization Methods for Dissect- ing Gene Regulatory Networks During Neuronal Differentiation Combinatorial and coordinated actions of transcription factors and signaling pro- teins dictate the determination of cell fate during retinal development. N RL (neu- ral retina leucine zipper) is the key transcriptional regulatory protein, which initi- ates a cascade of molecular events leading to functional rod photo-receptors from committed post-mitotic precursors. It controls the expression of most, if not all, rod-specific genes including the visual pigment rhodopsin. When N RL is absent, a rod precursor changes its course and becomes a cone, implying the existence of pools of progenitor cells that can acquire either a rod or a cone cell fate. To understand the gene regulatory networks (GRNs) during photoreceptor differen- tiation, Swaroop’s group [36] generated a transgenic mouse line (N RL :: GP P) that expresses enhanced green fluorescent protein (EGF P), under the control of N RL promoter, specifically in rod photo-receptors. Using FACS-purified photo- receptors, we have produced genome-wide expression profiles at five different developmental time points corresponding to distinct stages of differentiation and from multiple mouse mutants. These studies have provided huge lists of differ- entially expressed genes. However, manually sifting through the datasets to pos- tulate downstream targets and networks of co-regulated genes have been highly cumbersome and error-prone. We therefore develop computational methods for constructing key functional paths in the gene regulatory network for rod / cone differentiation. Technologies developed in these studies should find wider appli- cations in other microarray-based investigations. A comprehensive understanding of GRNs might lead to better design of drug targets for macular degeneration and retinal dystrophies. 112 5.1.1 Computational Background To identify functional motifs in gene regulatory networks, we seek develop new methods to cluster genes in a systematic manner, beginning with the most highly correlated gene pairs. At each level in our clustering procedure, we identify the gene pair in each cluster with the strongest correlation. The most highly corre- lated gene pairs occur in small clusters and are interpreted as functional motifs in the gene regulatory network. Motifs extracted from different knockout exper- iments and at different developmental time points provide key insight into func- tional pathways in the network. We developed the following clustering method using co-expression data: start- ing with small correlated clusters, which are candidates for motifs in the regulatory network. Clustering is carried out in a systematic way using Kruskal’s algorithm for minimum spanning tree, a well known combinatorial optimization procedure. The most highly correlated genes are clustered first and at later times less highly correlated gene clusters are merged. Each cluster-cluster merging event is charac- terized by the strength of its co-expression correlation. Using co-expression data at different time points and for different knockouts, gene expression changes and co-expressed motifs are expected to be modified con- tinuously during development (temporal changes). Using our clustering proce- dures, the hope is to characterize the way in which development modifies motifs and define how motifs are altered by different knockouts. 5.1.2 Biological Bakground Retinal development The vertebrate retina is an intriguingly complex yet a relatively simple model to investigate molecular details underlying complex cellular processes and higher 113 order functions of the central nervous system. It consists of six major neuron types and one glia. During retinal differentiation, these different cell types are generated in a predictable sequence from a common pool of multi-potent progenitors [19], [75]. The competence model of cell fate determination proposes that a heterogeneous pool of multi-potent progenitors passes through states of competence, in that it can produce a specific set of cell types. At the molecular level, this competence is acquired under the influence of extrinsic as well as cell-intrinsic mechanisms, which include transcriptional regulatory proteins [19], [75]. Functional mainte- nance and survival of mature retina also requires quantitatively precise gene ex- pression; over- or under-expression of certain cell-type specific genes results in retinopathies. Hence, elucidation of transcriptional regulatory networks in devel- oping and mature retina is fundamental in understanding the neuronal differenti- ation as well as disease pathogenesis. Photoreceptor differentiation and development In the mammalian retina, rod and cone photo-receptors account for over 70% of all cells; in most mammals, rods are almost 20-fold higher in number compared to cones. Cones are born earlier than rods during retinal development and rods develop over a much broader temporal window than cones Fig.(5.1). 114 510 E12 E14 £16 £18 R9 P2 P4 PE P P10 P12 P14 P21 Rod Birth tho Expression l Eye opening l Peak of Rod Birth OS Formation Mature Rods Figure 5.1: Time-line of rod photoreceptor birth, major developmental events, and the kinetics of N RL and rhodopsin (Rho) gene expression. Rod birth peaks at P1 - 2. At P6, expression of rhodopsin and several other rod-specific genes is observed. Outer segments begin to form at P10 and by P28 mature rods are formed. Adapted from Akimoto et al. [36] structural com tence ,, induction ,, stabilization .. . - . . Competent pa Transrtion Transmon Stabilization ”“99“” Differentiated stage factorls) stagel lactorts) stageZ factorts) Stage proteins stage RPCs :3. firms ”it Hi; _:- Crx + Crx + ch + Nfl + NerflEi . normal [Dds (r: Figure 5.2: A proposed model of photoreceptor differentiation, integrating the transcriptional regulatory functions of N RL and N R2153 [22]. 115 A majority of rods are born postnatally. Post—mitotic rod precursors exhibit variable delays, depending upon their time of birth (early or late), before express- ing the photo-pigment rhodopsin, a definitive marker of mature rods [36]. Though several transcription factors are shown to regulate rod gene expression, the rod photoreceptor-specific transcription factor N RL is required for rod differentia- tion [78]. It interacts with cone rod homeobox (CRX), photoreceptor—specific or- phan nuclear receptor (N R2E3), and other proteins to regulate the expression of most, if not all, rod-specific genes. Ablation of N RL in mice (N RL"/ ”) results in a rodless retina with S —cones [26]. A similar phenotype is observed in the retinal de- generation 7 for mice and in patients with enhanced S-cone syndrome, caused by mutations in a rod-specific orphan nuclear receptor N R2133. Significantly, N R2E3 expression is undetectable in the N RL — / — retina, suggesting that both N RL and N R2153 share similar regulatory functions. It was shown that NR2E3 is a direct transcriptional target of N RL in the retinal developmental hierarchy. Mu- tations in N RL and N R2E3 are associated with distinct retinal disease phenotypes ( [112], [62], [104], [87], [55]). As elaborated in Fig.(5.1) and Fig.(5.2), retinal progenitor cells (RPCs) un- dergo terminal mitosis at specific times during development. The post-mitotic cells (PM Cs) are directed towards photoreceptor lineage and pass through distinct transition stages. PMCS that actively express N RL are instructed to rod cell fate, whereas those not actively expressing N RL produce cones. Expression of N RL induces rod differentiation, but only subsequent N R2E3 expression stabilizes the cell fate by completely repressing cone genes. In the RD7 retina, the absence of N R2E3 transforms some of the early-born rod precursors to S-cones, while others may acquire rod-cone hybrid phenotype (private communications from Professor Swaroop). In the N RL — / — retina, since N RL and N R2E3 are not present, these potential rod precursors adopt the default S-cone fate. Ectopic expression of N R2E 3 116 in these cells can partially rescue the rod morphology and gene expression but not functionality [22]. Experimental design and data In order to understand the gene regulatory pathways during rod photoreceptor differentiation, it was generated a transgenic mouse line (N RL :: GF P) in wild- type background that expresses enhanced green fluorescent protein (E GP P) under the control of N RL promoter, specifically in rod photo-receptors [36]. To directly evaluate the origin of enhanced S-cones in the N RL — / — retina, the wild-type transgenic mice were inter-bred with N RL — / — mice. The GFP-tagged rod pre- cursors in this line take the identity of S-cones in the absence of N RL and N R2E3. Since a similar phenotype was observed by the loss of N R2133 function in RD7 mouse retina, N RL :: GF P crossed with RD7 mice comprised the third mouse line. To explore additional possible downstream targets of N R2153, it was ectopically expressed in the N RL — / — retina in the fourth mouse line using CRX promoter. Expression of N R2E3 in the N RL — / — retina completely suppressed cone differ- entiation and resulted in morphologically rod-like photo-receptors, which were however not functional, probably due to the absence of N RL [22]. Mating of dif- ferent transgenic lines with the N RL :: GFP mice allows tagging of rod precur— sors and mature rods with EGF P, facilitating their enrichment by fluorescence- activated cell sorting (FACS) [36]. The advantage of using the transgenic mouse lines is that purified photo- receptors from mutant mice can be obtained by FACS. Hence, one can generate gene profiles of a specific cell type during development (instead of using the whole retina). The four mutant transgenic mouse lines used have the following pheno- type: (i) wild-type where rods are tagged with GF P (both NRL and N R2E3 are expressed); (ii) N RL -— / —— where the GFP-tagged rod precursors take the identity 117 of S-cones (both N RL and N R2153 are absent); (iii) RD7 where rod-cone hybrid cells are tagged with GP P (N RL is expressed, N R2133 is absent); and (iv) N R2133 trans— genic in N RL — / — background where non-functional rods are tagged with GP P (N R2133 is expressed, but N RL is absent). Thus, these mouse lines represent dif- ferent combinations of the two key transcription factors, N RL and N R2E3, which are required for normal rod photoreceptor differentiation. Swaroop’s generated gene profile data from the GFP+ cells purified from the retinas of these transgenic mouse lines at five different developmental time points: E16, P2, P6, P10 and P28 as is shown in Fig.(5.1). Four independent samples were used at each time point. Affymetrix GeneChips 430 were used for gene profiling. 5.1.3 Construction of gene regulatory networks (GRN) from ex- perimental data Much of the initial work has focused on the development of techniques for ac- curate identification of differentially expressed genes and their statistical signifi- cance. However, the main difficulty in the analysis lies not in the identification of differentially expressed genes but in their interpretation. Hero’s group devel- oped an approach [116] that clusters genes based on their functionality rather than on the level of expression. The group implemented successfully [34] an algorithm that first constructs the relevance network (determined by three parameters: FDR, MAS and p) based on the estimation of pairwise gene profile correlation, and then discovers the biological significance of the network. The microarray data set available from these studies of mutant mice may be summarized in Fig.(5.3). Data currently exists for four mutant phenotypes at five different time points during photoreceptor development in mammalian retina. However, for a more accurate control of statistical significance of gene pair cor- relation an adaptive FDR controlling procedure [65] can be used; depending on 118 TIME I __.._ _._ . ._ . m. _ ._.. . . . _-._ ____ -.__..> '0 1 Erik wr {EM,(tl)} {Em.(t2)} z I 3; rd7 {Ewing} {E"‘7.(t2)} ; l NRL"' {ENRL"‘I(t1)} {ENMIUQ} rn V Figure 5.3: The phenotype matrix. Ell(t) represents the expression level of a set of genes, with each gene labeled by i, corresponding to the l-th experiment (i.e. knockout phenotype) at time t. the number of effects, one can choose which standard error estimator works best in conjunction with the original FDR controlling method. 5.1.4 Discovery of functions/pathways from constructed retinal GRN Biologists are interested in developing new methods for extracting functional path- ways from data like that illustrated in Fig.(5.3). One approach is to extend the rel- evance network approach [116] and to consider gene expression levels at different time points enabling the identification of the time course of functionally-related genes. The simplest multivariate statistic is the pair correlation between genes, which may be at different time points and / or in different phenotypes. A study of pair correlations at different time points would reveal time delays in the expression of downstream genes. Also, a study of correlations between different mutant phe- notypes could reveal the effect of a specific transcription factor (N RL or N R2153) on levels of gene expression. If there is a time delay in the effect of mutation on the expression of downstream genes, this will be revealed by multivariate analysis between different phenotypes at different times. An issue of considerable concern is that experimental uncertainty may obscure the trends in individual gene pairs, 119 100,00, @- , zo-a>rmmmoo' Figure 5.4: Clustering of genes based on correlation level. Small clusters of high correlation appear first. When clusters join, they are linked by an edge (thick edges in the figure) of lower correlation which gives a measure of the confidence associ- ated with the larger cluster. The two different degree of grayness are for the two types of co-expressions: up-, respectively down-regulated genes. so that identification of gene subsets which behave similarly provides a means to reduce statistical error. To alleviate this, we searched subsets to find gene clusters, which exhibit maximum average correlations. Clustering of genes. Starting with a set of genes which are not connected, we form clusters of genes that are most correlated by adding the most correlated gene pairs first (first level, from Fig.(5.4)). When clusters merge during this process, by construction, the correlation of last gene pair added before emerging is lower than the correlation of any other gene pair already in the clusters. We will call herein this last link, the weakest link. Larger clusters appear but only through relatively low correlation edges - the weakest links. As illustrated, the most strongly corre- lated clusters are small and are the most important motifs in the gene regulatory network. Using this approach we construct the entire network using the available data for different time points and for one knockout. 120 5.2 Preliminary results Using unpublished data from Swaroop group we present some results based on minimum spanning tree algorithm. For sorting the data we are using the false dis- covery rate (FDR) combined with the confidence interval method (CI) (also called the FDRCI method) of wild type compared to N RL — / — at 5 different time points run with a minimum fold change of 2. The wild type was used as the control. Wild type , is the typical form of an organism, strain, gene, or characteristic as it occurs in nature and it refers to the most common phenotype in the natural population. As we already showed in Fig.(5.4) the clustering of genes is based on the level of correlation. We performed two different ways of clustering. The first method is when small clusters of high correlation appears first. When clusters join, they are linked by an edge of lower correlation (which we called the weakest link) which gives a measure of the confidence associated with the large cluster. This is actually the minimum spanning tree. The second method of clustering is when small clus- ters of low correlation appear first. In this way we actually clusters genes which are anti-correlated. To construct the network we need to determine the weights for the edges that connect the genes. For this we use the following scaling equation: w-- = s —— . I] l 1i l + l 1 j I In Eq.(5.1) wii represents the weight of the link that connects gene 1' to gene 1'; 1;, 11' represent the co-expression level for the two genes and a is a scaling parameter. The minus sign in Eq.(5.1) corresponds to the case where the clustering is made with genes that are correlated whereas the plus sign corresponds to the case when the genes are anti-correlated. Initially we constructed a relatively small network with only 30 genes where 121 erKO-thtp-comparison Embryonic 16 Poet-natal 2 Post-mtal 6 Post-natal 10 Adult Gausmoi AFC GonoSyinbol AFC Geno Symbol arc menisci AFC GonoSyinbol AFC Nri -14.93 Nri 25.13 Lrp4 83.17 Nil 80.32 Rho -9821 -1i.03 Lrp4 -1o.79 Nil 8822 Rho -38.13 Rho -75.19 57618Rik -9.84 27ooow602Rik -9.8 Pdeeb 25.31 Rho 88.5 -- 82.51 Shehpi -9.39 -« -6.78 Tciapeb -11.74 Rho .3127 Nri 81.02 erimi 8.45 N12e3 8.81 Nr2e3 -10.58 Rho 24.81 Rho 87.88 7300170201111 828 Tciapeb 5.74 -9.78 Lrp4 21.85 60811 52.51 1116085558111 823 M608555811' -5.61 Rho 8.88 Pdeab 21.18 Rho 42.85 MGC85558 8.17 Exti -5.56 Co30033M19Rik 822 N2e3 . -17 Stem 34.78 lriai 8.17 Tciapea 8.55 1110051818Rik -7.8 1110051818Rik 48.84 083000260601 30.03 Lip4 8.14 4230057G18Rik 5.54 erz -7.59 6114111 -12.41 A93omeK24Rik 28.85 QM -7.37 Tciapea -5.38 -- ~7.45 9430059P22Rik -9.55 0830002608 28.49 (35092 825 Sag -5.18 - -7.44 ani 8.73 Kcnji4 2528 23100-17l15Rik 8.18 Pdeeb -5 28104211041111 .711 P9112 -722 ani 24.98 19330188A19Rik 8.01 116085558 4.99 Samd7 8.73 Cngai -7.04 PdeGb 23.82 Tdamb 5.93 Egrt 4.9 Gad1 8.71 Dp1l1 .703 -- 22.17 0751107158 2.54 4930544621Rik 2.74 Hist3h28 5.81 2900027603Rik 11.18 4933408F15 14.34 - 255 Hnrpai 2.77 0030009J22Riii 5.73 C030009022Rik 11.47 7530404M11Riii 14.85 ‘ i 2.55 01102 2.77 Panki 5.74 Maps 1222 611812 14.87 2.55 Egg 2.66 6230400614Fiik 625 .Socsa 12.02 Opntsw 15.03 2.58 ... 2.88 - 6.48 C130078007Rik 12.89 60149 15.11 2.85 38324510081111 2.87 131 8.95 Casp7 13.15 Aria 15.12 2.74 my 2.9 Minna 7.99 .- 13.47 Ana 17.58 2.82 001115111 2.95 Em 828 86037008 13m 03117 18.89 2.89 8230400614Rik 2.99 $0133 8.54 Kcne2 13.85 2900027603Rik 19.31 3.13 Mtnrta 3.08 611812 9.5 4933408F15 14.33 Opnisw 20.19 325 Smpdaa 3.13 Moxd1 10.11 Moxd1 14.67 cm 22.41 3.32 6111112 3.52 30053 10.38 Sloane 18.18 9191 23.17 3.34 -- 3.89 Page 10.83 611312 18.33 Fkbpe 24.1 3.38 [4me 4 Kcne2 11.72 Gucaia 25.41 Fabp7 30.59 3.45 by 8.04 Opnisw 1221 Opnisw 30.55 Cloris 33.73 Figure 5.5: The FDRCI output of wild type compared to N RL — / — at 5 different time points run with a minimum fold change of 2. Only the most down- and -up 30 regulated genes are shown. we choose the most 15 up- and the most 15 down- co-expressed anti-correlated genes as is shown in Fig.(5.5). Fig. (5.6) shows the anti-correlation network for three different time points: em- bryonic 16, post-natal 6 and two months (adult). As we can see from this figure, for the embryonic 16 time point, the first link added is between the gene leucine rich repeat transmembrane neuronal 4 (Lrtm4) which is a rat tetraspan protein and gene transcription factor AP — 2 beta (chap2b). As time evolves the network becomes less bushy as is shown in Fig.(5.6) for post-natal 6 and adult. In the same way 122 E16 P6 5 e '0 1 ‘0 92 35 7 ‘l 19 i 9 11 2| " l\. 4 ‘ z] \ m 3 " ' ' 2. 17 22 Adult Figure 5.6: Minimum Spanning Tree corresponding to the most highly anti- correlated 30 genes at embryonic 16, post-natal 6 and 2 months (adult) for erKO- thfp. 123 Figure 5.7: Minimum Spanning Tree corresponding to the highly anti-correlated genes at embryonic 16 for erKO-thfp. that we constructed the network from Fig.(5.6), using the data provided from Swa- roop’s lab we constructed the whole network corresponding to the E 16 time point. Fig.(5.7) shows this network where 961 genes were used. From this network we can identify the most connected genes which are also called hubs. The presence of hubs is very important in keeping together the network. As time evolves, the size of the hubs is reduced (Fig.(5.8),Fig.(5.9),Fig.(5.10),and Fig.(5.11)). For the adult time point, the structure of the network shows again a bushy structure as is shown in Fig.(5.11). As we can see from this figure, more hubs are present but with lower 124 Figure 5.8: Minimum Spanning Tree corresponding to the highly anti-correlated genes at post-natal 2 for erKO-thfp. 125 Figure 5.9: Minimum Spanning Tree corresponding to the highly anti-correlated genes at post-natal 6 for erKO-thfp. 126 Figure 5.10: Minimum Spanning Tree corresponding to the highly anti-correlated genes at post-natal 10 for erKO-thfp. 127 ‘2' r r!’ -— — I *- ’ ° ' 1;. 5%3'3’44 3.}; 1' ;.- «fifim: F n . l 7' 1 ' .‘ ~‘~ ' I. ‘. 5' 066 453351211 9 I c 1.x. ’1 HY.‘ ~59 P} ‘33.": .. 9.6.4:. o -.- 0" r . 3". . ‘. . . .-.'-. \‘ . I I ‘ 00 14k £0 ‘0 Figure 5.11: Minimum Spanning Tree corresponding to the highly anti-correlated genes at 2 months (adult) for erKO-thfp. 128 ”51‘. “'5‘. MWHN‘E “:1. n... connectivity. By looking at the network we can see that the network correspond- ing to time points P2, P6, P10 and adult resemble networks. Scale-free networks proved of crucial importance in understanding many inter-disciplinary areas span— ning from Internet [97] and world wide web [16] to social networks [76] and gene regulatory networks [6]. Scale-free networks are characterized by a power-law degree distribution where the probability (P) that a node has k links is given by P(k) ~ k‘7 and 'y is a degree exponent. For most biological networks it is believed that 2 < 'y < 3. The probability that a node is highly connected proved statisti- cally to be more significant than in random graphs and the network’s properties are often determined by this relatively small number of highly connected nodes (or hubs). There is hope that these gene-hubs are also very important from the biological point of view. However from the histogram presented in Fig.(5.12) we see that there is a differ- ence between E 16 and the other four networks obtained at P2, P6, P10 and adult. As we can see in Fig.(5.12) these four networks display the typical case of a power law behavior while E 16 has a predominance of highly connected hubs. We found the most connected hubs for five different time points. As is shown in Fig.(5.7), Fig.(5.9) and Fig.(5.11) we have: (i) the most connected gene ertm4 at E 16 has connectivity 101;(ii) the most connected gene Pn p at P2 has connectivity 15; (iii) there are two most connected, one of them being Clu at P6 with connectivity 11 ; (iv) the most connected gene Rtel at P10 has connectivity 28, and finally (v) the most connected gene 4933412F11Rik at 2 months has connectivity 15. Next, we constructed the correlated network (the minus sign case in Eq.(5.1)) for three different time-points: e16, post-natal 2 and post-natal 6. The results are presented in Fig.(5.13), Fig.(5.14) and Fig.(5.15). These three networks resemble scale free networks. For the correlated case the network at E 16 looks like a scale- free network, while for the anti-correlated case and for the same time point this is 129 I I I I I I I I] I I I l l I l I: Erdos-chyi network :1 H 516 0 L 5 l : l I r 1 - _‘ H P2 -5 F — 1 0—0 P6 - -1 0.1 :- H P10 '10 f 'l 1 E -—- Adult _15 _ _- i " .- 1 l i l 1 I 1 -1 ' .. -20 . 0 0.5 l 15 2 g 0.0l :— 11 T: l I 1 l l l I ll 1 I l l l l l l l I l l l l l 1 1 0.0001 10 l00 1000 it Figure 5.12: k vs. N(k) for the anti-correlated case corresponding to the 5 networks corresponding to the anti-correlated case. 130 Figure 5.13: Minimum Spanning Tree corresponding to the highly correlated genes at embryonic 16 for erKO-thfp. not the case. 131 Figure 5.14: Minimum Spanning Tree corresponding to the highly correlated genes at post-natal 2 for erKO-thfp. 132 I l . . Q ' .-" i a... -. ‘-.. .3 ' ' . I ..-‘ ' .. " .. ‘ 1 .. -” AW047325 P ‘- . b . ’O.‘- ' , ‘ , . '. ‘3 :- s‘, ... ' o 0' 9 t — 9 1. , 1 .1.) g 0. ‘0 -..,_‘_.--. . 1 "3;": .-.II. 1. 4,-1.4. - , H ‘ b .. -. «pr ' H- 94-, -' .. 1.2:. ’i't" 2’1‘1‘ ” . . ,I .1" .-' '"dy." ....... _ ,' I x 1.. L...- t....'.’. .' .x .0 ’ :11:-.__ ,. ‘ t ." ..- .— .-.— .__‘_.._ ._—" ._. .,. h “‘2‘“ ’, ....‘ . I‘ . - . 0 ’ it- - .. an a .. f . ."‘.-."_ . ‘ . k I. .’ Q ‘ g 0’ " -- . "O O-r' 0‘ Q .1 g f 9 V‘ - 1.- 3.4,. ‘7 ‘0‘. . . ' . 0"... . . ‘ .2 .- .. .. '1 . ‘1 '0 1 ---‘l— ’I’ "5““- t _ n O has" I; .‘ "’“fz’ '1.- °.'§. . ’ .. ”'9 I "v4 ' ..O. "at. . .' ,9“ 1 ,. i' ‘0 .-.._ '-' ‘ ‘ .‘ ' 9 'O’. Q Q', . 9 .. 1' . . ". '1‘ ' . . Q. ‘4 i ' ,‘fi . . ,. . 1‘... 'Q.‘ . .. 9". J *i ‘50 ‘3 Q ., ‘ ' $ 31; 'o '1.“ .180. .1, .P' 3.. ‘1', 4t _Q 0 8 if. '03. ..‘ l . ‘1‘, f. I t.’ f . . 1815'- .‘ d’ o '. Figure 5.15: Minimum Spanning Tree corresponding to the highly correlated genes at post-natal 6 for erKO-thfp. 133 I I I r I I I: 1» o—o E16 3 _ P2 . ” o—o P6 1 01— HII P10 — ' 5 Adult 5 >5 0.01.— 1 D. : 2 0.001 :— .E (1%)] l I l l l l l l l l_ l l l L L 1 1 1 10 100 K Figure 5.16: k vs. N (k) for the anti-correlated case corresponding to the 5 networks corresponding to the correlated case. We found the hubs for the five different time points that we considered throughout this analysis. As it is shown in Fig.(5.13), Fig.(5.14) and Fig.(5.15) we found: (i) the most connected gene Ttll44 at E 16 has connectivity 21;(ii) the most connected gene 2310007D03Rik at P2 has connectivity 9; (iii) the most con- nected gene AW047325 at P6 has connectivity 10; (iv) the most connected gene Mak3 at P10 has connectivity 23, and finally (v) the most connected gene Ki f a p3 at 2 months has connectivity 15. In Fig.(5.16) we draw the histograms for the five net- works corresponding to the correlated case; N (k) represents the number of genes with connectivity k. Although this work is preliminary we have identified sev- eral key candidates as targets for treatment of different retinal diseases. In our networks, genes are vertices and pairwise co-expressions are network edges. Tra- ditional work on gene networks focuses on the two aspects: discovery of networks 134 which are statistical significant (False Discovery Rate) and discovery of networks which are biological significant (Minimum Acceptable Strength). We used unpublished data from mice retina provided from Swaroop’s lab from University of Michigan, but our method of clustering and finding networks can be applied to any other biological system not only to retina. We plan to take this work to the next level and identify the topology of cellular pathways involved in photoreceptor differentiation and to to identify possible targets for treatement of different eye diseases. 135 Chapter 6 Conclusion This thesis covers two areas of application of statistical physics. Firstly it covers application of statistical physics to computer science problems, namely to complex network systems. Seccondly it covers application of statistical physics to biology. Application to complex networks from computer science is based on one of the most important properties specific to NP problems namely that many apparently different problems can be mapped to each other so that solutions are preserved under the mapping. An important observation is that problems whose order pa- rameter is at the critical boundary are typically hard. We considered problems from combinatorial optimization as models in statistical physics. The cost func- tion was renamed as the Hamiltonian, a term more familiar to physicists and the ground states correspond to the solutions of optimization problems. We outlined a few application of polynomial combinatorial optimization algo- rithms useful in extracting the universal zero-temperature properties for different disordered systems. Dijkstra’s algorithm can be used for models of non-directed elastic lines on graphs with isotropically correlated random potentials. The pre- flow-push algorithm for minimum-cut/max-flow problems can be used to inves- tigate transitions that occur in a model for elastic manifolds in a periodic potential 136 in the presence of point disorder. The existence of many degenerate states and / or metastable states is due to frus- tration. It is well known among computational physicists that for investigating disordered systems, finding the equilibrium states (or the ground states) is nearly impossible for large systems. Usually the size of the systems used is at most of the order of few hundred variables. A break-through technique that we discussed came from statistical physics where the cavity method was used successfully for systems with few millions of variables. Problems that are NP-complete are unlikely to be solved exactly. In statistical physics, we are interested in results in the thermodynamic limit N —+ 00 rather than for finite systems. So after we mapped problems from computer science to problems from statistical physics, we focused on classifying problems from statis- tical physics according to their computational complexity. For many NP-complete problems one more order parameter can be defined, and usually hard instances oc- cur around particular critical values of these order parameters. These critical val- ues form a boundary that separates the space of problems into three regions. One under-constrained region where the density of solutions is large so we can easily find solutions, one over-constrained region where is very unlikely to find a solu- tion and a third region which is between these two under- and over-constrained regions. Usually in this third region hard problems occur. We foccused on phase transitions from statistical physics and from computer science point of view but phase transitions are also interesting from the mathematical point of view because they are singular points. Mainly, in this thesis the novel contribution is presented in Chapter 4 and in Chapter 5. In Chapter 4 we introduced a more elegant and simple method for NP-complete problems, starting with Viana-Bray model and then continuing with the K-SAT 137 and Coloring problems. We studied analytically and numerically these three hard problems using a method similar to the message passing procedure. A local or- der parameter which is important in the analysis of phase transition in frustrated combinatorial systems is the probability that a node is frozen in a particular state. There is a percolative transition when an infinite connected cluster of these frozen nodes emerges. The emergence of frozen order can be considered to be a form of constraint percolation which proved to be useful in making analogies with rigidity percolation and its associated matching problem. We showed that a giant frozen cluster emerges at the phase transition and the emergence is continuous for 2-SAT problem and is discontinuous for the K 2 2 case. For the Coloring problem on random Bethe lattice, when q = 2 (so the two colors case), the onset of constraint percolation is of second order, similar to emergence of giant component in ran- dom graphs. However, using numerical methods it was shown that the onset of constraint percolation for q = 3 (the three colors case) is of first order. Advances in molecular biology, analytical and computational techniques are proving to be powerful tools to systematically investigate the complex molecu- lar processes underlying biological systems. In particular, using high-throughput gene expression arrays, we are able to measure the output of the gene regulatory networks. To identify functional motifs in gene regulatory networks, in Chapter 5 we developed novel methods to cluster genes in a systematic manner, beginning with the most highly correlated gene pairs. At each level in our clustering proce- dure, we identified the gene pair in each cluster with the strongest correlation. The most highly correlated gene pairs occur in small clusters and are interpreted as functional motifs in the gene regulatory network. As a future work we will focus on discovering functional pathways in these networks and on developing methods to control the activity of key functional pathways in these networks, particularly signaling pathways. 138 4 ~ 1! BIBLIOGRAPHY 139 ]'FI Bibliography [1] D. Achlioptas, L. M. Kirousis, E. Kranakis, and D. Krizans. Rigorous Results for Random (2+p)SAT., 14:63, 1997. [2] D. Achlioptas, A. Naor, and Y. Peres. Rigorous Location of Phase Transitions in Hard Optimization Problems. [3] L. M. Adleman. Molecular Computation of Solutions to Combinatorial Prob- lems. Science, 266(5187):1021—1024, 1994. [4] R. K. Ahuja, T. L. Magnanti, and I. B. Orlin. Network Flows: Theory, Algorithms and Applications. Prentice-Hall, N], 1993. [5] B. Aspvall, M. F. Plass, and R. E. Tarjan. A Linear-Time Algorithm for Testing the Truth of Certain Quantified Boolean Formulas. Inform. Process. Letter, 8:121—123. [6] A. L. Barabasi and R. Albert. Emergence of Scaling in Complex Networks. Science, 286:509—512, 1999. [7] F. Barahona. On the Computational Complexity of Ising Spin Glass Models . 1. Phys. A., 15:3241—3253, 1982. [8] F. Barahona, M. Grotschel, M. Iunger, and G. Reinelt. An Application of Combinatorial Optimization to Statistical Physics and Circuit Layout De- sign. Oper. Res., 36:493—513, 1988. [9] M. N. Barber. Finite-Size Scaling. Phase Transitions and Critical Phenomena, 8:145—266, 1983. [10] S. Bastea. Degeneracy Algorithm for Random Magnets. Phys. Rev. E, 58:7978, 1998. [11] E. B. Baum. Building an Associative Memory Vastly Larger Than the Brain. Science, 268:542—545, 1995. [12] R. I. Baxter. Colorings of a Hexagonal Lattice. I. of Mathematical Physics, 11:784—789, 1970. 140 [13] K. Binder and A. P. Young. Spin Glasses: Experimental Facts, Theoretical Concepts and Open Questions. Rev. Mod. Phys, 58:801, 1986. [14] G. Biroli, S. Coco, and R. Monasson. Phase Transitions and Complexity in Computer Science: An Overview of the Statistical Physics Approach to the Random Satisfiability Problem. Physica A, 306. [15] G. Biroli, R. Monasson, and M. Weigt. A Variational Description of the Ground-State Structure in Random Satisfibility Problems. Eur. Phys. I. B, 14:551, 2000. [16] James M. Bower and Hamid Bolouri (editors). Computational Modeling of Genetic and Biochemical Networks. Computational Molecular Biology Series, Addison-Wesley, Reading, Mass, 2001. [17] A. Braunstein, M. Mezard, and R. Zecchina. Survey Propagation: an Algo- rithm for Satisfiability. Random Structures and Algorithms, 27:201—226, 2005. [18] A. Braunstein, R. Mulet, A. Pagnani, M. Weigt, and R. Zecchina. Polynomial iterative algorithms for coloring and analyzing random graphs. Cond-Mat, 0304558, 2003. [19] C. L. Cepko, C. P. Austin, X. Yang, M. Alexiades, and D. Ezzeddine. Cell Fate Determination in the Vertebrate Retina. PNAS, 93:589—595, 1996. [20] I. Chalupa, P.L. Leath, and G. R. Reich. Bootstrap percolation on a Bethe lattice. 1. Phys. C., 122L31, 1979. [21] P. Cheeseman, B. Kafensky, and W. M. Taylor. Where the Really Hard Prob- lems Are. Proceedings of the 12th I ICAI, pages 331-337, 1991. [22] H. Cheng, T. S. Aleman, A. V. Cideciyan, R. Khanna, S. G. Jacobson, and A. Swaroop. In Vivo Function of the Orphan Nuclear Receptor NR2E3 in Es- tablishing Photoreceptor Identity During Mammalian Retinal Development. In Press. [23] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. MIT press, 1990. [24] I. Culberson and I. Gent. Empirical Evidence for an Asymptotic Disconti- nuity in the Backbone of the 3-Coloring Phase Transition. APES-16-1999, 16, 1999. [25] I. Culberson and I. Gent. Frozen Development in Graph Coloring. Theor. Comp. Sci., 265:227, 2001. [26] L. L. Daniele, C. Lillo, A. L. Lyubarsky, S. S. Nikonov, N. Philp, A. I. Mears, A. Swaroop, D. S. Williams, and E. Pugh Ir. Cone-Like Morphological, Molecular, and Electrophysiological Features of the Photoreceptors of the NRL Knockout Mouse. IOVS, 46:2156—2167, 2005. 141 [27] M. Davis and H. Putnam. A Computing Procedure for Quantification The- ory. Assoc. Comput. Mach., 7:201—215, 1960. [28] C. Djurberg, K. Jonason, and P. Nordblad. Magnetic Relaxation Phenomena in a CuMn Spin Glass. arXiv:cond-mat/9810314, 1998. [29] O. Dubois. Counting the Number of Solutions for Instance of Satisfiability. Theor. Comp. Science, 81:49—64, 1991. [30] O. Dubois, R. Monasson, B. Selman, and R. Zecchina. Phase Transitions in Random Combinatorial Problems. In O. Dubois, R. Monasson, B. Selman, and R. Zecchina, editors, Theoretical Computer Science, volume 265. 2001. [31] P. M. Duxbury, D. J. Jacobs, M. F. Thorpe, and C. Moukarzel. Floppy Modes and the Free Energy: Rigidity and Connectivity Percolation on Bethe Lattice. Phys. Rev. E, 59:2084, 1999. [32] S. F. Edwards and P. W. Anderson. Theory of Spin Glasses. J. Phys F: Metal Phys, 5:965, 1975. [33] P. Erdos and A. Rényi. On the Evolution of Random Graphs. Publ. Math. Inst. Hung. Acad. Sci., 5:17, 1960. [34] D. Zhu et al. Network Constrained Clustering for Gene Microarray Data. Bioinformatics, 21(21):4014—4020, 2005. [35] K. Sakamoto et. al. Molecular Computation by DNA Hairpin Formation. Science, 288:1223—1226, 2000. [36] M. Akimoto et al. Targeting GFP to Newborn Rods by NRL Promoter and Temporal Expression Profiling of Flow-Sorted Photoreceptors. PNAS, 10(103):3890—3895, 2006. [37] Q. Liu et. al. DNA Computing on Surfaces. Nature, 403:175—179, 2000. [38] Y. Benenson et. al. An Autonomous Molecular Computer for Logical Control of Gene Expression. Nature, (429):423—429, 2004. [39] D. Faulhammer, A. R. Cukras, R. J. Lipton, and L. F. Landweber. Molecular Computation: RNA Solutions to Cheese Problem. Proc. Natl. Acad. Sci. USA, 97 (4):1385-1 389, 2000. [40] C. W. Fay, J. W. Liu, and P. M. Duxbury. Maximum Independent Set on Diluted Triangular Lattices. Phys. Rev. E, 73:056112, 2006. [41] M. E. Fisher and J. W. Essam. Some Cluster Size and Percolation Problems. J. Math. Phys, 2:609, 1961. [42] B.J. Frey and D.J.C. Mackay. A Revolution: Belief Propagation in Graphs with Cycles. Adv. in Neural Information Porcessing System, 10, 1998. 142 [43] A. M. Frieze. On the Independence Number of Random Graphs. Discrete Mathematics, 81:3171—175, 1990. [44] Y. Fu. The Uses and Abuses of Statistical Mechanics in Computational Complexity, volume 1. Lectures in Sciences and Complexity. Addison-Wesley, Reading, Mass, 1989. [45] Y. Fu and PW. Anderson. Application of Statistical Mechanics to NP Com- plete Problems in Combinatorial Optimization. J. Phys. A: Math. Gen., 1921605, 1986. [46] M. R. Garey and D. S. Johnson. Computers and Intractability. Academic Press, 1979. [47] P. G. Gazmuri. Independent Sets in Random Sparse Graphs. Networks, 14:367—377, 1984. [48] I. P. Gent, E. Maclntryre, P. Prosser, and T. Walsh. Scaling Effects in the CSP Phase Transition. Principles and Practice of Constraint Programming, pages 70— 87, 1995. [49] I. P. Gent and T. Walsh. The TSP Phase Transition. Artificial Intelligence, 88(Issue 1-2):349—358, 1996. [50] LP. Gent and T. Walsh. Phase Transitions and Annealed Theories: Number Partitioning as a Case Study. 12th European Conference on Artificial Intelligence, pages 170-174, 1996. [51] A. Goerdt. A Threshold for Unsatisfiability. Journal of Computer and System Sciences, 53:469—486. [52] F. Guarnieri, M. Fliss, and C. Bancroft. Making DNA Add. Science, 273:220— 223, 1996. [53] A. Suyama H. Yoshida. Solution to 3-SAT by Breadth First Search. DNA Based Computers: DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 54(5):9—20, 1999. [54] F. Hadlock. Finding a Maximum Cut of a Planar Graph in Polynomial Time. SIAM. ]. Comput, 4:221—225, 1975. [55] N. B. Haider, S. G. Jacobson, A. V. Cideciyan, R. Swiderski, L. M. Streb, C. Searby, G. Beck, R. Hockey, D. B. Hanna, and S. Gorman. Mutation of a Nuclear Receptor Gene, NR2E3, Causes Enhanced S Cone Syndrome, a Disorder of Retinal Cell Fate. Nat Genet, 24:127—131. [56] A. K. Hartrnann and M. Weigt. Statistical Mechanics Perspective on the Phase Transition in Vertex Covering of Finite-Connectivity Random-Graphs. Theor. Comp. Sci., 265:199, 2001. 143 [57] A. K. Hartmann and M. Weigt. Statistical Mechanics of the Vertex-Cover Problem. Journal of Physics A: Mathematical and General, 36:11069—11093, 2003. [58] B. Hayes. Can’t Get No Satisfaction. American Scientist, March - April 1997. [59] G. L. Van Hemmen and R. G. Palmer. The Replica Method and Solvable Spin Glass Model. J. Phys. A, 12:563, 1979. [60] T. Hogg, B. A. Huberman, and C. Williams. Phase Transitions and the Search Problems. Artifical Intelligence, 81:1—15, 1996. [61] D. J. Jacobs and M. F. Thorpe. Generic Rigidity Percolation in Two Dimen- sions. Phys. Rev. E, 5323683, 1996. [62] S. G. Jacobson, A. Sumaroka, T. S. Aleman, A. V. Cideciyan, S. B. Schwartz, A. J. Roman, R. R. McInnes, V. C. Sheffield, E. M. Stone, and A. Swaroop. Nu- clear Receptor NR2E3 Gene Mutations Distort Human Retinal Laminar Ar- chitecture and Cause an Unusual Degeneration. HMG, 13:1893—1902, 2004. [63] I. Kanter and H. Sompolinsky. Mean-Field Theory of Spin-Glasses with Fi- nite Coordination Number. Phys. Rev. Lett., 58:164, 1987. [64] R. M. Karp. Reducibility Among Combinatorial Optimization. Complexity of Computer Computations, R. Miller and J. Teacher, eds. Plenum Press, NYz85— 103, 1972. [65] M. K. Kimel and Y. Benjamini. The False Discovery Rate for Multiple Testing in Factorial Experiments. Techical Report, 2006. [66] S. Kirkpatrick, G. Gyorgyi, N. Tishby, and L Troyansky. The Statistical Me- chanics of K-Satisfaction. Adv. Neur. Inform. Proc. Syst., 6:439—446, 1994. [67] S. Kirkpatrick and B. Selman. Critical Behaviour in the Satisfiability of Ran- dom Boolean Expressions. Science, 264:1297, 1994. [68] S. Kirkpatrick and D. Sherrington. Infinite-Ranged Models of Spin-Glasses. Physical Review B, 17(11):4384, 1978. [69] S. Kirkpatrick, W.W. Wilcke, R.B. Garner, and H. Huels. Percolation in Dense Storage Arrays. Physica A, 314:220—229, 2002. [70] A. D. Korshunov. The Main Properties of Random Graphs With a Large Number of Vertices and Edges. Russian Math. Surveys, pages 121—198, 1985. [71] ER. Kschischang, B.J. Frey, and H. A. Loeliger. Factor Graphs and the Sum Product Algorithm. IEEE Trans. Info. Theory, 47:498-519, 2001. [72] S. L. Lauritzen and D. J. Spiegelhalter. Local Computations with Probabilities on Graphical Structures and their Application to Expert Systems. [ournal of the Royal Statistical Society, B(50):157—224, 1988. 144 [73] D. Lidar and O. Biham. Simulating Ising Spin Glasses on a Quantum Com- puter. Phys. Rev. E, 56(3):3661-3681, 1997. [74] R. Lipton. DNA Solution of Hard Computational Problems. Science, 268(5210):542—545, 1995. [75] F. J. Livesey and C. L. Cepko. Vertebrate Neural Cell-Fate Determination: Lessons from the Retina. Nat. Rev. Neuroscience, 2:109—118, 2001. [76] Juyong Park M. E. J. Newman. Why Social Networks Are Different From Other Types of Networks. Phys. Rev. E, 682036122, 2003. [77] H. Maletta and W. Felsch. Insulating Spin-Glass System ExSr1_xS. Phys. Rev. B, 20:1245—1260, 1979. [78] A. J. Mears, M. Kondo, P. K. Swain, Y. Takada, R. A. Bush, T. L. Saunders, P. A. Sieving, and A. Swaroop. NRL Is Required For Rod Photoreceptor Development. Nat. Genet, 29:447—452, 2001. [79] S. Mertens. Phase Transition in the Number Partitioning Problem. prl, 81(20):4281—4284, 1998. [80] S. Mertens. A Physicist’s Approach to Number Partitioning. Theoretical Com- puter Science, 265:79—108, 2001. [81] S. Mertens, M. Mézard, and R. Zecchina. Threshold Values of Random K- SAT From the Cavity Method. Random Structure and Algorithms, 28(3):340— 373, 2006. [82] M. Mézard and G. Parisi. The Cavity Method at Zero Temperature. Journal of Statistical Physics, 111, 2003. [83] M. Mezard, G. Parisi, and M. Virasoro. Spin Glass Theory and Beyond: An In- troduction to the Replica Method and its Applications. World Scientific Lectrures Notes in Physics, 1987. [84] M. Mezard, G. Parisi, and MA. Virasoro. Spin Glass Theory and Beyond, vol- ume 9 of World Scientific Lecture Notes in Physics. 1988. [85] M. Mézard, G. Parisi, and R. Zecchina. Analytic and Algorithmic Solultion of Random Satisfiabilty Problem . Science, 297:812—815, 2002. [86] M. Mézard and R. Zecchina. Random K-Satisfiability: from an Algorithmic Solution to a New Efficient Algorithm. Phys. Rev. E, 66256126, 2002. [87] A. H. Milam, L. Rose, A. V. Cideciyan, M. R. Barakat, W. X. Tang, N. Gupta, T. S. Aleman, A. F. Wright, E. M. Stone, and V. C. Sheffield. The Nuclear Re- ceptor NR2E3 Plays a Role in Human Retinal Photoreceptor Differentiation and Degeneration. PNAS, 99:473—478, 2002. 145 [88] R. Monasson and R. Zecchina. Entropy of the K-Satisfiability Problem. Phys— ical Review Letters, 76(21):3881, 1996. [89] R. Monasson and R. Zecchina. Statistical Mechanics of the Random K- Satisfiability Model. Phys. Rev. E, 56:1357—1370, 1997. [90] R. Monasson, R. Zecchina, S. Kirkpatrick, and L. Troyansky B. Selman. Deter- mining Computational Complexity From Characteristic Phase Transitions. Nature, 400:133—137, 1999. [91] C. Moukarzel. A Fast Algorithm for Backbones. Int. J. Mod. Phys C, 9:887, 1998. [92] C. Moukarzel and P. M. Duxbury. Stressed Backbone and Elasticity of Ran- dom Centra-Force Systems. Phys. Rev. Lett., 75:4055—4058, 1995. [93] C. Moukarzel and P. M. Duxbury. Comparison of Rigidity and Connectivity Percolation in Two Dimensions. Phys. Rev. E, 592614—2622, 1999. [94] C. Moukarzel, P. M. Duxbury, and P. L. Leath. First Order Rigidity on Cayley Trees. Phys. Rev. E, 55:5800—5811, 1997. [95] R. Mulet, A. Pagnani, M. Weight, and R. Zecchina. Coloring Random Graphs. Phys. Rev. Lett., 892268701, 2002. [96] C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Prentice Hall, 1982. [97] Juyong Park and M. E. J. Newman. The Origin of Degree Correlations in the Internet and Other Networks. Phys. Rev. E, 682026112, 2003. [98] Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Academic Press, 1988. [99] B. Pittel, J. Spencer, and N. Worrnald. Sudden Emerges of a Giant K-Core in a Random Graph. J. Comb. Theory B, 672111, 1996. [100] Q. Quyang, P. D. Kaplan, S. Liu, and A. Libchaber. DNA Solution of the Maximal Clique Problem. Science, 278(446-449)2446—449, 1997. [101] S. Ravinderjit, N. Chelyapov, C. Johnson, P. W. K. Rothemund, and L Adle- man. Solution of a 20-Variab1e 3-SAT Problem on a DNA Computer. Science, 296:499-502, 2002. [102] P. W. K. Rothemund. DNA Computers: DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 27275—119, 1996. [103] S. Roweis, E. Winfree, R. Burgoyne, N .V. Chelyapov, M.F. Goodman, P.W.K. Rothemund, and L.M. Adleman. A Sticker-Based Model for DNA Compu- tation. Journal of computational biology, 52615—629, 1998. 146 [104] D. Sharon, MA. Sandberg, R.C. Caruso, E.L. Berson, and TR Dryja. Shared Mutations in NR2E3 in Enhanced S-Cone Syndrome, Goldmann-Favre Syn- drome, and Many Cases of Clumped Pigmentary Retinal Degeneration. Arch. Ophthalmology, 121:1316—1323, 2003. [105] D. Sherrington and S. Kirkpatrick. Solvable Model of a Spin-Glass. Physical Review Letters, 35(26):]792, December 1975. [106] P. Svenson and M. G. Nordahl. Relaxation in Graph Coloring and Satisfia- bility Problems. Phys. Rev. E, 5923983, 1999. [107] G. Tolouse. Theory of the Frustration Effect in Spin Glasses. Commun. Phys, 2(4)2115 — 119, 1977. [108] J. van Mourik and D. Saad. Random Graph Coloring: Statistical Physics Approach. Phys. Rev. E, 66256120, 2002. [109] B. Vandegriend and J. Culberson. The gm, Phase Transition Is Not Hard for the Hamiltonian Cycle Problem. Journal of Artificial Intelligence Research, 92219—245, 1998. [110] L. Viana and A.J. Bray. Phase Diagrams for Dilute Spin Glasses. J. of Phys. C, 1823037, 1985. [111] M. Weigt and A. K. Hartmann. Minimal Vertex Covers on Finite- Connectivity Random Araphs: A Hard-Sphere Lattice-Gas Picture. Phys. Rev. E, 632056127, 2001. [112] A. E. Wright, A. C. Reddick, S. B. Schwartz, J. S. Ferguson, T. S. Aleman, U. Kellner, B. Jurklies, A. Schuster, E. Zrenner, and B. Wissinger. Mutation Analysis of NR2E3 and NRL Genes in Enhanced 8 Cone Syndrome. Hum M utat, 242439, 2004. [113] W.T.Freeman, J.S. Yedidia, and Y. Weiss. Constructing Free Energy: Approx- imations and Generalized Belief Propagations Algorithms. Technical Report, 2002. [114] F. Y. Wu. The Potts Model. Review Modern Physics, 54(235), 1982. [115] IS. Yedidia, W.T. Freeman, and Y. Weiss. Bethe Free Energies: Kikuchi Ap- proximations and Belief Propagation Algorithms. Technical Report, 2001. [116] D. Zhu and A. O. Hero. Gene Co-Expression Network Discovery with Con- trolled Statistical and Biological Significance. ICASSP, pages 369—372, 2005. 147 ' 1111111111111iii